Archive for the 'Probability theory' Category

Great mathematical ideas

Normblog has a regular feature, Writer’s Choice, where writers give their opinions of books which have influenced them.   Seeing this led me recently to think of the mathematical ideas which have influenced my own thinking.   In an earlier post, I wrote about the writers whose  books (and teachers whose lectures) directly influenced me.  I left many pure mathematicians and statisticians off that list because most mathematics and statistics I did not receive directly from their books, but indirectly, mediated through the textbooks and lectures of others.  It is time to make amends. 

Here then is a list of mathematical ideas which have had great influence on my thinking, along with their progenitors.  Not all of these ideas have yet proved useful in any practical sense, either to me or to the world – but there is still lots of time.   Some of these theories are very beautiful, and it is their elegance and beauty and profundity to which I respond.  Others are counter-intuitive and thus thought-provoking, and I recall them for this reason.

  • Euclid’s axiomatic treatment of (Euclidean) geometry
  • The various laws of large numbers, first proven by Jacob Bernoulli (which give a rational justification for reasoning from samples to populations)
  • The differential calculus of Isaac Newton and Gottfried Leibniz (the first formal treatment of change)
  • The Identity of Leonhard Euler:  exp ( i * \pi) + 1 = 0, which mysteriously links two transcendental numbers (\pi and e), an imaginary number i (the square root of minus one) with the identity of the addition operation (zero) and the identity of the multiplication operation (1).
  • The epsilon-delta arguments for the calculus of Augustin Louis Cauchy and Karl Weierstrauss
  • The non-Euclidean geometries of Janos Bolyai, Nikolai Lobachevsky and Bernhard Riemann (which showed that 2-dimensional (or plane) geometry would be different if the surface it was done on was curved rather than flat – the arrival of post-modernism in mathematics)
  • The diagonalization proof of Gregor Cantor that the Real numbers are not countable (showing that there is more than one type of infinity) (a proof-method later adopted by Godel, mentioned below)
  • The axioms for the natural numbers of Guiseppe Peano
  • The space-filling curves of Guiseppe Peano and others (mapping the unit interval continuously to the unit square)
  • The axiomatic treatments of geometry of Mario Pieri and David Hilbert (releasing pure mathematics from any necessary connection to the real-world)
  • The algebraic topology of Henri Poincare and many others (associating algebraic structures to topological spaces)
  • The paradox of set theory of Bertrand Russell (asking whether the set of all sets contains itself)
  • The Fixed Point Theorem of Jan Brouwer (which, inter alia, has been used to prove that certain purely-artificial mathematical constructs called economies under some conditions contain equilibria)
  • The theory of measure and integration of Henri Lebesgue
  • The constructivism of Jan Brouwer (which taught us to think differently about mathematical knowledge)
  • The statistical decision theory of Jerzy Neyman and Egon Pearson (which enabled us to bound the potential errors of statistical inference)
  • The axioms for probability theory of Andrey Kolmogorov (which formalized one common method for representing uncertainty)
  • The BHK axioms for intuitionistic logic, associated to the names of Jan Brouwer, Arend Heyting and Andrey Kolmogorov (which enabled the formal treatment of intuitionism)
  • The incompleteness theorems of Kurt Godel (which identified some limits to mathematical knowledge)
  • The theory of categories of Sam Eilenberg and Saunders Mac Lane (using pure mathematics to model what pure mathematicians do, and enabling concise, abstract and elegant presentations of mathematical knowledge)
  • Possible-worlds semantics for modal logics (due to many people, but often named for Saul Kripke)
  • The topos theory of Alexander Grothendieck (generalizing the category of sets)
  • The proof by Paul Cohen of the logical independence of the Axiom of Choice from the Zermelo-Fraenkel axioms of Set Theory (which establishes Choice as one truly weird axiom!)
  • The non-standard analysis of Abraham Robinson and the synthetic geometry of Anders Kock (which formalize infinitesimal arithmetic)
  • The non-probabilistic representations of uncertainty of Arthur Dempster, Glenn Shafer and others (which provide formal representations of uncertainty without the weaknesses of probability theory)
  • The information geometry of Shunichi Amari, Ole Barndorff-Nielsen, Nikolai Chentsov, Bradley Efron, and others (showing that the methods of statistical inference are not just ad hoc procedures)
  • The robust statistical methods of Peter Huber and others 
  • The proof by Andrew Wiles of The Theorem Formerly Known as Fermat’s Last (which proof I don’t yet follow).

Some of these ideas are among the most sublime and beautiful thoughts of humankind.  Not having an education which has equipped one to appreciate these ideas would be like being tone-deaf.

Technorati Tags: , , , , , , , , , , , , , , , , , , , , , , , , ,




The decade around 1664

We noted before that one consequence of the rise of coffee-houses in 17th-century Europe was the development of probability theory as a mathematical treatment of reasoning with uncertainty.   Ian Hacking’s history of the emergence of probabilistic ideas in Europe has a nice articulation of the key events, all of which took place a decade either side of 1664:

  • 1654:  Pascal wrote to Fermat with his ideas about probability
  • 1657: Huygens wrote the first textbook on probability to be published, and Pascal was the first to apply probabilitiy ideas to problems other than games of chance
  • 1662: The Port Royal Logic was the first publication to mention numerical measurements of something called probability, and Leibniz applied probability to problems in legal reasoning
  • 1662:  London merchant John Gaunt published the first set of statistics drawn from records of mortality
  • Late 1660s:  Probability theory was used by John Hudde and by Johan de Witt in Amsterdam to provide a sound basis for reasoning about annuities (Hacking 1975, p.11).

Developments in the use of symbolic algebra in Italy in the 16th-century provided the technical basis upon which a formal theory of uncertainty could be erected.  And coffee-houses certainly aided the dissemination of probabilistic ideas, both in spoken and written form.   Coffee houses may even have aided the creation of these ideas – new mathematical concepts are only rarely created by a solitary person working alone in a garret, but usually arise instead through conversation and debate among people with partial or half-formed ideas.  

However, one aspect of the rise of probability in the mid 17th century is still a mystery to me:  what event or phenomena led so many people across Europe to be interested in reasoning about uncertainty at this time?  Although 1664 saw the establishment of a famous brewery in Strasbourg, I suspect the main motivation was the prevalence of bubonic plague in Europe.   Although plague had been around for many centuries, the Catholic vs. Protestant religious wars of the previous 150 years had, I believe, led many intelligent people to abandon or lessen their faith in religious explanations of uncertain phenomena.   Rene Descartes, for example, was led to cogito, ergo sum when seeking beliefs which peoples of all faiths or none could agree on.  Without religion, alternative models to explain or predict human deaths, morbidity and natural disasters were required.   The insurance of ocean-going vessels provided a financial incentive for finding good predictive models of such events.

Hacking notes (pp. 4-5) that, historically, probability theory has mostly developed in response to problems about uncertain reasoning in other domains:  In the 17th century, these were problems in insurance and annuities, in the 18th, astronomy, the 19th, biometrics and statistical mechanics, and the early 20th, agricultural experiments.  For more on the connection between statistical theory and experiments in agriculture, see Hogben (1957).  For the relationship of 20th-century probability theory to statistical physics, see von Plato (1994).

References:

Ian Hacking [1975]:  The Emergence of Probability: a Philosophical study of early ideas about Probability, Induction and Statistical Inference. London, UK: Cambridge University Press.

Lancelot Hogben [1957]: Statistical Theory. W. W. Norton.

J. von Plato [1994]:  Creating Modern Probability:  Its Mathematics, Physics and Philosophy in Historical Perspective.  Cambridge Studies in Probability, Induction, and Decision Theory.  Cambridge, UK:  Cambridge University Press.   Cambridge Studies in Probability, Induction, and Decision Theory.




Retroflexive decision-making

How do companies make major decisions?  The gurus of classical Decision Theory – people like economist Jimmie Savage and statistician Dennis Lindley - tell us that there is only one correct way to make decisions:  List all the possible actions, list the potential consequences of each action, assign utilities  and probabilities of occurence to each consequence, multiply these numbers together for each consequence and then add the resulting products for each action to get an expected utility for each action, and finally choose that action which maximizes expected utility.   

There are many, many problems with this model, not least that it is not what companies – or intelligent, purposive individuals for that matter – actually do.  Those who have worked in companies know that nothing so simplistic or static describes intelligent, rational decision making, nor should it.  Moreover, that their model was flawed as a description of reality was known at the time to Savage, Lindley, et al,  because it was pointed out to them six decades ago by people such as George Shackle, an economist who had actually worked in industry and who drew on his experience.  The mute, autistic behemoth that is mathematical economics, however, does not stop or change direction merely because its utter disconnection with empirical reality is noticed by someone, and so - TO THIS VERY DAY – students in business schools still learn the classical theory.  I guess for the students it’s a case of:  Who are we going to believe – our textbooks, or our own eyes?    From my first year as an undergraduate taking Economics 101, I had trouble believing my textbooks.

So what might be a better model of decision-making?  First, we need to recognize that corporate decision-making is almost always something dynamic, not static – it takes place over time, not in a single stage of analysis, and we would do better to describe a process, rather than just giving a formula for calculating an outcome.   Second, precisely because the process is dynamic, many of the inputs assumed by the classical model do not exist, or are not known to the participants, at the start, but emerge in the course of the decision-making process.   Here, I mean things such as:  possible actions, potential consequences, preferences (or utilities), and measures of uncertainty (which may or may not include probabilities).     Third, in large organizations, decision-making is a group activity, with inputs and comments from many people.   If you believe – as Savage and Lindley did – that there is only one correct way to make a decision, then your model would contain no scope for subjective inputs or stakeholder revisions, which is yet another of the many failings of the classical model.    Fourth, in the real world, people need to consider – and do consider – the potential downsides as well as the upsides of an action, and they need to do this – and they do do this – separately, not merged into a summary statistic such as “utility”.   So, if  one possible consequence of an action-option is catastrophic loss, then no amount of maximum-expected-utility quantitative summary gibberish should permit a rational decision-maker to choose that option without great pause (or insurance).   Shackle knew this, so his model considers downsides as well as upsides.   That Savage and his pals ignored this one can only assume is the result of the impossibility of catastrophic loss ever occurring to a tenured academic.

So let us try to articulate a staged process for what companies actually do when they make major decisions, such as major investments or new business planning:

  1. Describe the present situation and the way or ways it may evolve in the future.  We call these different future paths scenarios.   Making assumptions about the present and the future is also called taking a view.
  2. For each scenario, identify a list of possible actions, able to be executed under the scenario.
  3. For each scenario and action, identify the possible upsides and downsides.
  4. Some actions under some scenarios will have attractive upsides.   What can be done to increase the likelihood of these upsides occurring?  What can be done to make them even more attractive?
  5. Some actions under some scenarios will have unattractive downsides.   What can be done to eliminate these downsides altogether or to decrease their likelihood of occurring?   What can be done to ameliorate, to mitigate, to distribute to others, or to postpone the effects of these downsides?
  6. In the light of what was learned in doing steps 1-5, go back to step 1 and repeat it.
  7. In the light of what was learned in doing steps 1-6, go back to step 2 and repeat steps 2-5.  For example, by modifying or combining actions, it may be posssible to shift attractive upsides or unattractive downsides from one action to another.
  8. As new information comes to hand, occasionally repeat step 1. Repeat step 7 as often as time permits.  

This decision process will be familiar to anyone who has prepared a business plan for a new venture, either for personal investment, or for financial investors and bankers, or for business partners.   Having access to spreadsheet software such as Lotus 1-2-3 or Microsoft EXCEL has certainly made this process easier to undertake.  But, contrary to the beliefs of many, people made major decisons before the invention of spreadsheets, and they did so using processes similar to this, as Shackle’s work evidences.

Because this model involves revision of initial ideas in repeated stages, it bears some resemblance to the retroflexive argumentation theory of philosopher Harald Wohlrapp.  Hence, I call it Retroflexive Decision Theory.  I will explore this model in more detail in future posts.

References:

D. Lindley [1985]:  Making Decisions.  Second Edition. London, UK: John Wiley and Sons.

L. J. Savage [1950]: The Foundations of Statistics.  New York, NY, USA:  Wiley.

G. L. S. Shackle [1961]: Decision, Order and Time in Human Affairs. Cambridge, UK:  Cambridge University Press.

H. Wohlrapp [1998]:  A new light on non-deductive argumentation schemes.  Argumentation, 12: 341-350.

Technorati Tags: , , , , , , , , , ,




Banking on Linda

Over at “This Blog Sits”, Grant McCracken has a nice post about a paradigm example often used in mainstream economics to chastise everyday human reasoners. A nice discussion has developed. I thought to re-post one of my comments, which I do here:

“The first point — which should be obvious to anyone who deals professionally with probability, but often seems not — is that the answer to a problem involving uncertainty depends very crucially on its mathematical formulation. We are given a situation expressed in ordinary English words and asked to use it to make a judgment. The probability theorists have arrived at a way of translating such situations from natural human language into a formal mathematical language, and using this formalism, to arrive at an answer to the situation which they deem correct. However, natural language may be imprecise (as in the example, as gek notes). Imprecision of natural language is a key reason for attempting a translation into a formal language, since doing so can clarify what is vague or ambiguous. But imprecision also means that there may be more than one reasonable translation of the same problem situation, even if we all agreed on what formal language to use and on how to do the translation. There may in fact be more than one correct answer.

There is much of background relevance here that may not be known to everyone, First, note that it took about 250 years from the first mathematical formulations of uncertainty using probability (in the 1660s) to reach a sort-of consensus on a set of mathematical axioms for probability theory (the standard axioms, due to Andrei Kolmogorov, in the 1920s).   By contrast, the differential calculus, invented about the same time as Probability in the 17th century, was already rigorously formalized (using epsilon-delta arguments)  by the mid-19th century.   Dealing formally with uncertainty is hard, and intuitions differ greatly, even for the mathematically adept.

Second, even now, the Kolmogorov axioms are not uncontested. Although it often comes as a suprise to statisticians and mathematicians, there is a whole community of intelligent, mathematically-adept people in Artificial Intelligence who prefer to use alternative formalisms to probability theory, at least for some problem domains. These alternatives (such as Dempster-Shafer theory and possibility theory) are preferred to probability theory because they are more expressive (more situations can be adequately represented) and because they are easier to manipulate for some types of problems than probability theory. Let no one believe, then, that probability theory is accepted by every mathematically-adept expert who works with uncertainty.

Historical aside: In fact, ever since the 1660s, there has been a consistent minority of people dissenting from the standard view of probability theory, a minority which has mostly been erased from the textbooks. Typically, these dissidents have tried unsuccessfully to apply probability theory to real-world problems, such as those encountered by judges and juries (eg, Leibniz in the 17th century), doctors (eg, von Kries in the 19th), business investors (eg, Shackle in the 20th), and now intelligent computer systems (since the 1970s). One can have an entire university education in mathematical statistics, as I did, and never hear mention of this dissenting stream. A science that was confident of its own foundations would surely not need to suppress alternative views.

Third, intelligent, expert, mathematically-adept people who work with uncertainty do not even yet agree on what the notion of “probability” means, or to what it may validly apply. Donald Gillies, a professor of philosophy at the University of London, wrote a nice book, Philosophical Theories of Probability, which outlines the main alternative interpretations. A key difference of opinion concerns the scope of probability expressions (eg, over which types of natural language statements may one validly apply the translation mechanism). Note that Gillies wrote his book 70-some years after Kolmogorov’s axioms. In addition, there are other social or cultural factors, usually ignored by mathematically-adept experts, which may inform one’s interpretations of uncertainty and probability. A view that the universe is deterministic, or that one’s spiritual fate is pre-determined before birth, may be inconsistent with any of these interpretations of uncertainty, for instance. I have yet to see a Taoist theory of uncertainty, but I am sure it would differ from anything developed so far.

I write this comment to give some context to our discussion. Mainstream economists and statisticians are fond of castigating ordinary people for being confused or for acting irrationally when faced with situations involving uncertainty, merely because the judgements of ordinary people do not always conform to the Kolmogorov axioms and the deductive consequences of these axioms. It is surely unreasonable to cast such aspersions when experts themselves disagree on what probability is, to what statements probabilities may be validly applied, and on how uncertainty should be formally represented.

Reference:

Donald Gillies [2000]: Philosophical Theories of Probability. (London, UK: Routledge)

Technorati Tags: , , , , ,