Tuesday, September 1, 2015

Nothing is Certain – Cromwell's Rule

I here defend the proposition that absolutely nothing is absolutely certain, where certainty is understood as a flat 0 or 1 as a Bayes prior.

My last post applied Bayes's Theorem to a proposition to which many people would apply a subjective probability of 1 (committed theists) or 0 (committed atheists). In fact, I would guess that at least 70% of the population would assess the probability of the existence of God at 1, and perhaps 20% at 0. I argued in passing that 0 and 1 probabilities are inappropriate when we are doing subjective probabilities, not only in this case, but in the case of each and every proposition. What I contended for is a strong form of what Dennis Lindley called “Cromwell's Rule.” The source of this name is in the following passage from his book Making Decisions in which he is defending the proposition that one's subjective probability for the moon's being made of green cheese should be greater than zero.

. . . it can be as small as 1 in a million, but have it there since otherwise an army of astronauts returning with samples of the said cheese will leave you unmoved. A probability of one is equally dangerous because then the probability of ~E will be zero. So never believe in anything absolutely, leave some room for doubt: as Oliver Cromwell told the Church of Scotland 'I beseech you in the bowels of Christ think it possible you may be mistaken.'
P 104.

Lindley gives only one exception, that for the truths of logic, among which he includes such mathematics as “2 + 2 = 4”. See also, Understanding Uncertainty Revised Edition Sec. 6.8

Lindley is wrong about the 1 in a million. The probability of a moon of green cheese needs to be a few magnitudes lower than that. He is right, however, for the reason that he gives, that it must, as a theoretical matter be non-zero.

Possible Exceptions to Cromwell's Rule

Let me distinguish four levels of strength for mathematical exceptions to Cromwell Rule:

      1. Super Strong Exception: 0 or 1 is applied to all propositions whose falsity or truth is a matter of logic or mathematics.
      2. Very Strong Exception: 0 or 1 is applied to all propositions whose falsity or truth is accepted as a matter of logic or mathematics by the collective wisdom of the mathematical community.
      3. Strong Exception: 0 or 1 is to be applied to all propositions accepted by the Bayes user solely as a matter of logic and mathematics
      4. Weak Exception: 0 or 1 is to be applied to all propositions accepted by the Bayes user solely through the use of easy logic and mathematics.

It is not clear to me to which of these Lindley would commit. His examples support only the Weak Exception, but his language at points seems to border on endorsement of the Super Strong. Were I forced into an unambiguous interpretation of Lindley, I would lean towards the Very Strong or Strong Exceptions.

Not the Super Strong Exception

We can dispense with the Super Strong Exception quickly. Let S be. “The 3rd digit in the 10 to the trillionth power prime is 7.” Either S is true or not-S is, and which is true is solely a matter of elementary mathematics (although well beyond calculation). Clearly, we do not want to assign 1 to either of these propositions, though one of them is as true as true can be. (There would be some reason to assign .1 to S; none to assign 1 or 0.)

Not the Very Strong Exception

If the person doing Bayes is herself suitably expert in the sort of math a proposition involves, then 2 is equivalent to 3. So the interesting case is where the subject lacks such expertise. Suppose I see a headline on the first page of the New York Times one morning, “Counterexample to Fermat's Theorem” Subhead well down in the article: “Suspect Section of Wiles's Proof.” On the basis of this new evidence, what should my credence level in the theorem be? It may well be that it should not drop very much until I see the counterexample, but surely it should not be 1. Once I check the counterexample and find it sound, of course, my updating would drop the probability way down. It would be far more likely that there was a mistake in the Wiles proof than that I and all those who checked the counterexample before it got into the Times should be wrong about it. This thought experiment seems to me to show that the Very Strong Exception to Cromwell's rule must be rejected along with the Super Strong Exception.

Not the Strong Exception

I once understood a proof, or at least thought I understood a proof, that was the subject of a semester long course. Given a few hours, I was confident that I could explain the basics of the proof to anyone with a reasonable level of mathematical sophistication in set theory and proof theory. With more time, I could convey the details. Should I have assigned a probability of 1 to the proved proposition? Imagine a New York Times story again. This time it would be on one of the inside pages that the proof would be announced to have a subtle fallacy. Surely the proof was long and difficult enough that a mistake could have gotten past its author, and even easier past me. I should not, I think, have asserted a Cromwell's Rule exception.

Not the Weak Exception

This brings us to the probability of the proposition “2 + 2 = 4.” If we are to reject the Strong Exception, as I think we must, and still accept the Weak Exception, we are going to have to draw a line at some point on a spectrum that has difficult pieces of math at one end and easy ones at the other.

Now, I concede, in fact I would insist, that many arguments of the following form are fallacious: A and B cannot rationally be treated differently because any line dividing A from B will be completely arbitrary. It is perfectly rational to run more cautiously on a mountain path at night than it is in the daylight despite the arbitrariness of denominating a minute that separates day from night. Note that in this case, however, we might well run with increasing caution as twilight deepened. There is no problem in increasing prior probability of mathematical propositions from .9 by stages to .99999. It is the transition to a flat 1 that is dangerous.

What is the easiest math problem you ever got wrong? Do not count the time you marked an answer without even really looking because you were out of time. But do go ahead and count mistakes in third grade or when you were tired at the end of a test and perhaps coming down with the flu. Would your condition have to be so much worse than that for you to make a mistake about an easy addition?

Another thought experiment. You are shown an arithmetic test that has some problems as easy as “2 + 2 = 4.” You then watch as a subject, having been given an injection, takes the test. At intervals the subject is asked whether he is absolutely certain of each answer, to which he replies in the confident affirmative, volunteering, a couple of times, that he has checked them. You, however, have noticed three mistakes. You were also somewhat surprised when the subject spontaneously interrupted his task to declaim “The world is all that is the suitcase.” A few minutes later this was followed by “Quadruplicity drinks procrastination.”

You would not, I take it, be impressed by the subject's sincere certainty that he had all the problems right. Even his protestation that the problems were too simple for him possibly to make any mistake, and that, in any event, he had gone over them three times. You might object that this story is impossible. A person drugged in this fashion would report some kind of confusion or fuzziness of thought. I concede that he probably would. 99,999 times in a hundred thousand there would be some such tip off. There would, however, be that .00001 case, or perhaps it is a one in a hundred billion case, but there is some remote possibility of a brain glitch of some sort compatible with a sincere and subjectively certain affirmation of “2 + 2 = 5.”

As I suggested in my prior post, we must not be led astray by the thought that nothing in the empirical world can get between “2 + 2 = 4” and what makes it true. For degrees of confidence what is important is whether anything, even any extraordinarily rare thing, can get in between the mathematical proposition and your analysis of it. Lindley, though one of the founders of the Bayesian, subjective probabilities, approach to probability, was seduced into what amounts to a form of frequentism when it came to mathematical propositions. “2 + 2 = 4” is never false. It is not false once in a billion times. But then, “the moon is not made of green cheese” is not false either – ever. Both propositions, however, are of a sort about which we might make a mistake, if extraordinarily rarely. Evidence on the possibility of a mistake in our evidence gathering or our thinking is always in order. That is why even the Weak Exception to Cromwell's Rule is unacceptable.

The Practical Bayesian Objection

Having concluded that ought be no exceptions at all to Cromwell's Rule, even for easy propositions of logic or arithmetic, let me confront two arguments that I am wrong. The two come from radically different directions. The first reminds us that Bayesian probabilities are a practical work-a-day methodology utilized by the social sciences, artificial intelligence, image processing, spam filters, and on and on down an ever growing list. The second idealizes the Bayes formula as pure mathematics.

The practical move is straightforward. Getting our heads out of the clouds of abstract theory, the dispute as to whether there are flat 1 or only very nearly 1 prior probabilities may seem to have some resemblance to the question how many angels can dance upon the head of a pin. I have some sympathy for this objection because no one, so far as I know, has seriously applied Bayes to mathematical propositions. The proper prior for “2 + 2 = 4” is not a question that arises for those who use Bayes. That is not, however, much of a reason not to try to nail down Bayesian theory. It may be a reason not to try to force the discussion upon those who design Bayes updating into their spam filters.

In fact, I think the move can be turned back against my critics. It might make some sort of difference in principle, but if we are here concerned with practical Bayesian applications, what possible practical difference could there be between a prior confidence of 1 and a Cromwell Rule prior confidence of .999999?

The Theoretical Bayesian Objection

The opposite move in criticism of an exceptionless Cromwell's Rule insists that what there is to Bayes in the first instance is a bit of pure mathematics. Whether P(H) is defined over values x, 0 < x < 1 or 0 < x < 1 is a matter of the mathematics and, as any other matter of pure mathematics does not concern itself with the possibility of mathematical mistakes. Mathematical reasoning and mathematical formulae are never disfigured by the possibility that someone will make a logical or mathematical mistake in their application. So, the objection continues, Lindley was right to make either a Super Strong or at least a Very Strong Exception to Cromwell's Rule for mathematical propositions.

Pure mathematics, however, does not settle the < versus < question. It is a matter of the application of the formula. Updating a prior degree of credence requires that there be something with a degree of credence. It is not necessarily that it be a human with degrees of belief. It could be employed by a computer, an extraterrestrial, or a deity. If it is used by a being who is omniscient with respect to a particular field, then it will be appropriate for that being to assign a prior of 0 or 1 to any proposition of that field. Rarely, however, will such a being have much use for Bayesian updating.

Here is a possible case. Assume Swinburne's God, who knows all that there is to know, but does not know all of the future – because not all of the future is fixed (free will, quantum indeterminism). Consider the following proposition “Goldbach's conjecture is true and human beings will set foot on Mars before 2300.” Suppose that there is a new bit of evidence about the funding of the US manned space program. If you or I wanted to update our rational credence in this proposition, we would have to think about what our priors should be for Goldbach's conjecture and the Mars landing. (Neither would be particularly easy.)

For Swinburne's God, Goldbach's conjecture gets a 0 or a 1, and so he either assigns 0 to the conjunction or updates his Mars prior. For this reason, I would grant that the Super Strong Exception applies whenever Bayes is used by an omniscient being, even if omniscience is limited – so long as mathematics is without the limitation. For those of us users of Bayes not blessed with omniscience, “<” is the appropriate application of the formula.

Metaphilosophical Conclusion

Where Cromwell's Rule is most likely to see action is not with respect to math, to green cheese on the moon, or to Moore's “Here is one hand, here is another.” When, a circumstance becoming a little less rare, there is an attempt to bring Bayes into the discussion of traditionally philosophical issues, it is more likely to be when evidence is adduced respecting the existence of God, a physics with more than three spatial dimensions, the continuum hypothesis, a Platonism of mathematical objects, souls, noumena, free will, or the like. Bayesian methods may sometimes be attempted where people have strong commitments or are particularly impressed by the limits of what they can conceive. Those who want to bring empirical evidence to bear upon such should not be debarred by the assertion of an exception to Cromwell's Rule.


  1. You mention a computer briefly in the Theoretical Bayes section. How does such a mechanical device sit with the Weak Bayes exception, do you think? Still suspect because if it exists in the real world, bugs are possible? At least it shouldn't be subject to the same faults in improper credence as a drugged human, but maybe the bug possibility means it really is limited in the same way.

    Then maybe more practical. I'm used to supplying Bayes theorem with probability distributions over a set. But it if I don't get to assign a '0' to any option for a statement, I'm not sure I can define that set. For example, say 2+2=x. My normal probability distribution would have a very strong spike at 4. Now, do I only consider positive integers as options for x? Or must I give very small probabilities over the whole real continuum? What about complex numbers? What about considering the possibility the equation doesn't have a numerical solution? How may one define a domain of possibility if no possibilities may be ruled out?

    1. I do think even the simplest circuits have some, perhaps very low, probability of a bug.

      Whether to define a distribution over integers, reals, or complex might depend upon what kind of game one was hunting. If I am doing probabilities of hitting a prime, I stick with integers; if positrons in a space, maybe rationals. I am a little worried that defining the domain of possibility may be incompatible with my thesis. Maybe, however, it is enough to say that such a definition is (a) always a matter of one's particular purposes and resources at the time, and (b) should always be regarded as having something of the provisional about it.
