« PreviousContinue »
m+n white balls would be drawn in N trials. Then the probability that the observed event has resulted from some possibility above u, is expressed by ? $ (c) xf (), summed from m to N, divided by the same expression summed from o to N.
This, as I understand, is the method pursued by Laplace in investigating the probability that the difference in the ratio of male to female births, as observed in Paris and in London (respectively 31 and 19), is due to a real difference between the two localities. (" Theor. Analytique des Prob.," Book II., Art. 29); mutatis mutandis, that is, it being observed first that Laplace's m is derived only froin a finite set of 'observations (say at London), whereas ours is derived deductively from an infinite set of observations, the experience of games of chance and even more* widely diffused experiences, from the beginning of time. And secondly, in comparing our formula with Laplace's method, we must allow for his characteristic neglect of à priori probabilities. Laplace's reasoning is abridged by Mr. Todhunter, in his “ History of Probabilities,” Arts. 902, 1018. Laplace is followed by Demorgan, in the treatise on Probabilities published in the “Encyclopæd. Metrop.," at section 145, which the author entitles, “Determination of the Presumption that Increased Frequency of an vent Ehas a Particular Cause.” The same method is employed by Cournot in his masterly discussion of à posteriori Probabilities (in the eighth chapter of “Exposition de la Théorie des Chances"). The reader who may wish to see the identical (or as nearly as possible the same) problem which we have in hand, discussed by a first-rate authority, is referred to Cournot, section 99; where it is to be observed that our case is that noted by Cournot when his m' (our N) is “très petit par rapport à m” (his m corresponding to our infinite set of observations afforded by games of chance, &c.).
But however well established the preceding formula as an organon of statistics (6), the following schema, savouring more of Bernouilli than of Bayes, is perhaps more appropriate to the particular problem in hand. Let a be the à priori probability that chance alone should have been the régime under which the observed event occurred. Let p be the objective probability that, chance being the régime, a deviation from u in the direction of success at least as great as v should
Let ß be the à priori probability that there should have been some additional agency. Let y be the (not in general objective) probability that, such additional agency existing, the observed event should Then the required à posteriori probability in favour of the
By additional agency is
; where a=1-B. By+ap
I have dwelt upon this sort of experience elsewhere : Mind, April, 1884. Hermathena, 1884.
Such, as I understand, is the method pursued by Laplace in his investigation (“Theor. Analyt.," Book II., Art. 25) whether the difference between barometrical observation at different hours of the day is due to cause or chance alone. Laplace is followed by Demorgan in his section 139 entitled “On the Question whether Observed Discrepancies are Consequences of a General Law, or Accidental Fluctuations.” Such also is the method employed by Herschel (Essay on Quetelet) in determining the probability that the difference between the numbers of male and female births is not accidental, and that the connection between the binary stars is physical.
It may be objected, perhaps, to both these methods that they do not utilise all our knowledge; for that, as regards the second method, we are given the particular deviation from u, namely, v, while we take account only of the fact that the deviation belongs to the class extending from v to 1-U. In the first method, indeed, we take our stand upon the particular event, the deviation of exactly v. But, on the other hand, we do not take account of our exact knowledge of u.
The answer would have been the same if we had been given only that this fraction was somewhere between zero and what we now know to be its exact value.
This difficulty may be partially cleared up by the following illustration (borrowed from Laplace). Suppose we know that there are a thousand tickets in a certain lottery, whereof a hundred are red and the rest white, and that each has a certain number inscribed. If a red ball is drawn, though it has a particular number inscribed on it, yet we cannot utilise that knowledge in the absence of any knowledge whether the agency, other than chance, would prefer one number to another. We may have to put down the objective) probability that chance alone existing the red ball would have been drawn as 1o. But now let it be known that the particular number was prophesied, or is, and might have been found out to be, the prize-bearing ticket; then, indeed, we obtain a hold whereby to bring to bear our knowledge of the differential chanee, that is rooo. In our problem, with reference, for example, to the second method above exhibited, we can assign certainly the differential probability that the exact deviation v should result from chance alone. But we cannot similarly differentiate our vague knowledge about the
We may assign, certainly, the form of such an argument, but when we come to our second operation we shall find that it is an
Byl empty form. This foredoomed form might be
Byl + ap! sponding to the notation above employed, pa is the (very small) probability that the particular deviation v should occur under the régime of chance;y!, is the probability (presumably of the same order of magnitude) that, an additional agency existing, the exact deviation v should have occurred; a and B are as before.
The only interpretation which I can put on Professor Lodge's reasoning upon the problem now in hand (in the Proceedings of the S.P.R., Part VII.), is that it is an attempt in some way to evade the difficulty here noticed. But the originality of his reasoning renders it difficult for the book-taught student to understand it.
(2) Still under the heading devoted to the first operation, we come now to our second problem. It seems a sufficient (though for reasons already intimated it is an imperfect) statement to posit the same formula as in the second method of the preceding problem
By (viz., „), substituting for the p of that formula the continued
By+ap product p p' p', &c., expressing the probability that under the régime of chance all the observed results m+n, m+n', &c., would have diverged in the same direction from the most probable result, m by n n', &c. (Had the datum been that the observed results had diverged on one side or the other, it would have been proper to take each p as expressing that degree of divergence on one side or the other.) The import of y is analogously modified.
It will be noticed that this formula differs from that offered by Mr. Gurney in Part VII. of these Proceedings. But, as above intimated, it does not follow that, because two formulæ are different, both cannot be right. They may be equally serviceable and equally imperfect. In the present case Mr. Gurney's formula appears to be quite as accurate as ours, * but not, as will presently be pointed out, substantially more serviceable.
(3) The third problem may be reduced to the second (or first), by grouping the given series so as to constitute a set, in all of which the successes are in excess. This method, doubtless, does not utilise all our information. But it is convenient; and it might be difficult to frapeat a more useful formula without special knowledge of the subject matter. Much would turn upon the probability that the agency other than chance, if existing, would have been attended by the observed chequered result. If it were known or suspected to be a fitful agency, not much presumption against it would be created by defective series.
II. For the methods appropriate to the second operation the reader is referred to the paper on à priori probabilities in the Philosophical Magazine, September, 1884, and to the authorities therein cited. It is pointed out in the article referred to that an accurate knowledge of the values under consideration can often be dispensed with, and that an inaccurate knowledge is often derivable from experience ; partly by a copious simple induction, and partly by inference from the success which has attended the hypothetical values which have been usually assigned
* Poisson (Reserches, Art. 64) indicates the difference between these two procedures, without expressing a preference.
to these quantities. To apply these principles to the problems in hand. (1) For the first problem and the (a) first method the à priori facility function () can, to a large extent, be ignored, when N is large; as Cournot has well exhibited in the eighth chapter of the work already referred to. I would further contend that there is some empirical ground for treating the function as a constant (as is usual in inverse reasoning founded on Bayes' theorem and the cognate theory of errors of observation). Accordingly the sought à posteriori probability reduces to the objective probability Ef (r) between proper limits, divided by the same, summed between extreme limits.
By+ap tional agency, it is consonant, I submit, to experience to put both
for a and B. To put that same value for y, appears, while not contradicted Ty, yet less agreeable to, experience. In fact, we know of some kinds
of agencies which, if they exist, are extremely likely to make themselves felt (e.g., imposture). Accordingly Mill, discussing a similar problem (“ Logic,” Book III., chap. xviii., section 6), says: “The law of nature, if real, would certainly produce the series of coincidences.” And so Poisson, in a passage above referred to, supposes capable de le (the observed event] produire necessairement.” But it really is not very important what particular value we assign to one of these à priori constants, provided that we are careful not to build upon any particularity which does not rest upon our rough though solid ground of experience. In the present case all that we really know about y is that it is substantial, not in general indefinitely small. But we must not build any conclusion on its fractional character, seeing that it may very well be in the neighbourhood of unity. The importance of this remark will appear when we come to the second problem. In the present case, since neither a nor ß nor y is very small, if p is very small the above written expression for the à posteriori probability in favour of additional agency reduces by Taylor's theorem to 1 * p. Thus the objective probability p may be taken as a rough measure of the sought à posteriori probability in favour of mere chance. This reasoning is authorised by Donkin and even by Boole, who is so mightily scrupulous about the undetermined constants of probabilities (see the authorities cited in the paper on à priori Probabilities in Philosophical Magazine). The conclusion is agreeable to the summary practice of Laplace and Herschel. They have not thought it worth while to construct a scaffolding of unknown constants which would have to be taken down again. The third formula
attempts to utilise our knowledge of
By' tap' the particular deviation n, and the particular, most probable value from which it is a deviation, viz., m, p', is the objective probability that this particular deviation should occur in the régime of chance. pl we know; but what is yl? It is a magnitude presumably of the same order as p! Accordingly the above expression is thoroughly indeterminate. It will be remembered that this formula is here criticised not as being identical with the rule given by Professor Lodge, but as that to which the principle he employs might seem to lead. His rule, however obtained, is so far a good rule as (in common with an indefinite number of rules that might be constructed) it always varies in the same direction as the rule sanctioned by Laplace, Demorgan, Herschel, and the other masters of the science of probabilities. What is here termed p always increases with the increase, and decreases with the decrease of Professor Lodge's
(Proceedings, Part VII., p. 261, top). But it happens that Professor Lodge's rule does less than justice to the argument in favour of
agency other than chance.
(2) We come now to the second problem, concerning which, under the heading of the second operation, there need hardly be added anything. As under (1) we see (or will see presently) that p is the effective measure of the probability—the à posteriori probability-of mere chance, so under (2) the real grip of proof consists in pxp'x, &c. If we replace which Mr. Gurney assigns as “the probability of obtaining at least that degree of success-if chance to” act, by our y, his “final value” will become
So far as there is reason to think (with Mill) that “the law of nature, if real, would certainly produce the series of coincidences,” Mr. Gurney seems to underrate the probability in favour of a cause other than chance, by assigning to 1 a value (2) which, being raised to the nth power,
y unduly swells the denominator. If each p or the average—the geometric mean—of the p's were 1, Mr. Gurney's formula would be void of any probative content. But this is contrary to common sense.
It is contrary to this elementary principle of statistics : that, if an event may indifferently happen one way or another, be either plus or minus, and it repeatedly happens one way, then there must be a cause other than chance for that repetition.* According to this new rule it is no
* It is evidently owing to a mere lapsus plume on the part of Mr. Gurney that this consequence can be fastened upon him. For at p. 256 he implies the principle for which we are here contending. It may be as well to repeat that my contention is not against Mr. Gurney's reasoning, which is excellent ; but against his assumption of the premiss : that, “if chance +" act, the probability