Friday, June 8, 2012

Non-measurable sets and interval-valued probabilities

I think there is nothing new here, but I want to collect together some facts that are interesting to me.

Suppose m is a countably additive measure on a set U. Then it's pretty easy to show that for any subset B of U, measurable or not, there exist measurable sets A and C such that:

  • A is a subset of B and B is a subset of C
  • A is maximal in measure among the measurable subsets of B: for every measurable subset A' of B, m(A')≤m(A)
  • C is minimal in measure among the measurable supersets of B: for every measurable superset C' of B, m(C')≥m(C).
(Cf. Van Vleck.) We can now define the lower measure lm(B)=m(A) and the upper measure um(B)=m(C). Of course lm(B)≤um(B) for all B.

If m is a probability measure (i.e., m(U)=1), we can then extend the measure m on U to a complete measure (i.e., one such that any subset of a set of measure zero is also measurable) simply by taking as measurable all sets B such that lm(B)=um(B), and then setting m(B)=um(B).

From now on assume m is a complete probability measure. Then a set B is measurable if and only if lm(B)=um(B).

Suppose that X1,X2,... are independent random variables taking their values in U, with the probability that Xi is in B being equal to m(B). Let Sn(B) be the number of the variables X1,...,Xn whose values are in B. If B is measurable, then the Strong Law of Large Numbers implies that almost surely (i.e., with probability 1), Sn(B)/n converges to m(B). It immediately follows that in general, whether or not B is measurable, almost surely

  • lm(B)≤ liminf Sn(B)/n≤ limsup Sn(B)/num(B).
In other words, almost surely, all the limit points of the asymptotic frequency Sn(B)/n fall between the lower and upper measures of B.

It would be interesting to see what else we can say about the limit points of the asymptotic frequency. One might speculate that lm(B) and um(B) are almost surely limit points of the asymptotic frequency, but I think that's not true in general. But could it be true in the special case where m is Lebesgue measure on an interval?

I've been thinking from time to time about this question: What do asymptotic frequencies of visits to a nonmeasurable set look like? Still no answer.

In any case, the above stuff suggests that dealing with nonmeasurable sets might be a good application for interval-valued probabilities, where we assign the interval [lm(B),um(B)] as the probability of B.

Oh, and finally, it's worth noting that Van Vleck has in effect shown that if m is Lebesgue measure on [0,1], then there is a subset of [0,1] whose lower measure is zero and whose upper measure is one.

1 comment:

Alexander R Pruss said...

There is a literature on such stuff that I didn't know when I did the post. E.g., lm is usually denoted as m subscript * and um is usually denotes m superscript *: these are the lower and upper measures generated by m. The existence of sets A and C is well-known, too. E.g, Lemma 1.2.2 in A. van der Vaart and J. Wellner. Weak Convergence and Empirical Processes: With Applications to Statistics, New York: Springer, 1996. Though van Vleck knew it already in the case of Lebesgue measure, and the proof there is presumably the same as in the general case.