Probabilities on a Lusin space

Posted on May 13, 2020 by Stéphane Laurent
Tags: maths

We prove Prohorov’s theorem for a Lusin space, and we prove that the space of probability measures on a Lusin space is Lusin.

Space of probability measures

For a topological space \(E\), we denote by \(\mathcal{B}_E\) the Borel \(\sigma\)-field on \(E\), by \(C_b(E)\) the space of real-valued bounded continuous functions on \(E\) and by \(\Pr(E)\) the topological space of probabilities on \(E\) equipped with the narrow topology.

The canonical basis of neighborhoods at \(\mu \in \Pr(E)\) is the family of open sets \[ \Bigl\{\nu\in\Pr(E) \mid \bigl|\nu(f_i)-\mu(f_i)\bigl| < \epsilon, i\in [\![1,k]\!]\Bigr\} \] for \(\epsilon>0\), \(k \geqslant 1\) and \(f_i \in C_b(E)\).

In the next section, we will use the fact that a net \((\mu_\lambda)\) in \(\Pr(E)\) converges to \(\mu_\infty \in \Pr(E)\) if and only if \(\mu_\lambda(f) \to \mu_\infty(f)\) for every \(f \in C_b(E)\).

Proposition 1. Let \(\mu\in\Pr(E)\) and \((\mu_n)\) be a sequence in \(\Pr(E)\). If \(\limsup \mu_n(F) \leqslant \mu(F)\) for all closed sets \(F\subset E\), then \((\mu_n)\) converges to \(\mu\). The converse is true if \(E\) is metrizable.

Proof. See The conditional law as a random probability. □

Lemma 2. Let \(E\) be a metrizable space, and \(A \in \mathcal{B}_E\). The map \(\Pr(E) \ni \mu \mapsto \mu(A) \in [0,1]\) is Borel measurable.

Proof. Let \(\mathcal{D}\) be the class of \(A \in \mathcal{B}_E\) such that the map \(\mu \mapsto \mu(A)\) is measurable. For every \(f \in C_b(E)\), the map \(\mu \mapsto \mu(f)\) is continuous, hence it is measurable. Let \(F \subset E\) be closed. For every integer \(n\geqslant 1\) define \(f_n\in C_b(E)\) by \(f_n(x)=\max\bigl\{0,1-n d(x,F)\bigr\}\) where \(d\) is a metric on \(E\) compatible with the topology. Then \(f_n(x)\downarrow\mathbf{1}_F(x)\) for every \(x\in E\). By monotone convergence, \(F \in \mathcal{D}\). It is not difficult to check that \(\mathcal{D}\) is a \(\lambda\)-system. By the monotone class theorem, \(\mathcal{D} = \mathcal{B}_E\). □

Prohorov’s theorem in the Lusin case

Here we give a proof of Prohorov’s theorem for a Lusin space \(E\), i.e. \(E\) is homeomorphic to a Borel subset of a compact metric space. We follow [2]. We will only admit Stone-Weierstrass’s theorem and Riesz’s representation theorem, and the theorems used in the proof of the following theorem, which shows that the Lusin assumption is enough for us.

Theorem 3. A Polish space is Lusin. More precisely, a Polish space is homeomorphic to a \(G_\delta\) subset of the Hilbert cube \({[0,1]}^\mathbb{N}\).

Proof. See theorems 3.11 and 4.14 in [1]. □

Proposition 4. Let \((K,d)\) be a compact metric space. Then \(C(K)\) is separable and \(\Pr(K)\) is compact metrizable.

Proof. As a compact metric space, \(K\) is separable. Let \((x_n)\) be a dense sequence of \(K\) and define \(h_n \in C(K)\) by \(h_n(x)=d(x,x_n)\). The family of functions \(\{h_n\}\) separates points of \(K\), that is to say, for every \(x,y \in K\), there exists \(i\) such that \(h_i(x) \neq h_i(y)\). Indeed, let \(x,y\in K\) be distinct. Take \((n_j)\) such that \(d(x,x_{n_j}) \to 0\). If we had \(h_i(x) = h_i(y)\) for every \(i\), we would have \(d(y,x_{n_j}) \to 0\), therefore we would have \(x=y\).

Consider the subalgebra \(A\) of \(C(K)\) consisting of all polynomials in the \(h_k\), that is, the class of functions that are finite sums of the form \[ q\mathbf{1} + \sum q_{k_1,\ldots,k_r;n_1,\ldots,n_r}h_{k_1}^{n_1}\cdots h_{k_r}^{n_r}, \] where \(q\) and the \(q_{k_1,\ldots,k_r;n_1,\ldots,n_r}\) are rational constants. The closure of this subalgebra is a subalgebra of \(C(J)\), it contains all constant functions and it separates points of \(K\). Therefore, by Stone-Weierstrass’s theorem (see [3]), \(A\) is dense in \(C(K)\). Hence \(C(K)\) is separable because \(A\) is countable. Let \((f_n)\) be a dense sequence in \(C(K)\). Consider the map \[ \begin{array}{rccl} \Phi\colon & \Pr(K) & \to & V := \displaystyle\prod_{n=1}^\infty \bigl[-{\Vert f_n\Vert}_\infty, {\Vert f_n\Vert}_\infty\bigr] \\ & \mu & \mapsto & \bigl(\mu(f_1), \mu(f_2), \ldots \bigr) \end{array}. \]

  • The map \(\Phi\) is injective: if \(\mu,\nu\in\Pr(K)\) satisfy \(\mu(f_n) = \nu(f_n)\) for every \(n\), then \(\mu(f)=\nu(f)\) for every \(f\in C(K)\) by dominated convergence; this implies \(\mu=\nu\) (theorem 1.2 in [4]).

  • The map \(\Phi\) is continuous because each coordinate map \(\mu\mapsto\mu(f_n)\) is continuous.

  • The map \(\Phi^{-1}\) is continuous. Indeed, take a net \({(v_\lambda)}_{\lambda\in\Lambda}\) in \(\Phi(V)\) converging to \(v_\infty \in \Phi(V)\). Let \(\mu_\lambda\) such that \(v_\lambda = \Phi(\mu_\lambda)\) for every \(\lambda \in \Lambda\cup\{\infty\}\). Thus \(\mu_\lambda(f_j) \to \mu_\infty(f_j)\) for every \(j \geqslant 1\). For any \(f\in C(K)\), we have \[ \bigl|\mu_\lambda(f) - \mu_\infty(f)\bigr| \leqslant 2{\Vert f-f_j\Vert}_\infty + \bigl|\mu_\lambda(f_j) - \mu_\infty(f_j)\bigr| \] for every \(\lambda\) and every \(j\). Hence \[ \limsup_{\lambda} \bigl|\mu_\lambda(f) - \mu_\infty(f)\bigr| \leqslant 2{\Vert f-f_j\Vert}_\infty. \] There exists a subsequence of \({\Vert f-f_j\Vert}_\infty\) converging to \(0\). It follows that \[ \limsup_{\lambda} \bigl|\mu_\lambda(f) - \mu_\infty(f)\bigr| = 0 = \lim_{\lambda} \bigl|\mu_\lambda(f) - \mu_\infty(f)\bigr| \] for all \(f\in C(K)\), hence \(\mu_\lambda \to \mu_\infty\).

Thus \(\Phi\) is a homeomorphism. It remains to show that \(\Phi\bigl(\Pr(K)\bigr)\) is closed in the compact space \(V\). Since \(V\) is metrizable (as a countable product of metrizable spaces), it suffices to show that a sequence of \(\Phi\bigl(\Pr(K)\bigr)\) converges in \(\Phi\bigl(\Pr(K)\bigr)\) whenever it converges in \(V\). Let \((\mu_k)\) be a sequence in \(\Pr(K)\) such that the sequence \(\bigl(\Phi(\mu_k)\bigr)\) converges to \((\alpha_n) \in V\). Let \(f \in C_b(K)\). There exists a sequence \((j_r)\) of integers such that \({\Vert f_{j_r} - f \Vert}_\infty \to 0\). We have \[ \bigl|\mu_n(f) - \mu_m(f)\bigr| \leqslant 2{\Vert f-f_{j_r}\Vert}_\infty + \bigl|\mu_n(f_{j_r}) - \mu_m(f_{j_r})\bigr|. \] The second term of the sum goes to \(0\) as \(m,n\to\infty\) because \(\mu_k(f_{j_r})\) is the \(j_r\)-th component of \(\Phi(\mu_k)\), which goes to \(\alpha_{j_r}\). Hence \[ \limsup_{m,n\to\infty} \bigl|\mu_n(f) - \mu_m(f)\bigr| \leqslant 2{\Vert f-f_{j_r}\Vert}_\infty, \] therefore \[ \limsup_{m,n\to\infty} \bigl|\mu_n(f) - \mu_m(f)\bigr| = 0. \] Thus the limit \(\lim \mu_n(f) =: \Lambda(f)\) exists. The map \(f \mapsto \Lambda(f)\) is an increasing linear functional on \(C(K)\) which maps \(\mathbf{1}\) to \(1\). By Riesz’s representation theorem (see [3]), there exists \(\mu\in\Pr(K)\) such that \(\Lambda(f) = \mu(f)\). In particular \(\mu(f_j) = \alpha_j\), thus \(\Phi(\mu) = (\alpha_1,\alpha_2,\ldots)\). In other words, \(\Phi\bigl(\Pr(K)\bigr)\) is closed in \(V\). □

Hereafter, \(E\) denotes a Lusin space, which means that \(E\) is homeomorphic to a Borel subset \(E'\) of a compact metric space \((K,d)\). Let \(A\subset K\) be a Borel set.
Any \(\mu\in\Pr(E)\) can be extended to a probability \(\hat\mu\in\Pr(K)\) by setting \[ \hat\mu(A') = \mu\bigl(\iota^{-1}(A'\cap E')\bigr) = \mu\bigl(\iota^{-1}(A')\cap E\bigr) \] where \(\iota\colon E \to E'\) is a homeomorphism. Observe that \(\hat\mu(K\setminus E') = 0\). The map \(\mu \mapsto \hat\mu\) is injective because \(\iota\) is a bimeasurable bijection. Its image is the set \[ Q = \bigl\{\nu \in \Pr(K) \mid \nu(E')=1\bigr\}. \] Indeed, if \(\nu \in Q\), then \(\nu = \hat\mu\) for the probability \(\mu \in \Pr(E)\) defined by \(\mu(A) = \nu(\iota(A))\).

Lemma 5. The map \(Q \ni \nu \mapsto \nu_{|\mathcal{B}_{E'}} \in \Pr(E')\) is continuous.

Proof. Denote this map by \(\kappa\). Since \(\Pr(K)\) is metrizable by Proposition 3, \(Q\) is metrizable as well, and it suffices to show that \(\kappa\) is sequentially continuous. Take a sequence \((\mu_n)\) in \(Q\) and \(\mu_\infty \in Q\) such that \(\mu_n \to \mu_\infty\) in \(Q\). One also has \(\mu_n \to \mu_\infty\) in \(\Pr(K)\), and then, by Proposition 1, \[ \forall F\subset K \text{ closed}, \limsup \mu_n(F) \leqslant \mu_\infty(F). \] Let \(F' \subset E'\) closed, hence \(F' = F \cap E'\) for a closed \(F \subset K\). Then \[ \limsup \kappa(\mu_n)(F') = \limsup \mu_n(F') \leqslant \limsup \mu_n(F) \leqslant \mu_\infty(F). \] But \(\mu_\infty \in Q\), therefore \(\mu_\infty(F) = \mu_\infty(F') = \kappa(\mu_\infty)(F')\). By Proposition 1, \(\kappa(\mu_n) \to \kappa(\mu_\infty)\) in \(\Pr(E')\). □

Lemma 6. The subset \(Q\) of \(\Pr(K)\) is homeomorphic to \(\Pr(E')\). The map \(\Phi\colon\Pr(E')\to Q\) defined by \(\Phi(\nu)(A) = \nu(A\cap E')\) is a homeomorphism.

Proof. One has \(\mathcal{B}_{E'} = \{A\cap E' \mid A \in \mathcal{B}_K\}\). Hence it is clear that \(\Phi\) is injective. Let \(\mu\in Q\) and \(A'\in\mathcal{B}_{E'}\). One has \(A' = A\cap E'\) for a certain \(A \in \mathcal{B}_K\). Define \(\mu'(A') = \mu(A)\). Since \(\mu(A) = \mu(A\cap E')=\mu(A')\), this definition of \(\mu'(A')\) does not depend on the choice of \(A\). In fact \(\mu' = \mu_{|\mathcal{B}_{E'}}\). So \(\mu' \in \Pr(E')\) and \(\Phi(\mu') = \mu\). Thus \(\Phi\) is surjective.

Lemma 5 says that the inverse map \(\Phi^{-1}\) is continuous.

Now we will show that \(\Phi\) is continuous. The space \(\Pr(E')\) is pseudometrizable: we know from the previous proposition that \(\Pr(K)\) is metrizable, and with the help of a metric \(\rho\) on \(\Pr(K)\) one can define a pseudometric \(\rho'\) on \(\Pr(E')\) by \(\rho'(\mu',\nu') = \rho\bigl(\Phi(\mu'),\Phi(\nu')\bigr)\). Therefore it is sufficient to prove that \(\Phi\) is sequentially continuous.

The map \(\Phi\) is sequentially continuous. Indeed, consider a sequence \((\nu_n)\) in \(\Pr(E')\) and \(\nu_\infty \in \Pr(E')\) such that \(\nu_n \mapsto \nu_\infty\) in \(\Pr(E')\). By Proposition 1, one has \(\limsup\nu_n(F')\leqslant\nu_\infty(F')\) for all closed \(F' \subset E'\). Let \(F \subset K\) be closed. One has \(\Phi(\nu_n)(F) = \nu_n(F \cap E')\). But \(F \cap E'\) is closed in \(E'\). Therefore \[ \limsup \Phi(\nu_n)(F) = \limsup \nu_n(F \cap E') \leqslant \nu_\infty(F \cap E') = \Phi(\nu_\infty)(F). \] By Proposition 1, \(\Phi(\nu_n) \to \Phi(\nu_\infty)\) in \(\Pr(K)\). Hence \(\Phi(\nu_n) \to \Phi(\nu_\infty)\) in \(Q\). Indeed, for every \(\Pr(K)\)-neighborhood \(V\) of \(\Phi(\nu_\infty)\), there exists an integer \(N\) such that \(\Phi(\nu_n) \in V\) for every \(n \geqslant N\). Let \(W\) be a \(Q\)-neighborhood of \(\Phi(\nu_\infty)\), thus \(W = V \cap Q\) where \(V\) is a \(\Pr(K)\)-neighborhood of \(\Phi(\nu_\infty)\). There exists \(N\) such that \(\Phi(\nu_n) \in V\) for every \(n \geqslant N\). But \(\Phi(\nu_n) \in Q\), hence \(\Phi(\nu_n) \in W\). Thus \(\Phi(\nu_n) \to \Phi(\nu_\infty)\) in \(Q\). □

Theorem 7. The map \(\mu\mapsto\hat\mu\) defined above is a homeomorphism from \(\Pr(E)\) to the subset \(Q\) of \(\Pr(K)\). Hence \(\Pr(E)\) is metrizable.

Proof. This map is bijective. Let \((\mu_\lambda)\) be a net in \(\Pr(E)\) and \(\mu_\infty\in\Pr(E)\). We must show that the \(\Pr(E)\)-convergence \(\mu_\lambda\to\mu_\infty\) and the \(Q\)-convergence \(\widehat{\mu_\lambda}\to\widehat{\mu_\infty}\) are equivalent. By the previous lemma, this \(Q\)-convergence is equivalent to the \(\Pr(E')\)-convergence \(\Phi(\widehat{\mu_\lambda})\to\Phi(\widehat{\mu_\infty})\).

Observe that \(\nu(f) = \Phi(\hat\nu)(f\circ\iota^{-1})\) for every \(\nu \in \Pr(E)\) and all \(f\in C_b(E)\). Indeed, this is true when \(f\) is the indicator function of a Borel set of \(E\) and hence this is also true when \(f\) is a finite linear combination of such indicator functions. But every \(f \in C_b(E)\) is the dominated limit of a sequence \((f_n)\) of such linear combinations: take \[ f_n(x) = \sum_{j=1}^\infty \frac{j-1}{n} \mathbf{1}_{f^{-1}\left(\left]\frac{j-1}{n},\frac{j}{n}\right]\right)}(x). \] Therefore the equality holds for every \(f\in C_b(E)\) by dominated convergence.

Assume that \(\mu_\lambda\to\mu_\infty\) and take \(f' \in C_b(E')\). Then \(f' = f \circ \iota^{-1}\) for \(f = f'\circ\iota \in C_b(E)\), and \(\nu(f) = \Phi(\hat\nu)(f')\) for all \(\nu\in\Pr(E)\). Therefore \[ \Phi(\widehat{\mu_\lambda})(f') = \mu_\lambda(f) \to \mu_\infty(f) = \Phi(\widehat{\mu_\infty})(f'). \] That shows that \(\Phi(\widehat{\mu_\lambda}) \to \Phi(\widehat{\mu_\infty})\).

Now assume that \(\Phi(\widehat{\mu_\lambda})\to\Phi(\widehat{\mu_\infty})\) in \(\Pr(E')\). Let \(f \in C_b(E)\) and define \(f' = f \circ \iota^{-1} \in C_b(E')\). Then \[ \mu_\lambda(f) = \Phi(\widehat{\mu_\lambda})(f') \to \Phi(\widehat{\mu_\infty})(f') = \mu_\infty(f). \] Therefore \(\mu_\lambda \to \mu_\infty\). □

Theorem 8. The space \(\Pr(E)\) is a Lusin space.

Proof. The subset \(Q\) of \(\Pr(K)\) is Borel by Lemma 2. By Proposition 4, \(\Pr(K)\) is compact and metrizable. Thus the result follows from Lemma 7. □

Theorem (Prohorov). If \(\Lambda \subset \Pr(E)\) is tight, then every sequence in \(\Lambda\) has a convergent subsequence.

Proof. For every \(\epsilon>0\), let \(K_\epsilon \subset E\) be a compact subset of \(E\) such that \(\mu(K_\epsilon) > 1-\epsilon\) for all \(\mu\in\Lambda\). Let \((\mu_n)\) be a sequence in \(\Lambda\). We know by the first proposition of this section that \(\Pr(K)\) is compact metrizable, therefore the sequence \((\widehat{\mu_n})\) has a convergent subsequence \((\widehat{\mu_{n_k}})\). Denote by \(\nu \in \Pr(K)\) its limit. Let \(A_\epsilon = \iota(K_\epsilon) \subset E'\). One has \(\widehat{\mu_{n_k}}(A_\epsilon) = \mu_{n_k}(K_\epsilon) > 1-\epsilon\). By Proposition 1, one has \(\limsup_k \widehat{\mu_{n_k}}(A_\epsilon) \leqslant \nu(A_\epsilon)\). Therefore \(\nu(E') \geqslant \nu(A_\epsilon) \geqslant 1-\epsilon\). Since this is true for every \(\epsilon>0\), \(\nu(E') = 1\). Thus the sequence \((\widehat{\mu_{n_k}})\) is a sequence in \(Q\) and its limit is in \(Q\). Therefore it converges in \(Q\). By Theorem 7, the sequence \((\mu_{n_k})\) converges in \(\Pr(E)\). □

Kernels as random probabilities

Here we prove that a probability kernel on a Lusin space defines a measurable random probability.

Lemma 9. Let \(E\) be a Lusin space and \(\Omega\) be a probability space. Then a map \(\Gamma\colon\Omega\to\Pr(E)\) is measurable if and only if \(\omega\mapsto\Gamma_\omega(f)\) is measurable for every \(f\in C_b(E)\), where we denote by \(\Gamma_\omega\) the probability \(\mathcal{B}_E \ni A \mapsto \Gamma(\omega)(A)\).

Proof. In The conditional law as a random probability, we proved this result for a Polish space \(E\). But we only used the fact that \(\Pr(E)\) is strongly Lindelöf in this case. The space \(\Pr(E)\) is Lusin by Theorem 8, hence it is strongly Lindelöf. □

Theorem 10. Let \(E\) be a metric space. Let \(\Omega\) be a probability space, and \(\Gamma\colon\Omega\to\Pr(E)\) be a map. We denote by \(\Gamma_\omega\) the image of \(\omega\) by \(\Gamma\). Define the map \(K\colon\Omega\times\mathcal{B}_E \to [0,1]\) by \(K(\omega,A) = \Gamma_\omega(A)\). If \(\Gamma\) is measurable, then \(K\) is a probability kernel from \(\Omega\) to \(E\). If \(E\) is Lusin, the converse is true.

Proof. We proved the direct implication in The conditional law as a random probability. We also proved the converse for a Polish space \(E\), but to do so we used the same lemma as Lemma 9 for a Polish space \(E\). Thus the proof is similar for a Lusin space, using Lemma 9. □

References

  • [1] Alexander S. Kechris. Classical descriptive set theory. 1995.

  • [2] L. C. G. Rogers, David Williams. Diffusions, Markov Processes, and Martingales. Volume 1: Foundations. Second edition, 1994.

  • [3] N. Dunford, J. T. Schwartz. Linear Operators: Part I, General Theory. 1958.

  • [4] Patrick Billingsley. Convergence of probability measures. Second edition, 1999.