Matrix variate Beta distributions

Posted on December 12, 2017 by Stéphane Laurent
Tags: R, maths, statistics

\(\newcommand{\etr}{\textrm{etr}}\)

One asked me what are the densities of the matrix variate Beta distributions in the matrixsampling package. I’m going to give them here.

Definitions

Two definitions of the matrix variate Beta type I distribution were proposed. We will denote them by \(\mathcal{B}I_p^{(1)}(a,b,\Theta_1,\Theta_2)\) and \(\mathcal{B}I_p^{(2)}(a,b,\Theta_1,\Theta_2)\), where \(\Theta_1\) and \(\Theta_2\) are the noncentrality parameters. Take two independent Wishart random matrices \(W_1 \sim \mathcal{W}_p(2a, I_p, \Theta_1)\) and \(W_2 \sim \mathcal{W}_p(2b, I_p, \Theta_2)\). Then \(\mathcal{B}I_p^{(1)}(a,b,\Theta_1,\Theta_2)\) is the distribution of \[ U_1 = {(W_1+W_2)}^{-\frac12}W_1{(W_1+W_2)}^{-\frac12}, \] while \(\mathcal{B}I_p^{(2)}(a,b,\Theta_1,\Theta_2)\) is the distribution of \[ U_2 = W_1^\frac12{(W_1+W_2)}^{-1}W_1^\frac12. \] The condition \(a+b > \frac12(p-1)\) is required in order for \(W_1+W_2\) to be invertible.

In the central case, i.e. when both \(\Theta_1\) and \(\Theta_2\) are the null matrices, these two distributions are the same. More generally, as we will see, they are the same when \(\Theta_1\) and \(\Theta_2\) are scalar.

Similarly, two definitions of the matrix variate Beta type II distribution were proposed. We will denote them by \(\mathcal{B}II_p^{(1)}(a,b,\Theta_1,\Theta_2)\) and \(\mathcal{B}II_p^{(2)}(a,b,\Theta_1,\Theta_2)\). The first one is the distribution of \[ V_1 = W_2^{-\frac12} W_1 W_2^{-\frac12}, \] while the second one is the distribution of \[ V_2 = W_1^\frac12 {W_2}^{-1} W_1^\frac12. \] The condition \(b > \frac12(p-1)\) is required in order for \(W_2\) to be invertible.

Similarly to the type I, these two distributions are the same in the central case, and more generally when \(\Theta_1\) and \(\Theta_2\) are scalar.

Under the second definition, the Beta type I distribution is related to the Beta type II distribution by \(U_2 \sim V_2{(I_p+V_2)}^{-1}\).

Hypergeometric matrix function

The densities of the matrix Beta distributions involve the hypergeometric function of matrix argument \({}_0\!F_1\). We will use the property \({}_0\!F_1(\alpha, AB)={}_0\!F_1(\alpha, BA)\) (to simplify the densities when \(\Theta_1\) or \(\Theta_2\) are scalar).

Densities and identities

We provide the densities here. We will prove these results in the next section. Note that \(a\) and \(b\) must satisfy \(a,b > \frac12(p+1)\) in order for each distribution to have a density.

We denote by \(\etr\) the function taking the exponential of the trace of a matrix.

\(\boxed{\mathcal{B}I_p^{(1)}(a, b, \Theta_1, \Theta_2)}\)

Recall that \[ U_1 = {(W_1+W_2)}^{-\frac12} W_1 {(W_1+W_2)}^{-\frac12}. \] It is clear from this definition that \[ I_p - U_1 \sim \mathcal{B}I_p^{(1)}(b, a, \Theta_2, \Theta_1). \] The density of \(U_1\) is \[ \begin{split} & \mathcal{B}I_p^{(1)}(U \mid a, b, \Theta_1, \Theta_2) \propto {\det(U)}^{a-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \quad \int_{S>0} \etr(-S) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S^{\frac12} U S^\frac12\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S^{\frac12}(I_p-U)S^\frac12\right) \mathrm{d}S \end{split} \] for \(0 < U < I_p\).

If \(\Theta_1\) and \(\Theta_2\) are scalar, \[ \begin{split} & \mathcal{B}I_p^{(1)}(U \mid a, b, \Theta_1, \Theta_2) \propto {\det(U)}^{a-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \quad \int_{S>0} \etr(-S) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1SU\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S(I_p-U)\right) \mathrm{d}S. \end{split} \]

\(\boxed{\mathcal{B}I_p^{(2)}(a, b, \Theta_1, \Theta_2)}\)

Recall that \[ U_2 = W_1^\frac12{(W_1+W_2)}^{-1}W_1^\frac12. \]

The density of \(U_2\) is \[ \begin{split} & \mathcal{B}I_p^{(2)}(U \mid a, b, \Theta_1, \Theta_2) \propto {\det(U)}^{-b-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \quad \int_{S>0} \etr(-U^{-1}S) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2 S^\frac12 U^{-1}(I_p-U)S^{\frac12}\right)\mathrm{d}S \end{split} \] for \(0 < U < I_p\).

If \(\Theta_1\) and \(\Theta_2\) are scalar, it is equal to the density of \(\mathcal{B}I_p^{(1)}(a, b, \Theta_1, \Theta_2)\).

\(\boxed{\mathcal{B}II_p^{(1)}(a,b,\Theta_1,\Theta_2)}\)

More generally, we give the density of \[ V_1 = {(W_2^{-\frac12})}' W_1 W_2^{-\frac12} \] where \(W_2^{\frac12}{(W_2^{\frac12})}' = W_2\).

This density of \(V_1\) is \[ \begin{split} & \mathcal{B}II_p^{(1)}(V \mid a, b, \Theta_1, \Theta_2) \propto {\det(V)}^{a-\frac12(p+1)} \\ & \quad \int_{S>0} \etr\bigl(-(I_p+V)S\bigr) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1{(S^{\frac12})}' V S^{\frac12}\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S\right) \mathrm{d}S \end{split} \] for \(V>0\).

If \(\Theta_1\) is scalar, the distribution does not depend on the choice of \(W_2^\frac12\).

\(\boxed{\mathcal{B}II_p^{(2)}(a,b,\Theta_1,\Theta_2)}\)

More generally, we give the density of \[ V_2 = W_1^{\frac12} W_2^{-1} {(W_1^{\frac12})}' \] where \(W_1^{\frac12}{(W_1^{\frac12})}' = W_1\).

Assuming \(a > \frac12(p-1)\), it is clear from the definitions that \[ V_2^{-1} \sim \mathcal{B}II_p^{(1)}(b,a,\Theta_2,\Theta_1). \]

The density of \(V_2\) is \[ \begin{split} & \mathcal{B}II_p^{(2)}(V \mid a, b, \Theta_1, \Theta_2) \propto {\det(V)}^{-b-\frac12(p+1)} \\ & \quad \int_{S >0} \etr\bigl(-(I_p+V^{-1})S\bigr) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2{(S^\frac12)}' V^{-1} S^\frac12\right) \mathrm{d}S. \end{split} \] for \(V >0\).

If \(\Theta_2\) is scalar, the distribution does not depend on the choice of \(W_1^\frac12\).

If \(\Theta_1\) and \(\Theta_2\) are scalar, this is the same distribution as \(\mathcal{B}II_p^{(1)}(a,b,\Theta_1,\Theta_2)\).

If we take \(W_1^{\frac12}\) the symmetric square root of \(W_1\), then \(V_2{(I_p+V_2)}^{-1} \sim \mathcal{B}I_p^{(2)}(a,b,\Theta_1,\Theta_2)\).

Proofs

To derive the densities, we start with the joint density of \(W_1\) and \(W_2\) which is \[ C \, \etr\left(-\frac12 W_1\right)\etr\left(-\frac12 W_2\right) {\det(W_1)}^{a-\frac12(p+1)} {\det(W_2)}^{b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{4}\Theta_1W_1\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2W_2\right) \] where \(C\) is a constant.

In our following calculations, we always denote by \(C\) a constant not depending of the variables. The value of \(C\) is relative to each expression in which it is contained: two \(C\)’s written at two different places have not the same value.

\(\boxed{\mathcal{B}I_p^{(1)}(a,b,\Theta_1,\Theta_2)}\)

Recall that \(\mathcal{B}I_p^{(1)}(a,b,\Theta_1,\Theta_2)\) is the distribution of \[ U = {(W_1+W_2)}^{-\frac12}W_1{(W_1+W_2)}^{-\frac12}. \] By applying the transformation \(W_1+W_2=S\) and \(W_1 = S^{\frac12} U S^\frac12\), whose Jacobian is \(J(W_1, W_2 \rightarrow U, S) = {\det(S)}^{\frac12(p+1)}\), we get the pdf of \((U,S)\) as \[ \begin{aligned} C\, & {\det(U)}^{a-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \etr\left(-\frac12 S\right) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{4}\Theta_1S^{\frac12} U S^\frac12\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2S^{\frac12}(I_p-U) S^\frac12\right). \end{aligned} \] Thus the density of \(U\) is \[ \begin{aligned} C\, & {\det(U)}^{a-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \int_{S>0} \etr\left(-\frac12 S\right) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{4}\Theta_1S^{\frac12}U{(S^\frac12)}'\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2S^{\frac12}(I_p-U)S^\frac12\right) \mathrm{d}S \\ = C\,& {\det(U)}^{a-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \int_{S>0} \etr(-S) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S^{\frac12} U S^\frac12\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S^{\frac12}(I_p-U)S^\frac12\right) \mathrm{d}S. \end{aligned} \]

\(\boxed{\mathcal{B}I_p^{(2)}(a,b,\Theta_1,\Theta_2)}\)

Recall that \(\mathcal{B}I_p^{(2)}(a,b,\Theta_1,\Theta_2)\) is the distribution of \[ U = W_1^\frac12{(W_1+W_2)}^{-1}W_1^\frac12. \]

We use the transformation \(W_1+W_2 = W_1^{\frac12}U^{-1}W_1^{\frac12}\). The Jacobian of this transformation is \(J(W_1, W_2 \rightarrow U, W_1) = {\det(W_1)}^{\frac12(p+1)} {\det(U)}^{-(p+1)}\), and we get the pdf of \((U,W_1)\) as \[ \begin{aligned} C \, & {\det(W_1)}^{a+b-\frac12(p+1)} {\det(U)}^{-(p+1)} {\det(U^{-1}-I_p)}^{b-\frac12(p+1)} \\ & \etr\left(-\frac12 W_1U^{-1}\right) {}_0\!F_1\left(a, \frac{1}{4}\Theta_1W_1\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2W_1^{\frac12}(U^{-1}-I_p)W_1^\frac12\right) \\ = C \, & {\det(W_1)}^{a+b-\frac12(p+1)} {\det(U)}^{-b-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \etr\left(-\frac12 W_1U^{-1}\right) {}_0\!F_1\left(a, \frac{1}{4}\Theta_1W_1\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2W_1^{\frac12}U^{-1}(I_p-U)W_1^\frac12\right). \end{aligned} \] Thus the density of \(U\) is \[ \begin{aligned} C\, & {\det(U)}^{-b-\frac12(p+1)} {\det(I_p-U)}^{b-\frac12(p+1)} \\ & \int_{W_1>0} \etr\left(-\frac12 W_1U^{-1}\right) {\det(W_1)}^{a+b-\frac12(p+1)} \\ & \qquad {}_0\!F_1\left(a, \frac{1}{4}\Theta_1W_1\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2W_1^{\frac12}U^{-1}(I_p-U)W_1^\frac12\right) \mathrm{d}W_1\\ = C\, & {\det\bigl(U{(I_p-U)}^{-1}\bigr)}^{-b-\frac12(p+1)} {\det(I_p-U)}^{-(p+1)} \\ & \int_{S>0} \etr(- U^{-1}S) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S^{\frac12}U^{-1}(I_p-U)S^\frac12\right) \mathrm{d}S. \end{aligned} \] Let’s derive the density of \(U{(I_p-U)}^{-1}\). Using the transformation \(U = V{(I_p+V)}^{-1}\) with Jacobian \(J(U \rightarrow V) = {\det(I_p+V)}^{-(p+1)}\), we get the density of \(V\) as \[ \begin{aligned} C\, & {\det(V)}^{-b-\frac12(p+1)} \\ & \int_{S>0} \etr\bigl(-(I_p+V^{-1})S\bigr) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S^{\frac12}V^{-1}S^\frac12\right) \mathrm{d}S. \end{aligned} \] This is the density of \(\mathcal{B}II_p^{(2)}(a,b,\Theta_1,\Theta_2)\).

\(\boxed{\mathcal{B}II_p^{(1)}(a,b,\Theta_1,\Theta_2)}\)

Recall that \(\mathcal{B}II_p^{(1)}(a,b,\Theta_1,\Theta_2)\) is the distribution of \[ V = {(W_2^{-\frac12})}' W_1 W_2^{-\frac12}. \]

Transforming \(W_1 = {(W_2^{\frac12})}' V W_2^{\frac12}\) with Jacobian \(J(W_1, W_2 \rightarrow V, W_2) = {\det(W_2)}^{\frac12(p+1)}\), we get the density of \((V,W_2)\) as \[ \begin{aligned} C\, & {\det(V)}^{a-\frac12(p+1)} {\det(W_2)}^{a+b-\frac12(p+1)} \\ & \etr\left(-\frac12 (I_p+V)W_2\right) {}_0\!F_1\left(a, \frac{1}{4}\Theta_1 {(W_2^{\frac12})}' V W_2^{\frac12}\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2W_2\right). \end{aligned} \] Thus, the density of \(V\) is \[ \begin{aligned} C\, & {\det(V)}^{a-\frac12(p+1)} \\ & \int_{S>0} \etr\bigl(-(I_p+V)S\bigr) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1{(S^{\frac12})}' V S^{\frac12}\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2S\right) \mathrm{d}S. \end{aligned} \]

\(\boxed{\mathcal{B}II_p^{(2)}(a,b,\Theta_1,\Theta_2)}\)

Recall that \(\mathcal{B}II_p^{(2)}(a,b,\Theta_1,\Theta_2)\) is the distribution of \[ V = W_1^{\frac12} W_2^{-1} {(W_1^{\frac12})}'. \]

We apply the transformation \(W_2 = {(W_1^\frac12)}' V^{-1} W_1^\frac12\). The Jacobian of this transformation is \(J(W_1, W_2 \rightarrow V, W_1) = {\det(W_1)}^{\frac12(p+1)}{\det(V)}^{-(p+1)}\). We get the density of \((V,W_1)\) as \[ \begin{aligned} C\, & {\det(W_1)}^{a+b-\frac12(p+1)} {\det(V)}^{-b-\frac12(p+1)} \\ & \etr\left(-\frac12 W_1\right)\etr\left(-\frac12 W_1V^{-1}\right) {}_0\!F_1\left(a, \frac{1}{4}\Theta_1W_1\right) {}_0\!F_1\left(b, \frac{1}{4}\Theta_2{(W_1^\frac12)}' V^{-1} W_1^\frac12\right). \end{aligned} \] Thus the density of \(V\) is \[ \begin{aligned} C\, & {\det(V)}^{-b-\frac12(p+1)} \\ & \int_{S >0} \etr\bigl(-(I_p+V^{-1})S\bigr) {\det(S)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1S\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2{(S^\frac12)}' V^{-1} S^\frac12\right) \mathrm{d}S. \end{aligned} \] Now assume \(\Theta_2\) is scalar. Then one has \[ {}_0\!F_1\left(b, \frac{1}{2}\Theta_2{(S^\frac12)}' V^{-1} S^\frac12\right) = {}_0\!F_1\left(b, \frac{1}{2}\Theta_2 S V^{-1}\right). \] Doing the change of variables \(R = S V^{-1}\) in the integral, we get the density of \(V\) as \[ \begin{aligned} C\, & {\det(V)}^{a-\frac12(p+1)} \\ & \int_{R >0} \etr\bigl(-(I_p+V)R\bigr) {\det(R)}^{a+b-\frac12(p+1)} {}_0\!F_1\left(a, \frac{1}{2}\Theta_1 R V\right) {}_0\!F_1\left(b, \frac{1}{2}\Theta_2 R\right) \mathrm{d}R. \end{aligned} \] If in addition, \(\Theta_1\) is scalar, this is the density of \(\mathcal{B}II_p^{(1)}(a,b,\Theta_1,\Theta_2)\).

Simplifications in the singly noncentral case

One has \({}_0\!F_1(\alpha, \mathbf{0}) = 1\). Thus, when \(\Theta_1\) or \(\Theta_2\) is the null matrix, we get integrals like \[ \int_{S>0} \etr(-ZS) {\det(S)}^{\alpha-\frac12(p+1)} {}_0\!F_1\left(\beta, \frac{1}{2}\Theta S T \right) \mathrm{d}S, \] for example in the density of \(\mathcal{B}I_p^{(2)}(a,b,\Theta_1,\Theta_2)\) when \(\Theta_2\) is the null matrix, or in the density of \(\mathcal{B}I_p^{(1)}(a,b,\Theta_1,\Theta_2)\) when \(\Theta_1\) is the null matrix and \(\Theta_2\) is scalar.

This integral is equal to \[ \Gamma_p(\alpha){\det(Z)}^{-\alpha} {}_1\!F_1\left(\alpha, \beta, \frac{1}{2}\Theta Z^{-1} T \right). \]