概要
連続確率分布の1つであるベータ分布 について解説します。
確率密度関数
確率変数 X X X が次の確率密度関数をもつとき、X X X はパラメータ α , β \alpha, \beta α , β のベータ分布 (beta distribution) に従うといい、X ∼ B e t a ( α , β ) X \sim Beta(\alpha, \beta) X ∼ B e t a ( α , β ) と表す。
f X ( x ) = { x α – 1 ( 1 – x ) β – 1 B ( α , β ) 0 ≤ x ≤ 1 0 その他の場合
f_X(x) =
\begin{cases}
\frac{x^{\alpha – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha, \beta)} & 0 \le x \le 1 \\
0 & その他の場合
\end{cases}
f X ( x ) = { B ( α , β ) x α –1 ( 1– x ) β –1 0 0 ≤ x ≤ 1 その他の場合 ただし、B ( α , β ) \Beta(\alpha, \beta) B ( α , β ) はベータ関数で α > 0 , β > 0 \alpha > 0, \beta > 0 α > 0 , β > 0 とする。
確率関数である
∫ 0 1 x α – 1 ( 1 – x ) β – 1 = Γ ( α ) Γ ( β ) Γ ( α + β )
\int_0^1 x^{\alpha – 1} (1 – x)^{\beta – 1}
= \frac{\Gamma(\alpha) \Gamma(\beta)}{\Gamma(\alpha + \beta)}
∫ 0 1 x α –1 ( 1– x ) β –1 = Γ ( α + β ) Γ ( α ) Γ ( β ) 及び
Γ ( α ) Γ ( β ) Γ ( α + β ) = B ( α + β )
\frac{\Gamma(\alpha) \Gamma(\beta)}{\Gamma(\alpha + \beta)} = \Beta(\alpha + \beta)
Γ ( α + β ) Γ ( α ) Γ ( β ) = B ( α + β ) より、
∫ 0 1 x α – 1 ( 1 – x ) β – 1 B ( α , β ) = 1
\int_0^1 \frac{x^{\alpha – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha, \beta)}
= 1
∫ 0 1 B ( α , β ) x α –1 ( 1– x ) β –1 = 1
累積分布関数
P ( X ≤ x ) = ∫ − 1 x t α – 1 ( 1 – t ) β – 1 B ( α , β ) d t = B x ( α , β ) B ( α , β ) = I x ( α , β )
\begin{aligned}
P(X \le x)
&= \int_{-1}^x \frac{t^{\alpha – 1} (1 – t)^{\beta – 1}}{\Beta(\alpha, \beta)} dt \\
&= \frac{\Beta_x(\alpha, \beta)}{\Beta(\alpha, \beta)} \\
&= I_x(\alpha, \beta)
\end{aligned}
P ( X ≤ x ) = ∫ − 1 x B ( α , β ) t α –1 ( 1– t ) β –1 d t = B ( α , β ) B x ( α , β ) = I x ( α , β ) ただし、I x ( α , β ) I_x(\alpha, \beta) I x ( α , β ) は正規化された不完全ベータ関数とする。
期待値
E [ X ] = ∫ 0 1 x x α – 1 ( 1 – x ) β – 1 B ( α , β ) d x = ∫ 0 1 x α ( 1 – x ) β – 1 B ( α , β ) d x
\begin{aligned}
E[X]
&= \int_0^1 x \frac{x^{\alpha – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha, \beta)} dx \\
&= \int_0^1 \frac{x^\alpha (1 – x)^{\beta – 1}}{\Beta(\alpha, \beta)} dx
\end{aligned}
E [ X ] = ∫ 0 1 x B ( α , β ) x α –1 ( 1– x ) β –1 d x = ∫ 0 1 B ( α , β ) x α ( 1– x ) β –1 d x α ’ = α + 1 \alpha’ = \alpha + 1 α ’ = α + 1 とおくと、
∫ 0 1 x α ’– 1 ( 1 – x ) β – 1 B ( α ’– 1 , β ) d x
\int_0^1 \frac{x^{\alpha’ – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha’ – 1, \beta)} dx
∫ 0 1 B ( α ’–1 , β ) x α ’–1 ( 1– x ) β –1 d x ここで、ベータ関数の性質
B ( α , β ) = α + β α B ( α + 1 , β )
\Beta(\alpha, \beta) = \frac{\alpha + \beta}{\alpha} \Beta(\alpha + 1, \beta)
B ( α , β ) = α α + β B ( α + 1 , β ) より、
∫ 0 1 x α ’– 1 ( 1 – x ) β – 1 B ( α ’– 1 , β ) d x = α ’– 1 α ’ + β – 1 ∫ 0 1 x α ’– 1 ( 1 – x ) β – 1 B ( α ’ , β ) d x = α ’– 1 α ’ + β – 1 ∵ 第 2 項はパラメータ α ’ , β のベータ分布の積分なので 1 = α α + β ∵ α ’ = α + 1
\begin{aligned}
& \int_0^1 \frac{x^{\alpha’ – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha’ – 1, \beta)} dx \\
&= \frac{\alpha’ – 1}{\alpha’ + \beta – 1}
\int_0^1 \frac{x^{\alpha’ – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha’, \beta)} dx \\
&= \frac{\alpha’ – 1}{\alpha’ + \beta – 1} \quad \because 第2項はパラメータ \alpha’, \beta のベータ分布の積分なので1 \\
&= \frac{\alpha}{\alpha + \beta} \quad \because \alpha’ = \alpha + 1 \\
\end{aligned}
∫ 0 1 B ( α ’–1 , β ) x α ’–1 ( 1– x ) β –1 d x = α ’ + β –1 α ’–1 ∫ 0 1 B ( α ’ , β ) x α ’–1 ( 1– x ) β –1 d x = α ’ + β –1 α ’–1 ∵ 第 2 項はパラメータ α ’ , β のベータ分布の積分なので 1 = α + β α ∵ α ’ = α + 1
分散
E [ X ] = ∫ 0 1 x 2 x α – 1 ( 1 – x ) β – 1 B ( α , β ) d x = ∫ 0 1 x α + 1 ( 1 – x ) β – 1 B ( α , β ) d x
\begin{aligned}
E[X]
&= \int_0^1 x^2 \frac{x^{\alpha – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha, \beta)} dx \\
&= \int_0^1 \frac{x^{\alpha + 1} (1 – x)^{\beta – 1}}{\Beta(\alpha, \beta)} dx
\end{aligned}
E [ X ] = ∫ 0 1 x 2 B ( α , β ) x α –1 ( 1– x ) β –1 d x = ∫ 0 1 B ( α , β ) x α + 1 ( 1– x ) β –1 d x α ’– 1 = α + 1 \alpha’ – 1 = \alpha + 1 α ’–1 = α + 1 とおくと、
∫ 0 1 x α ’– 1 ( 1 – x ) β – 1 B ( α ’– 2 , β ) d x
\int_0^1 \frac{x^{\alpha’ – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha’ – 2, \beta)} dx
∫ 0 1 B ( α ’–2 , β ) x α ’–1 ( 1– x ) β –1 d x ここで、ベータ関数の性質
B ( α – 2 , β ) = α + β – 2 α – 2 B ( α – 1 , β ) = α + β – 2 α – 2 α + β – 1 α – 1 B ( α , β )
\Beta(\alpha – 2, \beta)
= \frac{\alpha + \beta – 2}{\alpha – 2} \Beta(\alpha – 1, \beta)
= \frac{\alpha + \beta – 2}{\alpha – 2} \frac{\alpha + \beta – 1}{\alpha – 1} \Beta(\alpha, \beta)
B ( α –2 , β ) = α –2 α + β –2 B ( α –1 , β ) = α –2 α + β –2 α –1 α + β –1 B ( α , β ) より、
∫ 0 1 x α ’– 1 ( 1 – x ) β – 1 B ( α ’– 2 , β ) d x = α ’– 2 α ’ + β – 2 α ’– 1 α ’ + β – 1 ∫ 0 1 x α ’– 1 ( 1 – x ) β – 1 B ( α ’ , β ) d x = α ’– 2 α ’ + β – 2 α ’– 1 α ’ + β – 1 ∵ 第 2 項はパラメータ α ’ , β のベータ分布の積分なので、 1 = α ( α + 1 ) ( α + β ) ( α + β + 1 ) ∵ α ’– 1 = α + 1
\begin{aligned}
& \int_0^1 \frac{x^{\alpha’ – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha’ – 2, \beta)} dx \\
&= \frac{\alpha’ – 2}{\alpha’ + \beta – 2} \frac{\alpha’ – 1}{\alpha’ + \beta – 1}
\int_0^1 \frac{x^{\alpha’ – 1} (1 – x)^{\beta – 1}}{\Beta(\alpha’, \beta)} dx \\
&= \frac{\alpha’ – 2}{\alpha’ + \beta – 2} \frac{\alpha’ – 1}{\alpha’ + \beta – 1} \quad \because 第2項はパラメータ \alpha’, \beta のベータ分布の積分なので、1 \\
&= \frac{\alpha(\alpha + 1)}{(\alpha + \beta)(\alpha + \beta + 1)} \quad \because \alpha’ – 1 = \alpha + 1 \\
\end{aligned}
∫ 0 1 B ( α ’–2 , β ) x α ’–1 ( 1– x ) β –1 d x = α ’ + β –2 α ’–2 α ’ + β –1 α ’–1 ∫ 0 1 B ( α ’ , β ) x α ’–1 ( 1– x ) β –1 d x = α ’ + β –2 α ’–2 α ’ + β –1 α ’–1 ∵ 第 2 項はパラメータ α ’ , β のベータ分布の積分なので、 1 = ( α + β ) ( α + β + 1 ) α ( α + 1 ) ∵ α ’–1 = α + 1 よって、
V a r [ X ] = E [ X 2 ] – ( E [ X ] ) 2 = α ( α + 1 ) ( α + β ) ( α + β + 1 ) – ( α α + β ) 2 = α ( α + 1 ) ( α + β ) – α 2 ( α + β + 1 ) ( α + β ) 2 ( α + β + 1 ) = α β ( α + β ) 2 ( α + β + 1 )
\begin{aligned}
Var[X]
&= E[X^2] – (E[X])^2 \\
&= \frac{\alpha(\alpha + 1)}{(\alpha + \beta)(\alpha + \beta + 1)}
– \left(\frac{\alpha}{\alpha + \beta}\right)^2 \\
&= \frac{\alpha(\alpha + 1)(\alpha + \beta) – \alpha^2(\alpha + \beta + 1)}
{(\alpha + \beta)^2(\alpha + \beta + 1)} \\
&= \frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}
\end{aligned}
Va r [ X ] = E [ X 2 ] – ( E [ X ] ) 2 = ( α + β ) ( α + β + 1 ) α ( α + 1 ) – ( α + β α ) 2 = ( α + β ) 2 ( α + β + 1 ) α ( α + 1 ) ( α + β ) – α 2 ( α + β + 1 ) = ( α + β ) 2 ( α + β + 1 ) α β
標準偏差
S t d [ X ] = V a r [ X ] = ( α β ( α + β ) 2 ( α + β + 1 ) ) 1 2
Std[X] = \sqrt{Var[X]}
= \left(\frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}\right)^{\frac{1}{2}}
St d [ X ] = Va r [ X ] = ( ( α + β ) 2 ( α + β + 1 ) α β ) 2 1
積率母関数
m X ( t ) = E [ e t X ] = ∫ 0 1 e t x f X ( x ) d x = 1 B ( α , β ) ∫ 0 1 ( ∑ k = 0 ∞ ( t x ) k k ! ) x α – 1 ( 1 – x ) β – 1 d x ∵ e t x のテイラー展開 = ∑ k = 0 ∞ t k k ! 1 B ( α , β ) ∫ 0 1 x α + k – 1 ( 1 – x ) β – 1 d x ∵ 収束半径内で無限級数と積分は交換可能 = ∑ k = 0 ∞ t k k ! B ( α + k , β ) B ( α , β ) ∵ ベータ関数の定義 = 1 + ∑ k = 1 ∞ t k k ! B ( α + k , β ) B ( α , β ) = 1 + ∑ k = 1 ∞ t k k ! Γ ( α + k ) Γ ( β ) Γ ( α + β + k ) Γ ( α + β ) Γ ( α ) Γ ( β ) ∵ ベータ関数とガンマ関数の関係 = 1 + ∑ k = 1 ∞ t k k ! Γ ( α + k ) Γ ( α ) Γ ( α + β ) Γ ( α + β + k ) = 1 + ∑ k = 1 ∞ t k k ! Γ ( α ) ∏ r = 0 k – 1 ( α + r ) Γ ( α ) Γ ( α + β ) Γ ( α + β ) ∏ r = 0 k – 1 ( α + β + r ) ∵ ガンマ関数の性質 = 1 + ∑ k = 1 ∞ ∏ r = 0 k – 1 ( α + r α + β + r ) t k k !
\begin{aligned}
m_X(t)
&= E[e^{tX}] \\
&= \int_0^1 e^{tx} f_X(x) dx \\
&= \frac{1}{\Beta(\alpha, \beta)} \int_0^1
\left( \sum_{k = 0}^\infty \frac{(tx)^k}{k!} \right)
x^{\alpha – 1} (1 – x)^{\beta – 1} dx
\quad \because e^{tx} のテイラー展開 \\
&= \sum_{k = 0}^\infty \frac{t^k}{k!}
\frac{1}{\Beta(\alpha, \beta)} \int_0^1
x^{\alpha + k – 1} (1 – x)^{\beta – 1} dx
\quad \because 収束半径内で無限級数と積分は交換可能 \\
&= \sum_{k = 0}^\infty \frac{t^k}{k!}
\frac{\Beta(\alpha + k, \beta)}{\Beta(\alpha, \beta)}
\quad \because ベータ関数の定義 \\
&= 1 + \sum_{k = 1}^\infty \frac{t^k}{k!}
\frac{\Beta(\alpha + k, \beta)}{\Beta(\alpha, \beta)} \\
&= 1 + \sum_{k = 1}^\infty \frac{t^k}{k!}
\frac{\Gamma(\alpha + k) \Gamma(\beta)}{\Gamma(\alpha + \beta + k)}
\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)}
\quad \because ベータ関数とガンマ関数の関係 \\
&= 1 + \sum_{k = 1}^\infty \frac{t^k}{k!}
\frac{\Gamma(\alpha + k)}{\Gamma(\alpha)}
\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha + \beta + k)} \\
&= 1 + \sum_{k = 1}^\infty \frac{t^k}{k!}
\frac{\Gamma(\alpha) \prod_{r = 0}^{k – 1} (\alpha + r)}{\Gamma(\alpha)}
\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha + \beta) \prod_{r = 0}^{k – 1} (\alpha + \beta + r)}
\quad \because ガンマ関数の性質 \\
&= 1 + \sum_{k = 1}^\infty
\prod_{r = 0}^{k – 1} \left( \frac{\alpha + r}{\alpha + \beta + r} \right) \frac{t^k}{k!}
\end{aligned}
m X ( t ) = E [ e tX ] = ∫ 0 1 e t x f X ( x ) d x = B ( α , β ) 1 ∫ 0 1 ( k = 0 ∑ ∞ k ! ( t x ) k ) x α –1 ( 1– x ) β –1 d x ∵ e t x のテイラー展開 = k = 0 ∑ ∞ k ! t k B ( α , β ) 1 ∫ 0 1 x α + k –1 ( 1– x ) β –1 d x ∵ 収束半径内で無限級数と積分は交換可能 = k = 0 ∑ ∞ k ! t k B ( α , β ) B ( α + k , β ) ∵ ベータ関数の定義 = 1 + k = 1 ∑ ∞ k ! t k B ( α , β ) B ( α + k , β ) = 1 + k = 1 ∑ ∞ k ! t k Γ ( α + β + k ) Γ ( α + k ) Γ ( β ) Γ ( α ) Γ ( β ) Γ ( α + β ) ∵ ベータ関数とガンマ関数の関係 = 1 + k = 1 ∑ ∞ k ! t k Γ ( α ) Γ ( α + k ) Γ ( α + β + k ) Γ ( α + β ) = 1 + k = 1 ∑ ∞ k ! t k Γ ( α ) Γ ( α ) ∏ r = 0 k –1 ( α + r ) Γ ( α + β ) ∏ r = 0 k –1 ( α + β + r ) Γ ( α + β ) ∵ ガンマ関数の性質 = 1 + k = 1 ∑ ∞ r = 0 ∏ k –1 ( α + β + r α + r ) k ! t k
scipy.stats のベータ分布
scipy.stats.beta でベータ分布に従う確率変数を作成できます。
サンプリング
[0.99813457 0.5202959 0.95723974 0.92700918 0.8402795 ]
確率密度関数
ipywidgets でインタラクティブに密度関数を描画する
統計量
mean 0.7865168539325842
var 0.04264874077027537
std 0.20651571555277667
コメント