# 母分散の不偏推定量

$$\epsilon_i=x_i-\mu$$ 」かつ 「 $$i\neq j$$ のとき $$\epsilon_i$$$$\epsilon_j$$ は独立」と前提すると、$\mathrm{E}\left(\epsilon_i\right)=0,\,\mathrm{V}\left(\epsilon_i\right)=\sigma^2=\mathrm{E}\left(\epsilon_i^2\right)$ さらに、$\bar{\epsilon}=\dfrac{1}{n}\displaystyle\sum\epsilon_i$ と置くと$\bar{x}=\bar{\epsilon}+\mu$となるため、 $\begin{eqnarray}\mathrm{E}\left(\left(x_i-\bar{x}\right)^2\right)&=&\mathrm{E}\left(\left(\left(\epsilon_i+\mu\right)-\left(\bar{\epsilon}+\mu\right)\right)^2\right)\\ &=&\mathrm{E}\left(\left(\epsilon_i-\bar{\epsilon}\right)^2\right)\\ &=&\mathrm{E}\left(\left(\epsilon_i-\dfrac{1}{n}\displaystyle\sum_{i=1}^n\epsilon_i\right)^2\right)\\ &=&\mathrm{E}\left(\left(\epsilon_i-\left(\dfrac{1}{n}\epsilon_i+\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)\right)^2\right)\\ &=&\mathrm{E}\left(\left(\epsilon_i-\dfrac{1}{n}\epsilon_i-\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)^2\right)\\ &=&\mathrm{E}\left(\left(\dfrac{n-1}{n}\epsilon_i-\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)^2\right)\\ &=&\mathrm{E}\left(\left(\dfrac{n-1}{n}\epsilon_i\right)^2+\left(\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)^2-2\left(\dfrac{n-1}{n}\epsilon_i\right)\left(\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)\right)\\ \end{eqnarray}$ ここで右辺の、$-2\left(\dfrac{n-1}{n}\epsilon_i\right)\left(\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)$ の過程に現れる $$\left(\epsilon_i\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)$$$$\mathrm{E\left(\epsilon_i\epsilon_j\right)}=0$$ であるため、$-2\left(\dfrac{n-1}{n}\epsilon_i\right)\left(\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)=0$となり、 $\begin{eqnarray}\mathrm{E}\left(\left(x_i-\bar{x}\right)^2\right)&=&\mathrm{E}\left(\left(\dfrac{n-1}{n}\epsilon_i\right)^2+\left(\dfrac{1}{n}\displaystyle\sum_{j=1,\,j\neq i}^n\epsilon_j\right)^2\right)\\ &=&\left(\dfrac{n-1}{n}\right)^2\mathrm{E}\left(\epsilon_i^2\right)+\dfrac{1}{n^2}\displaystyle\sum_{j=1,\,j\neq i}^n\mathrm{E}\left(\epsilon_j^2\right)\\ &=&\left(\dfrac{n-1}{n}\right)^2\sigma^2+\dfrac{1}{n^2}\displaystyle\sum_{j=1,\,j\neq i}^n\mathrm{E}\left(\epsilon_j^2\right)\\ &=&\left(\dfrac{n-1}{n}\right)^2\,\sigma^2 +\dfrac{1}{n^2}\left(n-1\right)\sigma^2\\ &=&\left(\left(\dfrac{n-1}{n}\right)^2+\dfrac{1}{n^2}\left(n-1\right)\right)\sigma^2\\ &=&\left(\dfrac{n^2-2n+1+n-1}{n^2}\right)\sigma^2\\ &=&\left(\dfrac{n^2-n}{n^2}\right)\sigma^2\\ &=&\frac{n-1}{n}\sigma^2\end{eqnarray}$

よって、 $\dfrac{1}{n-1}\mathrm{E}\left(\left(x_i-\bar{x}\right)^2\right)=\dfrac{1}{n}\sigma^2=\dfrac{1}{n}\mathrm{E}\left(\left(x_i-\mu\right)^2\right)$ シミュレーションで母分散の不偏推定量(不偏分散)を確認します。

$$\mathrm{N}(0,3)$$ に従う母集団( N=10000 )から 1000組 のサンプル( n=20 )を生成し、それぞれの組の「不偏分散(分母は n-1 )」と「分母を n とした場合の分散」のベクトル(それぞれ v1v2 )を作成します。

set.seed(20240630)
v1 <- v2 <- vector()
N <- 10000
X <- rnorm(N, mean = 0, sd = 3)
n <- 20
for (i in seq(1000)) {
x <- sample(x = X, size = n, replace = T)
v1[i] <- sum((x - mean(x))^2) / (n - 1)
v2[i] <- sum((x - mean(x))^2) / n
}
sum((X - mean(X))^2) / N
[1] 9.150467

summary(v1)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
2.055   6.832   8.777   9.174  11.027  23.471 

summary(v2)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
1.952   6.491   8.338   8.716  10.476  22.297