¿Por qué n-1? en la varianza muestral
Demostración matemática de por qué se usa \(n-1\) en el denominador al calcular la varianza muestral. E() se conoce como esperanza y puede que al momento de consultar este sitio web aun no lo conozca.
\[ E(s^2)=E\left(\frac{1}{n}\sum_{i=1}^{n}(x_i-\bar{x})^2\right) \]
Se reescribe \(x_i-\bar{x}=x_i-\bar{x}-\mu+\mu\), entonces:
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}\left((x_i-\mu)-(\bar{x}-\mu)\right)^2\right) \]
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}\left((x_i-\mu)^2-2(x_i-\mu)(\bar{x}-\mu)+(\bar{x}-\mu)^2\right)\right) \]
Note que \(\sum_{i=1}^{n}1=n\), por tanto:
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}\left((x_i-\mu)^2-\frac{2}{n}(\bar{x}-\mu)\sum_{i=1}^{n}(x_i-\mu)+\frac{1}{n}(\bar{x}-\mu)^2\sum_{i=1}^{n}1\right)\right) \]
Tenemos que
\[ \frac{1}{n}\sum_{i=1}^{n}(x_i-\mu)=\frac{1}{n}\sum_{i=1}^{n}x_i-\frac{n}{n}\mu \]
y sabemos que el promedio
\[ \bar{x}=\frac{\sum_{i=1}^{n}x_i}{n} \]
Entonces:
\[ \frac{1}{n}\sum_{i=1}^{n}(x_i-\mu)=\bar{x}-\mu \]
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}\left((x_i-\mu)^2-\frac{2}{n}(\bar{x}-\mu)\sum_{i=1}^{n}(\bar{x}-\mu)+\left(\frac{n}{n}(\bar{x}-\mu)\right)^2\right)\right) \]
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}\left((x_i-\mu)^2-\frac{2}{n}n(\bar{x}-\mu)^2+(\bar{x}-\mu)^2\right)\right) \]
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}\left((x_i-\mu)^2-2(\bar{x}-\mu)^2+(\bar{x}-\mu)^2\right)\right) \]
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}(x_i-\mu)^2\right)+E\left(-2(\bar{x}-\mu)^2+(\bar{x}-\mu)^2\right) \]
\[ =E\left(\frac{1}{n}\sum_{i=1}^{n}(x_i-\mu)^2\right)-E\left((\bar{x}-\mu)^2\right) \]
\[ =\frac{1}{n}\sum_{i=1}^{n}E\left((x_i-\mu)^2\right)-E\left((\bar{x}-\mu)^2\right) \]
Por otro lado, por la definición de la media \(\mu=E(x)\) y por la definición de la varianza \(\sigma^2=E((x-E(x))^2)=\mathrm{var}(x)\). Entonces:
\[ =\frac{1}{n}\sum_{i=1}^{n}\mathrm{var}(x_i)-\mathrm{var}(\bar{x}) \]
Por definición \(\mathrm{var}(\bar{x})=\frac{\sigma^2}{n}\) y \(\sum_{i=1}^{n}\mathrm{var}(x_i)=\sigma^2\sum_{i=1}^{n}1=n\sigma^2\)
\[ =\frac{1}{n}\cdot n\sigma^2-\frac{\sigma^2}{n} \]
\[ =\sigma^2-\frac{\sigma^2}{n} =\sigma^2\left(\frac{n-1}{n}\right)\neq\sigma^2 \]
Lo que produce un estimador SESGADO de la varianza.
Si se usa \(n-1\) en lugar de \(n\), se produce el estimador insesgado
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}\left((x_i-\mu)-(\bar{x}-\mu)\right)^2\right) \]
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}\left((x_i-\mu)^2-2(x_i-\mu)(\bar{x}-\mu)+(\bar{x}-\mu)^2\right)\right) \]
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}\left((x_i-\mu)^2-\frac{2}{n-1}(\bar{x}-\mu)\sum_{i=1}^{n}(x_i-\mu)+\frac{1}{n-1}(\bar{x}-\mu)^2\sum_{i=1}^{n}1\right)\right) \]
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}\left((x_i-\mu)^2-\frac{2}{n-1}(\bar{x}-\mu)\sum_{i=1}^{n}(\bar{x}-\mu)+\left(\frac{n}{n-1}(\bar{x}-\mu)\right)^2\right)\right) \]
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}\left((x_i-\mu)^2-\frac{2n}{n-1}(\bar{x}-\mu)^2+\frac{n}{n-1}(\bar{x}-\mu)^2\right)\right) \]
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}(x_i-\mu)^2\right)+\frac{n}{n-1}E\left(-2(\bar{x}-\mu)^2+(\bar{x}-\mu)^2\right) \]
\[ =E\left(\frac{1}{n-1}\sum_{i=1}^{n}(x_i-\mu)^2\right)-\frac{n}{n-1}E\left((\bar{x}-\mu)^2\right) \]
\[ =\frac{1}{n-1}\sum_{i=1}^{n}E\left((x_i-\mu)^2\right)-\frac{n}{n-1}E\left((\bar{x}-\mu)^2\right) \]
\[ =\frac{1}{n-1}\sum_{i=1}^{n}\mathrm{var}(x_i)-\frac{n}{n-1}\mathrm{var}(\bar{x}) \]
\[ =\frac{1}{n-1}\cdot n\sigma^2-\frac{n}{n-1}\frac{\sigma^2}{n} \]
\[ =\frac{n}{n-1}\sigma^2-\frac{1}{n-1}\sigma^2 \]
\[ =\frac{n}{n-1}\left(\sigma^2-\frac{\sigma^2}{n}\right) \]
\[ =\frac{n}{n-1}\left[\sigma^2\left(1-\frac{1}{n}\right)\right] \]
\[ =\frac{n}{n-1}\left[\sigma^2\frac{n-1}{n}\right] =\sigma^2 \]