Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Discrete Mixture Models\n",
"\n",
"<font color='red'> This solution differs from the one published here:\n",
"http://www.stat.columbia.edu/~gelman/book/solutions3.pdf so it's probably wrong.\n",
"</font>\n",
"\n",
"Discrete mixture models: if $p_m(\\theta)$, for $m=1,\\ldots,M$ are conjugate prior densities for the sampling model $y|\\theta$, show that the class of finite mixture prior densities given by \n",
"$$\n",
"p(\\theta)=\\sum_{1}^{M} \\lambda_m p_m(\\theta)\n",
"$$\n",
"is also a conjugate class, where the $\\lambda_m$’s are nonnegative weights that sum to 1. This can provide a useful extension of the natural conjugate prior family to more flexible distributional forms. As an example, use the mixture form to create a bimodal prior density for a normal mean, that is thought to be near $1$, with a standard deviation of $0.5$, but has a small probability of being near $−1$, with the same standard deviation. If the variance of each observation $y_1,\\ldots,y_{10}$ is known to be $1$, and their observed mean is $y =−0.25$, derive your posterior distribution for the mean, making a sketch of both prior and posterior densities. Be careful: the prior and posterior mixture proportions are different.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's skip the theory part and look at the example.\n",
"\n",
"We have\n",
"$$\n",
"p(\\theta|y_1,\\ldots,y_{10})\\propto p(y_1,\\ldots,y_10|\\theta)p(\\theta)$$\n",
"so\n",
"$$\n",
"p(\\theta|\\{y_{i}\\})\\propto \\sum \\lambda_{m}p(\\{y_{i}\\}|\\theta)p_{m}(\\theta)\n",
"$$\n",
"\n",
"Each of the terms $p_{m}(\\theta)p(\\{y_{i}\\}|\\theta)$\n",
"is equal to $p_{m}(\\theta|\\{y_{i}\\})p_{m}(\\{y_{i}\\})$.\n",
"\n",
"Therefore the total posterior density is a weighted sum\n",
"of the individual posteriors:\n",
"\n",
"$$p(\\theta|\\{y_{i}\\})=\\sum c_{m}p_{m}(\\theta|\\{y_{i}\\})$$\n",
"where \n",
"$$\n",
"c_{m}=\\frac{\\lambda_m p_{m}(\\{y_{i}\\})}{\\sum_{m} \\lambda_m p_{m}(\\{y_{i}\\}}\n",
"$$\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"<font color=\"red\"> This is the part that was wrong -- I leave it here for historical purposes, but the correct part follows. </font>\n",
"<p>\n",
"In the special case under consideration, $p_1$ is normal with mean $-1$ and $\\sigma=.5$, $p_2$ is normal with mean $1$ and $\\sigma=.5$ and we can set $\\lambda_1=.1$ and $\\lambda_2=.9$. The $p_m(\\{y_{i}\\})$ can be calculated from the $t$ distribution. Drawing a sample of size $10$ from $p_1$ and getting a sample mean of $-.25$ and a sample variance of $1$ gives a $t$-statistics of \n",
"$$\\frac{(\\overline{y}-\\mu)}{s/\\sqrt{N}}=\\frac{(-.25+1)}{1/\\sqrt{10}}=\\sqrt{10}(-.25+1)$$ in the first case and $\\sqrt{10}(-.25-1)$ in the second. \n",
"<p>\n",
"<font color=\"red\"> Now we continue with what is correct </font>\n",
"\n",
"We need to properly interpret $p_m(\\{y_{i}\\})$ and for that we should remember where it comes from. We\n",
"rewrote\n",
"$$\n",
"p(\\{y_{i}\\}|\\theta)p_{m}(\\theta)=p_m(\\{y_{i}\\},\\theta)=p_{m}(\\{y_{i}\\})p_{m}(\\theta|\\{y_{i}\\})\n",
"$$\n",
"We're dealing here with normal distributions. On the left, the quadratic form that is the log-likelihood of the\n",
"relevant bivariate normal is\n",
"$$Q=\\frac{(\\overline{y}-\\theta)^2}{\\sigma^2}+\\frac{(\\theta-\\mu)^2}{\\tau^2}\n",
"$$\n",
"where, more specifically, $\\overline{y}=-.25$, $\\mu=\\pm 1$, $\\sigma^2=1/10$, and $\\tau^2=0.25=0.5^2$. \n",
"The first term comes from $p_{m}(\\overline{y}|\\theta)$ and the second from $p_{m}(\\theta)$.\n",
"\n",
"Pure algebra (by expanding, writing $Q$ as a quadratic in $\\theta$, and completing the square) gives us\n",
"the expression\n",
"$$\n",
"Q=\\frac{(\\theta-\\mu_{1})^2}{\\tau_1^2}+\\frac{(\\overline{y}-\\mu)^2}{\\sigma^2+\\tau^2}\n",
"$$\n",
"where \n",
"$$\n",
"\\mu_{1}=\\frac{\\mu/\\tau^2+\\overline{y}/\\sigma^2}{\\tau_1^2}\n",
"$$\n",
"and \n",
"$$\n",
"\\frac{1}{\\tau_1^2}=\\frac{1}{\\sigma^2}+\\frac{1}{\\tau^2}\n",
"$$\n",
"\n",
"In the context of the problem under discussion, the two terms tell us that\n",
"$$\n",
"\\theta\\sim N(\\mu_1,\\tau_1^2)\n",
"$$\n",
"where $\\tau_1^2=1/(1/.1+1/.25)=1/14$ and $\\mu_1=(-.25/.1\\pm 1/.25)/14$ giving $-.46$ and $.11$. \n",
"and $p_{m}(\\{y_{i}\\})=N(-.25,\\pm 1, .1+.25)$.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.30191827840729224 0.07235502834102417\n"
]
}
],
"source": [
"import numpy as np\n",
"from scipy.stats import norm, t\n",
"import matplotlib.pyplot as plt\n",
"p2y=norm.pdf(-.25,1,np.sqrt(.35))\n",
"p1y=norm.pdf(-.25,-1,np.sqrt(.35))\n",
"lambda1,lambda2=(.1,.9)\n",
"print(p1y,p2y)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"wt1=lambda1*p1y/(lambda1*p1y+lambda2*p2y)\n",
"wt2=lambda2*p2y/(lambda1*p1y+lambda2*p2y)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.3167705292212528 0.6832294707787472\n"
]
}
],
"source": [
"print(wt1, wt2)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"def posterior(prior_mean,prior_variance,sample_mean,pop_variance,n):\n",
" post_var=1/((1/prior_variance) + n/pop_variance)\n",
" post_mean=(prior_mean/prior_variance+sample_mean*n/pop_variance)/(1/post_var)\n",
" return post_mean, post_var"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-0.4642857142857143 0.10714285714285714\n"
]
}
],
"source": [
"post_mean1,post_var1=posterior(-1,.25,-.25,1,10)\n",
"post_mean2,post_var2=posterior(1,.25,-.25,1,10)\n",
"print(post_mean1,post_mean2)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x=np.linspace(-3,3,1000)\n",
"y1=lambda1*norm.pdf(x,-1,.5)+lambda2*norm.pdf(x,1,.5)\n",
"y2=wt1*norm.pdf(x,post_mean1,np.sqrt(post_var1))+wt2*norm.pdf(x,post_mean2,np.sqrt(post_var2))\n",
"fig,ax=plt.subplots(1)\n",
"ax.plot(x,y1,color='red',label='prior')\n",
"ax.plot(x,y2,color='blue',label='posterior')\n",
"ax.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This now matches the solution published on the web!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}