Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Problem 5.9.1\n",
"\n",
"### Exchangeability with known model parameters\n",
"\n",
"For each of the following three examples, answer: \n",
"1. Are observations $y_1$ and $y_2$ exchangeable? \n",
"2. Are observations $y_1$ and $y_2$ independent? \n",
"3. Can we act as if the two observations are independent? \n",
"\n",
"Examples:\n",
"1. A box has one black ball and one white ball. We pick a ball $y_1$ at random, put it back, and pick another ball $y_2$ at random. \n",
"\n",
"Here the events are clearly independent and exchangeable. \n",
"\n",
"\n",
"2. A box has one black ball and one white ball. We pick a ball $y_1$ at random, we do not put it back, then we pick ball $y_2$.\n",
"\n",
"In this case there are four outcomes: (BB), (BW), (WB), (WW) and of these four only (WB) and (BW) have non-zero probability (1/2). Since the likelihood is symmetric, the observations are exchangeable. Clearly, though,\n",
"the events aren't independent; for example P(B|B)=0 and P(B|W)=1. And you clearly can't act as if they're independent since the second observation is determined by the first.\n",
"\n",
"3. A box has a million black balls and a million white balls. We pick a ball $y_1$ at random, we do not put it back, then we pick ball $y_2$ at random.\n",
"\n",
"These are exchangeable observations since P(BW)=(1/2)(1000000/1999999) =P(WB) and P(BB)=P(WW). They act independent since 1/2(1000000)/1999999) is just about 1/4 and so is 1/2(999999)/(1999999).\n",
"\n",
"\n",
"Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B.. Bayesian Data Analysis, Third Edition (Chapman & Hall/CRC Texts in Statistical Science) (Page 134). CRC Press. Kindle Edition. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Problem 5.9.2\n",
"\n",
"We ask the same questions as in the preceeding problem but under the conditions:\n",
"\n",
"1. A box has $n$ balls colored black and white, but we don't know how many of each. We pick a ball, put it back, then pick another.\n",
"2. Same except we pick a ball, don't put it back, then pick another.\n",
"3. Suppose we know that there are a lot of balls of each color.\n",
"\n",
"In the first case, let $\\theta$ be the proportion of white balls in the urn. Then $P(BW)=(1-\\theta)\\theta$\n",
"and $P(WB)=\\theta(1-\\theta)$. So the events are exchangeable. Also $P(BB)=(1-\\theta)^2=P(B)^2$, $P(BW)=P(WB)=\n",
"P(W)P(B)$, and $P(WW)=P(W)^2$. So they are independent. \n",
"\n",
"In the second case, we have $P(WB)=(\\theta)(n(1-\\theta)/(n-1))$ and $P(BW)=(1-\\theta)(n\\theta)/(n-1)$\n",
"and these are the same, so it's exchangeable. But they are not independent since $P(WB)$ isn't $P(W)P(B)$.\n",
"Also $P(WW)=(\\theta)(n\\theta-1)/(n-1)$ and $P(BB)=(1-\\theta)(n-n\\theta-1)/(n-1)$.\n",
"\n",
"They do get close to independent if $n$ is large.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# These graphs show the difference Pnr(WW)-Pr(WW) where Pnr means no replacement and Pr means with replacement.\n",
"x=np.linspace(0,1,100)\n",
"n=5\n",
"fig,ax=plt.subplots(1,2)\n",
"ax[0].plot(x,x*(n*x-1)/(n-1)-x*x)\n",
"ax[0].set_title('n=5')\n",
"n=100\n",
"ax[1].plot(x,x*(n*x-1)/(n-1)-x*x)\n",
"#ax[1].plot(x,x*x)\n",
"ax[1].set_title('n=100')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Problem 5.9.4\n",
"\n",
"Exchangeable prior distributions: suppose it is known a priori that the $2J$ parameters $\\theta_1,\\ldots,\\theta_{2J}$ are clustered into two groups, with exactly half being drawn from a $N(1, 1)$ distribution, and the other half being drawn from a $N(−1 , 1)$ distribution, but we have not observed which parameters come from which distribution. \n",
"\n",
"1. Are $\\theta_1,\\ldots,\\theta_{2J}$ exchangeable under this prior distribution? \n",
"2. Show that this distribution cannot be written as a mixture of independent and identically distributed components.\n",
"3. Why can we not simply take the limit as $J\\to\\infty$ and get a counterexample to de Finetti’s theorem?\n",
"\n",
"Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B.. Bayesian Data Analysis, Third Edition (Chapman & Hall/CRC Texts in Statistical Science) (Page 134). CRC Press. Kindle Edition. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at the case J=1. This is very much like the earlier problems, but with a continuous distribution.\n",
"So for example $P(x,y)=(1/2)P(x,N(1,1))P(y,N(-1,1))+(1/2)P(x,N(-1,1))P(y,N(1,1))$ and\n",
"$P(x,x)=P(x,N(1,1))P(x,N(-1,1))$ so it's exchangeable.\n",
"\n",
"After a small amount of cheating (by looking at some solutions by Gelman) the suggestion is to look at the covariance of $y_1$ and $y_2$. Informally, they should have negative covariance because if $y_1$ is large, it suggests that it came from $N(1,1)$; but then $y_2$ comes from $N(-1,1)$ so it should be small.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from scipy.stats import norm\n",
"y1=norm(loc=1,scale=1)\n",
"y2=norm(loc=-1,scale=1)\n",
"y1_sample=y_1.rvs(500)\n",
"y2_sample=y_2.rvs(500)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A sample of our situation is $(y_1,y_2)$ or $(y_2,y_1)$ with equal probability. So the mean of each variable is zero. The covariance is $-1=E(y_1y_2)=E(y_1)E(y_2)$. The next problem (5.9.5) shows that mixtures of iid variables have positive covariances."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Covariance= -1.0023058663082174\n"
]
}
],
"source": [
"cov=sum(y1_sample*y2_sample)/500\n",
"print(\"Covariance=\",cov)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the general case, the joint probablity distribution can be written\n",
"$$\n",
"P(y_1,\\ldots,y_{2J})=(\\binom{2J}{J})^{-1}\\sum_{{S\\subset [2J]}\\atop{|S|=J}} P_{S}(y_1,\\ldots,y_{2J})\n",
"$$\n",
"where $$P_{S}(y_1,\\ldots,y_{2J})=\\prod_{i\\in S} P(y_i,N(1,1))\\prod_{j\\not\\in S} P(y_j,N(-1,1)).$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To understand the covariance of, say $y_1$ and $y_2$, we need to know how often they are chosen from the same distribution and how often they are chosen from different ones. That raises the combinatorial question of how many of the partitions of $[2J]$ have $y_1$ and $y_2$ together, and how many of them separate $y_1$ and $y_2$. To have them together, we first pick $y_1$ and $y_2$, and then choose $J-2$ additional elements from the remaining $2J-2$. So there are $\\binom{2J-2}{J-2}$ subsets of size $J$ that contain both $y_1$ and $y_2$. To split them, we pick $J-1$ elements from the $2J-2$ elements other than $y_1$ and $y_2$ and combine those with $y_1$ (for example) so there are $\\binom{2J-2}{J-1}$ sets that split them. \n",
"\n",
"When computing the covariance, the cases where $y_1$ and $y_2$ are together contribute $+1$, and the cases where they are split contribute $-1$. This gives the following:\n",
"$$\n",
"\\mathrm{cov}(y_1,y_2)=\\frac{2(\\binom{2J-2}{J-2}-\\binom{2J-2}{J-1})}{\\binom{2J}{J}}\n",
"$$\n",
"The two in the numerator comes from the fact that the number of partitions is $1/2$ of $\\binom{2J}{J}$. \n",
"\n",
"Some trial computations gives the explicit formula that the covariance is $-\\frac{1}{(2J-1)}$ and this goes to zero as $J\\to\\infty$.\n",
"\n",
"The next problem (5.9.5) shows that in a mixture, the correlations are non-negative, so this shows we don't have a mixture of iid variables."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 -1.0\n",
"2 -3.0\n",
"3 -5.0\n",
"4 -7.0\n",
"5 -9.0\n",
"6 -11.0\n",
"7 -13.0\n",
"8 -15.0\n",
"9 -17.0\n"
]
}
],
"source": [
"# an illustration of the last point from the discussion above\n",
"from scipy.special import binom\n",
"for i in range(1,10):\n",
" print(i,(binom(2*i,i)/2/(binom(2*i-2,i-2)-binom(2*i-2,i-1))))\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the last point, I think the way to think of it is that $y_1$ is a combination of $\\binom{2J-1}{J-1}$ copies of $N(1,1)$ -- corresponding to the partitions in which $y_1$ is in the first half -- and $\\binom{2J-1}{J}$ -- corresponding to the partitions in which $y_1$ is in the second half. (Note that since $2J-1$ is odd, these numbers are actually equal). In the limit the correlation between different $y_i$'s drops to zero and so they become independent, and there's no contradiction to deFinetti's theorem."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Problem 5.9.5\n",
"\n",
"Suppose that the distribution of $\\theta=(\\theta_1,\\ldots,\\theta_{J})$ can be written as a mixture of independent and identically distributed components:\n",
"$$\n",
"p(\\theta)=\\int \\prod_{j=1}^{J} p(\\theta_{j}|\\phi)p(\\phi)d\\phi.\n",
"$$\n",
"Prove that the covariances $\\mathrm{cov}(\\theta_{i},\\theta_{j})$ are all non-negative.\n",
"\n",
"Here we apply the formula:\n",
"$$\n",
"\\mathrm{cov}(y_1,y_2)=E_{\\phi}(cov(y_1,y_2|\\phi))+\\mathrm{cov}_{\\phi}(E(y_1|\\phi),E(y_2|\\phi))\n",
"$$\n",
"The first term is zero (since $y_1$ and $y_2$ are independent, conditional on $\\phi$), and the second term is positive since $E(y_1|\\phi)=E(y_2|\\phi)=\\mu(\\phi)$ since the $y_1$ are identically distributed given $\\phi$;\n",
"thus this term is $\\mathrm{var}(\\mu(\\phi))\\ge 0$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}