From 06b98473012640930d014924b627002c99b20914 Mon Sep 17 00:00:00 2001 From: jeremyteitelbaum Date: Wed, 2 May 2018 09:44:46 -0400 Subject: [PATCH 1/4] catching up --- BDA 5.9.3.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/BDA 5.9.3.ipynb b/BDA 5.9.3.ipynb index abfbe93..f77853a 100644 --- a/BDA 5.9.3.ipynb +++ b/BDA 5.9.3.ipynb @@ -492,7 +492,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.5" + "version": "3.6.4" } }, "nbformat": 4, From 722cff49a00a72fc7f8ecd3291d3eb6b77227937 Mon Sep 17 00:00:00 2001 From: jeremyteitelbaum Date: Wed, 2 May 2018 10:26:06 -0400 Subject: [PATCH 2/4] fixed 5.9.8 --- BDA 5.9.8.ipynb | 98 +++++++++++++++++++++++++++++++++---------- Useful Formulae.ipynb | 62 +-------------------------- 2 files changed, 78 insertions(+), 82 deletions(-) diff --git a/BDA 5.9.8.ipynb b/BDA 5.9.8.ipynb index 8ad5732..9f6a2e1 100644 --- a/BDA 5.9.8.ipynb +++ b/BDA 5.9.8.ipynb @@ -7,8 +7,8 @@ "# Discrete Mixture Models\n", "\n", " This solution differs from the one published here:\n", - "http://www.stat.columbia.edu/~gelman/book/solutions3.pdf\n", - "\n", + "http://www.stat.columbia.edu/~gelman/book/solutions3.pdf and is probably wrong\n", + "\n", "\n", "Discrete mixture models: if $p_m(\\theta)$, for $m=1,\\ldots,M$ are conjugate prior densities for the sampling model $y|\\theta$, show that the class of finite mixture prior densities given by \n", "$$\n", @@ -50,29 +50,76 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "\n", + " This is the part that was wrong -- I leave it here for historical purposes, but the correct part follows. \n", + "

\n", "In the special case under consideration, $p_1$ is normal with mean $-1$ and $\\sigma=.5$, $p_2$ is normal with mean $1$ and $\\sigma=.5$ and we can set $\\lambda_1=.1$ and $\\lambda_2=.9$. The $p_m(\\{y_{i}\\})$ can be calculated from the $t$ distribution. Drawing a sample of size $10$ from $p_1$ and getting a sample mean of $-.25$ and a sample variance of $1$ gives a $t$-statistics of \n", - "$$\\frac{(\\overline{y}-\\mu)}{s/\\sqrt{N}}=\\frac{(-.25+1)}{1/\\sqrt{10}}=\\sqrt{10}(-.25+1)$$ in the first case and $\\sqrt{10}(-.25-1)$ in the second. " + "$$\\frac{(\\overline{y}-\\mu)}{s/\\sqrt{N}}=\\frac{(-.25+1)}{1/\\sqrt{10}}=\\sqrt{10}(-.25+1)$$ in the first case and $\\sqrt{10}(-.25-1)$ in the second. \n", + "

\n", + " Now we continue with what is correct \n", + "\n", + "We need to properly interpret $p_m(\\{y_{i}\\})$ and for that we should remember where it comes from. We\n", + "rewrote\n", + "$$\n", + "p(\\{y_{i}\\}|\\theta)p_{m}(\\theta)=p_m(\\{y_{i}\\},\\theta)=p_{m}(\\{y_{i}\\})p_{m}(\\theta|\\{y_{i}\\})\n", + "$$\n", + "We're dealing here with normal distributions. On the left, the quadratic form that is the log-likelihood of the\n", + "relevant bivariate normal is\n", + "$$Q=\\frac{(\\overline{y}-\\theta)^2}{\\sigma^2}+\\frac{(\\theta-\\mu)^2}{\\tau^2}\n", + "$$\n", + "where, more specifically, $\\overline{y}=-.25$, $\\mu=\\pm 1$, $\\sigma^2=1/10$, and $\\tau^2=0.25=0.5^2$. \n", + "The first term comes from $p_{m}(\\overline{y}|\\theta)$ and the second from $p_{m}(\\theta)$.\n", + "\n", + "Pure algebra (by expanding, writing $Q$ as a quadratic in $\\theta$, and completing the square) gives us\n", + "the expression\n", + "$$\n", + "Q=\\frac{(\\theta-\\mu_{1})^2}{\\tau_1^2}+\\frac{(\\overline{y}-\\mu)^2}{\\sigma^2+\\tau^2}\n", + "$$\n", + "where \n", + "$$\n", + "\\mu_{1}=\\frac{\\mu/\\tau^2+\\overline{y}/\\sigma^2}{\\tau_1^2}\n", + "$$\n", + "and \n", + "$$\n", + "\\frac{1}{\\tau_1^2}=\\frac{1}{\\sigma^2}+\\frac{1}{\\tau^2}\n", + "$$\n", + "\n", + "In the context of the problem under discussion, the two terms tell us that\n", + "$$\n", + "\\theta\\sim N(\\mu_1,\\tau_1^2)\n", + "$$\n", + "where $\\tau_1^2=1/(1/.1+1/.25)=1/14$ and $\\mu_1=(-.25/.1\\pm 1/.25)/14$ giving $6.5$ or $1.5$. \n", + "and $p_{m}(\\{y_{i}\\})=N(-.25,\\pm 1, .1+.25)$.\n", + "\n", + "\n" ] }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 17, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.30191827840729224 0.07235502834102417\n" + ] + } + ], "source": [ "import numpy as np\n", "from scipy.stats import norm, t\n", "import matplotlib.pyplot as plt\n", - "t_1=np.sqrt(10)*.75\n", - "t_2=np.sqrt(10)*(-1.25)\n", - "p1y=t.pdf(t_1,df=9)\n", - "p2y=t.pdf(t_2,df=9)\n", - "lambda1,lambda2=(.1,.9)\n" + "p2y=norm.pdf(-.25,1,np.sqrt(.35))\n", + "p1y=norm.pdf(-.25,-1,np.sqrt(.35))\n", + "lambda1,lambda2=(.1,.9)\n", + "print(p1y,p2y)" ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 18, "metadata": {}, "outputs": [], "source": [ @@ -82,14 +129,14 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "0.600590082806086 0.3994099171939141\n" + "0.3167705292212528 0.6832294707787472\n" ] } ], @@ -99,7 +146,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 20, "metadata": {}, "outputs": [], "source": [ @@ -111,22 +158,31 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 21, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "-0.4642857142857143 0.10714285714285714\n" + ] + } + ], "source": [ "post_mean1,post_var1=posterior(-1,.25,-.25,1,10)\n", - "post_mean2,post_var2=posterior(1,.25,-.25,1,10)" + "post_mean2,post_var2=posterior(1,.25,-.25,1,10)\n", + "print(post_mean1,post_mean2)" ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 22, "metadata": {}, "outputs": [ { "data": { - "image/png": "\n", + "image/png": "\n", "text/plain": [ "

" ] @@ -150,7 +206,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This isn't consistent with the solutions, which I don't understand. The key issue is the meaning of the statement that \"the variance of each observation is known to be 1\". How does one compute $p_{m}(\\{y_{i}\\})$?" + "This now matches the solution published on the web!" ] }, { @@ -177,7 +233,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.5" + "version": "3.6.4" } }, "nbformat": 4, diff --git a/Useful Formulae.ipynb b/Useful Formulae.ipynb index 5a3d891..1d0ddd3 100644 --- a/Useful Formulae.ipynb +++ b/Useful Formulae.ipynb @@ -52,66 +52,6 @@ " " ] }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "0.7403867575800461" - ] - }, - "execution_count": 26, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "post_sample(-.25,1,.25,-.25,1,10)+post_sample(-.25,-1,.25,-.25,1,10)" - ] - }, - { - "cell_type": "code", - "execution_count": 29, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "0.05095226579074726" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - ".1*post_sample(-.25,-1,.25,-.25,1,10)/(post_sample(-.25,1,.25,-.25,1,10)+post_sample(-.25,-1,.25,-.25,1,10))" - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "0.4414296078832747" - ] - }, - "execution_count": 30, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - ".9*post_sample(-.25,1,.25,-.25,1,10)/(post_sample(-.25,1,.25,-.25,1,10)+post_sample(-.25,-1,.25,-.25,1,10))" - ] - }, { "cell_type": "code", "execution_count": null, @@ -136,7 +76,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.5" + "version": "3.6.4" } }, "nbformat": 4, From 6a7223b767323034fb5cb9e1cc10898bbf3a7f5b Mon Sep 17 00:00:00 2001 From: Jeremy Teitelbaum Date: Mon, 15 Oct 2018 07:06:03 -0400 Subject: [PATCH 3/4] more notes on likelihood ratios --- LikelihoodRatioNotes.ipynb | 377 +++++++++++++++++++++++++++++++++++++ 1 file changed, 377 insertions(+) create mode 100644 LikelihoodRatioNotes.ipynb diff --git a/LikelihoodRatioNotes.ipynb b/LikelihoodRatioNotes.ipynb new file mode 100644 index 0000000..e189011 --- /dev/null +++ b/LikelihoodRatioNotes.ipynb @@ -0,0 +1,377 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Notes on the likelihood ratio test" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Consider the very simple situation in which we have a set of data which we believe is generated by a Poisson distribution. " + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [], + "source": [ + "from scipy.stats import poisson\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "import numpy as np\n", + "sns.set_style('darkgrid')\n", + "import warnings\n", + "warnings.filterwarnings(\"ignore\",category=FutureWarning)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our data comes from a Poisson(1) distribution. " + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "X = poisson(1)\n", + "sample=X.rvs(1000)" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "j=sns.distplot(sample,kde=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'd like to consider the possibility that our data came from a Poisson distribution that has mean of $1.1$. We look at the likelihood ratio of our data with respect to the null hypothesis that the mean is $1$ and the alternative hypothesis that it is $1.1$.\n", + "\n", + "In the null situation, the probability $P(X=n)$ is $e^{-1}/n!$. So the log likelihood for the sample is \n", + "$$\n", + "\\mathcal{L}_{1}(x)=\\sum_{i=1}^{N} (-\\log(x_i!)-1).\n", + "$$\n", + "\n", + "For a mean of $\\lambda$, we have $P(X=n)=\\lambda^{n}e^{-\\lambda}/n!$ and\n", + "so the log likelhood is\n", + "$$\n", + "\\mathcal{L}_{\\lambda}(x)=\\sum_{i=1}^{N} x_{i}\\log(\\lambda)-\\lambda-\\log(x_{i}!).\n", + "$$\n", + "\n", + "The difference\n", + "$$\n", + "\\mathcal{L}_{\\lambda}(x)-\\mathcal{L}_{1}(x) = +N-N\\lambda+\\log(\\lambda)\\sum x_{i}=N(-\\lambda+1+\\log(\\lambda)\\overline{x})\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This difference is positive (meaning that the alternative is more likely than the null) when \n", + "$$\n", + "\\overline{x}>\\frac{(\\lambda -1)}{\\log(\\lambda)}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To see what's going on, let's draw 1000 samples from poisson(1) and poisson(1.1) distributions and see how $\\overline{x}$ is distributed in each case." + ] + }, + { + "cell_type": "code", + "execution_count": 112, + "metadata": {}, + "outputs": [], + "source": [ + "N=1000\n", + "sample_means_1=[poisson(1).rvs(N).mean() for i in range(5000)]\n", + "sample_means_2=[poisson(1.1).rvs(N).mean() for i in range(5000)]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We plot these distributions, and include a vertical line at the critical value that gives equal likelihood either way as computed above." + ] + }, + { + "cell_type": "code", + "execution_count": 113, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "ax=sns.distplot(sample_means_1,kde=False)\n", + "sns.distplot(sample_means_2,kde=False,ax=ax)\n", + "s=.1/np.log(1.1)\n", + "j=ax.axvline(s)\n", + "t=ax.set_title(\"Critical value is {0:.2f}\".format(s))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As a check, we can compute the \"errors\" by looking at how often a sample from poisson(1) lies to the right of the critical line, and vice versa." + ] + }, + { + "cell_type": "code", + "execution_count": 114, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Fraction of draws from poisson(1.1) that lie to the left of the critical line: 0.068\n", + "Fraction of draws from poisson(1) that lie to the right of the critical line: 0.0642\n" + ] + } + ], + "source": [ + "print('Fraction of draws from poisson(1.1) that lie to the left of the critical line:',len([x for x in sample_means_2 if xs])/len(sample_means_1))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The idea behind a likelihood ratio test is that we have a parameter space $\\Omega$ and a random variable $X$ that is distributed as $P_{\\theta}$ for some $\\theta\\in\\Omega$. We have a partition of $\\Omega$ into sets $\\Omega_{0}$ and $\\Omega_{1}$ and based on the outcome of an experiment $X$ we want to accept the hypothesis that $\\theta\\in\\Omega_{0}$ if $X\\not\\in S$. The power function of this test is \n", + "$$\n", + "\\beta(\\theta)=P_{\\theta}(X\\in S)\n", + "$$\n", + "and the significance level is\n", + "$$\n", + "\\alpha = \\sup_{\\theta\\in\\Omega_{0}}P_{\\theta}(X\\in S).\n", + "$$\n", + "\n", + "The power function tells us, for a given $\\theta$, how likely it is that that we will reject $H_{0}$.\n", + "The significance level tells us the highest possible chance that we reject $H_{0}$. \n", + "\n", + "In the example above the parameter space has only two values: $\\theta = 1$ and $\\theta = 1.1$.\n", + "Since $S$ is the space where we accept $H_{1}$ and reject $H_{0}$, these two values are:\n", + "$$\n", + "\\beta(1)=\\textrm{ the area under the blue graph to the right of $s$}\\sim .06\n", + "$$\n", + "and\n", + "$$\n", + "\\beta(1.1)=\\textrm{ the area under the orange graph to the right of $s$}\\sim .94\n", + "$$\n", + "\n", + "The significance level is $.06$.\n", + "\n", + "In other words: Suppose we draw $1000$ samples from our Poisson distribution and compute the mean $\\overline{x}$. If $\\overline{x}" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "N=50\n", + "sample_means_1=[poisson(1).rvs(N).mean() for i in range(5000)]\n", + "sample_means_2=[poisson(1.1).rvs(N).mean() for i in range(5000)]\n", + "ax=sns.distplot(sample_means_1,kde=False)\n", + "sns.distplot(sample_means_2,kde=False,ax=ax)\n", + "s=.1/np.log(1.1)\n", + "j=ax.axvline(s)\n", + "t=ax.set_title(\"Critical value is {0:.2f}\".format(s))" + ] + }, + { + "cell_type": "code", + "execution_count": 116, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Fraction of draws from poisson(1.1) that lie to the left of the critical line: 0.3672\n", + "Fraction of draws from poisson(1) that lie to the right of the critical line: 0.3476\n" + ] + } + ], + "source": [ + "print('Fraction of draws from poisson(1.1) that lie to the left of the critical line:',len([x for x in sample_means_2 if xs])/len(sample_means_1))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this case, the significance level is $36\\%$. And if we draw the line so that the chance of falsely rejecting the null hypothesis is only $5\\%$, then the chance of correctly choosing the alternative will be low. This is an example of an experiment with **low power** -- there's not enough information to distinguish the two cases." + ] + }, + { + "cell_type": "code", + "execution_count": 117, + "metadata": {}, + "outputs": [], + "source": [ + "from scipy.stats import uniform\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also look at the uniform distribution. Here we look at the maximum of 100 draws from a unifrom distribution from 0 to 1, and from 0 to 1.01. " + ] + }, + { + "cell_type": "code", + "execution_count": 130, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAD7CAYAAABgzo9kAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAHh5JREFUeJzt3Xt0FOXBBvBnZrIbdpONayAgfDEUMFUISgsBFSNWLg2fFW9cQgKhx1BaLIZivUQCBjSAWBXPIQIWtJ6WWClKPaXKkaOo5QApCCqUVbwVUSCKSC7sLbuZeb8/OOxnSHY3O9lJNuPz+8vdd2f2SYLPznl35h1JCCFARESmI3d1ACIiMgYLnojIpFjwREQmxYInIjIpFjwRkUmx4ImITIoFT0RkUix4IiKTYsETEZlUUle+uaZpUFV9F9IqiqR7WyMxV2yYKzbMFZtEzQV0LJvForTrdVELPhgMory8HCdOnEAgEMBdd92Fyy67DA8++CAkSUJ2djaWLFkCWZbx9NNP45133kFSUhLKy8tx1VVXRdy3qgrU13vb9xNdwOm0697WSMwVG+aKDXPFJlFzAR3LlpHhaNfrohb81q1b4XQ68fjjj6Ourg633347rrjiCixYsABXX301KioqsGPHDvTr1w/79u3DSy+9hNraWpSWlmLLli26whMRUcdFLfiJEyciPz8/9FhRFLhcLowaNQoAMGbMGOzevRsDBgxAXl4eJElCv379oKoqzpw5g/T0dOPSExFRWFELPiUlBQDgdrsxf/58LFiwAI899hgkSQqNnz17Fm63G06ns8V2Z8+ejVjwiiLB6bTrCq4osu5tjcRcsWGu2DBXbBI1F9A52dr1JWttbS3mzZuHoqIiTJo0CY8//nhozOPxIC0tDampqfB4PC2edzgizxNxDr7zMFdsmCs2zBW7zpiDj3qa5OnTp1FSUoL7778fU6ZMAQAMGTIEe/fuBQDs3LkTubm5GD58OHbt2gVN03Dy5ElomsbpGSKiLhT1CP6ZZ55BY2Mj1q5di7Vr1wIAFi1ahGXLlmHVqlUYOHAg8vPzoSgKcnNzUVBQAE3TUFFRYXh4IiIKT+rKOzoFgyqnaDoJc8WGuWLDXLFLiCkaIiLqnljwREQm1aVLFRARdVSTALxBtc0x2R/s5DSJhQVPRN2aN6ji7SOn2hz732H/A1sn50kknKIhIjIpFjwRkUmx4ImITIoFT0RkUix4IiKTYsETEZkUC56IyKRY8EREJsWCJyIyKRY8EZFJseCJiEyKBU9EZFIseCIik2LBExGZFJcLJiIyQA94IAfdYcelposBWA3N0K6CP3jwIJ544gls3LgR99xzD06fPg0AOHHiBIYNG4annnoKc+fORX19PSwWC5KTk/Hss88aGpyIKJHJQTe0T94I/4IrbwKQbmiGqAW/YcMGbN26FTbbuWXzn3rqKQBAQ0MDZs2ahYULFwIAvvzyS7z22muQJMnAuERE1F5R5+CzsrJQVVXV6vmqqirMnDkTvXv3xunTp9HY2Ii5c+eisLAQb7/9tiFhiYio/aIewefn5+P48eMtnvvuu+9QU1MTOnoPBoMoKSnBrFmz0NDQgMLCQlx11VXo2bNnxH0rigSn064ruKLIurc1EnPFhrliw1yt+Rr8sNvansuWZAlOR9fkkhotQJhcACDLEpxpxmbT9SXr66+/jptvvhmKogAAevXqhenTpyMpKQk9e/bE4MGDcfTo0agFr6oC9fVePRHgdNp1b2sk5ooNc8WGuVrzB1R4fYE2x4Smv2M6yh4MQguTCwBsHciWkeFo1+t0nSZZU1ODMWPGhB7v2bMHCxYsAAB4PB58+umnGDhwoJ5dExFRnOg6gj969CguvfTS0OMbbrgBu3btwrRp0yDLMn7/+98jPd3Yb4eJiCiydhV8ZmYmNm/eHHr82muvtXrNokWL4peKiIg6jFeyEhGZFAueiMikWPBERCbFgiciMikWPBGRSbHgiYhMigVPRGRSLHgiIpNiwRMRmRQLnojIpFjwREQmxYInIjIpFjwRkUmx4ImITIoFT0RkUix4IiKTYsETEZkUC56IyKRY8EREJtWugj948CCKi4sBAC6XC9dffz2Ki4tRXFyMbdu2AQCefvppTJkyBdOnT8ehQ4eMS0xERO0S9abbGzZswNatW2Gz2QAAH374Ie68806UlJSEXuNyubBv3z689NJLqK2tRWlpKbZs2WJcaiIiiirqEXxWVhaqqqpCjw8fPox33nkHM2bMQHl5OdxuNw4cOIC8vDxIkoR+/fpBVVWcOXPG0OBERBRZ1CP4/Px8HD9+PPT4qquuwtSpUzF06FCsW7cOa9asgcPhgNPpDL0mJSUFZ8+eRXp6esR9K4oEp9OuK7iiyLq3NRJzxYa5YsNcrfka/LDbrG2OSbIEp6NrckmNFiBMLgCQZQnONGOzRS34C02YMAFpaWmh/66srMS4cePg8XhCr/F4PHA4HFH3paoC9fXeWCMAAJxOu+5tjcRcsWGu2DBXa/6ACq8v0OaY0PR3TEfZg0FoYXIBgK0D2TIyovcroOMsmtmzZ4e+RK2pqUFOTg6GDx+OXbt2QdM0nDx5EpqmRT16JyIiY8V8BL906VJUVlbCYrGgV69eqKysRGpqKnJzc1FQUABN01BRUWFEViIiioEkhBBd9ebBoMopmk7CXLFhrth0Za66gIq3j5xqc+x/h/0PbELr5ETn2IPfQPvkjbDjtitvQp2mb6bDsCkaIiLqHljwREQmxYInIjKpmL9kJSLqTE0C8AbVsONql32LmPhY8ESU0LzB8F+iAsC12RmdmKZ74RQNEZFJ8QieiEiHHvBADrrDjichiPDXsXYOFjwRkQ5y0B3xPHdp0OhOTNM2TtEQEZkUC56IyKRY8EREJsWCJyIyKRY8EZFJseCJiEyKBU9EZFI8D56IqA3d4UKmaFjwRGSoaIuF2S0KkqVODNRO3eFCpmhY8ERkqGiLhd14RW8kW5VOTPTD0a6CP3jwIJ544gls3LgRH330ESorK6EoCqxWKx577DH06tULy5Ytw3vvvYeUlBQAwNq1a+FwtO+2UkREFH9RC37Dhg3YunUrbDYbAGD58uV46KGHMHjwYGzatAkbNmzAwoUL4XK58OyzzyI9Xd89BomIKL6inkWTlZWFqqqq0ONVq1Zh8ODBAABVVZGcnAxN03Ds2DFUVFRg+vTpePnll41LTERE7RL1CD4/Px/Hjx8PPe7duzcA4L333kN1dTVeeOEFeL1ezJw5E3feeSdUVcWsWbMwdOhQXHHFFRH3rSgSnE67ruCKIuve1kjMFRvmik13zOVr8MNus4bdtkeyBc6LeoQdj7Z9kiKHHZdkCU6Hvt+X1GgBIryvlKTA1oFxWZbgTDP2b6nrS9Zt27Zh3bp1WL9+PdLT00Olfn4a55prrsGRI0eiFryqCtTXe/VEgNNp172tkZgrNswVm+6Yyx9Q4fWFP6HQ3xREfb0WfjzK9s2qFnZcaPo7xh4MQovwvsnNKpo6MG7rQLaMjPZ9vxnzhU7/+Mc/UF1djY0bN+LSSy8FAHzxxRcoKiqCqqoIBoN47733kJOTE+uuiYgojmI6gldVFcuXL0ffvn1RWloKABg5ciTmz5+PSZMmYdq0abBYLLj11luRnZ1tSGAiImqfdhV8ZmYmNm/eDADYt29fm6+ZM2cO5syZE79kRETUIVyLhojIpFjwREQmxYInIjIprkVDRD9IZlgtMhoWPBH9IJlhtchoOEVDRGRSPIInoi4lSRLqAuHXi1dFJ4YxGRY8EXUpX7OGmk+/DTt+bXZGJ6YxF07REBGZFAueiMikWPBERCbFgiciMikWPBGRSbHgiYhMigVPRGRSLHgiIpNiwRMRmRQLnojIpFjwREQm1a6CP3jwIIqLiwEAx44dQ2FhIYqKirBkyRJomgYAePrppzFlyhRMnz4dhw4dMi4xERG1S9SC37BhAxYvXoympiYAwKOPPooFCxbgr3/9K4QQ2LFjB1wuF/bt24eXXnoJq1atwsMPP2x4cCIiiixqwWdlZaGqqir02OVyYdSoUQCAMWPGYM+ePThw4ADy8vIgSRL69esHVVVx5swZ41ITEVFUUZcLzs/Px/Hjx0OPhRCQJAkAkJKSgrNnz8LtdsPpdIZec/759PT0iPtWFAlOp11XcEWRdW9rJOaKDXPFpjvm8jX4YbdZw26bpMiGjUuyBKej7VxSowWIsF8pSYHNwHFZluBMM/ZvGfN68LL8/wf9Ho8HaWlpSE1NhcfjafG8w+GIui9VFaiv98YaAQDgdNp1b2sk5ooNc8WmO+byB1R4feHvbtqsaoaNCy18x9iDQWgR9pvcrKLJwHFbhGzRZGRE71dAx1k0Q4YMwd69ewEAO3fuRG5uLoYPH45du3ZB0zScPHkSmqZFPXonIiJjxXwEX1ZWhoceegirVq3CwIEDkZ+fD0VRkJubi4KCAmiahoqKCiOyEhFRDNpV8JmZmdi8eTMAYMCAAaiurm71mtLSUpSWlsY3HRER6cYLnYiITIoFT0RkUix4IiKTYsETEZkUC56IyKRY8EREJsWCJyIyKRY8EZFJseCJiEyKBU9EZFIseCIik4p5sTEiokQwpKdAivChj/gWI9NbL7vrkWxdkCqxsOCJqFtKET64Xa8j2NsB96mzrcZTcyZ2QarEwikaIiKTYsETEZkUC56IyKRY8EREJsWCJyIyKRY8EZFJ6TpN8u9//zteeeUVAEBTUxM++ugjPPnkk/jDH/6Avn37Ajh3j9ZRo0bFLykRJaQmAdQ2+OEPqG2Oq6KTA1GIroK/4447cMcddwAAHn74YUyePBkulwv3338/8vPz4xqQiBKbN6hi77Hv4PUF2hy/NjujkxPReR2aovnPf/6Dzz77DAUFBXC5XNiyZQuKioqwcuVKNDc3xysjERHp0KErWf/4xz9i3rx5AIDrrrsO48ePR2ZmJpYsWYJNmzZh5syZEbdXFAlOp13XeyuKrHtbIzFXbJgrNomYy9fghyxLsNusbY4nKXLYsY6MJ1maYLUmQZYlWK2tqyzJokCSJTgdbf++pEYLEOF9pSQFNgPHZVmCM83Yv6Xugm9sbMR///tfXHPNNQCAyZMnIy0tDQAwbtw4bN++Peo+VFWgvr71GhLt4XTadW9rJOaKDXPFJhFz+QMqNE2EnaJpVrWwYx0Zb7apCASaoWkCgUDrGQNrUIXQwneMPRiEFuF9k5tVNBk4bouQLZqMDEe7Xqd7iubdd9/F6NGjAQBCCNxyyy34+uuvAQA1NTXIycnRu2siIooD3UfwR48eRWZmJgBAkiQsW7YMd999N3r06IFBgwZh2rRpcQtJRESx013wv/rVr1o8zsvLQ15eXocDERFRfPBCJyIik2LBExGZFAueiMikWPBERCbFgiciMikWPBGRSfGm20QUUZM4t6BYOFwtMnGx4IkoIm9QxdtHToUd52qRiYtTNEREJsWCJyIyKU7REJEp9UpRkBb4Gha0/SVBEoIIv9ajObDgiciULJof6id7oEBrc1waNLqTE3U+TtEQEZkUC56IyKQ4RUNEEc9153nu3RcLnoginuvO89y7LxY8ESWkIT0FUoQPfcS3GJne+t6lvewC7i7I1Z2w4IkoIaUIH9yu1xHs7YD71NlW431zx3VBqu5Fd8HfdtttcDjO3dk7MzMTBQUFWL58ORRFQV5eHu6+++64hSQiotjpKvimpiYAwMaNG0PP3XrrraiqqsKll16KX//613C5XMjJyYlPSiIiipmu0ySPHDkCn8+HkpISzJo1C++++y4CgQCysrIgSRLy8vJQU1MT76xERBQDXUfwPXr0wOzZszF16lR88cUXmDNnDtLS0kLjKSkp+Oqrr6LuR1EkOJ12PRGgKLLubY3EXLFhrtgYlcvX4IfdZm1zLEmRw46dH5dlqdVrfnyRCpvwoq90Gtf2aWr9npIdnzQoYfefZGmC1ZoEWZZgtbauKlmRoo5LMmBLbju7lKTAFuHnMnpcliU404z9N6ar4AcMGID+/ftDkiQMGDAADocD9fX1oXGPx9Oi8MNRVYH6+tbfjreH02nXva2RmCs2zBUbo3L5Ayq8vrZXZmlWtbBj58c1TbR6jcXmRb3rdaT3dqC+jS9JU3Mmwuuzh91/s01FINAMTRMIBJpbjWuqiDouNMAXJntys4qmCD+X0eM2TX//ZWQ42vU6XVM0L7/8MlauXAkA+Oabb+Dz+WC32/Hll19CCIFdu3YhNzdXz66JiChOdB3BT5kyBQsXLkRhYSEkScKKFSsgyzLuu+8+qKqKvLw8DBs2LN5ZiUgn3pXph0lXwVutVjz55JOtnt+8eXOHAxFR/PGuTD9MXGyMiMikWPBERCbFgiciMikWPBGRSbHgiYhMigVPRGRSLHgiIpPievBEJtAdL2TqlaJgJLy8oYeBWPBEJtAdL2SyaH64XTt4Qw8DcYqGiMikWPBERCbFgiciMikWPBGRSbHgiYhMigVPRGRSLHgiIpNiwRMRmRQLnojIpFjwREQmpWupgmAwiPLycpw4cQKBQAB33XUXLrnkEsydOxc/+tGPAACFhYW46aab4pmViIhioKvgt27dCqfTiccffxx1dXW4/fbbMW/ePNx5550oKSmJd0YiItJBV8FPnDgR+fn5oceKouDw4cM4evQoduzYgf79+6O8vBypqalxC0r0Q3Z+tUhfgx/+QOtVI7titcghPQVShA99xLcYltaEZlvLXFwNsuvpKviUlBQAgNvtxvz587FgwQIEAgFMnToVQ4cOxbp167BmzRqUlZVF3I+iSHA67XoiQFFk3dsaibliw1ztU9vgx95j30GWJWha6zYf0f9i2G3WsNsnKbLu8XBjFyWdhffDN6Gm2+Gr80GIlrmSf/ozWK1JkGUJVmvrqpEVyfBxSQZsyW3/XFKSAluE34nR47IswZlm7L8x3csF19bWYt68eSgqKsKkSZPQ2NiItLQ0AMCECRNQWVkZdR+qKlBf33od6PZwOu26tzUSc8WGudrHH1Dh9QVgt1nh9QVajTerWpvPx2M83FizTUUg0AxNExBCIBBobjGuqSI0fuFYZ40LDfCF+bmSm1U0RfidGD1u0/T3X0aGo12v03UWzenTp1FSUoL7778fU6ZMAQDMnj0bhw4dAgDU1NQgJydHz66JiChOdB3BP/PMM2hsbMTatWuxdu1aAMCDDz6IFStWwGKxoFevXu06gicyk0h3VbJbFCRLnRwoDob0FLzjUjemq+AXL16MxYsXt3p+06ZNHQ5E1F1FuqvSjVf0RrJV6eREHZcifAh+vJt3XOqmeKETEZFJseCJiEyKN90mSgCR5u+BrjnPnbo/FjxRAog0fw8A12ZndGIaMgsWPFEnkCQJdW1cgXqeUUfo37/atK0zYTySDR9+1w1P76F2YcETdQJfs4aaT78NO27UEXqK8MHteh3B3o42z4RJzZkIIHGu2KX4YsETtRPnyam7YcETtRPnyam7YcETJbAL59CTLC1XbeQcOkXCgidKYBfOoVutSS0W1uroHHqvFAUj4Y24HAF1Xyx4om6sPQUdab0Yi+aH27Uj7JewfXPHIfy3DpToWPBEXSjaaYzxKGj64WLBE3XA+YIG0GZJR5sjj3YaIwuaOoIFT9QB5wsaQJslzfPMqStxsTEiIpPiETx1G9EuNOquN9UgMgoLnrqNaBcajR3cB14R/rS+aB8A3/8A8TX44b9g7Rg9V6p29CwXoo5gwZNpRFvvJdpdlb7/AdLWza31XKnKs1yoK7HgKeH1gAdy0A1ZEzGviPj9s1x6at+iR7Dl6zRLKvxIiX9oogQQ14LXNA1Lly7Fxx9/DKvVimXLlqF///7xfAsyWEfnuY2YJ5eDbmifvIHmZgH3142txiOdqfL9s1ya+6bBfcE0S8qQ8ZDVc0fW6UDoA+T8kgBpqTY0usOfBskpFkpkcS34N998E4FAAH/729/wwQcfYOXKlVi3bl0834IMFm2eO5ZpjrZcOE9+4Vy3ng+ASPPc3y/goCrw3wumSbIzG/Dp/h0AgIHfm0Y5vyRA39xxOOk6N97WNAunWCiRxbXgDxw4gOuvvx4A8JOf/ASHDx+O5+5baPQHI95AoSvPqDD6KNiapCDQrG//HV3yVs+NK74/TZIW/AYnzvx/CfstCuzJ1tBRcs9eradLhBSEt1lAC/Oekea5WcD0QyYJEeG0gxgtWrQIP//5z3HDDTcAAH72s5/hzTffRFISp/qJiDpbXC90Sk1NhcfjCT3WNI3lTkTUReJa8MOHD8fOnTsBAB988AF+/OMfx3P3REQUg7hO0Zw/i+aTTz6BEAIrVqzAoEGD4rV7IiKKQVwLnoiIEgcXGyMiMikWPBGRSSVkwWuahoqKChQUFKC4uBjHjh1rMb5+/XrceuutmDFjBt5++20AwLfffotf/vKXKCoqwu9+9zv4fL6EyHXeu+++Gzp9NBFy1dfX4+qrr0ZxcTGKi4vx5z//OSFyeb1ePPDAAygqKsLUqVNx6NChhMi1fPny0O9q4sSJmDZtWkLkOnnyJGbOnIkZM2bgt7/9bcL8u//qq68wY8YMFBUV4b777jMk13kHDx5EcXFxq+ffeustTJ48GQUFBdi8eTMAwO/3o7S0FEVFRZgzZw7OnDmTELnOe+ONN3DvvffGL4RIQNu3bxdlZWVCCCHef/99MXfu3NDYkSNHxKRJk4Tf7xd+v1/cdtttwuv1imXLlolXXnlFCCHE6tWrxfPPP58QuYQQ4uTJk2Lu3Lli9OjRcc+kN9fu3bvFI488YkiejuRavXq1WL9+vRBCiI8++ij0N+3qXOcFAgExZcoUceTIkYTItXz5clFdXS2EEGLVqlXiL3/5S0LkKi0tFVu3bhVCCLF582axZs2auOcSQoj169eLm2++WUydOrXF84FAQIwfP17U19eLpqYmcccdd4hTp06JP/3pT2L16tVCCCFeffVVUVlZmRC5hBCisrJS5OfniwULFsQtR0IewUe6Ivbzzz/HqFGjkJycjOTkZPTv3x8ff/wxysvLccstt0DTNNTW1qJnz54JkaupqQlLlizB0qVL456nI7kOHz4Ml8uFmTNnYv78+Th1KvzyAp2Za9euXbBYLJg9ezbWrl0b2r6rc51XXV2N6667DpdffnlC5Bo8eDAaG8+tz+N2uw257kRPrs8++wxjxowBcO706QMHDsQ9FwBkZWWhqqqq1fOff/45srKycNFFF8FqtWLEiBHYv39/i59lzJgxqKmpSYhcwLnfU7x7IiEL3u12IzU1NfRYURQ0NzcDAC6//HLs378fbrcbdXV1eP/99+Hz+SBJElRVxc0334y9e/di+PDhCZHrkUceQUlJCfr06RP3PB3JNXDgQMyfPx/V1dUYP348li1blhC56urq0NjYiOeeew5jx47FY489lhC5ACAQCGDTpk2YPXt23DPpzXXJJZfghRdewC9+8Qvs3LkTEydOTIhcgwcPxltvvQUA2LFjh2FTNPn5+W1+qLndbjgcjtDjlJQUuN3uFs+npKTg7NnWSzh3RS4AuOmmmyBJ8V1fJSEvM410ReygQYMwY8YMzJkzB/3798ewYcNw8cUXAwAsFgu2bduGPXv2oKysDNXV1V2aS1EU7N+/H19++SXWrFmDhoYG3HPPPXjqqae6NNfFF1+MK6+8EjabDQAwYcIErF69Oq6Z9OZyOp0YO3YsAODGG2/E+vXrEyIXANTU1GDkyJEt/gft6lwLFy7Eo48+iuuvvx7vvPMOysrK4v4705OrrKwMlZWVePXVV3HttdeGfoed5cLMHo8HDoejxfMejwdpaWkJkcsoCXkEH+mK2DNnzqCurg4vvvgiFi1ahNraWmRnZ2Pp0qX497//DeDcp2K8Pwn15BoxYgS2b9+OjRs3YuPGjbjoooviXu56cmVnZ2Px4sXYvn07gHPFlZOTkxC5RowYgX/9618Azn0xfdlllyVELgDYs2dPaNrBCHpypaWlhQqid+/eoemars61Z88ezJs3D8899xxkWcbo0aPjniuSQYMG4dixY6ivr0cgEMD+/fvx05/+FMOHDw/9+9q5cydGjBiRELmMkpBH8BMmTMDu3bsxffr00BWxzz//PLKysjB27FgcP34ckydPhsViwQMPPABFUVBcXIylS5dizZo1kGXZkDlvPbk6g55c9957L8rLy/Hiiy/CZrMZMkWjJ9dvfvMbLF68GAUFBUhKSjJkikbv3/Ho0aO47bbb4p6nI7keeughPPLII9A0DUIIVFRUJESuAQMGoLy8HFarFdnZ2Ybkass///lPeL1eFBQU4MEHH8Ts2bMhhMDkyZPRp08fFBYWoqysDIWFhbBYLHjyyScTIpdReCUrEZFJJeQUDRERdRwLnojIpFjwREQmxYInIjIpFjwRkUmx4ImITIoFT0RkUix4IiKT+j/RpBGMZl+SEwAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "j=sns.distplot([uniform(0,1).rvs(100).max() for i in range(1000)],kde=False)\n", + "j=sns.distplot([uniform(0,1.01).rvs(100).max() for i in range(1000)],kde=False,ax=j)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "So suppose we have a sample $x$ of $100$ numbers and we to use it to test whether or not it come from the uniform distribution on $[0,1]$ (null hypothesis) or from the uniform distribution on $[0,1.01]$ (alternative hypothesis).\n", + "\n", + "The probability density for $[0,1]$ can be expressed in terms of the maximum $T$ and the minimum $M$ of the sample $x$. We have $P(T1)=1$. The associated density is the derivative of this $p(r)=100r^{99}$ but is zero outside the interval.\n", + "\n", + "Similarly we have $p_{1.01}(r)=100r^{99}/(1.01)^{100}$ and zero outside $[0,1.01]$.\n", + "\n", + "So the likelihood ratio is\n", + "$$\\frac{p_{1.01}(r)}{p(r)}=(1.01)^{-100}$$\n", + "for $0\\le r\\le 1$ and is $\\infty$ between $1$ and $1.01$.\n", + "\n", + "What this means is that if we get a sample with $r>1$ then we should obviously reject the null hypothesis. If we use $r=1$ as our criterion, then we will never falsely reject the null hypothesis, but since $(1/(1.01))^{100}=.36$,\n", + "we will falsely accept it $36\\%$ of the time. " + ] + }, + { + "cell_type": "code", + "execution_count": 132, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.3697112123291189" + ] + }, + "execution_count": 132, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "(1/1.01)**100" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.6" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From 62ca64acb75f6ef4c6632ec3050eab32e7f425fc Mon Sep 17 00:00:00 2001 From: Jeremy Teitelbaum Date: Mon, 15 Oct 2018 11:19:39 -0400 Subject: [PATCH 4/4] changes to 5.9.8 --- BDA 5.9.8.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/BDA 5.9.8.ipynb b/BDA 5.9.8.ipynb index 9f6a2e1..c3ad5cd 100644 --- a/BDA 5.9.8.ipynb +++ b/BDA 5.9.8.ipynb @@ -88,7 +88,7 @@ "$$\n", "\\theta\\sim N(\\mu_1,\\tau_1^2)\n", "$$\n", - "where $\\tau_1^2=1/(1/.1+1/.25)=1/14$ and $\\mu_1=(-.25/.1\\pm 1/.25)/14$ giving $6.5$ or $1.5$. \n", + "where $\\tau_1^2=1/(1/.1+1/.25)=1/14$ and $\\mu_1=(-.25/.1\\pm 1/.25)/14$ giving $-.46$ and $.11$. \n", "and $p_{m}(\\{y_{i}\\})=N(-.25,\\pm 1, .1+.25)$.\n", "\n", "\n"