Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
1 contributor

Users who have contributed to this file

108 lines (94 sloc) 3.78 KB
# coding: utf-8
# #### Problem 3.10.8
#
# Analysis of proportions: a survey was done of bicycle and
# other vehicular traffic in the neighborhood of the campus of the
# University of California, Berkeley, in the spring of 1993.
# Sixty city blocks were selected at random; each block was observed
# for one hour, and the numbers of bicycles and other vehicles traveling
# along that block were recorded. The sampling was stratified into six
# types of city blocks: busy, fairly busy, and residential streets, with
# and without bike routes, with ten blocks measured in each stratum.
# Table 3.3 displays the number of bicycles and other vehicles
# recorded in the study. For this problem, restrict your attention
# to the first four rows of the table: the data on residential streets.
#
# (a) Let $y_1$ , . . . , $y_{10}$ and $z_1$ , . . . , $z_8$ be the
# observed proportion of traffic that was on bicycles in the residential
# streets with bike lanes and with no bike lanes, respectively
# (so $y_1 = 16/(16 + 58)$ and $z_1 = 12/(12 + 113)$, for example).
# Set up a model so that the $y_i$ ’s are independent and identically
# distributed given parameters $\theta_y$ and the $z_i$ ’s are
# independent and identically distributed given parameters $\theta_z$ .
#
# (b) Set up a prior distribution that is independent in
# $\theta_y$ and $\theta_z$ .
#
# (c) Determine the posterior distribution for the parameters
# in your model and draw 1000 simulations from the posterior distribution.
# (Hint: $\theta_y$ and $\theta_z$ are independent in the posterior
# distribution, so they can be simulated independently.)
#
# (d) Let $\mu_y = E(y_i |\theta_y )$ be the mean of the distribution
# of the $y_i$ ’s; $\mu_y$ will be a function of $\theta_y$.
# Similarly, define $\mu_z$ . Using your posterior simulations from (c),
# plot a histogram of the posterior simulations of $\mu_y-\mu_z$, the
# expected difference in proportions in bicycle traffic on residential
# streets with and without bike lanes. We return to this example in
# Exercise 5.13.
#
# Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.;
# Vehtari, Aki; Rubin, Donald B.. Bayesian Data Analysis,
# Third Edition (Chapman & Hall/CRC Texts in Statistical Science) (Page 81).
# CRC Press. Kindle Edition.
#
# #### Data
# |Type |Bike lane? |Counts of Bikes/others|
# |--- |----------|----|
# |Residential |yes |16/58, 9/90, 10/48, 13/57, 19/103, 20/57, 18/86, 17/112, 35/273, 55/64 |
# |Residential |no |12/113, 1/18, 2/14, 4/44, 9/208, 7/67, 9/29, 8/154|
#
# Gelman, Andrew; Carlin, John B.; Stern, Hal S.;
# Dunson, David B.; Vehtari, Aki; Rubin, Donald B..
# Bayesian Data Analysis, Third Edition (
# Chapman & Hall/CRC Texts in Statistical Science)
# (Page 81). CRC Press. Kindle Edition.
# #### Probably best to first do 3.10.6
# For that problem see the reference
# [Raftery, 1988](https://www.stat.washington.edu/raftery/Research/PDF/bka1988.pdf)
import pystan
import numpy as np
import matplotlib.pyplot as plt
stan_code="""
data {
int<lower=1> N;
int bikes[N];
int others[N];
}
parameters {
real<lower=0> theta_b;
real<lower=0> theta_v;
}
model {
theta_b~uniform(0,100);
theta_v~uniform(0,100);
bikes~poisson(theta_b);
others~poisson(theta_v);
}
generated quantities {
real b_ppc;
real o_ppc;
real p ;
o_ppc=poisson_rng(theta_v);
b_ppc=poisson_rng(theta_b);
p=o_ppc/(o_ppc+b_ppc);
}
"""
sm=pystan.StanModel(model_code=stan_code)
fit=sm.sampling(data=dict({'N':10,'bikes':[16,9,10,13,19,20,18,17,35,55],'others':[58, 90, 48, 57, 103, 57, 86, 112, 273, 64] }))
print(fit.extract())
print(len(fit.extract()['b_ppc']))
fig,ax=plt.subplots(1,1)
ax.hist(fit.extract()['b_ppc'],density=True)
#ax[1].hist(bikes,density=True)
plt.show()
You can’t perform that action at this time.