diff --git a/lecture_16/lecture_16.ipynb b/lecture_16/lecture_16.ipynb
index 7ab08d2..206e84c 100644
--- a/lecture_16/lecture_16.ipynb
+++ b/lecture_16/lecture_16.ipynb
@@ -20644,7 +20644,7 @@
},
{
"cell_type": "code",
- "execution_count": 123,
+ "execution_count": 200,
"metadata": {
"collapsed": false
},
@@ -20793,12 +20793,20 @@
"\n",
"\n",
"\n",
- "\tgnuplot_plot_1a\n",
- "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\t\n",
+ "\tdata\n",
+ "\n",
"\n",
"\n",
- "\t \n",
- "\t\n",
+ "\t\n",
+ "\t\tdata\n",
+ "\t\n",
+ "\n",
+ "\n",
+ "\t\n",
"\t\n",
- "\tgnuplot_plot_2a\n",
+ "\tbest-fit\n",
+ "\n",
+ "\t\n",
+ "\t\tbest-fit\n",
+ "\t\n",
+ "\n",
"\n",
- "\t\n",
+ "\t\n",
"\t\n",
"\n",
"\n",
@@ -20851,6 +20865,7 @@
"a=A\\b\n",
"\n",
"plot(x,y,'o',x,a(1)+a(2)*x)\n",
+ "legend('data','best-fit','Location','NorthWest')\n",
"xlabel('Force (N)')\n",
"ylabel('velocity (m/s)')"
]
diff --git a/lecture_17/.ipynb_checkpoints/lecture_16-checkpoint.ipynb b/lecture_17/.ipynb_checkpoints/lecture_16-checkpoint.ipynb
new file mode 100644
index 0000000..37443ae
--- /dev/null
+++ b/lecture_17/.ipynb_checkpoints/lecture_16-checkpoint.ipynb
@@ -0,0 +1,754 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 171,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "setdefaults"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 172,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "%plot --format svg"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "### Project Ideas so far\n",
+ "\n",
+ "- Nothing yet...probably something heat transfer related\n",
+ "\n",
+ "- Modeling Propulsion or Propagation of Sound Waves\n",
+ "\n",
+ "- Low Thrust Orbital Transfer\n",
+ "\n",
+ "- Tracking motion of a satellite entering orbit until impact\n",
+ "\n",
+ "- What ever you think is best.\n",
+ "\n",
+ "- You had heat transfer project as an option; that sounded cool\n",
+ "\n",
+ "- Heat transfer through a pipe\n",
+ "\n",
+ "- I would prefer to do something with beam/plate mechanics or vibrations than a heat transfer or thermo problem\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Questions from you:\n",
+ "\n",
+ "- Is attempting to divide by zero an acceptable project idea?\n",
+ "\n",
+ "- Would it be alright if we worked in a group of 4?\n",
+ "\n",
+ "- What are acceptable project topics?\n",
+ "\n",
+ "- How do the exams look? \n",
+ "\n",
+ "- Is there no pdf for the lecture today?\n",
+ "\n",
+ "- Thank you for making the formatted lectures available!\n",
+ "\n",
+ "- did you do anything cool over spring break?\n",
+ "\n",
+ "- Could we have a group of 4? We don't want to have to choose which one of us is on their own.\n",
+ "\n",
+ "- In HW 5 should there be 4 vectors as an input?\n",
+ "\n",
+ "- Would it be possible for me to join a group of 3? I seem to be the odd man out in two 3 member groups that my friends are in."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Linear Least Squares Regression\n",
+ "\n",
+ "When approximating a set of data as a polynomial, there will always be error introduced (except in 2 cases). \n",
+ "\n",
+ "For a straight line, the actual data points, $y_{i}$ as a function of the independent variable, $x_{i}$, is:\n",
+ "\n",
+ "$y_{i}=a_{0}+a_{1}x_{i}+e_{i}$\n",
+ "\n",
+ "where $a_{0}$ and $a_{1}$ are the intercept and slope of the line and $e_{i}$ is the error between the approximate function and the recorded data point. \n",
+ "\n",
+ "We make the following assumptions in this analysis:\n",
+ "\n",
+ "1. Each x has a fixed value; it is not random and is known without error.\n",
+ "\n",
+ "2. The y values are independent random variables and all have the same variance.\n",
+ "\n",
+ "3. The y values for a given x must be normally distributed.\n",
+ "\n",
+ "The total error is \n",
+ "\n",
+ "$\\sum_{i=1}^{n}e_{i}=\\sum_{i=1}^{n}(y_{i}-a_{0}-a_{1}x_{i})$\n",
+ "\n",
+ "we don't care about the sign though. One approach has been demonstrated to provide a unique solution is minimizing the sum of squares error or\n",
+ "\n",
+ "$S_{r}=\\sum_{i=1}^{n}e_{i}^{2}=\\sum_{i=1}^{n}(y_{i}-a_{0}-a_{1}x_{i})^{2}$\n",
+ "\n",
+ "Where, $S_{r}$ is the sum of squares error (SSE). \n",
+ "\n",
+ "$\\frac{\\partial S_{r}}{\\partial a_{0}}=-2\\sum(y_{i}-a_{0}-a_{1}x_{i})$\n",
+ "\n",
+ "$\\frac{\\partial S_{r}}{\\partial a_{1}}=-2\\sum(y_{i}-a_{0}-a_{1}x_{i})x_{i}$\n",
+ "\n",
+ "The minimum $S_{r}$ occurrs when the partial derivatives are 0. \n",
+ "\n",
+ "$\\sum y_{i}= \\sum a_{0}+\\sum a_{1}x_{i}$\n",
+ "\n",
+ "$\\sum x_{i}y_{i}= \\sum a_{0}x_{i}+\\sum a_{1}x_{i}^{2}$\n",
+ "\n",
+ "$\\left[\\begin{array}{c}\n",
+ "\\sum y_{i}\\\\\n",
+ "\\sum x_{i}y_{i}\\end{array}\\right]=\n",
+ "\\left[\\begin{array}{cc}\n",
+ "n & \\sum x_{i}\\\\\n",
+ "\\sum x_{i} & \\sum x_{i}^{2}\\end{array}\\right]\n",
+ "\\left[\\begin{array}{c}\n",
+ "a_{0}\\\\\n",
+ "a_{1}\\end{array}\\right]$\n",
+ "\n",
+ "\n",
+ "$b=Ax$\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example \n",
+ "\n",
+ "Find drag coefficient with best-fit line to experimental data\n",
+ "\n",
+ "|i | v (m/s) | F (N) |\n",
+ "|---|---|---|\n",
+ "|1 | 10 | 25 |\n",
+ "|2 | 20 | 70 |\n",
+ "|3 | 30 | 380|\n",
+ "|4 | 40 | 550|\n",
+ "|5 | 50 | 610|\n",
+ "|6 | 60 | 1220|\n",
+ "|7 | 70 | 830 |\n",
+ "|8 |80 | 1450|"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 200,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a =\n",
+ "\n",
+ " -234.286\n",
+ " 19.470\n",
+ "\n"
+ ]
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "drag_data=[...\n",
+ "1 , 10 , 25 \n",
+ "2 , 20 , 70 \n",
+ "3 , 30 , 380\n",
+ "4 , 40 , 550\n",
+ "5 , 50 , 610\n",
+ "6 , 60 , 1220\n",
+ "7 , 70 , 830 \n",
+ "8 ,80 , 1450];\n",
+ "x=drag_data(:,2);\n",
+ "y=drag_data(:,3);\n",
+ "\n",
+ "b=[sum(y);sum(x.*y)];\n",
+ "A=[length(x),sum(x);\n",
+ " sum(x), sum(x.^2)];\n",
+ " \n",
+ "a=A\\b\n",
+ "\n",
+ "plot(x,y,'o',x,a(1)+a(2)*x)\n",
+ "legend('data','best-fit','Location','NorthWest')\n",
+ "xlabel('Force (N)')\n",
+ "ylabel('velocity (m/s)')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "How do we know its a \"good\" fit? \n",
+ "\n",
+ "Can compare the sum of squares error to the total sum of squares of the dependent variable (here F). \n",
+ "\n",
+ "$S_{r}=\\sum(y_{i}-a_{0}-a_{1}x_{i})^{2}$\n",
+ "\n",
+ "$S_{t}=\\sum(y_{i}-\\bar{y})^{2}$\n",
+ "\n",
+ "Then, we can calculate the *coefficient of determination*, $r^{2}$ or *correlation coefficient*, r. \n",
+ "\n",
+ "$r^{2}=\\frac{S_{t}-S_{r}}{S_{t}}$\n",
+ "\n",
+ "This represents the relative improvement of assuming that y is a function of x (if the x-values are not random and the y-values are random)\n",
+ "\n",
+ "For further information regarding statistical work on regression, look at \n",
+ "[NIST Statistics Handbook](http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd44.htm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 128,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "St = 1.8083e+06\n",
+ "St = 1.8083e+06\n"
+ ]
+ }
+ ],
+ "source": [
+ "Sr=sum((y-a(1)-a(2)*x).^2);\n",
+ "St=std(y)^2*(length(y)-1)\n",
+ "St=sum((y-mean(y)).^2)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 130,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "r2 = 0.88049\n",
+ "r = 0.93834\n"
+ ]
+ }
+ ],
+ "source": [
+ "r2=(St-Sr)/St\n",
+ "r=sqrt(r2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Limiting cases \n",
+ "\n",
+ "#### $r^{2}=0$ $S_{r}=S{t}$\n",
+ "\n",
+ "#### $r^{2}=1$ $S_{r}=0$"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 152,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "gen_examples\n",
+ "plot(x1(1:50:end),y1(1:50:end),'s',x2,y2,'o')\n",
+ "legend('Case 1','Case 2','Location','NorthWest')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 157,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a1 =\n",
+ "\n",
+ " 0.0497269\n",
+ " 0.0016818\n",
+ "\n",
+ "a2 =\n",
+ "\n",
+ " 0\n",
+ " 1\n",
+ "\n",
+ "Sr1 = 0.82607\n",
+ "St1 = 0.82631\n",
+ "coefficient of determination in Case 1 is 0.000286\n",
+ "Sr2 = 0\n",
+ "St2 = 1.0185\n",
+ "coefficient of determination in Case 2 is 1.000000\n"
+ ]
+ }
+ ],
+ "source": [
+ "b1=[sum(y1);sum(x1.*y1)];\n",
+ "A1=[length(x1),sum(x1);\n",
+ " sum(x1), sum(x1.^2)];\n",
+ " \n",
+ "a1=A1\\b1\n",
+ "\n",
+ "b2=[sum(y2);sum(x2.*y2)];\n",
+ "A2=[length(x2),sum(x2);\n",
+ " sum(x2), sum(x2.^2)];\n",
+ " \n",
+ "a2=A2\\b2\n",
+ "\n",
+ "Sr1=sum((y1-a1(1)-a1(2)*x1).^2)\n",
+ "St1=sum((y1-mean(y1)).^2)\n",
+ "fprintf('coefficient of determination in Case 1 is %f\\n',1-Sr1/St1)\n",
+ "\n",
+ "Sr2=sum((y2-a2(1)-a2(2)*x2).^2)\n",
+ "St2=sum((y2-mean(y2)).^2)\n",
+ "\n",
+ "fprintf('coefficient of determination in Case 2 is %f\\n',1-Sr2/St2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "collapsed": true
+ },
+ "source": [
+ "# General Regression (Linear and nonlinear)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Octave",
+ "language": "octave",
+ "name": "octave"
+ },
+ "language_info": {
+ "file_extension": ".m",
+ "help_links": [
+ {
+ "text": "MetaKernel Magics",
+ "url": "https://github.com/calysto/metakernel/blob/master/metakernel/magics/README.md"
+ }
+ ],
+ "mimetype": "text/x-octave",
+ "name": "octave",
+ "version": "0.19.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/lecture_17/.ipynb_checkpoints/lecture_17-checkpoint.ipynb b/lecture_17/.ipynb_checkpoints/lecture_17-checkpoint.ipynb
new file mode 100644
index 0000000..2fd6442
--- /dev/null
+++ b/lecture_17/.ipynb_checkpoints/lecture_17-checkpoint.ipynb
@@ -0,0 +1,6 @@
+{
+ "cells": [],
+ "metadata": {},
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/lecture_17/gen_examples.m b/lecture_17/gen_examples.m
new file mode 100644
index 0000000..c5d6890
--- /dev/null
+++ b/lecture_17/gen_examples.m
@@ -0,0 +1,5 @@
+x1=linspace(0,1,1000)';
+y1=0*x1+0.1*rand(length(x1),1);
+
+x2=linspace(0,1,10)';
+y2=1*x2;
diff --git a/lecture_17/in-class_regression.pdf b/lecture_17/in-class_regression.pdf
new file mode 100644
index 0000000..628b495
Binary files /dev/null and b/lecture_17/in-class_regression.pdf differ
diff --git a/lecture_17/lecture_17.ipynb b/lecture_17/lecture_17.ipynb
new file mode 100644
index 0000000..35e61da
--- /dev/null
+++ b/lecture_17/lecture_17.ipynb
@@ -0,0 +1,2710 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "setdefaults"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "%plot --format svg"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "### Project Ideas so far\n",
+ "\n",
+ "- Nothing yet...probably something heat transfer related\n",
+ "\n",
+ "- Modeling Propulsion or Propagation of Sound Waves\n",
+ "\n",
+ "- Low Thrust Orbital Transfer\n",
+ "\n",
+ "- Tracking motion of a satellite entering orbit until impact\n",
+ "\n",
+ "- What ever you think is best.\n",
+ "\n",
+ "- You had heat transfer project as an option; that sounded cool\n",
+ "\n",
+ "- Heat transfer through a pipe\n",
+ "\n",
+ "- I would prefer to do something with beam/plate mechanics or vibrations than a heat transfer or thermo problem\n",
+ "\n",
+ "- Modeling Couette flow with a pressure gradient using a discretized form of the Navier-Stokes equation and comparing it to the analytical solution\n",
+ "\n",
+ "- Software to instruct a robotic arm to orient itself based on input from a gyroscope"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Questions from you:\n",
+ "\n",
+ "How was your spring break\n",
+ "\n",
+ "Are grades for hw 3 and 4 going to be posted?\n",
+ "\n",
+ "For class occasionally switching to doc cam and working through problems by hand might help get ideas across.\n",
+ "\n",
+ "Are you assigning those who do not have groups to groups?\n",
+ "\n",
+ "How do you graph a standard distribution in Matlab?\n",
+ "\n",
+ "What is the longest code you have written?\n",
+ "\n",
+ "how to approach probability for hw 5?\n",
+ "??\n",
+ "\n",
+ "Will you be assigning groups to people who do not currently have one? \n",
+ "\n",
+ "what are some basic guidelines we should use to brainstorm project ideas?\n",
+ "\n",
+ "Are you a fan of bananas?\n",
+ "\n",
+ "Going through code isn't the most helpful, because you can easily lose interest. But I am not sure what else you can do.\n",
+ "\n",
+ "Has lecture 15 been posted yet? Looking in the repository I can't seem to find it."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Linear Least Squares Regression\n",
+ "\n",
+ "When approximating a set of data as a polynomial, there will always be error introduced (except in 2 cases). \n",
+ "\n",
+ "For a straight line, the actual data points, $y_{i}$ as a function of the independent variable, $x_{i}$, is:\n",
+ "\n",
+ "$y_{i}=a_{0}+a_{1}x_{i}+e_{i}$\n",
+ "\n",
+ "where $a_{0}$ and $a_{1}$ are the intercept and slope of the line and $e_{i}$ is the error between the approximate function and the recorded data point. \n",
+ "\n",
+ "We make the following assumptions in this analysis:\n",
+ "\n",
+ "1. Each x has a fixed value; it is not random and is known without error.\n",
+ "\n",
+ "2. The y values are independent random variables and all have the same variance.\n",
+ "\n",
+ "3. The y values for a given x must be normally distributed.\n",
+ "\n",
+ "The total error is \n",
+ "\n",
+ "$\\sum_{i=1}^{n}e_{i}=\\sum_{i=1}^{n}(y_{i}-a_{0}-a_{1}x_{i})$\n",
+ "\n",
+ "we don't care about the sign though. One approach has been demonstrated to provide a unique solution is minimizing the sum of squares error or\n",
+ "\n",
+ "$S_{r}=\\sum_{i=1}^{n}e_{i}^{2}=\\sum_{i=1}^{n}(y_{i}-a_{0}-a_{1}x_{i})^{2}$\n",
+ "\n",
+ "Where, $S_{r}$ is the sum of squares error (SSE). \n",
+ "\n",
+ "$\\frac{\\partial S_{r}}{\\partial a_{0}}=-2\\sum(y_{i}-a_{0}-a_{1}x_{i})$\n",
+ "\n",
+ "$\\frac{\\partial S_{r}}{\\partial a_{1}}=-2\\sum(y_{i}-a_{0}-a_{1}x_{i})x_{i}$\n",
+ "\n",
+ "The minimum $S_{r}$ occurrs when the partial derivatives are 0. \n",
+ "\n",
+ "$\\sum y_{i}= \\sum a_{0}+\\sum a_{1}x_{i}$\n",
+ "\n",
+ "$\\sum x_{i}y_{i}= \\sum a_{0}x_{i}+\\sum a_{1}x_{i}^{2}$\n",
+ "\n",
+ "$\\left[\\begin{array}{c}\n",
+ "\\sum y_{i}\\\\\n",
+ "\\sum x_{i}y_{i}\\end{array}\\right]=\n",
+ "\\left[\\begin{array}{cc}\n",
+ "n & \\sum x_{i}\\\\\n",
+ "\\sum x_{i} & \\sum x_{i}^{2}\\end{array}\\right]\n",
+ "\\left[\\begin{array}{c}\n",
+ "a_{0}\\\\\n",
+ "a_{1}\\end{array}\\right]$\n",
+ "\n",
+ "\n",
+ "$b=Ax$\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example \n",
+ "\n",
+ "Find drag coefficient with best-fit line to experimental data\n",
+ "\n",
+ "|i | v (m/s) | F (N) |\n",
+ "|---|---|---|\n",
+ "|1 | 10 | 25 |\n",
+ "|2 | 20 | 70 |\n",
+ "|3 | 30 | 380|\n",
+ "|4 | 40 | 550|\n",
+ "|5 | 50 | 610|\n",
+ "|6 | 60 | 1220|\n",
+ "|7 | 70 | 830 |\n",
+ "|8 |80 | 1450|"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 68,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a =\n",
+ "\n",
+ " -234.286\n",
+ " 19.470\n",
+ "\n"
+ ]
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "drag_data=[...\n",
+ "1 , 10 , 25 \n",
+ "2 , 20 , 70 \n",
+ "3 , 30 , 380\n",
+ "4 , 40 , 550\n",
+ "5 , 50 , 610\n",
+ "6 , 60 , 1220\n",
+ "7 , 70 , 830 \n",
+ "8 ,80 , 1450];\n",
+ "x=drag_data(:,2);\n",
+ "y=drag_data(:,3);\n",
+ "\n",
+ "b=[sum(y);sum(x.*y)];\n",
+ "A=[length(x),sum(x);\n",
+ " sum(x), sum(x.^2)];\n",
+ " \n",
+ "a=A\\b\n",
+ "\n",
+ "plot(x,y,'o',x,a(1)+a(2)*x)\n",
+ "legend('data','best-fit','Location','NorthWest')\n",
+ "xlabel('Force (N)')\n",
+ "ylabel('velocity (m/s)')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 72,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plot(x,y-a(1)-40*x)\n",
+ "legend('data','best-fit','Location','NorthWest')\n",
+ "ylabel('residuals (N)')\n",
+ "xlabel('velocity (m/s)')\n",
+ "title('Model does not capture measurements')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "How do we know its a \"good\" fit? \n",
+ "\n",
+ "Can compare the sum of squares error to the total sum of squares of the dependent variable (here F). \n",
+ "\n",
+ "$S_{r}=\\sum(y_{i}-a_{0}-a_{1}x_{i})^{2}$\n",
+ "\n",
+ "$S_{t}=\\sum(y_{i}-\\bar{y})^{2}$\n",
+ "\n",
+ "Then, we can calculate the *coefficient of determination*, $r^{2}$ or *correlation coefficient*, r. \n",
+ "\n",
+ "$r^{2}=\\frac{S_{t}-S_{r}}{S_{t}}$\n",
+ "\n",
+ "This represents the relative improvement of assuming that y is a function of x (if the x-values are not random and the y-values are random)\n",
+ "\n",
+ "For further information regarding statistical work on regression, look at \n",
+ "[NIST Statistics Handbook](http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd44.htm)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 73,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "St = 1.8083e+06\n",
+ "St = 1.8083e+06\n"
+ ]
+ }
+ ],
+ "source": [
+ "Sr=sum((y-a(1)-a(2)*x).^2);\n",
+ "St=std(y)^2*(length(y)-1)\n",
+ "St=sum((y-mean(y)).^2)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 74,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "r2 = 0.88049\n",
+ "r = 0.93834\n"
+ ]
+ }
+ ],
+ "source": [
+ "r2=(St-Sr)/St\n",
+ "r=sqrt(r2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Limiting cases \n",
+ "\n",
+ "#### $r^{2}=0$ $S_{r}=S{t}$\n",
+ "\n",
+ "#### $r^{2}=1$ $S_{r}=0$"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 76,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "gen_examples\n",
+ "plot(x1(1:50:end),y1(1:50:end),'s',x2,y2,'o')\n",
+ "legend('Case 1','Case 2','Location','NorthWest')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 77,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a1 =\n",
+ "\n",
+ " 0.0482496\n",
+ " 0.0017062\n",
+ "\n",
+ "a2 =\n",
+ "\n",
+ " 0\n",
+ " 1\n",
+ "\n",
+ "Sr1 = 0.83505\n",
+ "St1 = 0.83529\n",
+ "coefficient of determination in Case 1 is 0.000291\n",
+ "Sr2 = 0\n",
+ "St2 = 1.0185\n",
+ "coefficient of determination in Case 2 is 1.000000\n"
+ ]
+ }
+ ],
+ "source": [
+ "b1=[sum(y1);sum(x1.*y1)];\n",
+ "A1=[length(x1),sum(x1);\n",
+ " sum(x1), sum(x1.^2)];\n",
+ " \n",
+ "a1=A1\\b1\n",
+ "\n",
+ "b2=[sum(y2);sum(x2.*y2)];\n",
+ "A2=[length(x2),sum(x2);\n",
+ " sum(x2), sum(x2.^2)];\n",
+ " \n",
+ "a2=A2\\b2\n",
+ "\n",
+ "Sr1=sum((y1-a1(1)-a1(2)*x1).^2)\n",
+ "St1=sum((y1-mean(y1)).^2)\n",
+ "fprintf('coefficient of determination in Case 1 is %f\\n',1-Sr1/St1)\n",
+ "\n",
+ "Sr2=sum((y2-a2(1)-a2(2)*x2).^2)\n",
+ "St2=sum((y2-mean(y2)).^2)\n",
+ "\n",
+ "fprintf('coefficient of determination in Case 2 is %f\\n',1-Sr2/St2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "collapsed": true
+ },
+ "source": [
+ "# General Linear Regression"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In general, you may want to fit other polynomials besides degree-1 (straight-lines)\n",
+ "\n",
+ "$y=a_{0}+a_{1}x+a_{2}x^{2}+\\cdots+a_{m}x^{m}+e$\n",
+ "\n",
+ "Now, the solution for $a_{0},~a_{1},...a_{m}$ is the minimization of m+1-dependent linear equations. \n",
+ "\n",
+ "Consider the following data:\n",
+ "\n",
+ "| x | y |\n",
+ "|---|---|\n",
+ "| 0.00 | 21.50 |\n",
+ "| 2.00 | 20.84 |\n",
+ "| 4.00 | 23.19 |\n",
+ "| 6.00 | 22.69 |\n",
+ "| 8.00 | 30.27 |\n",
+ "| 10.00 | 40.11 |\n",
+ "| 12.00 | 43.31 |\n",
+ "| 14.00 | 54.79 |\n",
+ "| 16.00 | 70.88 |\n",
+ "| 18.00 | 89.48 |\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 78,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "load xy_data.csv\n",
+ "x=xy_data(1,:)';\n",
+ "y=xy_data(2,:)';\n",
+ "plot(x,y,'o')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 46,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "xy_data =\n",
+ "\n",
+ " Columns 1 through 7:\n",
+ "\n",
+ " 0.00000 2.00000 4.00000 6.00000 8.00000 10.00000 12.00000\n",
+ " 21.50114 20.84153 23.19201 22.69498 30.26687 40.11075 43.30543\n",
+ "\n",
+ " Columns 8 through 11:\n",
+ "\n",
+ " 14.00000 16.00000 18.00000 20.00000\n",
+ " 54.78730 70.88443 89.48368 97.28135\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "xy_data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The model can be rewritten as \n",
+ "\n",
+ "$y=\\left[Z\\right]a+e$\n",
+ "\n",
+ "where $a=\\left[\\begin{array}{c}\n",
+ " a_{0}\\\\\n",
+ " a_{1}\\\\\n",
+ " a_{2}\\end{array}\\right]$\n",
+ " \n",
+ "$[Z]=\\left[\\begin{array} \n",
+ "1 & x_{1} & x_{1}^{2} \\\\\n",
+ "1 & x_{2} & x_{2}^{2} \\\\\n",
+ "1 & x_{3} & x_{3}^{2} \\\\\n",
+ "1 & x_{4} & x_{4}^{2} \\\\\n",
+ "1 & x_{5} & x_{5}^{2} \\\\\n",
+ "1 & x_{6} & x_{6}^{2} \\\\\n",
+ "1 & x_{7} & x_{7}^{2} \\\\\n",
+ "1 & x_{8} & x_{8}^{2} \\\\\n",
+ "1 & x_{9} & x_{9}^{2} \\\\\n",
+ "1 & x_{10} & x_{10}^{2} \\end{array}\\right]$\n",
+ "\n",
+ "The sum of squares residuals for this model is\n",
+ "\n",
+ "$S_{r}=\\sum_{i=1}^{n}\\left(y_{i}-\\sum_{j=0}^{m}a_{j}z_{ji}\\right)$\n",
+ "\n",
+ "Minimizing this function results in\n",
+ "\n",
+ "$y=[Z]a$\n",
+ "\n",
+ "->**A standard Linear Algebra Problem**\n",
+ "\n",
+ "*the vector a is unknown, and Z is calculated based upon the assumed function*\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 79,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Z =\n",
+ "\n",
+ " 1 0 0\n",
+ " 1 2 4\n",
+ " 1 4 16\n",
+ " 1 6 36\n",
+ " 1 8 64\n",
+ " 1 10 100\n",
+ " 1 12 144\n",
+ " 1 14 196\n",
+ " 1 16 256\n",
+ " 1 18 324\n",
+ " 1 20 400\n",
+ "\n",
+ "a =\n",
+ "\n",
+ " 21.40341\n",
+ " -0.81538\n",
+ " 0.23935\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "Z=[x.^0,x,x.^2]\n",
+ "\n",
+ "a=Z\\y"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 80,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "x_fcn=linspace(min(x),max(x));\n",
+ "plot(x,y,'o',x_fcn,a(1)+a(2)*x_fcn+a(3)*x_fcn.^2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### General Coefficient of Determination\n",
+ "\n",
+ "$r^{2}=\\frac{S_{t}-S_{r}}{S_{t}}=1-\\frac{S_{r}}{S_t}$"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 49,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "St = 27.923\n",
+ "Sr = 2.6366\n"
+ ]
+ }
+ ],
+ "source": [
+ "St=std(y)\n",
+ "Sr=std(y-a(1)-a(2)*x-a(3)*x.^2)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 81,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "the coefficient of determination for this fit is 0.880485\n",
+ "the correlation coefficient this fit is 0.938342\n"
+ ]
+ }
+ ],
+ "source": [
+ "r2=1-Sr/St;\n",
+ "r=sqrt(r2);\n",
+ "\n",
+ "fprintf('the coefficient of determination for this fit is %f',r2)\n",
+ "fprintf('the correlation coefficient this fit is %f',r)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Compare this to a straight line fit"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 83,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "St = 27.923\n",
+ "Sr = 9.2520\n",
+ "the coefficient of determination for this fit is 0.668655\n",
+ "the correlation coefficient this fit is 0.817713\n"
+ ]
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "Z=[ones(size(x)) x];\n",
+ "a_line=Z\\y;\n",
+ "x_fcn=linspace(min(x),max(x));\n",
+ "plot(x,y,'o',x_fcn,a(1)+a(2)*x_fcn+a(3)*x_fcn.^2,...\n",
+ "x_fcn,a_line(1)+a_line(2)*x_fcn)\n",
+ "St=std(y)\n",
+ "Sr=std(y-a_line(1)-a_line(2)*x)\n",
+ "r2=1-Sr/St;\n",
+ "r=sqrt(r2);\n",
+ "\n",
+ "fprintf('the coefficient of determination for this fit is %f',r2)\n",
+ "fprintf('the correlation coefficient this fit is %f',r)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 85,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plot(x,y-a_line(1)-a_line(2)*x)\n",
+ "title('residuals of straight line')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 86,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plot(x,y-a(1)-a(2)*x-a(3)*x.^2)\n",
+ "title('residuals of parabola')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Warning \n",
+ "**Coefficient of determination reduction does not always mean a better fit**\n",
+ "\n",
+ "Try the function, \n",
+ "\n",
+ "$y(x)=a_0+a_{1}x+a_{2}x^{2}+a_{4}x^{4}+a_{5}x^{5}+a_{5}x^{5}+a_{6}x^{6}+a_{7}x^{7}+a_{8}x^{8}$"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 90,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a_overfit =\n",
+ "\n",
+ " 2.1487e+01\n",
+ " -1.4264e+01\n",
+ " 1.5240e+01\n",
+ " -6.0483e+00\n",
+ " 1.1887e+00\n",
+ " -1.2651e-01\n",
+ " 7.4379e-03\n",
+ " -2.2702e-04\n",
+ " 2.8063e-06\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "Z=[ones(size(x)) x x.^2 x.^3 x.^4 x.^5 x.^6 x.^7 x.^8];\n",
+ "a_overfit=Z\\y"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 91,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plot(x,y,'o',x_fcn,a(1)+a(2)*x_fcn+a(3)*x_fcn.^2,...\n",
+ "x_fcn,a_overfit(1)+a_overfit(2)*x_fcn+a_overfit(3)*x_fcn.^2+...\n",
+ "a_overfit(4)*x_fcn.^3+a_overfit(5)*x_fcn.^4+...\n",
+ "a_overfit(6)*x_fcn.^5+a_overfit(7)*x_fcn.^6+...\n",
+ "a_overfit(8)*x_fcn.^7+a_overfit(9)*x_fcn.^8)\n",
+ "legend('data','parabola','n=8-fit','Location','NorthWest')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 92,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "St = 27.923\n",
+ "Sr = 0.77320\n",
+ "the coefficient of determination for this fit is 0.972309\n",
+ "the correlation coefficient this fit is 0.986057\n"
+ ]
+ }
+ ],
+ "source": [
+ "St=std(y)\n",
+ "Sr=std(y-polyval(a_overfit(end:-1:1),x))\n",
+ "r2=1-Sr/St;\n",
+ "r=sqrt(r2);\n",
+ "\n",
+ "fprintf('the coefficient of determination for this fit is %f',r2)\n",
+ "fprintf('the correlation coefficient this fit is %f',r)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Linear Regression is only limited by the ability to separate the parameters from the function to achieve\n",
+ "\n",
+ "$y=[Z]a$\n",
+ "\n",
+ "$Z$ can be any function of the independent variable(s)\n",
+ "\n",
+ "**Example**:\n",
+ "We assume two functions are added together, sin(t) and sin(3t). What are the amplitudes?\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "load sin_data.csv\n",
+ "t=sin_data(1,:)';\n",
+ "y=sin_data(2,:)';\n",
+ "plot(t,y)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a =\n",
+ "\n",
+ " 0.99727\n",
+ " 0.50251\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "Z=[sin(t) sin(3*t)];\n",
+ "a=Z\\y"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plot(t,y,'.',t,a(1)*sin(t)+a(2)*sin(3*t))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Octave",
+ "language": "octave",
+ "name": "octave"
+ },
+ "language_info": {
+ "file_extension": ".m",
+ "help_links": [
+ {
+ "text": "MetaKernel Magics",
+ "url": "https://github.com/calysto/metakernel/blob/master/metakernel/magics/README.md"
+ }
+ ],
+ "mimetype": "text/x-octave",
+ "name": "octave",
+ "version": "0.19.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/lecture_17/octave-workspace b/lecture_17/octave-workspace
new file mode 100644
index 0000000..52eed6b
Binary files /dev/null and b/lecture_17/octave-workspace differ
diff --git a/lecture_17/q1.png b/lecture_17/q1.png
new file mode 100644
index 0000000..7c89cbf
Binary files /dev/null and b/lecture_17/q1.png differ
diff --git a/lecture_17/q2.png b/lecture_17/q2.png
new file mode 100644
index 0000000..7eea25c
Binary files /dev/null and b/lecture_17/q2.png differ
diff --git a/lecture_17/sin_data.csv b/lecture_17/sin_data.csv
new file mode 100644
index 0000000..a5ff272
--- /dev/null
+++ b/lecture_17/sin_data.csv
@@ -0,0 +1,2 @@
+ 0.00000000e+00 1.26933037e-01 2.53866073e-01 3.80799110e-01 5.07732146e-01 6.34665183e-01 7.61598219e-01 8.88531256e-01 1.01546429e+00 1.14239733e+00 1.26933037e+00 1.39626340e+00 1.52319644e+00 1.65012947e+00 1.77706251e+00 1.90399555e+00 2.03092858e+00 2.15786162e+00 2.28479466e+00 2.41172769e+00 2.53866073e+00 2.66559377e+00 2.79252680e+00 2.91945984e+00 3.04639288e+00 3.17332591e+00 3.30025895e+00 3.42719199e+00 3.55412502e+00 3.68105806e+00 3.80799110e+00 3.93492413e+00 4.06185717e+00 4.18879020e+00 4.31572324e+00 4.44265628e+00 4.56958931e+00 4.69652235e+00 4.82345539e+00 4.95038842e+00 5.07732146e+00 5.20425450e+00 5.33118753e+00 5.45812057e+00 5.58505361e+00 5.71198664e+00 5.83891968e+00 5.96585272e+00 6.09278575e+00 6.21971879e+00 6.34665183e+00 6.47358486e+00 6.60051790e+00 6.72745093e+00 6.85438397e+00 6.98131701e+00 7.10825004e+00 7.23518308e+00 7.36211612e+00 7.48904915e+00 7.61598219e+00 7.74291523e+00 7.86984826e+00 7.99678130e+00 8.12371434e+00 8.25064737e+00 8.37758041e+00 8.50451345e+00 8.63144648e+00 8.75837952e+00 8.88531256e+00 9.01224559e+00 9.13917863e+00 9.26611167e+00 9.39304470e+00 9.51997774e+00 9.64691077e+00 9.77384381e+00 9.90077685e+00 1.00277099e+01 1.01546429e+01 1.02815760e+01 1.04085090e+01 1.05354420e+01 1.06623751e+01 1.07893081e+01 1.09162411e+01 1.10431742e+01 1.11701072e+01 1.12970402e+01 1.14239733e+01 1.15509063e+01 1.16778394e+01 1.18047724e+01 1.19317054e+01 1.20586385e+01 1.21855715e+01 1.23125045e+01 1.24394376e+01 1.25663706e+01
+ 9.15756288e-02 3.39393873e-01 6.28875306e-01 7.67713096e-01 1.05094584e+00 9.70887288e-01 9.84265740e-01 1.02589034e+00 8.53218113e-01 6.90197665e-01 5.51277193e-01 5.01564914e-01 5.25455797e-01 5.87052838e-01 5.41394658e-01 7.12365594e-01 8.14839678e-01 9.80181855e-01 9.44430709e-01 1.06728057e+00 1.15166322e+00 8.99464065e-01 7.77225453e-01 5.92618124e-01 3.08822183e-01 -1.07884730e-03 -3.46563271e-01 -5.64836023e-01 -8.11931510e-01 -1.05925186e+00 -1.13323611e+00 -1.11986890e+00 -8.88336727e-01 -9.54113139e-01 -6.81378679e-01 -6.02369117e-01 -4.78684439e-01 -5.88160325e-01 -4.93580777e-01 -5.68747320e-01 -7.51641934e-01 -8.14672884e-01 -9.53191554e-01 -9.55337518e-01 -9.85995556e-01 -9.63373597e-01 -1.01511061e+00 -7.56467517e-01 -4.17379564e-01 -1.22340361e-01 2.16273929e-01 5.16909714e-01 7.77031694e-01 1.00653798e+00 9.35718089e-01 1.00660116e+00 1.11177057e+00 9.85485116e-01 8.54344900e-01 6.26444042e-01 6.28124048e-01 4.27764254e-01 5.93991751e-01 4.79248018e-01 7.17522492e-01 7.35927848e-01 9.08802925e-01 9.38646871e-01 1.13125860e+00 1.07247935e+00 1.05198782e+00 9.41647332e-01 6.98801244e-01 4.03193543e-01 1.37009682e-01 -1.43203880e-01 -4.64369445e-01 -6.94978252e-01 -1.03483196e+00 -1.10261288e+00 -1.12892727e+00 -1.03902484e+00 -8.53573083e-01 -7.01815315e-01 -6.84745997e-01 -6.14189417e-01 -4.70090797e-01 -5.95052432e-01 -5.96497000e-01 -5.66861911e-01 -7.18239679e-01 -9.52873043e-01 -9.37512847e-01 -1.15782985e+00 -1.03858206e+00 -1.03182712e+00 -8.45121554e-01 -5.61821980e-01 -2.83427014e-01 -8.27056140e-02
diff --git a/lecture_17/xy_data.csv b/lecture_17/xy_data.csv
new file mode 100644
index 0000000..15b919d
--- /dev/null
+++ b/lecture_17/xy_data.csv
@@ -0,0 +1,2 @@
+ 0.00000000e+00 2.00000000e+00 4.00000000e+00 6.00000000e+00 8.00000000e+00 1.00000000e+01 1.20000000e+01 1.40000000e+01 1.60000000e+01 1.80000000e+01 2.00000000e+01
+ 2.15011412e+01 2.08415256e+01 2.31920098e+01 2.26949773e+01 3.02668745e+01 4.01107461e+01 4.33054255e+01 5.47873036e+01 7.08844291e+01 8.94836828e+01 9.72813533e+01
diff --git a/lecture_18/.Newtint.m.swp b/lecture_18/.Newtint.m.swp
new file mode 100644
index 0000000..769fe2b
Binary files /dev/null and b/lecture_18/.Newtint.m.swp differ
diff --git a/lecture_18/.ipynb_checkpoints/lecture_18-checkpoint.ipynb b/lecture_18/.ipynb_checkpoints/lecture_18-checkpoint.ipynb
new file mode 100644
index 0000000..2fd6442
--- /dev/null
+++ b/lecture_18/.ipynb_checkpoints/lecture_18-checkpoint.ipynb
@@ -0,0 +1,6 @@
+{
+ "cells": [],
+ "metadata": {},
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/lecture_18/Newtint.m b/lecture_18/Newtint.m
new file mode 100644
index 0000000..93e77f6
--- /dev/null
+++ b/lecture_18/Newtint.m
@@ -0,0 +1,33 @@
+function yint = Newtint_bak(x,y,xx)
+% Newtint: Newton interpolating polynomial
+% yint = Newtint(x,y,xx): Uses an (n - 1)-order Newton
+% interpolating polynomial based on n data points (x, y)
+% to determine a value of the dependent variable (yint)
+% at a given value of the independent variable, xx.
+% input:
+% x = independent variable
+% y = dependent variable
+% xx = value of independent variable at which
+% interpolation is calculated
+% output:
+% yint = interpolated value of dependent variable
+
+% compute the finite divided differences in the form of a
+% difference table
+n = length(x);
+if length(y)~=n, error('x and y must be same length'); end
+b = zeros(n,n);
+% assign dependent variables to the first column of b.
+b(:,1) = y(:); % the (:) ensures that y is a column vector.
+for j = 2:n
+ for i = 1:n-j+1
+ b(i,j) = (b(i+1,j-1)-b(i,j-1))/(x(i+j-1)-x(i));
+ end
+end
+% use the finite divided differences to interpolate
+xt = 1;
+yint = b(1,1);
+for j = 1:n-1
+ xt = xt*(xx-x(j));
+ yint = yint+b(1,j+1)*xt;
+end
diff --git a/lecture_18/challenger_oring.csv b/lecture_18/challenger_oring.csv
new file mode 100644
index 0000000..11d647e
--- /dev/null
+++ b/lecture_18/challenger_oring.csv
@@ -0,0 +1,24 @@
+Flight#,Temp,O-Ring Problem
+1,53,1
+2,57,1
+3,58,1
+4,63,1
+5,66,0
+6,66.8,0
+7,67,0
+8,67.2,0
+9,68,0
+10,69,0
+11,69.8,1
+12,69.8,0
+13,70.2,1
+14,70.2,0
+15,72,0
+16,73,0
+17,75,0
+18,75,1
+19,75.8,0
+20,76.2,0
+21,78,0
+22,79,0
+23,81,0
diff --git a/lecture_18/lecture_18.ipynb b/lecture_18/lecture_18.ipynb
new file mode 100644
index 0000000..10a9d0a
--- /dev/null
+++ b/lecture_18/lecture_18.ipynb
@@ -0,0 +1,1882 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "setdefaults"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "%plot --format svg"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Nonlinear Regression\n",
+ "\n",
+ "We can define any function and minimize the sum of squares error even if the constants cannot be separated.\n",
+ "\n",
+ "$S_{r}=\\left[y-f(z_{1},z_{2},...)\\right]^{2}$\n",
+ "\n",
+ "Consider the function, \n",
+ "\n",
+ "$f(x) = a_{0}(1-e^{a_{1}x})$\n",
+ "\n",
+ "We can define the sum of squares error as a function of $a_{0}$ and $a_{1}$:\n",
+ "\n",
+ "$f_{SSE}(a_{0},a_{1})=\\sum_{i=1}^{n}\\left[y- a_{0}(1-e^{a_{1}x})\\right]^{2}$"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "function [SSE,yhat] = sse_nonlin_exp(a,x,y)\n",
+ " % This is a sum of squares error function based on \n",
+ " % the two input constants a0 and a1 where a=[a0,a1]\n",
+ " % and the data is x (independent), y (dependent)\n",
+ " % and yhat is the model with the given a0 and a1 values\n",
+ " a0=a(1);\n",
+ " a1=a(2);\n",
+ " yhat=a0*(1-exp(a1*x));\n",
+ " SSE=sum((y-a0*(1-exp(a1*x))).^2);\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Where the data we are fitting is:\n",
+ "\n",
+ "| x | y |\n",
+ "|---|---|\n",
+ " | 0.0 | 0.41213|\n",
+ " | 1.0 | -2.65190|\n",
+ " | 2.0 | 15.04049|\n",
+ " | 3.0 | 5.19368|\n",
+ " | 4.0 | -0.71086|\n",
+ " | 5.0 | 12.69008|\n",
+ " | 6.0 | 29.20309|\n",
+ " | 7.0 | 58.68879|\n",
+ " | 8.0 | 91.61117|\n",
+ " | 9.0 | 173.75492|\n",
+ " | 10.0 | 259.04083|"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "data=[\n",
+ " 0.00000 0.41213\n",
+ " 1.00000 -2.65190\n",
+ " 2.00000 15.04049\n",
+ " 3.00000 5.19368\n",
+ " 4.00000 -0.71086\n",
+ " 5.00000 12.69008\n",
+ " 6.00000 29.20309\n",
+ " 7.00000 58.68879\n",
+ " 8.00000 91.61117\n",
+ " 9.00000 173.75492\n",
+ " 10.00000 259.04083\n",
+ "\n",
+ "];\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "the sum of squares for a0=-2.00 and a1=0.20 is 98118.4\n"
+ ]
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "[SSE,yhat]=sse_nonlin_exp([-2,0.2],data(:,1),data(:,2));\n",
+ "fprintf('the sum of squares for a0=%1.2f and a1=%1.2f is %1.1f',...\n",
+ "-2,0.2,SSE)\n",
+ "plot(data(:,1),data(:,2),'o',data(:,1),yhat)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "a =\n",
+ "\n",
+ " -1.71891 0.50449\n",
+ "\n",
+ "fsse = 633.70\n"
+ ]
+ }
+ ],
+ "source": [
+ "[a,fsse]=fminsearch(@(a) sse_nonlin_exp(a,data(:,1),data(:,2)),[-2,0.2])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "[sse,yhat]=sse_nonlin_exp(a,data(:,1),data(:,2));\n",
+ "plot(data(:,1),data(:,2),'o',data(:,1),yhat)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Case Study: Logistic Regression\n",
+ "\n",
+ "Many times the variable you predict is a binary (or discrete) value, such as pass/fail, broken/not-broken, etc. \n",
+ "\n",
+ "One method to fit this type of data is called [**logistic regression**](https://en.wikipedia.org/wiki/Logistic_regression).\n",
+ "\n",
+ "[Logistic Regression link 2](http://www.holehouse.org/mlclass/06_Logistic_Regression.html)\n",
+ "\n",
+ "We use a function that varies from 0 to 1 called a logistic function:\n",
+ "\n",
+ "$\\sigma(t)=\\frac{1}{1+e^{-t}}$\n",
+ "\n",
+ "We can use this function to describe the likelihood of failure (1) or success (0). When z=0, the probability of failure is 50%. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "t=linspace(-10,10);\n",
+ "sigma=@(t) 1./(1+exp(-t));\n",
+ "plot(t,sigma(t))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now we make the assumption that we can predict the boundary between the pass-fail criteria with a function of our independent variable e.g.\n",
+ "\n",
+ "$y=\\left\\{\\begin{array}{cc} \n",
+ "1 & a_{0}+a_{1}x +\\epsilon >0 \\\\\n",
+ "0 & else \\end{array} \\right\\}$\n",
+ "\n",
+ "so the logistic function is now:\n",
+ "\n",
+ "$\\sigma(x)=\\frac{1}{1+e^{-(a_{0}+a_{1}x)}}$\n",
+ "\n",
+ "Here, there is not a direct sum of squares error, so we minimize a cost function: \n",
+ "\n",
+ "$J(a_{0},a_{1})=\\sum_{i=1}^{n}\\left[-y_{i}\\log(\\sigma(x_{i}))-(1-y_{i})\\log((1-\\sigma(x_{i})))\\right]$\n",
+ "\n",
+ "y=0,1 \n",
+ "\n",
+ "So the cost function either sums the "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example: Challenger O-ring failures\n",
+ "\n",
+ "The O-rings on the Challenger shuttles had problems when temperatures became low. We can look at the conditions when damage was observed to determine the likelihood of failure. \n",
+ "\n",
+ "[Challenger O-ring data powerpoint](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjZvL7jkP3SAhUp04MKHXkXDkMQFggcMAA&url=http%3A%2F%2Fwww.stat.ufl.edu%2F~winner%2Fcases%2Fchallenger.ppt&usg=AFQjCNFyjwT7NmRthDkDEgch75Fc5dc66w&sig2=_qeteX6-ZEBwPW8SZN1mIA)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "oring =\n",
+ "\n",
+ " 1.00000 53.00000 1.00000\n",
+ " 2.00000 57.00000 1.00000\n",
+ " 3.00000 58.00000 1.00000\n",
+ " 4.00000 63.00000 1.00000\n",
+ " 5.00000 66.00000 0.00000\n",
+ " 6.00000 66.80000 0.00000\n",
+ " 7.00000 67.00000 0.00000\n",
+ " 8.00000 67.20000 0.00000\n",
+ " 9.00000 68.00000 0.00000\n",
+ " 10.00000 69.00000 0.00000\n",
+ " 11.00000 69.80000 1.00000\n",
+ " 12.00000 69.80000 0.00000\n",
+ " 13.00000 70.20000 1.00000\n",
+ " 14.00000 70.20000 0.00000\n",
+ " 15.00000 72.00000 0.00000\n",
+ " 16.00000 73.00000 0.00000\n",
+ " 17.00000 75.00000 0.00000\n",
+ " 18.00000 75.00000 1.00000\n",
+ " 19.00000 75.80000 0.00000\n",
+ " 20.00000 76.20000 0.00000\n",
+ " 21.00000 78.00000 0.00000\n",
+ " 22.00000 79.00000 0.00000\n",
+ " 23.00000 81.00000 0.00000\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "% read data from csv file \n",
+ "% col 1 = index\n",
+ "% col 2 = temperature\n",
+ "% col 3 = 1 if damaged, 0 if undamaged\n",
+ "oring=dlmread('challenger_oring.csv',',',1,0)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plot(oring(:,2),oring(:,3),'o')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 54,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": [
+ "function J=sse_logistic(a,x,y)\n",
+ " % Create function to calculate SSE of logistic function\n",
+ " % t = a0+a1*x\n",
+ " % sigma(t) = 1./(1+e^(-t))\n",
+ " sigma=@(t) 1./(1+exp(-t));\n",
+ " a0=a(1);\n",
+ " a1=a(2);\n",
+ " t=a0+a1*x;\n",
+ " J = 1/length(x)*sum(-y.*log(sigma(t))-(1-y).*log(1-sigma(t)));\n",
+ "end"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 61,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "J = 0.88822\n",
+ "a =\n",
+ "\n",
+ " 15.03501 -0.23205\n",
+ "\n"
+ ]
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "J=sse_logistic([10,-0.2],oring(:,2),oring(:,3))\n",
+ "a=fminsearch(@(a) sse_logistic(a,oring(:,2),oring(:,3)),[0,-3])\n",
+ "\n",
+ "T=linspace(50,85);\n",
+ "plot(oring(:,2),oring(:,3),'o',T,sigma(a(1)+a(2)*T))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 75,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "probability of failure when 70 degrees is 23.00% \n",
+ "probability of failure when 60 degrees is 75.25%\n"
+ ]
+ }
+ ],
+ "source": [
+ "fprintf('probability of failure when 70 degrees is %1.2f%% ',100*sigma(a(1)+a(2)*70))\n",
+ "fprintf('probability of failure when 60 degrees is %1.2f%%',100*sigma(a(1)+a(2)*60))\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Interpolation\n",
+ "\n",
+ "Using regression (linear and nonlinear) you are faced with the problem, that you have lots of noisy data and you want to fit a physical model to it. \n",
+ "\n",
+ "You can use interpolation to solve the opposite problem, you have a little data with very little noise.\n",
+ "\n",
+ "## Linear interpolation\n",
+ "\n",
+ "If you are trying to find the value of f(x) for x between $x_{1}$ and $x_{2}$, then you can match the slopes\n",
+ "\n",
+ "$\\frac{f(x)-f(x_{1})}{x-x_{1}}=\\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}$\n",
+ "\n",
+ "or\n",
+ "\n",
+ "$f(x)=f(x_{1})+(x-x_{1})\\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}$\n",
+ "\n",
+ "### Example: Logarithms\n",
+ "\n",
+ "Engineers used to have to use interpolation in logarithm tables for calculations. Find ln(2) from \n",
+ "\n",
+ "a. ln(1) and ln(6)\n",
+ "\n",
+ "b. ln(1) and ln(4)\n",
+ "\n",
+ "c. just calculate it as ln(2)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 84,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "ln(2)~0.358352\n",
+ "ln(2)~0.462098\n",
+ "ln(2)=0.693147\n"
+ ]
+ }
+ ],
+ "source": [
+ "ln2_16=log(1)+(log(6)-log(1))/(6-1)*(2-1);\n",
+ "fprintf('ln(2)~%f\\n',ln2_16)\n",
+ "ln2_14=log(1)+(log(4)-log(1))/(4-1)*(2-1);\n",
+ "fprintf('ln(2)~%f\\n',ln2_14)\n",
+ "ln2=log(2);\n",
+ "fprintf('ln(2)=%f\\n',ln2)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 87,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "x=linspace(1,6);\n",
+ "plot(x,log(x),2,log(2),'*',...\n",
+ "[1,2,6],[log(1),ln2_16,log(6)],'o-',...\n",
+ "[1,2,4],[log(1),ln2_14,log(4)],'s-')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Quadratic interpolation (intro curvature)\n",
+ "\n",
+ "Assume function is parabola between 3 points. The function is can be written as:\n",
+ "\n",
+ "$f_{2}(x)=b_{1}+b_{2}(x-x_{1})+b_{3}(x-x_{1})(x-x_{2})$\n",
+ "\n",
+ "When $x=x_{1}$\n",
+ "\n",
+ "$f(x_{1})=b_{1}$\n",
+ "\n",
+ "when $x=x_{2}$\n",
+ "\n",
+ "$b_{2}=\\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}$\n",
+ "\n",
+ "when $x=x_{3}$\n",
+ "\n",
+ "$b_{3}=\\frac{\\frac{f(x_{3})-f(x_{2})}{x_{3}-x_{2}}\n",
+ "-\\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}}{x_{3}-x_{1}}$\n",
+ "\n",
+ "#### Reexamining the ln(2) with ln(1), ln(4), and ln(6):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 89,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "b1 = 0\n",
+ "b2 = 0.46210\n",
+ "b3 = -0.051873\n"
+ ]
+ }
+ ],
+ "source": [
+ "x1=1;\n",
+ "x2=4;\n",
+ "x3=6;\n",
+ "f1=log(x1);\n",
+ "f2=log(x2);\n",
+ "f3=log(x3);\n",
+ "\n",
+ "b1=f1\n",
+ "b2=(f2-f1)/(x2-x1)\n",
+ "b3=(f3-f2)/(x3-x2)-b2;\n",
+ "b3=b3/(x3-x1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 91,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "x=linspace(1,6);\n",
+ "f=@(x) b1+b2*(x-x1)+b3*(x-x1).*(x-x2);\n",
+ "plot(x,log(x),2,log(2),'*',...\n",
+ "[1,4,6],[log(1),log(4),log(6)],'ro',...\n",
+ "x,f(x),'r-',2,f(2),'s')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Newton's Interpolating Polynomials\n",
+ "\n",
+ "For n-data points, we can fit an (n-1)th-polynomial\n",
+ "\n",
+ "$f_{n-1}(x)=b_{1}+b_{2}(x-x_{1})+\\cdots+b_{n}(x-x_{1})(x-x_{2})\\cdots(x-x_{n})$\n",
+ "\n",
+ "where \n",
+ "\n",
+ "$b_{1}=f(x_{1})$\n",
+ "\n",
+ "$b_{2}=\\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}$\n",
+ "\n",
+ "$b_{3}=\\frac{\\frac{f(x_{3})-f(x_{2})}{x_{3}-x_{2}}\n",
+ "-b_{2}}{x_{3}-x_{1}}$\n",
+ "\n",
+ "$\\vdots$\n",
+ "\n",
+ "$b_{n}=f(x_{n},x_{n-1},...,x_{2},x_{1})\n",
+ "=\\frac{f(x_{n},x_{n-1},...x_{2})-f(x_{n-1},x_{n-2},...,x_{1})}{x_{n}-x_{1}}$\n",
+ "\n",
+ "**e.g. for 4 data points:**\n",
+ "\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 105,
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "ln(2)=0.693147\n",
+ "ln(2)~0.722462\n"
+ ]
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ ""
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "\n",
+ "yy=zeros(size(xx));\n",
+ "x=[0.5,1,3,4,5,6]; % define independent var's\n",
+ "y=log(x); % define dependent var's\n",
+ "xx=linspace(min(x),max(x));\n",
+ "for i=1:length(xx)\n",
+ " yy(i)=Newtint(x,y,xx(i));\n",
+ "end\n",
+ "plot(xx,log(xx),2,log(2),'*',...\n",
+ "x,y,'ro',...\n",
+ "xx,yy,'r-')\n",
+ "\n",
+ "fprintf('ln(2)=%f',log(2))\n",
+ "fprintf('ln(2)~%f',Newtint(x,y,2))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "collapsed": true
+ },
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Octave",
+ "language": "octave",
+ "name": "octave"
+ },
+ "language_info": {
+ "file_extension": ".m",
+ "help_links": [
+ {
+ "text": "MetaKernel Magics",
+ "url": "https://github.com/calysto/metakernel/blob/master/metakernel/magics/README.md"
+ }
+ ],
+ "mimetype": "text/x-octave",
+ "name": "octave",
+ "version": "0.19.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/lecture_18/newton_interpolation.png b/lecture_18/newton_interpolation.png
new file mode 100644
index 0000000..5990cb5
Binary files /dev/null and b/lecture_18/newton_interpolation.png differ