Using Ordinary Least Squares (OLS) to fit a 3rd order polynomial to noisy data

Following on from the previous post where we used our ols() function to fit a linear line to noisy data we can extend also used the OLS equations to fit a higher order polynomial equation to noisy data.

Along the same lines as the previous post, let's consider a 3rd order polynomial model as follows, again $y$ is a vector of measured data, however this time $\beta_{0}$, $\beta_{1}$, $\beta_{2}$ and $\beta_{3}$ are the cubic model parameters, $x$ represents the independent variable and $\epsilon$ represents the error vector, this model can therefore be expressed as:

$$y_{1} = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{1}^2 + \beta_{3}x_{1}^3 + \epsilon_{1}$$$$y_{2} = \beta_{0} + \beta_{1}x_{2} + \beta_{2}x_{2}^2 + \beta_{3}x_{2}^3 + \epsilon_{2}$$$$\vdots$$$$y_{n} = \beta_{0} + \beta_{1}x_{3} + \beta_{2}x_{3}^2 + \beta_{3}x_{3}^3 + \epsilon_{n}$$

From an estimation point of view, this non-linear model can effectively be considered a linear model if, instead of one independent variable ($x$), we consider the model to have 3 independent variables $x$, $x^2$ and $x^3$. Then we can write the model in the required linear form:

$$\begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{bmatrix} = \begin{bmatrix} 1 & x_{1} & x_{1}^2 & x_{1}^3\\ 1 & x_{2} & x_{2}^2 & x_{2}^3 \\ \vdots & \vdots \\ 1 & x_{n} & x_{n}^2 & x_{n}^3 \end{bmatrix} \begin{bmatrix} \beta_{0} \\ \beta_{1} \\ \beta_{2} \\ \beta_{3} \\ \end{bmatrix} + \begin{bmatrix} \epsilon_{1} \\ \epsilon_{2} \\ \vdots \\ \epsilon_{n} \end{bmatrix}$$

Or in matrix form:

$$y=X \beta + \epsilon$$

And as per the previous article, we can calculate $\hat \beta$ from the OLS equation:

$$\hat \beta = (X^T X)^{-1} X^T y$$

Which we coded in our ols() function:

In [5]:
def ols(X, y):
    beta = np.dot(np.linalg.inv(np.dot(X.T,X)),np.dot(X.T,y))
    return beta

Test Data Generation

Now let's generate some noisy data to test the function in Python. The model is $y = 6.1 + 25.7x + 3.9x^2 - 4.5x^3$ with normally distributed noise with mean = 0 and standard deviation = 20:

In [6]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

N = 51

b0 = 6.1
b1 = 25.7
b2 = 3.9
b3 = -4.5

x = np.linspace(-4,4,N)
y = b0 + b1*x + b2*(x**2) + b3*(x**3)

# generate noise and add it, yn = y + e
mn = 0
sd = 20
e = np.random.normal(mn,sd,N)
yn = y + e

# set the default size of the plots
plt.rcParams['figure.figsize'] = (12,7)
# create ax object for plotting
fig, ax = plt.subplots()
ax.plot(x,y,'r-',label='y = '+str(b0)+' + '+str(b1)+'x'+' + '+str(b2)+'x^2'+' + '+str(b3)+'x^3')
ax.plot(x,yn,'b-',label='yn = y + e, mean = 0, sd = 20.0')

# add axes labels and title
ax.set_xlabel('x'); ax.set_ylabel('y'); ax.set_title('Cubic Model with Noise'); ax.legend()
Out[6]:
<matplotlib.legend.Legend at 0x143a2210b70>

Testing the Function

Before we test the data the vector x needs to be formatted correctly into a matrix X:

In [7]:
print(x)
print(x.shape)
intercept = np.ones((N,1))
xr = x.reshape(-1,1) # convert 1 x N into N x 1
X = np.concatenate((intercept, xr, xr**2, xr**3), axis=1)
print(X[0:5])
print(X.shape)
[-4.   -3.84 -3.68 -3.52 -3.36 -3.2  -3.04 -2.88 -2.72 -2.56 -2.4  -2.24
 -2.08 -1.92 -1.76 -1.6  -1.44 -1.28 -1.12 -0.96 -0.8  -0.64 -0.48 -0.32
 -0.16  0.    0.16  0.32  0.48  0.64  0.8   0.96  1.12  1.28  1.44  1.6
  1.76  1.92  2.08  2.24  2.4   2.56  2.72  2.88  3.04  3.2   3.36  3.52
  3.68  3.84  4.  ]
(51,)
[[  1.        -4.        16.       -64.      ]
 [  1.        -3.84      14.7456   -56.623104]
 [  1.        -3.68      13.5424   -49.836032]
 [  1.        -3.52      12.3904   -43.614208]
 [  1.        -3.36      11.2896   -37.933056]]
(51, 4)

Now we can test the ols function:

In [8]:
beta_hat = ols(X, yn)
print(beta_hat)
[11.11421318 23.37815119  3.12362826 -4.28018192]

For comparison numpy.polyfit can also be used to fit a linear model:

In [9]:
print(np.polyfit(x, yn, 3))
[-4.28018192  3.12362826 23.37815119 11.11421318]

numpy.polyfit() returns polynomial coefficients with highest order first and therefore they are the same as those calculated using our ols() function derived above, giving confidence our method is working as we want it to.

Finally we can add the predicted line on top of the figure shown earlier to compare the predicted against the actual model.

In [10]:
yest = beta_hat[0] + beta_hat[1]*x + beta_hat[2]*x**2 + beta_hat[3]*x**3

# create ax object for plotting
fig, ax = plt.subplots()
ax.plot(x,y,'r-',label='y = '+str(b0)+' + '+str(b1)+'x'+' + '+str(b2)+'x^2'+' + '+str(b3)+'x^3')
ax.plot(x,yest,'g-',label='yest = '+"%.2f" % beta_hat[0]+' + '+"%.2f" % beta_hat[1]+'x'+' + '+"%.2f" % beta_hat[2]+'x^2'+' + '+"%.2f" % beta_hat[3]+'x^3')
ax.plot(x,yn,'b-',label='yn = y + e, mean = 0, sd = 20.0')

# add axes labels and title
ax.set_xlabel('x'); ax.set_ylabel('y'); ax.set_title('Cubic Model with Noise'); ax.legend()
Out[10]:
<matplotlib.legend.Legend at 0x143a229efd0>

As we can see from the above plot OLS can be a simple and effective method for plotting lines of best fit through noisy data.

Further posts will discuss the relationship between OLS and Maximum Likelihood Estimation and whether we can use OLS when the error vector isn't normally distribued as it is this example.



Comments

comments powered by Disqus