Sunday, January 31, 2016

Polynomial curve fitting

As given in the book we are trying to estimate the relationship between a independent variable and its dependent variable (both scalars) by modelling the relationship between them as a th order polynomial. Then the model output can be written as a function of and .

Note that by storing the coefficients that uniquely define this “model” as a vector ,

And by constructing a column vector (which is a function of ) ,

we can rewrite the sigma in more concisely as the dot product between the two vectors as shown below,

We were till now talking about a single and combination. What if we are given a Training Set consisting of observations of this system that we are trying to model. i.e. we get and its corresponding .

Where,

We can now choose a Error function which measures the misfit between the polynomial function and the training set. The error function corresponds to (one half of) the sum of the squares of the displacements of each data point from the model estimate .

Now we see how taking the square of the error - while eliminating the difference between positive and negative error also helps us when we minimize wrt in the sense that ’s derivative will be linear in terms of , assuring us that there exists a unique solution for the minimization of .

This is one of the reasons why we prefer to square the difference rather than take the modulus of the difference - Mean Absolute error in which case we will have discontinuities at every point where the individual error changes sign.

So in order to find the best fit polynomial (defined by ) from the given training set we can differentiate the error function with respect to and equate the resulting linear equation in to 0.

Now the partial differentiation of the error function with respect to is,

The above equation is true for . When , the term will be one not 0.

In the above equation note that when we partially differentiate w.r.t all other are considered to be a scalar. Let us now represent this above equation as a System of linear equations as shown below then we can easily see how the Moore–Penrose pseudoinverse is the solution to this minimization problem.

Now we should note that we have the training set available to us, so and are known while is our unknown.
So if we construct a matrix of size as follows,

and we know is a vector. Then our minimization of the error function is equivalent to wanting to find a for the below equation such that we obtain a ‘best fit’ (least squares) solution.

Elaborating on why they are equivalent, note that the error function is a sum of squares. So the minimum value of the error function is when all the individual errors are 0. So here every row of the column vector is the model’s output to the input. So will consist of the error values from each pair given to us from the training data set.

I will write a post explaining how the Moore–Penrose pseudoinverse works. But for now there are many sources for that.

Therefore the best possible solution is,

Written with StackEdit.

Thursday, January 21, 2016

Parameter Estimation

Objective

Use System Identification Toolbox of MATLAB to estimate the transfer function of a system - given sampled data of the system’s input and output.
Then use the minimum mean square error (MMSE) estimator method to alternatively estimate the model of the system.

Theory

Mathematical modelings importance in various fields cannot be understated. It is used to describe a system using mathematical concepts and language. When we have an interest in performing tests on a complicated system which is difficult to physically obtain, expensive, and sensitive to failure. Then it is safer and cheaper to perform the same tests on the model using computer simulations rather than carry out repetitive experimentations and observations on the real system.

In making these “Models” we have various types,

Theoretical Models

These models are obtained from fundamental principles, such as the laws of conservation of mass, energy, and momentum along with other chemical principles such as chemical reaction kinetics and thermodynamic equilibrium, etc.

Empirical models

These models are based on experimental plant data. These models are developed using data fitting techniques such as linear and nonlinear regression.

We will be dealing with empirical models in the remainder of this experiment, models obtained exclusively from experimental plant data are also known as black-box models . Such models do not provide detailed description of the underlying physics of the process. However, they do provide a description of the dynamic relationship between inputs and outputs. Thus they are sometimes more adequate for control design and implementation.

Thus we come to the issue of Parameter Estimation, given a set of input output data pairs we need to estimate the values for the coefficients used in the model such that the model best fits the experimental evidence.

Let the relationship between input and output of the system be modeled as the difference equation given below,

where is the output sample and is the input sample.

So for samples of input-output data pairs, we can write the set of simultaneous equations in matrix form in order to form a concise notation.

Consider a vector having elements from to ,
and a matrix having elements from to as the first column and to as the second column,
and finally a vector which holds the two parameters and .

MMSE Derivation

So the equations formed from entering the n data pairs into the difference equation given above can be written in matrix form as follows,

even though we write a equals sign, it is usually not possible to find a which perfectly models the system. we just aim for a such that the model most faithfully fits the given data ( samples). Note that as , (infinite number of data points) until then R.H.S and L.H.S continues to approach each other.

From the minimum mean square error (MMSE) estimator method, we get the following formula to calculate this ,

I will now detail the proof for the above equation which is used to estimate (i.e. the values of and ).

As I said before we need to find a such that the model most faithfully fits the given data, how can we make this statement into a relationship involving the parameters of interest?

So “most faithfully” can be thought of to mean, we want to minimize the error. Let us then define the error of the model as follows,

Now we don’t want to minimize this error - for all data samples. Because we care about the deviation of from , we do not care about the direction of this deviation. This is because positive and negative error of subsequent samples can cancel each other out. Thus we should take the - so that we have the absolute value of Error. We would rather define the error as the value (also known as Mean Square Error) rather than just .

The reason why we prefer MSE (squaring) to taking modulus when we want to quantify this error, is due to the fact that while both eliminate the primary problem of “positive and negative error”; squaring makes the algebra much easier to work with and offers properties that the absolute method does not. Additionally the squared difference has specific mathematical properties; such that it’s continuously differentiable (which nice when you want to minimize it - like now). There are additional reasons which can be found here.

Okay getting back to our hunt for the best fit for the given data, we need to minimize with respect to , for that we differentiate with respect to and equate that to zero.

Lets expand ,

The transpose respects addition,

Opening the brackets, note that

Vector Calculus Side Note


Here both and are vectors while is a matrix.


You can skip the following proof

Now the formulas given above can be understood by the fact that vectors are just used to hold multiple elements together - for the sole reason of applying operators onto them in bulk.

So in the first formula both and are vectors. What both and will be is a scalar value which is the dot product of both vectors.



Differentiating a scalar w.r.t a vector is defined as a new vector with its th element being the differentiation of that scalar with the th element of the vector.

Here I have to add that this topic divides the academia.
Two competing notational conventions split the field of matrix calculus into two separate groups. The two groups can be distinguished by whether they write the derivative of a scalar with respect to a vector as a column vector or a row vector.

For the second formula, the proof is straightforward given the above result. Here is a matrix. Apply product rule, note that the terms in the bracket are considered as constants when differentiating.

From the first formula - when comparing, can be thought of as while as . So,


When we partially differentiate P w.r.t , only is a variable while everything else is considered to be a constant.

Partially differentiating w.r.t and equating to 0 we get the value of for which the Mean Square Error is minimum,

Here is a symmetric matrix thus,

Moving all terms without to one side,

So post-multiplying both sides with ,

Taking transpose on both sides, note that because and .

Thus we get our ,

Simulation

I will first generate “data” using the simulink library in MATLAB. There we can construct a model of a system and run simulations on it.

Basic block diagram

Here as you can see I am applying a step input to a first order system, and writing both the input (Step) and output to the file try.mat, now looking at the settings of the step function and the ‘write to file’ block you can see I have chosen a Sampling Rate of 0.1 secs and initial time is 0 secs.

The settings for the Source - Step Input

The settings for the WriteToFile block

So we store time, input, and output into a variable - data and this file is called try.mat. Note that as we have 0.1 sec sample time we sample every 0.1 secs for 30 secs. Therefore we will have 301 samples ( and an initial 0 data sample).

Here the transfer function of the system used to generate the data is ,

You might get a error - Error encountered in block ‘exp1/To File’ - Error opening or closing file ‘try.mat’ . The error will open the Diagnostic Viewer as shown below,

Cannot Write to file Diagnostic Viewer error

We can fix this error by opening MATLAB as administrator, this error is caused due to MATLAB not having the permission to write to disk. Open MATLAB as shown in the image below,

Open MATLAB as administrator

On opening the Scope block we will see the input and output superimposed on a graph with amplitude as a function of time. The yellow line is the unit step input and the purple line is the systems output.

Scope block shows the input output relationship graphically

In order to do that we need to load this generated data into the MATLAB’s workspace. We use the load try.mat command to achieve this, once we do that the data variable holding the time, input, and output as rows will be added to the MATLAB workspace. As you can see below the first row is the time sampled every 0.1 secs, the second row is the step input which starts at 1 sec and finally the system’s output is the third row in the data variable. Note that I printed data' for better readability that is why the rows I refer to are shown as columns in the image below.

The steps to load the generated data into MATLAB workspace

This data variable can now be separated into its 3 components using the commands as shown below,

Commands to seperate time, input and output from out of data

We will be now using the t, u, and y variables we have made in order to reverse engineer the above system transfer function.

The Minimum Mean Square Error (MMSE) estimator method

The first method is using the equation we just derived which models the system using a first order difference equation, and then minimizes the Mean Square Error between the actual system’s output and the model’s predicted output.

Now that we have the time, input, and output variables in the workspace, we can construct and using the commands as shown below,

The steps to construct Y and Phi

Note that in MATLAB element’s indexes range from 1 to n rather than 0 to n like more programming languages.

As I had explained in the proof, I am taking the all but the first data element when I construct , and all but the last element when i construct .

Then I apply the formula to find the for which the MSE is minimum. This comes out to be for values of = 0.9618 and = 0.1950.

Let us now verify how well these parameters fit the actual system, for this I am using MATLAB’s c2d function to take my continuous time i/o system and find its equivalent discrete time i/o system with a sampling time of 0.1 secs (Just like the data we generated).

The code to find the Actual or more precisely the values of a and b which MATLAB finds for the same system

Note that the shown in the image is a approximation of , and with smaller sampling time this approximation eventually tends to become the continuous-time model itself!

For better understanding I will elaborate on how the tells us what and are. MATLAB tells us,

Cross multiplying, we get

By the Time shifting property of Z transforms i.e. ,

Doesn’t the above equation look familiar? yes it is of the same structure as the difference equation we used to model this system. Just as first order systems are modeled using , we can use the difference equation of this form to model the equivalent discrete time model system.

Thus the best fitting which MATLAB performs using its “inbuilt algorithms” (which are based on the same mathematics we used, but has multiple small modifications to improve its speed and accuracy) is = 0.9672 and = 0.1639.

So an error of = 0.0054 and = -0.0311 seems acceptable.

System Identification Toolbox

We can also use the System Identification Toolbox that comes with MATLAB to estimate the same parameters. This method has the additional benefits in the fact that they have added options for all algorithms to estimate the parameters, runs extremely fast, graphs the model formed etc.

We can use ident to open the System Identification App, we will see a window as shown below open up,

The system identification toolbox window

We can now import the data using which MATLAB will estimate the parameters,

The first dialog box used to import data into ident

Note that we need to keep the sampling time as 0.1 sec and starting time as 0 secs just as we defined it in the simulink model. Now click import to get the data into ident. The time domain data will look like the below image,

Time domain data after importing


Now using Estimate -> Transfer Function Models, we are prompted for number of poles and zeros which we need to enter.

Number of poles and zeros required for estimating

After which they estimate using Instrument Variable approach (which is the default) as shown below,

Estimation of transfer function

Now as you can see the Transfer function has been estimated with a 100% fit.

The transfer function estimated

This is absolutely correct as is 1.666 and is 0.333.

We can even use the “Show in LTI” option as shown in image above to see the out of a LTI system modeled on a .

Show in LTI option


There is also a different MATLAB option to do the estimation, that is using
Estimate -> Process Models,

Just untick the ‘Delay’ checkbox, uncheck ‘Zero’ checkbox - as there are no zero’s, choose the correct number of poles - 1 from the dropdown menu. Then click on estimate - We get the results quite fast,

Estimate using Process Models

As we can see the transfer function has been recovered from the time domain data here, using the model. Where is 5 and is 3.

Conclusion

In conclusion we have used the experimental plant data (Which we generated using simulink) to estimate black-box models . We understood the mathematics behind this generation and also employed more suitable methods available in MATLAB in the process of generating said models.

Addendum

We can show very clearly the benefits of empirical models when we model the FET amplifier circuit - voltage divider bias configuration.

JFET voltage divider bias

Now before we begin, let’s remember that a DC gate-to-source voltage controls the level of DC drain current through a relationship known as Shockley’s equation.

So we can express the change in drain current that will result from a change in gate-to-source voltage by using the transconductance factor .

Now substituting the ac equivalent model for the JFET, we get the following circuit,

This is the AC equivalent model applied to JFET in voltage divider bias configuration

From the above circuit we find the input impedance and output impedance as,

Setting = 0 Volts sets and to zero, also

Redrawn network

So the Gain of the amplifier is,

The analysis thus far has been limited to a particular frequency. For the amplifier, it was a frequency that normally permitted ignoring the effects of the capacitive elements, reducing the analysis to one that included only resistive elements and sources of the independent and controlled variety.

The junction capacitance (input capacitance) will be virtually increased due to Miller’s effect.

where is the gain of the amplifier and is the feedback capacitance.

So now the Transfer function is given by,

Now just like how we obtained the above transfer function by first principle we can use the parameter estimation methods detailed in the sections above to obtain the same transfer function.

Thus knowing that FET amplifier has a first order transfer function characteristics lets us model it using .

Note that this is not a complete black box model, it is infact a grey box model as we know the structure (First order).

Written with StackEdit.