Wednesday, October 7, 2015

Matrix Chain Multiplication

We got a assignment problem from our IC 303 DATA STRUCTURES AND ALGORITHMS , I’ll be going into its portions in detail too. But for today’s post I will solve the following problem of Matrix Chain Multiplication and hope the explanation of this problem will help bring clarity to how useful dynamic problem can be in solving problems.

The Assignment Problem

Find the cost involved in finding the product of 6 Matrices using both

  1. Dynamic programming (Matrix Chain Multiplication)
  2. Normal matrix multiplication.

Given,

Now first of all, what is cost here? I think its reasonable to assume the cost (time complexity) can be thought of as the total number of scalar multiplications to obtain .

For calculating given that,

its trivial to prove that the number of scalar multiplications required are .

For each element of we need to do multiplications, for the first row of we have such elements(size of each row), so multiplications.

Finally we have such rows (number of rows), so total number of scalar multiplications to calculate is

For example check out the image below where are 4,2,3.

Matrix Multiplication

Normal matrix multiplication.

Ok, now we have defined what is cost let’s see what the cost of the Normal Multiplication of the The Assignment problem is,

now by normal multiplication lets consider the grouping as shown below,

So here first is done in
then that solution with is done in
then that solution with is done in
then that solution with is done in
then that solution with is done in

Thus in the Normal method we need to perform scalar multiplications.

Now it should be obvious we can find by different methods which will change as we change how we group the 6 matrices.

Therefore the matrix-chain multiplication problem is given a chain of matrices to multiply, we need to find the grouping which minimizes the total number of scalar multiplications

In most cases the time we spend to find this special grouping is more than made up by the savings in the effort to actually compute the solution.

Now ok cool, we now know that we need to group these matrices, what’s the total number of ways that can be done?

Going by brute force then using the native algorithm the time complexity is,

while we get for the dynamic programming solution to finding the grouping.

So we can agree we need to use dynamic programming to optimally parenthesize a matrix chain.

Dynamic programming

I hope you all know that Divide and conquer algorithms works on the principle that we take a problem, make it into subproblems (divide) and recursively solve it (conquer).

Now what if when you applied this kind of problem solving strategy to a problem and discovered that you were repeating your work. That the subproblems that you were solving weren’t unique… would you solve them again and again? Nope, we would use the solution we already got with the first occurrence of that particular subproblem right?!

So that’s the basic idea of how Dynamic programming works. So when we see a problem with both Overlapping subproblems (this means that the subproblems repeat themselves) and Optimal substructure (which just means that it’s possible to solve this problem optimally by joining the optimally solved subproblems) we can use D.P.

So in the case of this Matrix Chain Multiplication problem what’s this “optimal subproblem” look like?

The crucial idea is that given a set of matrices then the Matrix Chain Multiplication problem can be broken down to,
Finding ( ) such that both the group of matrices to as well as the group of matrices to are optimally arranged for multiplication.

Thus

where p is the number of rows of the first matrix,
q is the number of rows of th matrix
or the number of column of th matrix,
r is the number of columns of the th matrix.

So here the first two terms are the cost of optimally solving the subproblems (i.e. consider the set of matrices and as subproblems) now when you join those two subproblem solutions you get the optimal solution to the original problem involving . That’s the key idea here.

Optimal solution of =
Optimal solution of =
Optimal solution of =
The term is incurred by the joining of the two subproblem solutions. (the first solution is a matrix and the second solution is a matrix.)

[TO BE CONTINUED]

Written with StackEdit.

No comments:

Post a Comment