Project #3 -- Optimization of Functions of Several Variables



Introduction

One of the major applications of calculus is to determine how functions obtain their maximum or minimum values. In Calculus I, this was done with either the first or second derivative test. In multivariable calculus, this was done for functions of two variables. Specifically, for a function f(x,y), one found critical points (a,b) and then applied a second derivative test to see if the f(x,y) near the critical point was either always greater than f(a,b) (a local minimum) or less than f(a,b) (a local maximum). Other possibilities are having a saddle point, and having the test fail.

A couple of shortcomings of the approach presented are that no explanation is given for why the method works, and more important, there is nothing to suggest how it would be extended to a function of three or more variables. In this project, we see how linear algebra answers both concerns by letting eigenvectors associated with the second derivative determine the best coordinate system to see how the function behaves and then having the eigenvalues determine the nature of the critical point.


Background

This project assumes a knowledge of optimization of functions of two variables and understanding of eigenvalues and eigenvectors. Suppose f(x,y) has continuous second derivatives, and (a,b) is a critical point of f, meaning that fx and fy are both 0 at (a,b).

(Note that finding (a,b) means one must solve a system of two equations and two unknowns. However, this will often be a system of equations which are not linear, with possibly several solutions. There is no general technique in this case; anything goes!)

Recall the second degree Taylor Series for f(x,y) about (a,b):

f(x,y) = f(a,b) + fx(a,b)(x-a) + fy(a,b)(y-b) + fxx(a,b) (x-a)2 + 2fxy(a,b) (x-a)(y-b) +fyy(a,b) (y-b)2

Since (a,b) is a critical point, the partials fx and fy are 0. Using delta notation, the above equation becomes

Df = fxx(a,b)Dx2+ 2fxyDxDy+ fyy(a,b)Dy2

Now if this last expression is always positive, for any increment in x or y then we have a local minimum. If it is always negative then we have a local maximum. If it takes on both signs + and - for different (x,y) near (a,b) then we have a saddle point.

As an example, consider the function f(x,y) = 1/x + 1/y + 2xy. It has a critical point at (1/21/3 , 1/21/3). The second derviatives are

fxx = 2x-3 ,     fxy = 2     and    fyy = 2y-3

which evaluate, respectively, to 4, 2, and 4. Thus the expression for Df is

Df = 4Dx2+ 4DxDy+ 4Dy2.

Given that x and y can take on any sign, independent of each other, it is hard to tell if f is always the same sign. This is where linear algebra comes in.

In matrix form,

Df = (Dx Dy) [42][Dx]
[24][Dy]

which has eigenvalues of 6 and 2, for eigenvectors of (1 1) and (1 -1). (Please bear with the rather horrible notation for matrices and vertical vectors I've designed above; I'm not an HTML whiz.) So if we take (1/21/3, 1/21/3) as the origin and change basis for this matrix to the coordinate system of the eigenvectors, the second derivative becomes

Df = (Du Dv) [60][Du]
[02][Dv]

Due to the lack of any cross terms (DuDv), it is now clear that for any Du and Dv we get f > 0, indicating a minimum at (1/21/3, 1/21/3). (Multiply the matrix equation out if you don't see it).

The same procedure works for a function of three variables, f(x,y,z). A critical point is where all three partial derivatives are zero (fx = fy = fz = 0). The two term Taylor series for f near a critical point (in Delta notation) is

Df = fxxDx2+ 2fxyDxDy+ 2fxzDxDz+ fyyDy2+ 2fyzDyDz+ fzzDz2

all partials being evaluated at the critical point (a,b). The matrix form of this equation is

Df = (Dx Dy Dz) [fxxfxyfxz][Dx]
|fyxfyyfyz||Dy|
[fzxfzyfzz][Dz]

and if we change to a coordinate system of the eigenvectors of the matrix, it will become diagonal. If all three diagonal entries (the eigenvalues) are positive, we have a local minimum. If all three are negative, we have a maximum. If the signs are mixed, we have a saddle point. As an example, consider the function

f(x,y,z) = x3 + y3 + 2xyz + z2

which has critical points at (0,0,0), (3/2,3/2,-9/4), and (3/2,-3/2,9/4). We'll examine the second point as a possible local extrema. Using the second derivatives

fxx = 6x,   fxy = 2z ,   fxz = 2y,   fyz = 2x ,   fyy = 6y,   fzz = 2

and the expression above for the second order Taylor Series, we have that the change in f near (3/2,3/2,-9/4) is

Df = (Dx Dy Dz) [9-9/23][Dx]
[-9/293][Dy]
[332][Dz]

which has eigenvalues of 13.5, 7.67, and -1.17. Since they are not all of the same sign, the point (3/2,3/2,-9/4) is a saddle point of the function and not a local extrema.


The Report

Your project should contain at least the following sections:

Introduction and Summary : This section should give a very brief account of the subject matter discussed in your report.

Summary of Ideas: Provide a 1-2 page summary of how one uses linear algebra techniques to evaluate critical points of functions of many variables. You can just summarize what I've written above if you like, but put it in your own words (do not simply cut and paste my words into your report!)

Exercises: Solve the following exercises:

In each of problems 1-3, find all critical points of f and test each one as a possible local extrema using linear algebra techniques. For at least one of the critical points in each problem, provide me with the change of basis matrix needed to change coordinates to the coordinate system of the eigenvectors.

  1. f(x,y) = 3x - x3 - 3xy2 (there are 4 critical points)

  2. f(x,y) = 6xy2 -2x3 -3y4 (there are 3 critical points)

  3. f(x,y) = x4/3 + y4/2 - 4xy2 + 2x2 + 2y2 + 3 (there are 5 critical points).

  4. Suppose for a function of three variables, f(x,y,z), that (2,1,5) is a critical point and that its matrix of second partial derivatives has eigenvalues of 2, 3, and -1. What can you say about the critical point (2,1,5)?

  5. Suppose a function of three variables has (2,7,6) as a critical point and eigenvalues of its matrix of second partial derivatives of 2,3 and 0. What can you say about the critical point? What generalization can you draw?
Conclusion : Give a satisfying concluding comment to your report.


Points

This project will be worth 25 points towards your final grade. The point breakdown for this project is as follows:
Project adapted from Kohlman, Prentice-Hall