MATH 494: Mathematical Foundations of Machine Learning

 

Assignment #7

 

(Due Monday, April 8, 2024)

 

 

This is the first of the two assignments in unsupervised learning. You will create a noisy dataset based on a linear combination pattern, and you will use singular value decomposition (SVD) to reduce dimensionality of the dataset.

 

·       Create a matrix M x 5 (you may use M = 20 or a similar value). Fill the first two columns with uniformly distributed random numbers in the range from 0.0 to 10.0. Display the matrix.

·       The next three columns will be linear combinations of the first two. The third column will be the sum of the first two. The fourth column will be ‘A’ times the first column plus ‘B’ times the second one, where ‘A’, and ‘B’ are the last and next-to-last digits of your USD ID #. The fifth column will be ‘B’ times the last plus ‘A’ times next-to-last digit. (For the three of you with identical last two digits, use 9 minus the last digit for one of the weights.)

·       Add noise to the third, fourth, and fifth column: add a Gaussian with the mean of 0.0 and standard deviation of about 1.0 - 2.0. Tweak, if needed.

·       Perform the SVD decomposition and display matrices U, S, and VT. (Note: to check the correctness of your decomposition procedure, you may compare the results to U, S, VT = np.linalg.svd(X), where X is your matrix. Checking is not mandatory!)

·       Ask the user for the value of ‘k’ (the user will enter 0, 1, 2, or 3).

·       Reduce the dimensionality by ‘k’ dimensions by obtaining a reduced version (it will have the size (5-k) x (5-k)) of the S matrix. Display the reduced matrix. (Note: choosing k = 0 means reassembling the original matrix.)

·       Compute and display the average relative error, by finding the average value of relative errors of all elements of the original and the reduced versions of matrix X.