Enrollment in this course will be limited. If you are considering taking this course, follow the directions below

Who this course is for

This course is designed for graduate students in Statistics & Data Science who need to know about optimization and the essentials of numerical algorithm design and analysis. My intent is to help them design algorithms for Machine Learning and Data Analysis in their own research. The course could also be useful for graduate students in programs in Economics, SOM, and the sciences. It may also be suitable for some undergraduates. The prerequisites are knowledge of linear algebra, multivariate calculus, and probability.

Students who have taken Optimization Techniques (S&DS 430 / ENAS 530 / EENG 437 / ECON 413) probably won't get much out of this course. Students who have the time to take many courses are encouraged to instead take a combination of Algorithm Design (CPSC 365/366), Numerical Computation (CPSC 440), Applied Numerical Methods I (ENAS 440), Topics in Numerical Computation (CPSC 640), and some more advanced courses in optimization.

This course will eventually serve as an introduction to more advanced courses in optimization. It contains a lot of material that appears in the current (Fall 2019) version of Optimization Techniques. The next time Optimization Techniques is taught (probably in 2021-2022), it will assume this course as a prerequisite.

Topics to be covered.

Please note that this list of topics does not precisely agree with those advertised on Yale Course Search. And, this is the first time this course is being offered. This means that it is an experiment. I might find that I need to adjust the list of topics as we go.

Measuring the running times of algorithms

• Input size, condition numbers, bit precision, floating point.
• Assumptions under which algorithms run quickly.

Lower Bounds and NP Completeness

• Lower bounds for convex optimization in the oracle model.
• Explanation of NP Hardness, with examples from optimization.

Systems of Linear Equations

• Solution by Gaussian elimination (LU factorization), and the impact of pivoting.
• The condition number for systems of linear equations, its impact on bit complexity and its interpretation as the distance to ill-posedness.
• Kaczmarz method of solving systems of linear equations.
• Normal equations, and Cholesky factorization of positive definite matrices.
• QR Factorizations (if there is time)

The singular value decomposition

• Interpretation and Computation

Linear Programming

• Geometric Interpretation and Polytopes.
• Duality.
• Primal and Dual simplex methods.
• Condition numbers for linear programs.
• Von Neumann's iterative algorithm for linear programming, and its relation to the condition number.

Convex Programming

• Convex sets and convex functions. Separating hyperplanes.
• Lagrange duality, KKT conditions.
• Convergence analysis of gradient methods.
• Automatic Differentiation.

• Sparse solutions to linear equations, by linear programming and pursuit methods.
• Matrix completion by nuclear norm minimization.
• Solution heuristics, some subset of ILP, Simulated Annealing, Nelder-Mead, and Differential Evolution (if there is time).

Note that all of these treatments will be introductory, and we will not get to the most recent developments.

Work for the course

This course will feature regular homework, a midterm, and a final.

The final will be at the standard time: Wedneday, May 6, at 2pm. The midterm take place in class on Tuesday, March 3.

Materials

All readings for the course will come from materials that are freely available online (although some will require access from inside Yale or the VPN). Our main references will be

Some other material will come from papers and the lectures notes of the instructor.

Instructions for enrolling

First, add the course in Canvas. Then fill out the enrollment survey. Decisions about who is admitted to the course will be made before the first lecture.