I will try to update this blog which lists my understandings in each of the following areas in convex optimization (in order of my interest).

from Bertsekas Convex Analysis and Optimization

Lagrange multipliers

Lagrange multipliers are used as a measure of sensitivity of certain constraint which is my main interest. They are expected to exist if the tangent cone could be express as polar cone of set of constraint’s gradients. Recall that negative objective’s gradient is included in normal cone at x which is a range space of constraint’s gradient) the tangent cone of x*. However, this expectation is valid on regularity conditions such as LICQ.

Next is optimization algorithms. The concept of iterative and approximate plays the key role. Gradients (or subgradients for non-differentiable) and polyhedron or proximal approximation are required theory for each. Nocedal classified iterative methods into line search and true region whose difference is whether we set the direction before or after the step size.

Duality

Duality has many forms including the following
LP
y'b \leq c'x
Lagrange 1

minimax inequality (d* <= p*)


saddle point

if saddle points exists minimax inequality becomes equality

frenchel

with this duality btw subgradient and conditional gradient is proved (Bach15) and motivates mirror descent

dual cone & conjugate function

Some quotes to ponder upon
1. polarity relation between cones is a special case of conjugacy between convex functions.
I know that duality is $latex \min_{x \in K} f(x) = \min_{x \in K*} f*(x)$ but cannot understand how cones could be actually regarded as a function when it is a set.

2. general ferkas lemma is

Normal cone and tangent cones are orthogonal and based on this, it seems this implies a condition for normal cone of x, y \in N_{x}(x). x should be $latetx conv( N(y) basis) + cone(N_{y}(y))$. This might be related to the expression of objective’s gradient as a convex combination of active constraint’s gradient.

  1. L(x,\lambda)=f(x)+\sum_{i=1}^{m} \lambda_{i} \underbrace{h_{i}(x)}_{\leq 0} \leq f(x) when proving, we first take inf to get g(\lambda, \nu) \leq p* then take sup to get d* \leq p*