I met Donald Goldfarb at IEOR phd seminar and asked questions.
Connections regrading my research were:
- optimization formulation of Neural network
- proof techniques used for adaptive algorithms
The last formulation seemed similar to what I am doing with SBC_iteration where I wish to minimize the gap between the generated and real data and recursion happens. I asked whether there are algorithm that determines L adaptively by monitoring the error but he didn’t know. A justification for regularizer for my SBC_iter algorithm can be made by tracking how $\Omega$ are justified in NN literature. Currently I view this can elicited from the difference between the simulation and inference prior. So given the simulation prior $\pi_s$ the modeler wants to use which may cause bad computation outcomes (bad posterior geometry), my algorithm transforms it to a more computation-coherent and therefore well-calibrated prior through iteration.
Another question was for convergence proof technique; how the convergence can be proved for stochastic estimators as described in Goal B and C. He gave comments on adaptive algorithm convergence where algorithm doesn’t converge in a monotone manner such as stochastic version of BARZILAI and BORWEIN method where convergence happens with certain ergodic set manner. This paper may have some relation. Also he introduced Shiqing Ma.
I didn’t ask a question but it would be interesting if I can make connection with the role of mass matrix in leapfrog integrator here implemented in Stan HMC algorithms with hessian, damping etc. VI and stochastic optimization has been made in this paper. HMC+Leapfrog is as follows:
1. Starting with initial $\theta_0$
2. A new momentum vector is sampled and the current value of the parameter θ is updated using the leapfrog integrator with discretization time ϵ and number of steps L according to the Hamiltonian dynamics.
3. Metropolis acceptance step is applied, and a decision is made whether to update to the new state (θ∗,ρ∗) or keep the existing state.
The following are interesting snapshots which includes K-BFGS, self-concordance (using third moment info on Hessian update), multi-affine formulation. I will share the slide when I get them.
He thought multi-affine formulation which extends the above formulation on NN, would have great application and hoped for its implementation.
He was proud for the following work which got score 9,9,9,7 (top 1%) in NeurIPS.
1 Comment
1 Pingback