I have largely three areas of interest:
1. Strategy for Security and Resilience: Organizational Regulatory Functions, Preventive Maintenance
2. Diagnostics for Generative and Dynamic Models: System Dynamics Simulation, Hierarchical Bayesian, Generative Adversarial Network, Model Checking Diagnostics, Monte Carlo Simulation Methods, Simulation-based Calibration
3. Servicing flexible and scalable optimization: Auto-ML
1. Strategy for Security and Resilience
Ongoing engine failure prediction projects with the Korean military have led me to the topic of reliability prediction. Among many forms of data, I have expertise in time series owing to three demand forecasting projects whose topics included hierarchical time series, multi-seasonal models, and exogenous feature engineering. I particularly focused on systemic modeling: modeling multiple time series parallelly generated in a system as interdependent, not independent. One example would be information pooling among different subgroups of a system via a hierarchical model.
Accuracy and interpretability were requested for failure prediction software. My insights on interpretable time series components (trends, seasonality, and event effects) and time series with a hierarchical structure could meet this demand once applied.
These characteristics and requirements are common. Thus, my research could be easily applied to different domains including health care and economy. The following are ongoing research, extending my StanCon 2020 presentation.
2. Generative and Dynamic Models, and their diagnostics
The superb stochastic approximations I witnessed (Hamiltonian Monte Carlo is Stan’s official engine), compared to deterministic approximations pique my interest. However, stochastic approximations are convoluted such as integrators of flow i.e. discretized versions of Fokker–Planck or Hamiltonian equations.
In retrospect, nature enjoys being defined relationally, not prescriptively. Heat, mass–energy equivalence, entropy, and hamiltonian equation are not in min f(x) s.t. format. For this reason, dogged attempts to pave stochastic paths to equilibrium were eye-opening. These aesthetic mechanisms, I observed in three papers so far. First, this paper introduces variational formulation of distributional learning which solved the mystery of the Bayes formula; second, this paper which views stochastic gradient descent trajectories as a Markov chain to introduce stationary distribution with Ornstein-Uhlenbeck’s process; third, this paper tracts variance dissipation and gradient flow by formulating time-reversed relative entropy process with likelihood local martingale. Upon deeper understanding, it can demystify detailed balance and its direct application to my interest; MCMC.
3. Servicing flexible and scalable optimization
Scalability and flexibility should be addressed to serve as Korean logistics platform as specified in this article summarizing the joint work of NextOpt and LogisAll, Korea’s largest Pallet company.
Flexible optimization is an oxymoron as optimization should aim to make the tightest adjustment to its input; any form of generality, i.e. similar performance for unforeseen types could be the token of disloyalty to the given setting. We owe existing algorithms’ flexibility to the effort of optimization servicer. This menial work includes abstracting and classifying input; designing algorithm with countless heuristics and analysis on the unexpected outcomes; justifying and honing outcome — which are tersely expressed as “more thought” by Jaynes. This could improve, however, under the autoML framework.
Auto-ML learns the connections between models and between parameters in neighboring models in the network of regression, time-series, and causal model is the goal. Prior research is introduced in sec.7.4 of Bayesian workflow paper. Research directions are shared here which detail model space navigation tools with simulation and optimization. If your passion conjugates with this research, issue of the modular Stan package is the best place to join where are being actively developed, for instance using combinatorics here. The posterior of this research is to be in the synthesis of Bayesian decision analysis with machine learning such as generative adversarial networks.
In previous projects, speed has been the main issue; daily forecasts currently requested by 20,000 clients will surely expand both by its size and diversity. Using approximate computations such as adjoint-differentiated embedded Laplace or variational inference. The necessity of computation approximation highlights the need for calibration tools such as simulation-based calibration (SBC). Explanations of SBC’s role and directions are being actively updated here. If you are interested in collaborating, issues are the best place to start our conversation.