Sparse identification of non-linear dynamics

Sparse identification of nonlinear dynamics (SINDy) is a data-driven algorithm for obtaining dynamical systems from data.[1] Given a series of snapshots of a dynamical system and its corresponding time derivatives, SINDy performs a sparsity-promoting regression (such as LASSO) on a library of nonlinear candidate functions of the snapshots against the derivatives to find the governing equations. This procedure relies on the assumption that most physical systems only have a few dominant terms which dictate the dynamics, given an appropriately selected coordinate system and quality training data.[2][3] It has been applied to identify the dynamics of fluids, based on proper orthogonal decomposition, as well as other complex dynamical systems, such as biological networks.[4]

Mathematical Overview

edit

First, consider a dynamical system of the form

 

where   is a state vector (snapshot) of the system at time   and the function   defines the equations of motion and constraints of the system. The time derivative may be either prescribed or numerically approximated from the snapshots.

With   and   sampled at   equidistant points in time ( ), these can be arranged into matrices of the form

 

and similarly for  .

Next, a library   of nonlinear candidate functions of the columns of   is constructed, which may be constant, polynomial, or more exotic functions (like trigonometric and rational terms, and so on):

 

The number of possible model structures from this library is combinatorically high.   is then substituted by   and a vector of coefficients   determining the active terms in  :

 

Because only a few terms are expected to be active at each point in time, an assumption is made that   admits a sparse representation in  . This then becomes an optimization problem in finding a sparse   which optimally embeds  . In other words, a parsimonious model is obtained by performing least squares regression on the system (4) with sparsity-promoting ( ) regularization

 

where   is a regularization parameter. Finally, the sparse set of   can be used to reconstruct the dynamical system:

 

References

edit
  1. ^ Brunton, Steven L.; Kutz, J. Nathan (2022-05-05). Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Higher Education from Cambridge University Press. doi:10.1017/9781009089517. ISBN 9781009089517. Retrieved 2022-10-25.
  2. ^ Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan (2016-04-12). "Discovering governing equations from data by sparse identification of nonlinear dynamical systems". Proceedings of the National Academy of Sciences. 113 (15): 3932–3937. arXiv:1509.03580. Bibcode:2016PNAS..113.3932B. doi:10.1073/pnas.1517384113. ISSN 0027-8424. PMC 4839439. PMID 27035946.
  3. ^ Huang, Yunfei.; et al. (2022). "Sparse inference and active learning of stochastic differential equations from data". Scientific Reports. 12 (1): 21691. doi:10.1038/s41598-022-25638-9. PMC 9755218. PMID 36522347.
  4. ^ Mangan, Niall M.; Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan (2016-05-26). "Inferring biological networks by sparse identification of nonlinear dynamics". arXiv:1605.08368 [math.DS].