Theory

Inverse Problem

Considering a time-independent system, it can be prescribed as the following model:

\[x= [\theta, \alpha]\]

where \(x\) is state vector, \(\theta\) is the state variables characterizing the system behaviors and \(\alpha\) represents the state augmentation parameters for system design.

Then the stochastic formulation of state vector and observation can be constructed for the system by adding an addictive random term in deterministic formulation as following:

\[x=x_0+\eta\]
\[y=y_o+\epsilon\]

where \(\eta\) is the random perturbation on the initial state vector \(x_0\), \(y_o\) is the available observation of the system. \(\epsilon\) is random observation error. With random state vector and observation, we can leverage the Bayes’s Theorem to get a posterior distribution by combining the prior distribution and data distribution which is called inverse problem.

\[p(x \mid y) \propto p(x)p(y \mid x)\]

Thus the problem can be solved with data assimilation by finding state vector to maximize a posterior(MAP).

Data Assimilation

Data assimilation can be sorted by variational method [A1] and Kalman filter [A2]. The variational method use the optimal control theory [A3] optimal to reduce the misfit between observation and model realization in observed space, while Kalman filter is originated directly from the Bayesian formulation. However, the variational method need much efforts on the adjoint model and the Kalman filter is quite costly to estimate the statics of the state vector especially for high-dimension problem. To address these issues, the trending approach is to introduce ensemble technique, and many ensemble-based methods have been proposed.

This toolbox is focused on ensemble-based data assimilation methods.

Ensemble Kalman Filtering (EnKF)

In EnKF, the prior statistics is estimated by ensemble Monte Carlo sampling and each sample in kth DA step can be defined as:

(1)\[x_k^{(j)} = [\theta_k^{(j)},\alpha_k^{(j)}] \qquad 1 \leq j \leq N_{en}\]

where \(N_{en}\) is the ensemble size.

In (1), MAP is equivalent to the determination of a variance minimizing analysis. Accordingly, the matrix is updated as:

(2)\[x_{k+1}^{(j)}=x_k^{(j)}+K(y^{(j)}-H(x_k^{(j)}))\]

with gain matrix \(K=PH^T(HPH^T+C)^{-1}\), \(P = \frac{1}{N_{en}-1}X'X'^T\), \(X'=(x^{(1)}-x^{(e)},x^{(2)}-x^{(e)},\dot,x^{(N_{en})}-x^{(e)})\) where H is operator that map from the state space to observation space, C is observation covariance error.

The procedure of EnKF can be summarized as follows:

  1. Give a first guessed or prior state vector \(x^{(e)}\), and prescribe the prior and observation statistics respectively;

  2. Realize Nen initial samples \(\{x^{(j)}\}_{j=1}^N\) around \(x^{(e)}\);

  3. Map the state vector to observation space by solving the RANS equation based on each sample and obtain HX;

  4. Obtain gain matrix K and update the prior distribution based on (2);

  5. Return to step 3 until further minimization cannot be achieved.

For more information, please refer to [A4]

References

[A1]

François-Xavier Le Dimet and Olivier Talagrand. Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects. Tellus A: Dynamic Meteorology and Oceanography, 38(2):97–110, 1986.

[A2]

Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. Journal of basic Engineering, 82(1):35–45, 1960.

[A3]

Leonard David Berkovitz. Optimal control theory. Volume 12. Springer Science & Business Media, 2013.

[A4]

H Xiao, J-L Wu, J-X Wang, R Sun, and CJ Roy. Quantifying and reducing model-form uncertainties in reynolds-averaged navier–stokes simulations: a data-driven, physics-informed bayesian approach. Journal of Computational Physics, 324:115–136, 2016.