Lectures
What is data assimilation?
by Prof. Alberto Carrassi (University of Bologna, Italy, and University of Reading, UK)
Abstract:
The lecture is focused on defining what is data assimilation and highlighting its importance in the forecasting operation using mathematical models of evolution equations. The development of data assimilation techniques will be seen with application examples. Theoretical and practical challenges will be described.
Nudging and backward-forward approach for data assimilation
by Prof. Didier Auroux (Université Côte d'Azur, France)
Abstract:
Nudging is a data assimilation method that uses dynamical relaxation to adjust a model towards observations. The standard nudging algorithm consists in adding to the model equations a feedback term, proportional to the difference between the observations and the corresponding model state. Also known as the Luenberger (or asymptotic) observer, it theoretically requires an infinite time window to converge. The Back and Forth Nudging (BFN) algorithm has been introduced in order to extend the efficiency of nudging to finite/small time windows. It consists in alternately solving the model forwards and backwards in time, with a nudging term in both cases, over the assimilation window. These approaches can be extended to more complex observers, for which non-observed variables can also be corrected with observed ones. We will give in this talk an overview of nudging, observers, and backward-forward algorithms, with applications to oceanography and fluid dynamics.
From least squares to Kalman filter, particle filter, and beyond
by Dr. Haroldo F. de Campos Velho (National Institute for Space Research, Brazil)
Abstract:
The talk presents a historical development of estimation theory from least square up to the Kalman filter. The topic for estimating the covariance matrix for modeling error will be addressed, describing the methodologies developed for it. Examples for data assimilation applying Kalman filter will be described. Alternative schemes for data assimilation considering non-Gaussian stochastic dynamics will be also presented, even when the statistical moments are not defined. Finally, techniques based on machine learning to emulate Kalman filter for data assimilation reducing the computational effort will be also shown.
Ensemble Kalman filter
by Dr. Takemasa Miyoshi (RIKEN Center for Computational Science, Japan)
Abstract:
The ensemble Kalman filter (EnKF) is a sequential filter suitable to data assimilation in geophysical models. The EnKF computes the covariance matrix for error modeling based on ensemble forecasts. The EnKF is related to the particle filter but based on the quasi-linear and Gaussian assumptions. If this approach is applicable, it is much more efficient than other alternatives for particle filter. In this talk, special attention is dedicated to the Local EnKF for efficient high-dimensional geophysical applications.
Optimal interpolation and variational (3D/4D) methods
by Dr. Amos Lawless (University of Reading, UK)
Abstract:
In this talk, two methodologies for data assimilation will be shown: optimal interpolation (OI) and variational techniques (3D-Var and 4D-Var). I will describe where the OI formula comes from as the Best Linear Unbiased Estimator (BLUE). We will then see the link between the OI formulation and three-dimensional variational assimilation (3D-Var). Four-dimensional variational assimilation (4DVar), as an extension to 3D-Var incorporating a forecasting model, will then be derived. Practical implementations for 3D-Var and 4D-Var will be discussed and comments on challenges will be mentioned.
Adjoint-free approach to 4D variational data assimilation
by Dr. Max Yaremchuk (Naval Research Laboratory, Stennis Space Center, USA)
Abstract:
Development and maintenance of the linearized and adjoint code for advanced circulation models is a challenging issue, requiring a significant proportion of total effort in operational 4Dvar data assimilation. The ensemble-based assimilation techniques provide a derivative-free alternative, which appears to be competitive with traditional approaches to variational methods in many practical applications. This presentation gives an overview of the methods, which employ ensemble information for computing sensitivities of the numerical models to their free parameters and for estimating the cost function gradients in the process of optimizing model solutions to observations.
Hybrid methods: The best of ensemble Kalman filters and variational methods
by Prof. Marc Bocquet (CEREA École des Ponts & EdF R&D, Île-de-France, France)
Abstract:
In this lecture, I will explain how the advantages and downsides of the ensemble Kalman filters and of the variational methods led to proposing new hybrid data assimilation methods that would capture the best of both worlds. This lecture will mostly be based on the methodological aspects showing why and how were these new hybrid methods created. I will first compare ensemble filtering and variational methods. I will then describe the main available hybrid methods, that I will group into categories, such as the methods based on hybrid covariance matrices, the ensemble of data assimilation (EDAs), the iterative ensemble Kalman filter (IEnKS), and the 4DEnVar. Their performance and potential will be illustrated with numerical models.
On particle filters and particle flow filters and smoother: towards fully nonlinear data assimilation
by Prof. Peter Jan van Leeuwen (Colorado State University, USA)
Abstract:
In this talk, I am going to show an introduction of particle filters, particle flow filters, and smoother strategies for addressing the data assimilation procedure. These techniques will be applied to a strong nonlinear models, describing challenging dynamics, where advanced methods for data assimilation could present a better performance.
Data assimilation by neural networks on ocean circulation model
by Dr. Olmo Zavala-Romero (Florida State University, USA)
Abstract:
Data assimilation (DA) has a huge computational effort for the high resolution, real-time global and basin-scale ocean prediction systems. The HYCOM (HYbrid Coordinate Ocean Model) system is our focus on application of DA techniques. In this talk, neural network results for DA for ocean circulation model are presented and discussed, including the accuracy of the analysis and the computational performance of the method.
Data assimilation on space weather models
by Prof. Ludger Scherliess (Utah State University, USA)
Abstract:
Physics-based data assimilation models have been used in meteorology and oceanography for several decades and are now becoming prevalent for specifications and forecasts of the near-Earth space environment. This increased use of data assimilation models for this region coincides with the increase in data suitable for assimilation. Of particular interest is the ionized part of the Earth’s upper atmosphere, called the ionosphere, due its large impact on technological systems. Over the past years data assimilation models for this region have been developed that assimilate a variety of different data types, including ground-based and space-based observations. The models greatly differ in their complexity and applied data assimilation schemes including ensemble Kalman filter and 4-D variational approaches. We will introduce the use of data assimilation for the space environment and address the various approaches. Examples will be presented to discuss their advantages and disadvantages.
Data assimilation in hydrology by neural network
by Prof. Marie-Amélie Boucher (Univesity of Sherbrooke, Canada; visiting scientist of the European Center for Medium Range Weather Forecasts, UK)
Abstract:
In this short-course, I will first provide a bit of background and explain the methodology that I proposed in 2020 as a proof of concept that it is possible to use Multilayer Perceptrons and Extreme Learning Machines to perform hydrological data assimilation. This method relies on the fact that neural networks can learn any underlying non linear relationship between inputs and outputs. In the context of data assimilation, the goal is to leverage that learning capability to train an ensemble of neural networks to learn the relationship between streamflow simulated by a hydrological model and the state variables that generated this simulation. Because the problem is ill-constrained (the same streamflow value could be generated by different sets of state variables), we include additional input variables to constrain the problem. Those additional variable will vary depending on the watershed but might include, for instance, precipitation during the days prior to t0. Once the relationship between simulated streamflow and state variables is learned, the neural network model can be applied to the most recent streamflow observations instead of simulations, in order to obtain corrected values for the state variables. Because this is an ensemble method, the result is an ensemble of candidate state variables that can be reinserted into the hydrological model at t0 in order to obtain a better starting point for a forecast. The method and results were published in Water Resources Research (Boucher et al 2020), and the data is already available on the Harvard Dataverse (Boucher 2020). Reusing the data and codes from that paper, I will provide a step-by-step tutorial on the implementation of the method in Matlab (or Octave). Unfortunately, Python or R codes are not yet available.
Boucher, M-A (2020) "Data for manuscript `Data assimilation for streamflow forecasting using Extreme Learning Machines and Multilayer Perceptrons’”, https://doi.org/10.7910/DVN/DB3AUE, Harvard Dataverse, V1
Boucher M-A, Quilty J. and Adamowski J. 2020 : Data assimilation for streamflow forecasting using extreme learning machines and multilayer perceptrons, Water Resources Research, 56, e2019WR026226. https://doi.org/10.1029/2019WR026226
WRF atmospheric model and data assimilation by neural network
by Dr. Vinicius A. Almeida (Federal University of Rio de Janeiro, Brazil)
Abstract:
The practical feasibility of neural networks models for data assimilation using local observations data in the WRF model for the Rio de Janeiro metropolitan region in Brazil is shown. Results employing 6-hour forecast fields with neural network models are able to emulate the 3D-Var results for surface and multi-level variables. The main result refers to CPU time reduction enabled by the neural networks models, reducing the data assimilation CPU-time by 121 times and 25 times for different machine learning models in comparison to the 3D-Var method under the same hardware configurations.
Data assimilation: big data and exascale computing
by Dr. Takemasa Miyoshi (RIKEN Center for Computational Science, Japan)
Abstarct:
Numerical weather prediction (NWP) supports our daily lives. Weather models require higher spatiotemporal resolutions to prepare for extreme weather disasters and reduce the uncertainty of predictions. The accuracy of the initial state of the weather simulation is also critical, requiring more advanced data assimilation (DA) technology. By combining resolution and ensemble size, the world's largest weather DA experiment was carried out using a global cloud-resolving model and an ensemble Kalman filter method. The number of grid points was ~ 4.4 trillion, and 1.3 PiB of data was passed from the model simulation part to the DA part. A data-centric application design and approximate computing to speed up the overall system of DA were adopted . The DA system, named NICAM-LETKF, scales to 131,072 nodes (6,291,456 cores) of the supercomputer Fugaku with a sustained performance of 29 PFLOPS and 79 PFLOPS for the simulation and DA parts, respectively. Pioneering new applications of data assimilation for various high-performance-computer simulations, using advanced data assimilation for making sense of "Big Data”.
Data learning: integrating data assimilation and machine learning – Applications to the COVID-19 pandemic
by Dr. Rossella Arcucci (Imperial College London, UK)
Abstract:
This lecture fits into the context of digital twins, which are usually made of two components: a model and some data. When developing a digital twin, many fundamental questions exist, some connected with the data and its reliability and uncertainty, and some to do with dynamic model updating. To combine model and data, we use Data Assimilation (DA). DA is the approximation of the true state of some physical system by combining real-world observations with a dynamic model. DA models have increased in sophistication to better fit application requirements and circumvent implementation issues. Nevertheless, these approaches are incapable of fully overcoming their unrealistic assumptions. Machine Learning (ML) shows great capability in approximating nonlinear systems and extracting meaningful features from high-dimensional data. ML algorithms can assist or replace traditional forecasting methods. However, the data used during training in any ML algorithm include numerical, approximation and round off errors, which are trained into the forecasting model. Integration of ML with DA increases the reliability of prediction by including information in real time and with a physical meaning. This talk introduces Data Learning, a field that integrates Data Assimilation and Machine Learning to overcome limitations in applying these fields to real-world data. We present several Data Learning methods and results for some test cases for COVID-19, though the equations are general and can easily be applied elsewhere.
Real-time predictive modelling machine learning and data assimilation in environmental problems
Prof. Fangxin Fang (Imperial College London, UK)
Abstract:
Numerical simulations of fluid dynamics have been indispensable in applications relevant to physics and engineering. For improving predictive capability, numerical algorithms have become increasingly sophisticated by using more spatial and temporal resolution. Advanced deep learning (DL) techniques achieve great progress in rapidly predicting fluid flows without prior knowledge of the underlying physical relationships [1-5]. In this talk, an overview of DL techniques in fluidity dynamics is provided. Focus will be on Recurrent Neutral Network (RNN), Long short-term memory (LSTM), Convolutional Neutral Network (CNN), and Generative Adversarial Network (GAN). Reduced order modelling (ROM) and data assimilation techniques will be introduced for real-time operational modelling and uncertainty analysis. Having the compatibility of machines learning and data assimilation will be nothing short of revolutionary for a large number of disciplines. Examples of large data-driven modelling to fluid flow problems will be presented: ozone forecast in China and flooding prediction in Denmark. An example of optimising the sensor location using adjoint-ROM will be given as well [6].
REFERENCES
[1] Cheng M, Fang F, Navon IM, Zheng J, Tang X, Zhu J, Pain CC: Spatio‐temporal Hourly and Daily Ozone Forecasting in China Using a Hybrid Machine Learning Model: Autoencoder and Generative Adversarial Networks, Journal of Advances in Modeling Earth Systems, 2022.
[2] M. Cheng, F. Fang, C.C. Pain and I.M. Navon, Data -driven modelling of nonlinear spatio-temporal fluid flows using a deep convolutional generative adversarial network, Computer Methods in Applied Mechanics and Engineering, v. 365, 2020.
[3] M. Cheng, F. Fang, C.C. Pain and I.M. Navon, An advanced hybrid deep adversarial autoencoder for parameterized nonlinear fluid flow modelling, Computer Methods in Applied Mechanics and Engineering, v. 372, 1-19, 2020.
[4] M. Cheng, F. Fang, T. Kinouchi, I.M. Navon and C.C. Pain, Long lead-time daily and monthly streamflow forecasting using machine learning methods, Journal of Hydrology, v. 590, 1-13, 2020.
[5] M. Cheng, F. Fang, I.M. Navon, C.C. Pain, A real-time flow forecasting with deep convolutional generative adversarial network: Application to flooding event in Denmark, Physics of Fluids, v. 33, 2021.
[6] Fang F, Pain C, Navon, Xiao D, 2016, An efficient goal based reduced order model approach for targeted adaptive observations, International Journal for Numerical Methods in Fluids, v. 83, 263-275.