Foundation Models as a New Backbone for Medium-range Forecasting and Data Assimilation

Katherine Breen
NASA
Geosciences

Recent advances in machine learning have led to rapid progress in data-driven weather forecasting, with foundation models trained on large reanalysis datasets demonstrating competitive skill at medium-range lead times. These developments raise key questions about how such models compare to established numerical weather prediction (NWP) systems and how they may be rigorously evaluated and integrated within existing forecasting frameworks. This talk presents a quantitative, out-of-the-box evaluation of state-of-the-art foundation models against operational NWP benchmarks using standard forecast skill metrics across multiple atmospheric variables, pressure levels, and regions, highlighting both their strengths and current limitations. The talk also discusses early research efforts by public-sector scientists to develop machine learning–based data assimilation methods within NASA forecast systems, aiming to combine learned representations with scientific transparency, physical consistency, and reproducibility. Together, these efforts motivate a broader discussion on the increasing role of privately developed weather forecasting models, many of which are optimized for proprietary or operational objectives rather than scientific inquiry, and emphasize the importance of open, independently evaluated, and scientifically grounded model development for the future of weather and climate research.


Back to Mathematics and Machine Learning for Earth System Simulation