Developing a Surrogate Model for SWAT with Remotely Sensed Soil Moisture Using Python

Katherine Breen
Baylor University

Deep learning algorithms may be applied to create surrogate, or "data-driven" models, for physics-based models to reduce computational time and user biases to facilitate real-time estimates for use in risk assessments. The objective of this project is to develop a deep learning model (DLM) that makes accurate near-real-time soil-moisture estimates using the USDA’s Soil & Water Assessment Tool (SWAT) in conjunction with soil-moisture retrievals from NASA’s Soil Moisture Active Passive (SMAP) satellite.

Remotely sensed parameters provide global, high-resolution datasets with minimal data latency relative to in situ observational networks. Prior to neural network development, soil-moisture products from two satellites were assessed for agreement with in situ observations from the Soil and Climate Analysis Network (SCAN) within the area of interest chosen for DLM development. NASA’s Soil Moisture Active Passive (SMAP) and the ESA’s Soil Moisture Ocean Salinity (SMOS) satellites were selected for assessment. SMAP was chosen for further analyses because overall agreement with in situ stations was superior. SMAP data acquisition was performed using the Python pytesmo package, which imported and compared soil-moisture retrievals from April 2015 –December 2017. Training data for the model were generated by running SWAT from 2015-2017 on 3,200 hydrologic response units (HRUs) within the area of interest. The Python deep learning framework Keras was used to build neural networks. A multilayer perceptron (MLP) model did not yield satisfactory results, so a hybrid architecture was developed where an LSTM network processed transient SWAT model inputs (precipitation, temperature, etc.) and an MLP processed static data (curve number, soil depth, etc.) to make predictions using both the static physical parameters for each HRU and historical meteorological data. Loss was the RMSE between SWAT-predicted and SMAP-sensed soil moistures. The ADAGRAD optimizer was used along with a dropout rate of 0.2 for all layers.

Back to Science at Extreme Scales: Where Big Data Meets Large-Scale Computing