TY - JOUR
T1 - Deep Learning Optimizes Data-Driven Representation of Soil Organic Carbon in Earth System Model Over the Conterminous United States
AU - Tao, Feng
AU - Zhou, Zhenghu
AU - Huang, Yuanyuan
AU - Li, Qianyu
AU - Lu, Xingjie
AU - Ma, Shuang
AU - Huang, Xiaomeng
AU - Liang, Yishuang
AU - Hugelius, Gustaf
AU - Jiang, Lifen
AU - Doughty, Russell
AU - Ren, Zhehao
AU - Luo, Yiqi
N1 - Funding Information:
We acknowledge the National Supercomputing Center in Wuxi for providing the supercomputing facility (Sunway). We thank Ms. Naiyu Jiang for comments and suggestions offered when writing this paper. Funding. This research is financially supported by the National Key R&D Programs of China (2016YFB0201100), the National Natural Science Foundation of China (41776010) and Qingdao National Laboratory for Marine Science and Technology (QNLM2016ORP0108).
Funding Information:
This research is financially supported by the National Key R&D Programs of China (2016YFB0201100), the National Natural Science Foundation of China (41776010) and Qingdao National Laboratory for Marine Science and Technology (QNLM2016ORP0108).
Publisher Copyright:
Copyright © 2020 Tao, Zhou, Huang, Li, Lu, Ma, Huang, Liang, Hugelius, Jiang, Doughty, Ren and Luo.
PY - 2020/6/3
Y1 - 2020/6/3
N2 - Soil organic carbon (SOC) is a key component of the global carbon cycle, yet it is not well-represented in Earth system models to accurately predict global carbon dynamics in response to climate change. This novel study integrated deep learning, data assimilation, 25,444 vertical soil profiles, and the Community Land Model version 5 (CLM5) to optimize the model representation of SOC over the conterminous United States. We firstly constrained parameters in CLM5 using observations of vertical profiles of SOC in both a batch mode (using all individual soil layers in one batch) and at individual sites (site-by-site). The estimated parameter values from the site-by-site data assimilation were then either randomly sampled (random-sampling) to generate continentally homogeneous (constant) parameter values or maximally preserved for their spatially heterogeneous distributions (varying parameter values to match the spatial patterns from the site-by-site data assimilation) so as to optimize spatial representation of SOC in CLM5 through a deep learning technique (neural networking) over the conterminous United States. Comparing modeled spatial distributions of SOC by CLM5 to observations yielded increasing predictive accuracy from default CLM5 settings (R2 = 0.32) to randomly sampled (0.36), one-batch estimated (0.43), and deep learning optimized (0.62) parameter values. While CLM5 with parameter values derived from random-sampling and one-batch methods substantially corrected the overestimated SOC storage by that with default model parameters, there were still considerable geographical biases. CLM5 with the spatially heterogeneous parameter values optimized from the neural networking method had the least estimation error and less geographical biases across the conterminous United States. Our study indicated that deep learning in combination with data assimilation can significantly improve the representation of SOC by complex land biogeochemical models.
AB - Soil organic carbon (SOC) is a key component of the global carbon cycle, yet it is not well-represented in Earth system models to accurately predict global carbon dynamics in response to climate change. This novel study integrated deep learning, data assimilation, 25,444 vertical soil profiles, and the Community Land Model version 5 (CLM5) to optimize the model representation of SOC over the conterminous United States. We firstly constrained parameters in CLM5 using observations of vertical profiles of SOC in both a batch mode (using all individual soil layers in one batch) and at individual sites (site-by-site). The estimated parameter values from the site-by-site data assimilation were then either randomly sampled (random-sampling) to generate continentally homogeneous (constant) parameter values or maximally preserved for their spatially heterogeneous distributions (varying parameter values to match the spatial patterns from the site-by-site data assimilation) so as to optimize spatial representation of SOC in CLM5 through a deep learning technique (neural networking) over the conterminous United States. Comparing modeled spatial distributions of SOC by CLM5 to observations yielded increasing predictive accuracy from default CLM5 settings (R2 = 0.32) to randomly sampled (0.36), one-batch estimated (0.43), and deep learning optimized (0.62) parameter values. While CLM5 with parameter values derived from random-sampling and one-batch methods substantially corrected the overestimated SOC storage by that with default model parameters, there were still considerable geographical biases. CLM5 with the spatially heterogeneous parameter values optimized from the neural networking method had the least estimation error and less geographical biases across the conterminous United States. Our study indicated that deep learning in combination with data assimilation can significantly improve the representation of SOC by complex land biogeochemical models.
KW - Community Land Model version 5 (CLM5)
KW - Earth system model
KW - data assimilation
KW - deep learning
KW - soil carbon dynamics
KW - soil organic carbon representation
UR - http://www.scopus.com/inward/record.url?scp=85089870961&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089870961&partnerID=8YFLogxK
U2 - 10.3389/fdata.2020.00017
DO - 10.3389/fdata.2020.00017
M3 - Article
AN - SCOPUS:85089870961
SN - 2624-909X
VL - 3
JO - Frontiers in Big Data
JF - Frontiers in Big Data
M1 - 17
ER -