TY - JOUR
T1 - Machine Learning-Based Prediction of Ecosystem-Scale CO2 Flux Measurements
AU - Uyekawa, Jeffrey
AU - Leland, John
AU - Bergl, Darby
AU - Liu, Yujie
AU - Richardson, Andrew D.
AU - Lucas, Benjamin
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/1
Y1 - 2025/1
N2 - AmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers regularly ceasing operation for extended periods and a lack of standardization of measurements between sites. In this study, we use machine learning algorithms to predict CO2 flux measurements at NEON sites (a subset of Ameriflux sites), creating a model to gap-fill measurements when sites are down or replace measurements when they are incorrect. Machine learning algorithms also have the ability to generalize to new sites, potentially even those without a flux tower. We compared the performance of seven machine learning algorithms using 35 environmental drivers and site-specific variables as predictors. We found that Extreme Gradient Boosting (XGBoost) consistently produced the most accurate predictions (Root Mean Squared Error of 1.81 μmolm−2s−1, R2 of 0.86). The model showed excellent performance testing on sites that are ecologically similar to other sites (the Mid Atlantic, New England, and the Rocky Mountains), but poorer performance at sites with fewer ecological similarities to other sites in the data (Pacific Northwest, Florida, and Puerto Rico). The results show strong potential for machine learning-based models to make more skillful predictions than state-of-the-art process-based models, being able to estimate the multi-year mean carbon balance to within an error ±50 gCm−2y−1 for 29 of our 44 test sites. These results have significant implications for being able to accurately predict the carbon flux or gap-fill an extended outage at any AmeriFlux site, and for being able to quantify carbon flux in support of natural climate solutions.
AB - AmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers regularly ceasing operation for extended periods and a lack of standardization of measurements between sites. In this study, we use machine learning algorithms to predict CO2 flux measurements at NEON sites (a subset of Ameriflux sites), creating a model to gap-fill measurements when sites are down or replace measurements when they are incorrect. Machine learning algorithms also have the ability to generalize to new sites, potentially even those without a flux tower. We compared the performance of seven machine learning algorithms using 35 environmental drivers and site-specific variables as predictors. We found that Extreme Gradient Boosting (XGBoost) consistently produced the most accurate predictions (Root Mean Squared Error of 1.81 μmolm−2s−1, R2 of 0.86). The model showed excellent performance testing on sites that are ecologically similar to other sites (the Mid Atlantic, New England, and the Rocky Mountains), but poorer performance at sites with fewer ecological similarities to other sites in the data (Pacific Northwest, Florida, and Puerto Rico). The results show strong potential for machine learning-based models to make more skillful predictions than state-of-the-art process-based models, being able to estimate the multi-year mean carbon balance to within an error ±50 gCm−2y−1 for 29 of our 44 test sites. These results have significant implications for being able to accurately predict the carbon flux or gap-fill an extended outage at any AmeriFlux site, and for being able to quantify carbon flux in support of natural climate solutions.
KW - AmeriFlux
KW - carbon dioxide flux
KW - machine learning
KW - National Ecological Observatory Network
KW - nature-based climate solutions
KW - phenocam
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85215806853&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85215806853&partnerID=8YFLogxK
U2 - 10.3390/land14010124
DO - 10.3390/land14010124
M3 - Article
AN - SCOPUS:85215806853
SN - 2073-445X
VL - 14
JO - Land
JF - Land
IS - 1
M1 - 124
ER -