TY - JOUR
T1 - Characterizing sediment sources by non-negative matrix factorization of detrital geochronological data
AU - Saylor, J. E.
AU - Sundell, K. E.
AU - Sharman, G. R.
N1 - Funding Information:
This research was supported by grants from the National Science Foundation ( EAR-1550097 and - 1742952 ) to Saylor. The authors thank Dr. Noah McLean for extensive discussions and input which helped to shape this manuscript. We acknowledge Dr. P. Vermeesch who reviewed a previous version of this paper, and are extremely grateful to an anonymous reviewer for several helpful reviews. Finally, we thank Dr. A. Yin for help in editorial handling of this paper.
Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2019/4/15
Y1 - 2019/4/15
N2 - This paper explores an inverse approach to the problem of characterizing sediment sources’ (“source” samples) age distributions based on samples from a particular depocenter (“sink” samples) using non-negative matrix factorization (NMF). It also outlines a method to determine the optimal number of sources to factorize from a set of sink samples (i.e., the optimum factorization rank). We demonstrate the power of this method by generating sink samples as random mixtures of known sources, factorizing them, and recovering the number of known sources, their age distributions, and the weighting functions used to generate the sink samples. Sensitivity testing indicates that similarity between factorized and known sources is positively correlated to 1) the number of sink samples, 2) the dissimilarity among sink samples, and 3) sink sample size. Specifically, the algorithm yields consistent, close similarity between factorized and known sources when the number of sink samples is more than ∼3 times the number of source samples, sink data sets are internally dissimilar (cross-correlation coefficient range >0.3, Kuiper V value range >0.35), and sink samples are well-characterized (>150–225 data points). However, similarity between known and factorized sources can be maintained while decreasing some of these variables if other variables are increased. Factorization of three empirical detrital zircon U–Pb data sets from the Book Cliffs, the Grand Canyon, and the Gulf of Mexico yields plausible source age distributions and weights. Factorization of the Book Cliffs data set yields five sources very similar to those recently independently proposed as the primary sources for Book Cliffs strata; confirming the utility of the NMF approach. The Grand Canyon data set exemplifies two general considerations when applying the NMF algorithm. First, although the NMF algorithm is able to identify source age distribution, additional geological details are required to discriminate between primary or recycled sources. Second, the NMF algorithm will identify the most basic elements of the mixed sink samples and so may subdivide sources that are themselves heterogeneous mixtures of more basic elements into those basic elements. Finally, application to a large Gulf of Mexico data set highlights the increased contribution from Appalachian sources during Cretaceous and Holocene time, potentially attributable to drainage reorganization. Although the algorithm reproduces known sources and yields reasonable sources for empirical data sets, inversions are inherently non-unique. Consequently, the results of NMF and their interpretations should be evaluated in light of independent geological evidence. The NMF algorithm is provided both as MATLAB code and a stand-alone graphical user interface for Windows and macOS (.exe and.app) along with all data sets discussed in this contribution.
AB - This paper explores an inverse approach to the problem of characterizing sediment sources’ (“source” samples) age distributions based on samples from a particular depocenter (“sink” samples) using non-negative matrix factorization (NMF). It also outlines a method to determine the optimal number of sources to factorize from a set of sink samples (i.e., the optimum factorization rank). We demonstrate the power of this method by generating sink samples as random mixtures of known sources, factorizing them, and recovering the number of known sources, their age distributions, and the weighting functions used to generate the sink samples. Sensitivity testing indicates that similarity between factorized and known sources is positively correlated to 1) the number of sink samples, 2) the dissimilarity among sink samples, and 3) sink sample size. Specifically, the algorithm yields consistent, close similarity between factorized and known sources when the number of sink samples is more than ∼3 times the number of source samples, sink data sets are internally dissimilar (cross-correlation coefficient range >0.3, Kuiper V value range >0.35), and sink samples are well-characterized (>150–225 data points). However, similarity between known and factorized sources can be maintained while decreasing some of these variables if other variables are increased. Factorization of three empirical detrital zircon U–Pb data sets from the Book Cliffs, the Grand Canyon, and the Gulf of Mexico yields plausible source age distributions and weights. Factorization of the Book Cliffs data set yields five sources very similar to those recently independently proposed as the primary sources for Book Cliffs strata; confirming the utility of the NMF approach. The Grand Canyon data set exemplifies two general considerations when applying the NMF algorithm. First, although the NMF algorithm is able to identify source age distribution, additional geological details are required to discriminate between primary or recycled sources. Second, the NMF algorithm will identify the most basic elements of the mixed sink samples and so may subdivide sources that are themselves heterogeneous mixtures of more basic elements into those basic elements. Finally, application to a large Gulf of Mexico data set highlights the increased contribution from Appalachian sources during Cretaceous and Holocene time, potentially attributable to drainage reorganization. Although the algorithm reproduces known sources and yields reasonable sources for empirical data sets, inversions are inherently non-unique. Consequently, the results of NMF and their interpretations should be evaluated in light of independent geological evidence. The NMF algorithm is provided both as MATLAB code and a stand-alone graphical user interface for Windows and macOS (.exe and.app) along with all data sets discussed in this contribution.
KW - Book Cliffs
KW - Grand Canyon
KW - Gulf of Mexico
KW - U–Pb
KW - low rank approximation
KW - mixture modeling
UR - http://www.scopus.com/inward/record.url?scp=85061405653&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061405653&partnerID=8YFLogxK
U2 - 10.1016/j.epsl.2019.01.044
DO - 10.1016/j.epsl.2019.01.044
M3 - Article
AN - SCOPUS:85061405653
SN - 0012-821X
VL - 512
SP - 46
EP - 58
JO - Earth and Planetary Sciences Letters
JF - Earth and Planetary Sciences Letters
ER -