TY - JOUR
T1 - Debates
T2 - Does Information Theory Provide a New Paradigm for Earth Science? Sharper Predictions Using Occam's Digital Razor
AU - Weijs, Steven V.
AU - Ruddell, Benjamin L.
N1 - Publisher Copyright:
©2020. American Geophysical Union. All Rights Reserved.
PY - 2020/2/1
Y1 - 2020/2/1
N2 - Occam's Razor is a bedrock principle of science philosophy, stating that the simplest hypothesis (or model) is preferred, at any given level of model predictive performance. A modern restatement often attributed to Einstein explains, “Everything should be made as simple as possible, but not simpler.” Using principles from (algorithmic) information theory, both model descriptive performance and model complexity can be quantified in bits. This quantification yields a Pareto-style trade-off between model complexity (length of the model program in bits) and model performance (information loss in bits, or the missing information, needed to describe the original observations). Model complexity and performance can be collapsed to one single measure of lossless model size, which, when minimized, leads to optimal model complexity versus loss trade-off for generalization and prediction. Our view puts both simple data-driven and complex physical-process-based models on a continuum, in the sense that both describe patterns in observed data in compressed form, with different degrees of generality, model complexity, and descriptive performance. Information theory-based assessment of compression performance with fair and meaningful accounting for model complexity will enable us to best compare and combine the strengths of physics knowledge and data-driven modeling for a given problem, given the availability of data. “Suppose we draw a set of points on paper in a totally random manner” …. “I am saying it is possible to find a geometric line whose notation is constant and uniform, following a certain law, that will pass through all points, and in the same. order they were drawn.” … “But if that law is strongly composed,. the thing that conforms to it should be seen as irregular”Gottfried Wilhelm Leibniz, 1686: Discours de métaphysique V, VI (from French).
AB - Occam's Razor is a bedrock principle of science philosophy, stating that the simplest hypothesis (or model) is preferred, at any given level of model predictive performance. A modern restatement often attributed to Einstein explains, “Everything should be made as simple as possible, but not simpler.” Using principles from (algorithmic) information theory, both model descriptive performance and model complexity can be quantified in bits. This quantification yields a Pareto-style trade-off between model complexity (length of the model program in bits) and model performance (information loss in bits, or the missing information, needed to describe the original observations). Model complexity and performance can be collapsed to one single measure of lossless model size, which, when minimized, leads to optimal model complexity versus loss trade-off for generalization and prediction. Our view puts both simple data-driven and complex physical-process-based models on a continuum, in the sense that both describe patterns in observed data in compressed form, with different degrees of generality, model complexity, and descriptive performance. Information theory-based assessment of compression performance with fair and meaningful accounting for model complexity will enable us to best compare and combine the strengths of physics knowledge and data-driven modeling for a given problem, given the availability of data. “Suppose we draw a set of points on paper in a totally random manner” …. “I am saying it is possible to find a geometric line whose notation is constant and uniform, following a certain law, that will pass through all points, and in the same. order they were drawn.” … “But if that law is strongly composed,. the thing that conforms to it should be seen as irregular”Gottfried Wilhelm Leibniz, 1686: Discours de métaphysique V, VI (from French).
KW - Occam's razor
KW - algorithmic information theory
KW - data compression
KW - data-driven modeling
KW - model complexity
KW - physically based modeling
UR - http://www.scopus.com/inward/record.url?scp=85080991482&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85080991482&partnerID=8YFLogxK
U2 - 10.1029/2019WR026471
DO - 10.1029/2019WR026471
M3 - Comment/debate
AN - SCOPUS:85080991482
SN - 0043-1397
VL - 56
JO - Water Resources Research
JF - Water Resources Research
IS - 2
M1 - e2019WR026471
ER -