TY - JOUR
T1 - Should we be concerned about multiple comparisons in hierarchical Bayesian models?
AU - Ogle, Kiona
AU - Peltier, Drew
AU - Fell, Michael
AU - Guo, Jessica
AU - Kropp, Heather
AU - Barber, Jarrett
N1 - Publisher Copyright:
© 2018 The Authors. Methods in Ecology and Evolution © 2018 British Ecological Society
PY - 2019/4
Y1 - 2019/4
N2 - Ecologists increasingly use hierarchical Bayesian (HB) models to estimate group-level parameters that vary by, for example, species, treatment level, habitat type or other factors. Group-level parameters may be compared to infer differences among levels. We would conclude a non-zero pairwise difference, separately, for each pair in the group, when the respective 95% credible interval excludes zero. Classical procedures suggest that the rejection procedure should be adjusted to control the family-wise error rate (FWER) for a family of differences. Adjustments for FWER have been considered unnecessary in HB models due to partial pooling whereby increased pooling strength – group-level parameters become more alike – could lead to decreased rejection rates (Type I error, FWER, or Power) and increased false acceptance rates (Type 2 error and its family-wise analogue). To address this, we conducted a simulation experiment with factors of sample size, group size, balance (missingness), overall mean and ratio of within- to between-group variances, resulting in 2016 factor-level combinations (‘scenarios’), replicated 100 times, producing 201,600 pseudo datasets analysed in a Bayesian framework. We evaluated the results in the context of a new partial pooling index (PPI), which we show is also applicable to more complex model structures based on four real-data examples. Simulation results confirm intuition that rejection rates (false and true) decrease and false acceptance rates increase with increasing PPI or pooling strength (scenario-level R 2 = 0.81–0.97). The relationship with PPI differed greatly for balanced versus unbalanced designs and was affected by group size, especially for family-wise errors. Critically, an HB model does not guarantee that the FWER will follow a set significance level (α); for example, even minor imbalance can lead to FWER > α for weak to moderate pooling. These results are confirmed by the real-data examples, suggesting that ecologists need to consider FWER when applying HB models, especially for large group sizes or incomplete datasets. Contrary to current thought, HB models are not immune to issues of multiplicity, and our proposed PPI offers a method for evaluating if a particular HB analysis is likely to produce FWER ≤ α (no adjustment or alternative solution required).
AB - Ecologists increasingly use hierarchical Bayesian (HB) models to estimate group-level parameters that vary by, for example, species, treatment level, habitat type or other factors. Group-level parameters may be compared to infer differences among levels. We would conclude a non-zero pairwise difference, separately, for each pair in the group, when the respective 95% credible interval excludes zero. Classical procedures suggest that the rejection procedure should be adjusted to control the family-wise error rate (FWER) for a family of differences. Adjustments for FWER have been considered unnecessary in HB models due to partial pooling whereby increased pooling strength – group-level parameters become more alike – could lead to decreased rejection rates (Type I error, FWER, or Power) and increased false acceptance rates (Type 2 error and its family-wise analogue). To address this, we conducted a simulation experiment with factors of sample size, group size, balance (missingness), overall mean and ratio of within- to between-group variances, resulting in 2016 factor-level combinations (‘scenarios’), replicated 100 times, producing 201,600 pseudo datasets analysed in a Bayesian framework. We evaluated the results in the context of a new partial pooling index (PPI), which we show is also applicable to more complex model structures based on four real-data examples. Simulation results confirm intuition that rejection rates (false and true) decrease and false acceptance rates increase with increasing PPI or pooling strength (scenario-level R 2 = 0.81–0.97). The relationship with PPI differed greatly for balanced versus unbalanced designs and was affected by group size, especially for family-wise errors. Critically, an HB model does not guarantee that the FWER will follow a set significance level (α); for example, even minor imbalance can lead to FWER > α for weak to moderate pooling. These results are confirmed by the real-data examples, suggesting that ecologists need to consider FWER when applying HB models, especially for large group sizes or incomplete datasets. Contrary to current thought, HB models are not immune to issues of multiplicity, and our proposed PPI offers a method for evaluating if a particular HB analysis is likely to produce FWER ≤ α (no adjustment or alternative solution required).
KW - borrowing of strength
KW - effective number of parameters
KW - family-wise error
KW - hierarchical model
KW - imbalance
KW - partial pooling
KW - type I error
UR - http://www.scopus.com/inward/record.url?scp=85059843813&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059843813&partnerID=8YFLogxK
U2 - 10.1111/2041-210X.13139
DO - 10.1111/2041-210X.13139
M3 - Article
AN - SCOPUS:85059843813
SN - 2041-210X
VL - 10
SP - 553
EP - 564
JO - Methods in Ecology and Evolution
JF - Methods in Ecology and Evolution
IS - 4
ER -