On the Probabilities of Environmental Extremes

Authors

  • Benjamin Kedem Department of Mathematics and Institute for Systems Research, University of Maryland, College Park, MD 20742, USA
  • Ryan M. Stauffer Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20742, USA
  • Xuze Zhang Department of Mathematics and Institute for Systems Research, University of Maryland, College Park, MD 20742, USA
  • Saumyadipta Pyne Health Analytics Network, Pittsburgh, PA 15237; Public Health Dynamics Laboratory, and Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, 15261, USA

DOI:

https://doi.org/10.6000/1929-6029.2021.10.07

Keywords:

Tail probabilities, repeated fusion, nitrogen dioxide, epidemiological, order statistics

Abstract

Environmental researchers, as well as epidemiologists, often encounter the problem of determining the probability of exceeding a high threshold of a variable of interest based on observations that are much smaller than the threshold. Moreover, the data available for that task may only be of moderate size. This generic problem is addressed by repeatedly fusing the real data numerous times with synthetic computer-generated samples. The threshold probability of interest is approximated by certain subsequences created by an iterative algorithm that gives precise estimates. The method is illustrated using environmental data including monitoring data of nitrogen dioxide levels in the air

References

Eckel SP, Cockburn M, Shu Y-H, Deng H, Lurmann FW, Liu L, Gilliland FD. Air pollution affects lung cancer survival. Thorax 2016; 71: 891-898. https://doi.org/10.1136/thoraxjnl-2015-207927 DOI: https://doi.org/10.1136/thoraxjnl-2015-207927

Faustini A, Rapp R, Forastiere F. Nitrogen dioxide and mortality: review and meta-analysis of long-term studies. European Respiratory Journal 2014; 44: 744-753. https://doi.org/10.1183/09031936.00114713 DOI: https://doi.org/10.1183/09031936.00114713

Hamra GB, Laden F, Cohen AJ, Raaschou-Nielsen O, Brauer M, Loomis D. Lung cancer and exposure to nitrogen dioxide and traffic: a systematic review and meta-analysis. Environmental Health Perspectives 2015; 123: 1107-1112. https://doi.org/10.1289/ehp.1408882 DOI: https://doi.org/10.1289/ehp.1408882

Kedem, Benjamin, Victor De Oliveira, and and Michael Sverchkov. Statistical Data Fusion. Singapore: World Scientific 2017. https://doi.org/10.1142/10282 DOI: https://doi.org/10.1142/10282

Kedem B, Pan L, Smith P, Wang C. Estimation of Small Tail Probabilities by Repeated Fusion. Mathematics and Statistics 2019; 7: 172-181. https://doi.org/10.13189/ms.2019.070503 DOI: https://doi.org/10.13189/ms.2019.070503

Kedem B, Pyne S. Estimation of Tail Probabilities by Repeated Augmented Reality. Journal of Statistical Theory and Practice 2021; 15. https://doi.org/10.1007/s42519-020-00152-1 DOI: https://doi.org/10.1007/s42519-020-00152-1

Qin J, Zhang B. A Goodness of Fit Test for Logistic Regression Models Based on Case-control Data. Biometrika 1997; 84: 609-618. https://doi.org/10.1093/biomet/84.3.609 DOI: https://doi.org/10.1093/biomet/84.3.609

Kedem B, Pan L, Zhou W, Coelho CA. Interval Estimation of Small Tail Probabilities – Application in Food Safety. Statistics in Medicine 2016; 35: 3229-3240. https://doi.org/10.1002/sim.6921 DOI: https://doi.org/10.1002/sim.6921

Wang, Chen. Data Fusion Based on the Density Ratio Model. PhD dissertation, Department of Mathematics, University of Maryland, College Park 2018.

Casella, George and Roger L. Berger. Statistical Inference, 2nd ed. Pacific Grove, CA: Duxbury 2002.

Zhang X, Pyne S, Kedem B. Estimation of Residential Radon Concentration in Pennsylvania Counties by Data Fusion. Applied Stochastic Models in Business and Industry 2020a; 36: 1094-1110. https://doi.org/10.1002/asmb.2546 DOI: https://doi.org/10.1002/asmb.2546

Zhang X, Pyne S, Kedem B. Model Selection in Radon Data Fusion. Statistics in Transition, new series, 2020b; 21: 159-165. https://doi.org/10.21307/stattrans-2020-036 DOI: https://doi.org/10.21307/stattrans-2020-036

Beirlant, Jan, Yuri Goegebeur, Jozef Teugels, and Johan Segers. Statistics of Extremes : Theory and Applications. Hoboken, NJ: Wiley 2004. https://doi.org/10.1002/0470012382 DOI: https://doi.org/10.1002/0470012382

Ferreira A, De Haan L. On the Block Maxima Method in Extreme Value Theory: PWM Estimators. The Annals of Statistics 2015; 43: 276-298. https://doi.org/10.1214/14-AOS1280 DOI: https://doi.org/10.1214/14-AOS1280

Fokianos K, Qin J. A Note on Monte Carlo Maximization by the Density Ratio Model. Journal of Statistical Theory and Practice 2008; 2: 355-367. https://doi.org/10.1080/15598608.2008.10411880 DOI: https://doi.org/10.1080/15598608.2008.10411880

Katzoff M, Zhou W, Khan D, Lu G, Kedem B. Out of Sample Fusion in Risk Prediction. Journal of Statistical Theory and Practice 2014; 8: 444-459. https://doi.org/10.1080/15598608.2013.806233 DOI: https://doi.org/10.1080/15598608.2013.806233

Zhou, Wen. Out of Sample Fusion. PhD dissertation, Department of Mathematics, University of Maryland, College Park 2013.

Owen, Art. Empirical Likelihood. Boca Raton, FL: Chapman & Hall/CRC 2001.

Zhang B. A Goodness of Fit Test for Multiplicative-intercept Risk Models Based on Case-control Data. Statistica Sinica 2000; 10: 839-865.

Lu, Guanhua. Asymptotic Theory for Multiple-Sample Semiparametric Density Ratio Model and its Application to Mortality Forecasting. PhD dissertation, Department of Mathematics, University of Maryland, College Park 2007

Downloads

Published

2021-06-09

How to Cite

Kedem, B. ., Stauffer, R. M. ., Zhang, X. ., & Pyne, S. . (2021). On the Probabilities of Environmental Extremes. International Journal of Statistics in Medical Research, 10, 72–84. https://doi.org/10.6000/1929-6029.2021.10.07

Issue

Section

General Articles