How to Analyze Censored Concentration Data Using Modern Statistical Methods of Survival Analysis: Background and Nonparametric Methods
Abstract
1. Introduction
2. How Do Censored Concentration Data Arise?
3. Examples of Methods for Analyzing Censored Concentration Data
3.1. Deleting Censored Concentrations
3.2. Data-Fabrication Methods
3.2.1. One-Half LRL
3.2.2. Methods Based on Quantiles or Order Statistics
3.3. Ignoring the Reporting Limits
3.4. Partitioning Concentrations into Discrete Classes
3.5. Survival-Analysis Methods
4. Basic Concepts and Terminology
4.1. Functions for Specifying Probability Distributions
4.2. Censored Data
- Avoid fabricating data whenever possible.
- Use only information from available data that is known with high confidence.
5. Nonparametric Survival-Analysis Methods
- Nonparametric methods are much less sensitive to outliers than parametric methods.
- Nonparametric methods make fewer assumptions that must be verified regarding properties of the underlying populations from which the data are obtained. In particular, no specific probability distribution is assumed for concentration data. These methods are therefore applicable in many cases where parametric methods are invalid because their distributional assumptions are clearly untenable or where these assumptions cannot be convincingly assessed.
- In cases where the probability distribution assumed by a parametric method is consistent with properties of the sampled population (regardless of whether this consistency can be convincingly demonstrated) nonparametric methods often are nearly as efficient as parametric methods. This means that they require only slightly larger sample sizes to achieve the same statistical power.
- 1.
- Characterizing the distribution of concentrations for an individual study site or date;
- 2.
- Pairwise comparison of concentration distributions for different sites or dates;
- 3.
- Testing homogeneity and detecting monotonic trends in concentration distributions for three or more sites or dates.
5.1. Characterizing Concentration Distributions
5.1.1. R Example
5.1.2. SAS Example
5.2. Pairwise Comparison of Concentration Distributions
- H0:
- for all x;
- H1:
- for some x (two-sided);
- H1:
- for all x, with “>” for some x (one-sided, “site i greater than site j”);
- H1:
- for all x, with “<” for some x (one-sided, “site i less than site j”).
Thus, Hosmer et al. [65] recommend the same visual diagnostic that we outlined above: overlay plots of the two estimated cPDFs (or PDFs) and visually determine whether they decisively cross. Kaplan–Meier estimates of the cPDFs are used with right-censored data, and Turnbull estimates are used with left-censored, doubly censored, or interval-censored data.A problem can occur if the estimated survival functions cross one another. This means that, in some time intervals, one group will have a more favorable survival experience, while in other time intervals, the other group will have the more favorable experience. This situation is analogous to having interaction present when applying Mantel-Haenszel methods to a stratified contingency table… Fleming, Harrington, and O’Sullivan [66] proposed a method that addresses the problem by using, as a test statistic, the maximum observed difference between the two survival functions. This test has not been implemented in any software package… For the time being, our only check is via a visual examination of the plot of the Kaplan-Meier estimator for the groups being compared. If we see that the curves cross, then this “interaction” may be present.
5.2.1. R Example
5.2.2. SAS Example
5.3. Tests of Homogeneity and Monotonic Trends in Multiple Concentration Distributions
- H0:
- for all x;
- H1:
- for some concentration (x) and pair of sites ().
- H0:
- for all x;
- H1:
- for all x, with “>” for some x and pair () (“decreasing trend”);
- H1:
- for all x, with “<” for some x and pair () (“increasing trend”).
5.3.1. R Example
5.3.2. SAS Example
6. Discussion
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Fusek, M. On testing reduction of left-censored Weibull distribution to exponential submodel. MENDEL Soft Comput. J. 2017, 23, 179–184. [Google Scholar] [CrossRef]
- Fusek, M.; Michálek, J. Left-censored samples from skewed distributions: Statistical inference and applications. Acta Univ. Agric. Et Silvic. Mendel. Brun. 2018, 66, 245–252. [Google Scholar] [CrossRef]
- Gibbons, R.D.; Bhaumik, D.; Aryal, S. Statistical Methods for Groundwater Monitoring, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
- Helsel, D.R. Statistics for Censored Environmental Data Using Minitab and R; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
- Hsieh, P.H. Tales from the tail: Robust estimation of moments of environmental data with one-sided detection limits. Comput. Stat. Data Anal. 2012, 56, 4266–4277. [Google Scholar] [CrossRef][Green Version]
- Huang, L.; Chen, L.; Wang, H.; Wang, L. Parametric and semiparametric estimation of correlation for gamma-distributed environmental pollutant data with non-detects. Measurement 2026, 261, 119981. [Google Scholar] [CrossRef]
- Huybrechts, T.; Thas, O.; Dewulf, J.; Van Langenhove, H. How to estimate moments and quantiles of environmental data sets with non-detected observations? A case study on volatile organic compounds in marine water samples. J. Chromatogr. A 2002, 975, 123–133. [Google Scholar] [CrossRef]
- Kroll, C.N.; Stedinger, J.R. Estimation of moments and quantiles using censored data. Water Resour. Res. 1996, 32, 1005–1012. [Google Scholar] [CrossRef]
- Leith, K.F.; Bowerman, W.W.; Wierda, M.R.; Best, D.A.; Grubb, T.G.; Sikarske, J.G. A comparison of techniques for assessing central tendency in left-censored data using PCB and p,p’DDE contaminant concentrations from Michigan’s Bald Eagle Biosentinel Program. Chemosphere 2010, 80, 7–12. [Google Scholar] [CrossRef]
- Shoari, N. Quantitative Analysis of Left-Censored Concentration Data in Environmental Site Characterization. Ph.D. thesis, École de Technologie Supérieure, Montreal, QC, Canada, 2016. [Google Scholar]
- Shoari, N.; Dubé, J.S. An investigation of the impact of left-censored soil contamination data on the uncertainty of descriptive statistical parameters. Environ. Toxicol. Chem. 2016, 35, 2623–2631. [Google Scholar] [CrossRef]
- Shoari, N.; Dubé, J.S. Toward improved analysis of concentration data: Embracing nondetects. Environ. Toxicol. Chem. 2018, 37, 643–656. [Google Scholar] [CrossRef] [PubMed]
- Shoari, N.; Dubé, J.S.; Chenouri, S. On the use of the substitution method in left-censored environmental data. Hum. Ecol. Risk Assess. Int. J. 2016, 22, 435–446. [Google Scholar] [CrossRef]
- Silva, F.H.R.d.; Pinto, É.J.d.A. Assessment of left-censored data treatment methods using stochastic simulation. Braz. J. Water Resour. 2023, 28, e42. [Google Scholar] [CrossRef]
- Stow, C.A.; Webster, K.E.; Wagner, T.; Lottig, N.; Soranno, P.A.; Cha, Y. Small values in big data: The continuing need for appropriate metadata. Ecol. Inform. 2018, 45, 26–30. [Google Scholar] [CrossRef]
- Wood, M.; Beresford, N.; Copplestone, D. Limit of detection values in data analysis: Do they matter? Radioprotection 2011, 46, S85–S90. [Google Scholar] [CrossRef]
- Zoffoli, H.J.O.; Varella, C.A.A.; do Amaral-Sobrinho, N.M.B.; Zonta, E.; Tolón-Becerra, A. Method of median semi-variance for the analysis of left-censored data: Comparison with other techniques using environmental data. Chemosphere 2013, 93, 1701–1709. [Google Scholar] [CrossRef] [PubMed]
- ANSP. 2005 Sabine River Studies for the Eastman Chemical Company Texas Operations; Technical Report 06-08D2; Academy of Natural Sciences of Philadelphia: Philadelphia, PA, USA, 2007. [Google Scholar]
- Hart, J.J.; Jamison, M.N.; McNair, J.N.; Szlag, D.C. Frequency and degradation of SARS-CoV-2 markers N1, N2, and E in sewage. J. Water Health 2023, 21, 514–524. [Google Scholar] [CrossRef]
- Schmitz, B.W.; Innes, G.K.; Prasek, S.M.; Betancourt, W.Q.; Stark, E.R.; Foster, A.R.; Abraham, A.G.; Gerba, C.P.; Pepper, I.L. Enumerating asymptomatic COVID-19 cases and estimating SARS-CoV-2 fecal shedding rates via wastewater-based epidemiology. Sci. Total Environ. 2021, 801, 149794. [Google Scholar] [CrossRef] [PubMed]
- Wu, J.; Wang, Z.; Lin, Y.; Zhang, L.; Chen, J.; Li, P.; Liu, W.; Wang, Y.; Yao, C.; Yang, K. Technical framework for wastewater-based epidemiology of SARS-CoV-2. Sci. Total Environ. 2021, 791, 148271. [Google Scholar] [CrossRef]
- Ando, H.; Reynolds, K. Handling left-censored wastewater surveillance data at the city level: A state-space model incorporating a logistic function. Water Res. 2026, 294, 125488. [Google Scholar] [CrossRef]
- Shrestha, A.; Dorevitch, S. Evaluation of rapid qPCR method for quantification of E. Coli Non-Point Source Impacted Lake Mich. Beaches. Water Res. 2019, 156, 395–403. [Google Scholar] [CrossRef]
- Haugland, R.; Oshima, K.; Sivaganesan, M.; Dufour, A.; Varma, M.; Siefring, S.; Nappier, S.; Schnitker, B.; Briggs, S. Large-scale comparison of E. Coli Levels Determ. Cult. A QPCR Method (EPA Draft Method C) Mich. Towards Implement. Rapid, Multi-Site Beach Testing. J. Microbiol. Methods 2021, 184, 106186. [Google Scholar] [CrossRef] [PubMed]
- McNair, J.N.; Lane, M.J.; Hart, J.J.; Porter, A.M.; Briggs, S.; Southwell, B.; Sivy, T.; Szlag, D.C.; Scull, B.T.; Pike, S.; et al. Validity assessment of Michigan’s proposed qPCR threshold value for rapid water-quality monitoring of E. Coli Contam. Water Res. 2022, 226, 119235. [Google Scholar] [CrossRef]
- Saleem, F.; Schellhorn, H.E.; Simhon, A.; Edge, T.A. Same-day Enterococcus qPCR results of recreational water quality at two Toronto beaches provide added public health protection and reduced beach days lost. Can. J. Public Health 2023, 114, 676–687. [Google Scholar] [CrossRef] [PubMed]
- McNair, J.N.; Rediske, R.R.; Hart, J.J.; Jamison, M.N.; Briggs, S. Performance of Colilert-18 and qPCR for monitoring E. coli contamination at freshwater beaches in Michigan. Environments 2025, 12, 21. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2025. [Google Scholar]
- SAS Institute Inc. SAS OnlineDoc 9.1.3; SAS Institute Inc.: Cary, NC, USA, 2004. [Google Scholar]
- Ware, J.H.; Demets, D.L. Reanalysis of some baboon descent data. Biometrics 1976, 32, 459–463. [Google Scholar] [CrossRef]
- Gillespie, B.W.; Chen, Q.; Reichert, H.; Franzblau, A.; Hedgeman, E.; Lepkowski, J.; Adriaens, P.; Demond, A.; Luksemburg, W.; Garabrant, D.H. Estimating population distributions when some data are below a limit of detection by using a reverse Kaplan-Meier estimator. Epidemiology 2010, 21, S64–S70. [Google Scholar] [CrossRef]
- Currie, L.A. Nomenclature in evaluation of analytical methods including detection and quantification capabilities (IUPAC Recommendations 1995). Pure Appl. Chem. 1995, 67, 1699–1723. [Google Scholar] [CrossRef]
- NCCLS. Protocols for Determination of Limits of Detection and Limits of Quantitation; Approved Guideline; National Committee for Clinical Laboratory Standards: Wayne, PA, USA, 2004. [Google Scholar]
- Eurachem. The Fitness for Purpose of Analytical Methods: A Laboratory Guide to Method Validation and Related Topics, 2nd ed.; Eurachem: Bucharest, Romania, 2014; ISBN 978-91-87461-59-0. Available online: www.eurachem.org (accessed on 31 December 2023).
- Skoog, D.; Holler, F.; Crouch, S. Principles of Instrumental Analysis; Cengage Learning: Boston, MA, USA, 2018. [Google Scholar]
- Fritz, J.S.; Schenk, G.H. Quantitative Analytical Chemistry; Allyn and Bacon, Inc.: Newton, MA, USA, 1987. [Google Scholar]
- Kutner, M.H.; Nachtsheim, C.J.; Neter, J.; Li, W. Applied Linear Statistical Models; McGraw-Hill: New York, NY, USA, 2005. [Google Scholar]
- Parker, P.A.; Vining, G.G.; Wilson, S.R.; Szarka, J.L., III; Johnson, N.G. The prediction properties of classical and inverse regression for the simple linear calibration problem. J. Qual. Technol. 2010, 42, 332–347. [Google Scholar] [CrossRef]
- Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
- Antweiler, R.C.; Taylor, H.E. Evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets: I. Summary statistics. Environ. Sci. Technol. 2008, 42, 3732–3738. [Google Scholar] [CrossRef] [PubMed]
- Antweiler, R.C. Evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets. II. Group comparisons. Environ. Sci. Technol. 2015, 49, 13439–13446. [Google Scholar] [CrossRef]
- George, B.J.; Gains-Germain, L.; Broms, K.; Black, K.; Furman, M.; Hays, M.D.; Thomas, K.W.; Simmons, J.E. Censoring trace-level environmental data: Statistical analysis considerations to limit bias. Environ. Sci. Technol. 2021, 55, 3786–3795. [Google Scholar] [CrossRef]
- Shumway, R.H.; Azari, R.S.; Kayhanian, M. Statistical approaches to estimating mean water quality concentrations with detection limits. Environ. Sci. Technol. 2002, 36, 3345–3353. [Google Scholar] [CrossRef]
- Gilbert, R.O. Statistical Methods for Environmental Pollution Monitoring; John Wiley & Sons: Hoboken, NJ, USA, 1987. [Google Scholar]
- Agresti, A.; Coull, B.A. Approximate is better than “exact” for interval estimation of binomial proportions. Am. Stat. 1998, 52, 119–126. [Google Scholar]
- Brown, L.D.; Cai, T.T.; DasGupta, A. Interval estimation for a binomial proportion. Stat. Sci. 2001, 16, 101–133. [Google Scholar] [CrossRef]
- Agresti, A.; Caffo, B. Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. Am. Stat. 2000, 54, 280–288. [Google Scholar] [CrossRef]
- Agresti, A. Categorical Data Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Fagerland, M.W.; Lydersen, S.; Laake, P. The McNemar test for binary matched-pairs data: Mid-p and asymptotic are better than exact conditional. BMC Med. Res. Methodol. 2013, 13, 91. [Google Scholar] [CrossRef] [PubMed]
- Calhoun, P. Exact: Unconditional Exact Test, R Package Version 3.2; CRAN: Vienna, Austria, 2022. Available online: https://CRAN.R-project.org/package=Exact (accessed on 18 August 2023).
- Shan, G.; Wang, W. ExactCIdiff: Inductive Confidence Intervals for the Difference Between Two Proportions, R Package Version 2.1; CRAN: Vienna, Austria, 2022. Available online: https://CRAN.R-project.org/package=ExactCIdiff (accessed on 18 August 2023).
- Hollander, M.; Wolfe, D.A.; Chicken, E. Nonparametric Statistical Methods; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
- Lydersen, S.; Fagerland, M.W.; Laake, P. Recommended tests for association in 2×2 tables. Stat. Med. 2009, 28, 1159–1175. [Google Scholar] [CrossRef]
- Attwood, K.; Park, S.; Hutson, A.D. Practical and robust test for comparing binomial proportions in the randomized phase II setting. Pharm. Stat. 2022, 21, 361–371. [Google Scholar] [CrossRef] [PubMed]
- Anderson-Bergman, C. icenReg: Regression models for interval censored data in R. J. Stat. Softw. 2017, 81, 1–23. [Google Scholar] [CrossRef]
- Feller, W. An Introduction to Probability Theory and Its Applications; John Wiley & Sons: Hoboken, NJ, USA, 1968. [Google Scholar]
- Klein, J.P.; Moeschberger, M.L. Survival Analysis: Techniques for Censored and Truncated Data; Springer: New York, NY, USA, 2003. [Google Scholar]
- Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley & Sons: Hoboken, NJ, USA, 2002. [Google Scholar]
- Conover, W.J. Practical Nonparametric Statistics; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
- Lehmann, E.L. Nonparametrics: Statistical Methods Based on Ranks; Holden-Day: San Francisco, CA, USA, 1975. [Google Scholar]
- Collett, D. Modelling Survival Data in Medical Research; Chapman and Hall/CRC: New York, NY, USA, 2023. [Google Scholar]
- Turnbull, B.W. Nonparametric estimation of a survivorship function with doubly censored data. J. Am. Stat. Assoc. 1974, 69, 169–173. [Google Scholar] [CrossRef]
- Turnbull, B.W. The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc. Ser. B (Methodological) 1976, 38, 290–295. [Google Scholar] [CrossRef]
- Oller, R.; Langohr, K. FHtest: An R Package for the Comparison of Survival Curves with Censored Data. J. Stat. Softw. 2017, 81, 1–25. [Google Scholar] [CrossRef]
- Hosmer, D.W., Jr.; Lemeshow, S.; May, S. Applied Survival Analysis: Regression Modeling of Time-To-Event Data; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
- Fleming, T.R.; Harrington, D.P.; O’sullivan, M. Supremum versions of the log-rank and generalized Wilcoxon statistics. J. Am. Stat. Assoc. 1987, 82, 312–320. [Google Scholar] [CrossRef]











| Name for Time-to-Event Data | Suggested Name for Concentration Data |
|---|---|
| Probability distribution function | Probability distribution function (PDF) |
| Survivor function | Complementary probability distribution function (cPDF) |
| Probability density function | Probability density function (pdf) |
| Hazard function | Attenuation function (AF) |
| Probability | Quantile | LCL | UCL |
|---|---|---|---|
| 0.25 | 9 | 8 | 11 |
| 0.50 | 30 | 26 | 34 |
| 0.75 | 84 | 75 | 93 |
| Statistical Task | Section |
|---|---|
| Characterize concentration distributions | Section 5.1 |
| • Estimate PDF, cPDF, and point-wise confidence intervals | |
| • Estimate quantiles and their confidence intervals | |
| Pairwise comparison of concentration distributions | Section 5.2 |
| Tests of homogeneity and monotonic trends | Section 5.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
McNair, J.N.; Frobish, D.; Ciarrocchi, I.; Rediske, R.R. How to Analyze Censored Concentration Data Using Modern Statistical Methods of Survival Analysis: Background and Nonparametric Methods. Water 2026, 18, 1135. https://doi.org/10.3390/w18101135
McNair JN, Frobish D, Ciarrocchi I, Rediske RR. How to Analyze Censored Concentration Data Using Modern Statistical Methods of Survival Analysis: Background and Nonparametric Methods. Water. 2026; 18(10):1135. https://doi.org/10.3390/w18101135
Chicago/Turabian StyleMcNair, James N., Daniel Frobish, Isabelle Ciarrocchi, and Richard R. Rediske. 2026. "How to Analyze Censored Concentration Data Using Modern Statistical Methods of Survival Analysis: Background and Nonparametric Methods" Water 18, no. 10: 1135. https://doi.org/10.3390/w18101135
APA StyleMcNair, J. N., Frobish, D., Ciarrocchi, I., & Rediske, R. R. (2026). How to Analyze Censored Concentration Data Using Modern Statistical Methods of Survival Analysis: Background and Nonparametric Methods. Water, 18(10), 1135. https://doi.org/10.3390/w18101135

