Next Article in Journal
Field Study of Metal Oxide Semiconductor Gas Sensors in Temperature Cycled Operation for Selective VOC Monitoring in Indoor Air
Previous Article in Journal
Wind Shear Prediction from Light Detection and Ranging Data Using Machine Learning Methods
Previous Article in Special Issue
Graz Lagrangian Model (GRAL) for Pollutants Tracking and Estimating Sources Partial Contributions to Atmospheric Pollution in Highly Urbanized Areas
 
 
Article
Peer-Review Record

Improving Air Pollutant Metal Oxide Sensor Quantification Practices through: An Exploration of Sensor Signal Normalization, Multi-Sensor and Universal Calibration Model Generation, and Physical Factors Such as Co-Location Duration and Sensor Age

Atmosphere 2021, 12(5), 645; https://doi.org/10.3390/atmos12050645
by Kristen Okorn 1,* and Michael Hannigan 2
Reviewer 1:
Reviewer 2:
Atmosphere 2021, 12(5), 645; https://doi.org/10.3390/atmos12050645
Submission received: 24 April 2021 / Revised: 11 May 2021 / Accepted: 15 May 2021 / Published: 19 May 2021
(This article belongs to the Special Issue Atmospheric Trace Gas Source Detection and Quantification)

Round 1

Reviewer 1 Report

Synopsis:

In this paper, the authors explore some statistical methods for field calibration of low-cost sensors.  Using co-located low-cost sensors as a reference is not particularly novel, but it has also not been heavily explored in literature.  The notable takeaway is that field sensors may be able to score themselves based on the standard deviation of a signal assuming a normal response.  The authors demonstrate this by deploying some sensing devices with different co-location schemes and self-referential hierarchies at sites in CO and CA.  The authors also note a few concerns to consider during deployment, such as sensor age, recommending device replacement to alleviate problems.  The conclusions suggest that in cases such as high-CH_4 alarm setups, it is possible to reduce the high-cost sensor co-location with low-cost MO_x sensors.


Comments:
- p. 2 line 49 - Anything saying "recently been employed" requires a citation to back up the claim.
- p. 3 line 109 - "Since the sites in Colorado all pull from the same oilfield, we would expect to see somewhat similar chemical signatures from each of those two sites." - This makes a big assumption about activity at the site.  Is all activity at this reservoir primary oil recovery?  Secondary? Is all of this homogenous across the oilfield?  As the phases progress, there would be new injection techniques to liberate oil.  Moving tertiary and beyond -- one would expect the gas makeup to change (e.g. EOR would impart CO_2 to the well).  The aliphatic fingerprint may remain the same, but the partial pressures of each element would change.
- p. 3 line 138 - While citing the instrument paper here is acceptable, it would be kind to the reader to include some more information which is relevant about the instruments to this specific paper.  Notably, the reader needs to know:
  + sampling frequency
  + sampling duration -- presumably 24hr/day for the duration of the sampling period
  + specific sensors used (a TGS 2611 is a world of difference from an MQ-4)
- p. 3 line 140 - No pressure sensor?  The cited paper says it includes the BMP180.  Did those die?  Pressure is part of the bare minimum for future weather comparisons.
- p. 4 line 145 - What is the human/animal presence at these sites?  Emission of CO_2 by passing creatures, such as a tech going in and out of a trailer, can change the partial pressure of CH_4.
- p. 5 line 190 - I am beginning to be concerned that you keep mentioning linear models in the context of metal oxide sensors.  The MO_x adsorption process approaches a Langmuir type curve due to the physisorption step dominating the kinetics.  Any linear output the end user sees from these sensors is governed by electronic linearization of the raw data, thus introducing some error.  If you are working with raw data from a MO_x sensor, the raw data will never be linear and you should not be using anything close to linear regression.  See Honeycutt et al. 2019 Sensors (https://doi.org/10.3390/s19143157).
- p. 5 line 198 - I am having trouble understanding this entire paragraph.  You talk about a reference instrument which implies that you have a separate "truth" standard.  Later in the para, you suggest that each sensor calibrates against its baseline mean value.  Am I reading this incorrectly?  If it is difficult to organize this para to make it easy to grasp, maybe a figure would help. 
  - Post reading edit: This makes a bit more sense now that I have read the rest of the paper, but that's a problem.  The reader should not have to work to grasp something when jumping around.  
- p. 6 line 211 - Ah, so not everything is linear fitting (ref my comment for p. 5 line 190).  Log won't work either for a mechanism dependent on surface coverage.
- p. 6 line 237 - Oh, here is  your ground truth!  You should mention this when the technique is introduced (ref. p. 5 line 198).  I'm still not sure if you are comparing against another U/Y box or if you have a fancy reference instrument.  Maybe you'll mention that soon....
- p. 6 line 239 - You still haven't said outright that this is a U/Y box, but I believe it is.  This confusion on the part of the reader may be alleviated by re-organization of this section.
- p. 7 line 295 - Add a new subsection or subsubsection for this part onward.  Conversely, as it appears to apply to all calibration models, so you could move it to the "Universal..." section.
- p. 8 line 304 - This equation is really ugly.  Why is it in the text?  Why use full words instead of common variables?  Is that lower-case "p" supposed to be pressure or a coefficient?  Use \LaTeX or MathJax.   E.g. for LaTeX, assuming VOC are concentrations and p are coefficients you would use:
\left[CO_{2 \left(ppm\right)}\right] = p_{1} + \left(p_{2} \times T\right) + \left(p_{3} \times R_{h}\right) + \left(p_{4} \times \left[VOC_{1}\right] \right) + \left(p_{5} \times \left[VOC_{2}\right] \right) + \left( p_{6} \frac{\left[VOC_{1}\right]}{\left[VOC_{2}\right]}\right) + \left( p_{7} \times t \right)
- p. 8 line 305 - What are the units of VOC_1 and VOC_2? Concentration? Volts?  What about humidity, relative or absolute %?  Units of T?
- p. 8 line 309 - More ugly math.
- p. 9 line 339 (Table 3 row 3 col 2) - is that p bar and r bar?  Presumably it is attempting to show the mean.  This is not rendering correctly in the inserted equation figure.
- p. 11 Fig. 4 - Are these color choices color-blind friendly?  I honestly can't tell, but I've never seen a pink/green combo before.  If you have considered this already, please ignore this comment.  I'm guessing you are using R.  Have you seen (https://stackoverflow.com/questions/57153428/r-plot-color-combinations-that-are-colorblind-accessible)?
- p. 11 Fig. 4 - I think there is too much information included in this figure and others like it.  As the reader, I'm having to work to get anything out of it.  Notably, the cross, X, and unfilled circle are really hard to make out with such light colors.
- p. 11 Fig 4 - As MO_x sensors have non-linear response curves, the nature and degree of error changes depending on the measured concentration.  A sensor measuring [CH_4] = 0.1 ppm will have very different uncertainty than the same sensor measuring [CH_4] = 100 ppm.  None of these figures report what the mean value used for the z-score of the sensor measurement is, greatly reducing the validity of the results in this figure.  How do I know that a poorly performing sensor/site/calibrationmodel is not just due to a sensor trying to measure a less-certain mean value?
- p. 12 line 422 - O_3 sensors perform MO_x sensors generally perform a bit better than aliphatic CH_4 MO_x sensors due to the reaction mechanism of adsorbed species on the surface.  While O_3 sensors are still largely dependent on Langmuir adsorption, the mechanism has less steps to redox at the surface.  Thus, the observation that the clustering was poorer for O_3 is interesting.  This is not an actionable comment, just...hmm.
- p.12 line 431; p. 14 line 482 - Nitpick.  Change to "Duration of Co-Location".  "Length" may imply distance to co-located instruments, which is non-trivial for these types of experiments.
- p.12 line 457 - "small baseline shifts" - how do you mean?  Are your sensors auto-calibrating somehow? That would be an important factor.  Are you implying local conditions for T, Rh, P changed?  That would mean something entirely different.
- p. 13 line 464 - What is the "Thermo"? This is the first time you have mentioned it.
- p. 13 line 480 - What is the "Picarro"? This is the first time you have mentioned it.
- p.14 line 526 - I had assumed you were using altitude corrected pressure values.  Are you using absolute pressure?  Consider careful wording here.
- p. 14 line 528 - "without the possibility of comparison with a reference instrument" - It feels like you are arguing against your thesis here.  Isn't one of the purposes of this paper to demonstrate that you can do ok without the big, fancy toys?  Do you mean that your conclusions work well locally but less well regionally?  That would be important to mention.
- p. 15 Fig 7 - The time trend of your box plots support your text stating that a longer measurement period increases the quality of fit.  Can you go beyond this?  Can you predict the number of weeks to reach 95% confidence of your fit?  That would greatly increase the impact of this paper - MO_x sensors should be deployed at least x weeks to be taken seriously.  That would also allow you to get another paper experimenting on this statistical prediction.
- p. 15 line 547 - "the minutely data" - What does this mean?  To me, "minutely" means "very small".  Is your sampling interval 60s?  See (https://english.stackexchange.com/questions/3091/weekly-daily-hourly-minutely).
- p. 16 near line 561 and line 568 - Is that a tiny, hovering ppm in the text here?  Check your figures to see if you need to crop out some extraneous axes.
- p. 16-17 Fig 9- Captions should not break from the figure.  Ensure this doesn't happen in proofs.  There may be a single dot on extreme right of line 572 causing problems too.
- p. 17 line 575 - Big logic jump here.  You have not given enough information to support this claim in the way it is worded.  While the T and Elev may be A cause, all you can say for certain here is that there was a difference.  Boundary layer gas mixing is way more complex than this simplification.  
- p. 17 line 609 - All MO_x sensors require a burn-in period prior to operation.  Describe the burn-in protocol you did on sensors when starting work and when replacing them.
- p. 17 line 624 - PCR?  PCB?  Anyway, how do you mean?  Are you implying that the copper traces on PCB 1 somehow produce different results than the traces on PCB 2?  Are you using direct analog input from the sensors?  If so, then this may hold a grain of truth.  Are you using A/D converters?  If so, then this does not make sense unless you have really bad A/D converters.
- p. 18 line 652 - This paragraph underscores my comment at p. 17 line 609 regarding burn-in.  Without knowing your protocol, I can't tell if I should dismiss the information in this section.  Ramp-up suggests an incomplete burn-in.
- p. 18 line 665 - All MO_x sensors are T dependent.  They include heating coils in the package to help mitigate this issue.  If you see diurnal cycling, you should consider a temperature correction test.  CH_4 should RAISE a bit at night as the partial pressure of humidity drops with temperature and the partial pressure of CO_2 drops with decreased vegetation respiration.  See (DOI: 10.1039/D0RA08593F).
- p. 20 line 709 - Please add a plus/minus in front of the numbers here and a degree sign for every Celsius.
- p. 20 line 719 - How close?  In which direction?  What is the prevailing wind at that site?
- p. 20 line 728 - How many km apart?  A precise measurement here is the difference between someone ignoring your results or using an atmospheric model with 4 km^2 grid squares (a common size) to confirm your results.
- p. 20 line 735 - These \sigma values are below the true measurement capabilities of TGS sensors near 0 ppm.  I don't think that really matters here as you are reporting what you statistically observed, so don't feel obligated to change this.  Instead, this comment is an FYI.
- p. 20 line 736 - Needs degree signs
- p. 20 line 743 - Nitpick: You should re-iterate that you are talking about co-location of MO_x sensors somewhere in this para.  Optical sensors are another problem for another paper.
- p. 21 line 764 - I don't like that you ignore the units for gas sensor signals.  Depending on how your electronics are set up, this is likely in terms of volts or conductivity.  This is less of a nitpick than my other comments on units, as this implies the authors may not 100% understand how their sensors work -- furthering my concerns about linear statistics on non-linear sensor responses.
- p. 21 line 771 - But zero centered effects have been studied in Honeycutt et al. 2019.
- p. 22 line 807 - Don't make the reader look at old papers to get the funding info.  Put it here.
- Citations - The editor will hit you with some edits here if you get accepted to publish.  At first glance, I see some obvious ones:
  - ref 21 - Needs date accessed
  - ref 28 - Title caps?

 

General comments:
- On the whole, I quite like this paper and would recommend its (eventual) publication.  The authors have a good logic for their field testing, the statistical section is quite strong, and the conclusions are great.  That said, it needs a fair amount of work before publication.  This reviewer has given a large number of comments, many of which are actionable.  Most are quite minor, but a few are not.  These problems need to be addressed before publication.  The authors demonstrate a limited understanding of the underlying mechanisms of their sensors, potentially leading to some less-than-perfect results.  A few things stand out as particularly troublesome, including the focus on linear analysis of decidedly non-linear sensor responses.  Depending on the authors response to this major concern, it may undermine their entire paper -- but I really hope that is not the case.  Other fundamental changes required include organization of the presented material.  While some sections flow well (mostly the statistics sections), others jump and skip concepts, introducing new concepts and terminology well after they should be for ease of understanding.
- There are a few cases of grammatical errors.
- This entire article needs a good look at reorganization.  It does not follow the "define everything", "describe each experiment", "present results" structure.  Rather, it jumps around at times.  I suggest the author consider how the information flows together and check if parts are overly entangled.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The work presented in this article illustrates a study for estimating methane and ozone concentrations by using low-cost gas sensors near oil and gas facilities. The use of these sensors constitutes an attempt to increase the number of pollutant detection points without deploying expensive reference instruments. This manuscript considers  Metal Oxide sensor data as inputs for comparing different sensor signal normalization techniques. Moreover, different calibration models are explored and several factors affecting the sensor performance are considered. It is interesting the 1-Hop quantification approach compared with the 1-Cal one as attempt to reduce time and energy for co-location.

Although this interesting aspect featuring this work, different elements must be improved:

  • The introduction does not provide a sufficient background to the problem, and relevant references are left out. In particular, it is well known that, among the low-cost gas sensors, the less expensive but also the less performing are the chemoresistive ones such as the MOxs. Better performances can be achieved with the use of the electrochemical sensors as it is widely documented in the scientific literature. In the particular case of the ozone sensors, the study presented in the article “Design and Development of a Flexible, Plug-and-Play, Cost-Effective Tool for on-Field Evaluation of Gas Sensors”, investigates sensor performances by comparing electrochemical with chemoresistive ones also by examining different linear regression calibration models. By not referencing these kind of works, the global vision of the problem cannot be exhaustively depicted.
  • Sensor performance depends on their selectivity, sensitivity and stability, moreover, calibration techniques are also important. However, their performance in terms of R2 and residual magnitudes also depends on the real concentration of it, or if you want, on the reference concentration readings. Best results in calibrating low-cost sensors are achieved when the average concentration of the gas is enough higher than the limit of detection of the sensor and when its standard deviation is enough wide. For this reason, in judging the performances is important to know or have data about the concentration detected by the reference instruments, or at least, their average, min, max, standard deviation. In the case of the ozone these data are not present (while the methane time series are exposed, and this could be sufficient). It is necessary to show them for a better understanding of sensor performances, or at least, it is necessary to show whiskerplots reporting min, max, average etc. of the reference instruments readings.
  • Choosing centred RMSE, MBE as indicators of the performance does not enable to understand the difference between the results of this work and the data exposed in similar studies, where RMSE, Mean Absolute error or mean relative error are considered; therefore, the reader cannot adequately assess the performance of the approach proposed.
  • In figure 3 for clarity, it is necessary to report the units of the plot on the left (I presume they are milliVolts)
  • The system used for the summarizing the results through the pictures 4,8,9 is hard to interpret. It is not clear which whiskerplot is related to the calibration, or the validation dataset, for example. That’s because the colour shades are too similar. Frankly speaking, the colours are too much similar, and the plots are not so readable. A different system in showing data must be chosen. Maybe a simple table for cmrsd and mbe could be more effective. Concerning the R2 data, the whiskerplots are too small and thin, and they must be enlarged for a better visualization.
  • Line 666: “Although it appears models do not require a CANF for best performance, validation data may be more reliable with the inclusion of this term.” The author must show and clarify in which extent “validation data may be more reliable with the inclusion of this term”. Otherwise, if CANF do not improve performance, and the “validation data may be not more reliable with the inclusion of this term”, then, why consider it? At what end? For doing what?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thank you for addressing my numerous concerns.  I have recommended that this article be published in its current form.

Back to TopTop