#
Haphazard Intentional Sampling Techniques in Network Design of Monitoring Stations^{ †}

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- 1.
- Select units for the comparison of treatments, and collect covariate data on all units.
- 2.
- Define an explicit criterion for covariate balance.
- 3.
- Randomize the units to treatment groups.
- 4.
- Check covariate balance and return to Step 3 if the allocation is unacceptable according to the criterion specified in Step 2; continue until the balance is acceptable.
- 5.
- Conduct the experiment using the final randomization obtained in Step 4.
- 6.
- Perform inference (using a randomization test that follows exactly Steps 2–4).

## 2. Haphazard Intentional Sampling

## 3. Case Study

## 4. Results

#### 4.1. Balance and Decoupling

- The balance criterion, measured by the Mahalanobis distance between the covariates of interest, $M({\mathbf{w}}^{*},\mathbf{X})$. We computed the median and 95th percentile of $M({\mathbf{w}}^{*},\mathbf{X})$ over the 500 allocations yielded by each method.
- The decoupling criterion, which concerns the absence of a systematic bias in allocating each pair of sampling units to the same group (positive association) or to different groups (negative association). For this purpose, we use the Yule’s coefficient of colligation [19]: for each pair of units $(i,j)\in {\{1,2,\dots ,n\}}^{2},i<j$, and for each pair of groups $(r,s)\in {\{0,1\}}^{2}$, let ${z}_{rs}(i,j)$ denote the number of times among the 500 allocations such that the units i and j are assigned, respectively, to groups r and s. The Yule coefficient for the pair $(i,j)$ is computed as$$Y(i,j)=\frac{\sqrt{{z}_{00}(i,j){z}_{11}(i,j)}-\sqrt{{z}_{01}(i,j){z}_{10}(i,j)}}{\sqrt{{z}_{00}(i,j){z}_{11}(i,j)}+\sqrt{{z}_{01}(i,j){z}_{10}(i,j)}}.$$This coefficient ranges in the interval $[-1,1]$ and measures how often the units $(i,j)$ are allocated to the same or to different groups. It equals zero when the numbers of agreements (allocations to the same group) and disagreements (allocations to different groups) are equal; and is maximum ($-1$ or $+1$) in the presence of total negative (complete disagreement) or positive (complete agreement) association.The closer the $Y(i,j)$ to $+1$ or $-1$, the lower the decoupling provided by the allocation method with respect to $(i,j)$. So, for comparison purposes, we computed, for each method, the median and 95th percentile of $\left|Y\right(i,j\left)\right|$ among all pairs $(i,j)$.

#### 4.2. Inference Power

- 1.
- Generate B allocations ${\mathbf{w}}^{\left(1\right)},{\mathbf{w}}^{\left(2\right)},\dots ,{\mathbf{w}}^{\left(B\right)}$ using method $\mathcal{M}\left(\mathbf{X}\right)$, constrained to $\mathbf{1}\xb7{\left({\mathbf{w}}^{\left(b\right)}\right)}^{t}={n}_{1}$ and $\mathbf{1}\xb7{(\mathbf{1}-{\mathbf{w}}^{\left(b\right)})}^{t}={n}_{0}$.
- 2.
- For each generated allocation ${\mathbf{w}}^{\left(b\right)}$, compute the corresponding ${\widehat{\tau}}_{{\mathbf{w}}^{\left(b\right)},\mathbf{y}}$ according to Equation (9).
- 3.
- Estimate the p value by$$p\cong \frac{{\sum}_{b=1}^{B}I\left(\right|{\widehat{\tau}}_{{\mathbf{w}}^{\left(b\right)},\mathbf{y}}|\ge |{\widehat{\tau}}_{\mathbf{w},\mathbf{y}}\left|\right)}{B}\phantom{\rule{4pt}{0ex}},$$
- 4.
- ${H}_{0}$ is rejected if $p\le \alpha $.

- 1.
- Generate an allocation $\mathbf{w}$ using the method $\mathcal{M}\left(\mathbf{X}\right)$.
- 2.
- Simulate a response vector $\mathbf{y}$ in the following way:For $i\in \{1,\dots ,n\}$:
- Draw a random number ${\mu}_{i}^{0}\sim N(\theta ,1)$, where $\theta ={\sum}_{j}\left({X}_{i,j}-{\overline{X}}_{,j}\right)\phantom{\rule{0.166667em}{0ex}}/\phantom{\rule{0.166667em}{0ex}}\mathrm{sd}\left({X}_{,j}\right)$ and j indexes the columns of $\mathbf{X}$;
- If ${w}_{i}=0$, then set ${y}_{i}={\mu}_{i}^{0}$; otherwise, set ${y}_{i}={\mu}_{i}^{0}+\tau $.

- 3.
- Apply the randomization test described above on $\mathbf{w},\mathbf{y}$ to test ${H}_{0}:\tau =0$, with a significance level $\alpha =0.05$ and $B=500$ allocations.

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Fossaluza, V.; Lauretto, M.S.; Pereira, C.A.B.; Stern, J.M. Combining Optimization and Randomization Approaches for the Design of Clinical Trials. In Interdisciplinary Bayesian Statistics; Springer: New York, NY, USA, 2015; pp. 173–184. [Google Scholar]
- Pearl, J. Causality: Models, Reasoning, and Inference; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
- Stern, J. Decoupling, Sparsity, Randomization and Objective Bayesian Inference. Cybern. Hum. Knowing
**2008**, 15, 49–68. [Google Scholar] - Sprott, D.A.; Farewell, V.T. Randomization in experimental science. Stat. Pap.
**1993**, 34, 89–94. [Google Scholar] [CrossRef] - Rubin, D.B. Comment: The design and analysis of gold standard randomized experiments. J. Am. Stat. Assoc.
**2008**, 103, 1350–1353. [Google Scholar] [CrossRef] - Bruhn, M.; McKenzie, D. In Pursuit of Balance: Randomization in Practice in Development Field Experiments. Am. Econ. J. Appl. Econ.
**2009**, 1, 200–232. [Google Scholar] [CrossRef] - Saa, O.; Stern, J.M. Auditable Blockchain Randomization Tool. arXiv
**2019**, arXiv:1904.09500. [Google Scholar] - Morgan, K.L.; Rubin, D.B. Rerandomization to improve covariate balance in experiments. Ann. Stat.
**2012**, 40, 1263–1282. [Google Scholar] [CrossRef] - Morgan, K.L.; Rubin, D.B. Rerandomization to Balance Tiers of Covariates. J. Am. Stat. Assoc.
**2015**, 110, 1412–1421. [Google Scholar] [CrossRef] [PubMed] - Lauretto, M.S.; Nakano, F.; Pereira, C.A.B.; Stern, J.M. Intentional Sampling by goal optimization with decoupling by stochastic perturbation. Aip Conf. Proc.
**2012**, 1490, 1490. [Google Scholar] - Lauretto, M.S.; Stern, R.B.; Morgan, K.L.; Clark, M.H.; Stern, J.M. Haphazard intentional allocation and rerandomization to improve covariate balance in experiments. AIP Conf. Proc
**2017**, 1853, 050003. [Google Scholar] - Golub, G.H.; Van Loan, C.F. Matrix Computations; JHU Press: Baltimore, MD, USA, 2012. [Google Scholar]
- Wolsey, L.A.; Nemhauser, G.L. Integer And Combinatorial Optimization; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
- Ward, J.; Wendell, R. Technical Note-A New Norm for Measuring Distance Which Yields Linear Location Problems. Oper. Res.
**1980**, 28, 836–844. [Google Scholar] [CrossRef] - Murtagh, B.A. Advanced Linear Programming: Computation And Practice; McGraw-Hill International Book Co.: New York, NY, USA, 1981. [Google Scholar]
- Amorim, W. Web Scraping do Sistema de Qualidade do Ar da Cetesb; R Foundation for Statistical Computing: Sao Paulo, Brazil, 2018. [Google Scholar]
- Gurobi Optimization Inc. Gurobi: Gurobi Optimizer 6.5 Interface, R package version 6.5-0; Gurobi Optimization Inc.: Beaverton, OR, USA, 2015. [Google Scholar]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018. [Google Scholar]
- Yule, G.U. On the Methods of Measuring Association Between Two Attributes. J. R. Stat. Soc.
**1912**, 75, 579–652. [Google Scholar] [CrossRef] - Pavlikov, K.; Uryasev, S. CVaR norm and applications in optimization. Optim. Lett.
**2014**, 8, 1999–2020. [Google Scholar] [CrossRef] - Gotoh, J.Y.; Uryasev, S. Two pairs of polyhedral norms versus l
_{p}-norms: proximity and applications in optimization. Math. Program.**2016**, 156, 391–431. [Google Scholar] [CrossRef] - Ward, J.; Wendell, R. Using Block Norms for Location Modeling. Oper. Res.
**1985**, 33, 1074–1090. [Google Scholar] [CrossRef]

**Figure 1.**Mahalanobis distances and absolute Yule coefficients yielded by the haphazard allocation method with ${\lambda}^{*}\in \{0.05,0.1,0.2,0.3,0.4\}$.

**Figure 2.**Difference between groups 0 and 1 with respect to average of standardized covariate values for each type of allocation (Adapted from Morgan and Rubin [9]).

**Figure 3.**Power curves for each allocation method for testing the absence of treatment effect, ${H}_{0}:\tau =0$.

**Table 1.**Mahalobis distances and absolute Yule coefficients yielded by the haphazard allocation, rerandomization and pure randomization methods (500 allocations for each method).

Method | Mahalanobis Distance | Yule Coefficient (Absolute Value) | ||
---|---|---|---|---|

Median | 95th perc. | Median | 95th perc. | |

Haphazard (${\lambda}^{*}=0.05$) | 0.15 | 0.17 | 0.26 | 0.71 |

Haphazard (${\lambda}^{*}=0.10$) | 0.16 | 0.18 | 0.16 | 0.51 |

Haphazard (${\lambda}^{*}=0.20$) | 0.18 | 0.20 | 0.12 | 0.45 |

Haphazard (${\lambda}^{*}=0.30$) | 0.18 | 0.21 | 0.12 | 0.44 |

Haphazard (${\lambda}^{*}=0.40$) | 0.20 | 0.22 | 0.11 | 0.43 |

Rerandomization | 0.44 | 0.48 | 0.07 | 0.26 |

Pure random | 1.15 | 1.40 | 0.03 | 0.07 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Lauretto, M.S.; Stern, R.; Ribeiro, C.; Stern, J.
Haphazard Intentional Sampling Techniques in Network Design of Monitoring Stations. *Proceedings* **2019**, *33*, 12.
https://doi.org/10.3390/proceedings2019033012

**AMA Style**

Lauretto MS, Stern R, Ribeiro C, Stern J.
Haphazard Intentional Sampling Techniques in Network Design of Monitoring Stations. *Proceedings*. 2019; 33(1):12.
https://doi.org/10.3390/proceedings2019033012

**Chicago/Turabian Style**

Lauretto, Marcelo S., Rafael Stern, Celma Ribeiro, and Julio Stern.
2019. "Haphazard Intentional Sampling Techniques in Network Design of Monitoring Stations" *Proceedings* 33, no. 1: 12.
https://doi.org/10.3390/proceedings2019033012