Challenges and Opportunities in the Statistical Analysis of Multiplex Immunofluorescence Data
Abstract
:Simple Summary
Abstract
1. Introduction
2. Data Preprocessing and Quality Control of mIF Data
2.1. mIF Assay and Data Generation
2.2. Quality Control of Generated Data
2.2.1. Conflicting Information between Markers (CD8 and FOXP3)
2.2.2. Batch Effects
3. Analysis of Summary Data
3.1. Analysis of the Number, Percentage or Density of Cells Positive for Immune Marker
3.2. Analysis Using ZeroInflated and OverDispersed Distributions
3.3. Repeated Measurements
4. Clustering and Cooccurrence in Spatial Analysis of mIF
4.1. Pixel or RegionBased Methods
4.2. Distance and Nearest NeighborBased Methods
4.3. Spatial Point Process Based Methods
4.3.1. Analyzing Number of Neighbors
4.3.2. Analyzing Distance to Neighbor
4.3.3. Considerations
5. Discussion and Conclusions
Type of Analysis  Name  Empirical Formula  Theoretical Value under CSR  Comments 

Pixel/Area Based  Morisita Horn Index [110,111]  $MH\left({p}_{1},{p}_{2}\right)=\frac{2{p}_{1}{p}_{2}}{{p}_{1}^{2}+{p}_{2}^{2}}$ =$\frac{2{{\displaystyle \sum}}_{k=1}^{P}{p}_{1}^{k}\times {p}_{2}^{k}}{{{\displaystyle \sum}}_{k=1}^{P}{({p}_{1}^{k})}^{2}+{{\displaystyle \sum}}_{k=1}^{P}{({p}_{2}^{k})}^{2}}$ 
 
Duncan Segregation Index [113]  $D={2}^{1}{\displaystyle {\displaystyle \sum}_{k=1}^{P}}{p}_{1}^{k}/{p}_{1}{p}_{2}^{k}/{p}_{2}$ 
 
Nearest Neighbor  Euclidean Distance  $d\left({c}_{i},{c}_{j}\right)=\sqrt{{\left({x}_{i}{x}_{j}\right)}^{2}+{\left({y}_{i}{y}_{j}\right)}^{2}}$  ${\left(\lambda \pi {r}^{2}\right)}^{1}$  
Nearest Neighbor  $\underset{j}{\mathrm{min}}d\left({c}_{i},{c}_{j}\right)$  ${\left(\left(n1\right)\lambda \pi {r}^{2}\right)}^{1}$  
Spatial Point Processes  Ripley’s K [115]  $\widehat{K}\left(r\right)={\left(n\left(n1\right)\right)}^{1}{\displaystyle {\displaystyle \sum}_{i=1}^{n}}{\displaystyle {\displaystyle \sum}_{i\ne j}}1\left(d\left({c}_{i},{c}_{j}\right)\le r\right){e}_{ij}$  $\pi {r}^{2}$ 

Besag’s L [117]  $\widehat{L}=\sqrt{\frac{\widehat{K}\left(r\right)}{\pi}}$  $r$  
Marcon’s M [118]  $\widehat{M}=\frac{\widehat{K}\left(r\right)}{\pi {r}^{2}}$  $1$  
Pairwise Correlation Function [119,120]  $\widehat{g}\left(r\right)\text{}$$={\left(2\pi \right)}^{1}{\displaystyle \sum}_{i=1}^{n}{\displaystyle \sum}_{i\ne j}\frac{\kappa \left(rd\left({c}_{i},{c}_{j}\right)\right)}{d\left({c}_{i},{c}_{j}\right)}{e}_{ij}$  $\text{}\frac{{K}^{\prime}\left(r\right)}{2\pi r\text{}}$ 
 
Hypothesized Interaction Distribution [121]  $\widehat{h}\left(i,\text{}j\right)={n}^{1}{\displaystyle {\displaystyle \sum}_{i=1}^{n}}{\displaystyle {\displaystyle \sum}_{i\ne j}}1\left(d\left({c}_{i},{c}_{j}\right)\le r\right)$  $\left(n1\right)*\pi \text{}{r}^{2}$ 
 
Empty Space Function [122]  $\widehat{F}\left(r\right)={m}^{1}{\displaystyle {\displaystyle \sum}_{i=1}^{m}}1\left(r\underset{j}{\le \mathrm{min}}d\left({l}_{i},{c}_{j}\right)\le r+\mathsf{\Delta}r\right){e}_{ij}$  $1\mathrm{exp}\left(\lambda \pi {r}^{2}\right)$ 
 
Nearest Neighbor Function [116]  $\widehat{G}\left(r\right)={n}^{1}{\displaystyle {\displaystyle \sum}_{i=1}^{m}}1\left(r\underset{j}{\le \mathrm{min}}d\left({c}_{i},{c}_{j}\right)\le r+\mathsf{\Delta}r\right){e}_{ij}$  $1\mathrm{exp}\left(\lambda \pi {r}^{2}\right)$  
Hazard Empty Space Function [123] or Hazard Nearest Neighbor Function  ${h}_{\alpha}=\frac{d}{dr}\left(\mathrm{log}\left(1\widehat{\alpha}\left(r\right)\right)\right)$  $2\pi r\lambda $ 
 
Jfunction [124]  $\widehat{J}\left(r\right)=\frac{1\widehat{G}\left(r\right)}{1\widehat{F}\left(r\right)}$  $1$ 

