1. Introduction
The three-dimensional organization of the genome is fundamental to gene regulation [
1], as spatial proximity between regulatory elements and their target genes directly influences transcription [
2,
3]. Fluorescence in situ hybridization (FISH), including modern Oligopaint-based approaches, remains a gold-standard method for visualizing these spatial relationships with single-cell resolution within the native nuclear context [
4,
5,
6,
7].
While existing FISH studies and computational tools, sometimes complemented by Hi-C-based maps, have significantly advanced our understanding of large-scale chromosomal structures and long-range interactions spanning hundreds of kilobases to megabases [
6,
8,
9,
10,
11,
12,
13], many critical regulatory interactions—such as enhancer–promoter loops and local chromatin contacts—occur at much finer genomic scales (<10–100 kb) [
14]. At these scales, technical limitations, including chromatic aberration, can introduce measurement errors that are comparable to, or even exceed, the biological distances being investigated.
Despite the biological significance of short-range genomic measurements, many studies still rely on user-dependent distance measurements, which are prone to error [
15,
16,
17,
18]. Existing computational tools are typically not designed or rigorously validated for high-precision quantification at these scales. Moreover, the relationship between FISH probe design parameters—such as probe density and target sequence length—and measurement accuracy remains poorly characterized for short-distance applications.
Here we present FISH-Dist, an automated computational pipeline for quantitative 3D distance measurements in FISH imaging, with a focus on short-range genomic distances. FISH-Dist integrates deep learning-based spot detection [
19], sub-pixel 3D Gaussian fitting, and complementary chromatic aberration correction strategies for high-resolution measurements. We validate the pipeline using orthogonal approaches, including colocalization experiments and calibrated nanorulers. Using FISH-Dist, we also systematically quantify how probe design parameters, such as labeling density and target sequence length, influence measurement accuracy, providing practical guidance for experimental planning. By addressing technical limitations in standard confocal FISH, FISH-Dist provides a reproducible framework for quantifying spatial relationships at the short genomic distances most relevant to gene regulation.
2. Materials and Methods
2.1. Oligopaint Probe Synthesis
Forty-five-nucleotide single-stranded DNA Oligopaint probes targeting specific DNA sequences were designed by Daicel Arbor Biosciences and synthesized from myTags Immortal Libraries (Daicel Arbor Biosciences, Ann Arbor, MI, USA) according to the manufacturer’s protocol [
20]. Briefly, myTags Immortal Libraries were amplified by PCR, and the resulting products were column-purified. In vitro transcription was then performed to generate RNA, which was column-purified and used as a template for reverse transcription (RT). When RNA concentration was insufficient for RT, an additional ethanol precipitation was performed. RT was carried out using fluorescent primers (coupled at the 5′ end with a single ATTO 565 or ATTO 633 fluorescent dye) purchased from Integrated DNA Technologies (Coralville, IA, USA). Unincorporated primers were degraded, and RNA:DNA hybrids were column-purified. RNA fragments were subsequently removed by RNAse treatment, and the resulting fluorescent single-stranded DNA probes were column-purified and stored at −20 °C. The fraction of successfully dye-conjugated probes was quantified using a Nanodrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) to ensure an appropriate labeling ratio. Detailed sequences for all oligos are available in
Supplementary File S1.
2.2. Oligopaint Probe Design for Spatial Resolution Calibration in Multi-Color FISH
Oligopaint probes were designed to target a ∼10 kb endogenous genomic region (X:6,760,094–6,770,369, Drosophila genome Release 6) adjacent to the attP18 landing site on the X chromosome of Drosophila melanogaster. A total of 142 Oligopaint probes were designed to hybridize exclusively to the + DNA strand of this locus. For dual-color experiments, the same probe set was labeled with either ATTO 565 or ATTO 633 fluorophores.
2.3. Construction and Oligopaint Targeting of Synthetic Reporter Sequences
Two synthetic reporter sequences (R1 and R6) were designed to be orthogonal to the
Drosophila melanogaster genome by Daicel Arbor Biosciences and synthesized by Genewiz. Each reporter consists of a 2 kb DNA sequence absent from the
D. melanogaster genome, with similar overall GC content but distinct sequence composition. Each reporter was targeted by a set of 87 Oligopaint probes tiled across the entire sequence and both DNA strands. Transgenes were integrated at the ZH-2A landing site on the X chromosome using site-specific recombination. Detailed sequences for all oligos are available in
Supplementary File S1.
2.4. Transgene Design for Defined 3D Genomic Distance Measurements
To evaluate the accuracy of 3D distance measurements under biological conditions, we generated transgenic constructs containing two distinct Oligopaint-tagged sets targeting R1 and R6, labeled with ATTO 565 and ATTO 633 and separated by defined spacer lengths of 0 kb, 3 kb, or 10 kb. In all constructs, the spacer is flanked on both sides by the synthetic reporter tags R1 and R6, which serve as independent Oligopaint target sites. 3kb spacer consists of randomly generated DNA sequences designed to minimize binding of known
Drosophila transcription factors. The 10 kb spacer is a cherry-tagged
yellow gene of
Drosophila biarmipes. All constructs were integrated at the ZH-2A landing site. Detailed sequences for all constructs are available in
Supplementary File S1.
2.5. Oligopaint Target Length Variants for Assessing Spatial Resolution
To examine how FISH target sequence length influences spatial resolution, a series of transgenic constructs was generated in which progressively shorter segments of the R6 synthetic reporter were used as Oligopaint targets. Three target lengths were tested—2 kb, 1 kb, and 500 bp—preserving sequence composition while varying target size. All constructs were integrated at the ZH-2A landing site, ensuring a constant genomic context across conditions.
2.6. Modulation of Oligopaint Probe Density
To assess the impact of Oligopaint probe density on measurement performance, the fraction of fluorescently labeled probes targeting a fixed 2 kb R6 genomic region was systematically varied while maintaining a constant total probe concentration. This was achieved by mixing fluorescently labeled probes with unlabeled (“cold”) probes targeting the same sequence, thereby preserving hybridization conditions while progressively reducing the number of fluorescent probes contributing to the detected signal.
Three labeling conditions were examined. In the fully labeled condition (∼100%), the 2 kb R6 locus could be covered by up to 87 labeled probes and no unlabeled probes; ATTO 565- and ATTO 633-labeled probe sets were each used at a final concentration of 3 ng/µL, resulting in a total labeled probe concentration of 6 ng/µL. In the intermediate labeling condition (∼66%), the locus could be covered by up to 57 labeled probes and 30 unlabeled probes; each labeled probe set was used at 2 ng/µL and supplemented with 2 ng/µL of unlabeled probes. In the lowest labeling condition (∼50%), the locus could be covered by up to 43 labeled probes and 43 unlabeled probes; here, each labeled probe set was used at 1.5 ng/µL and supplemented with 3 ng/µL of unlabeled probes. In all cases, the total probe concentration was maintained at 6 ng/µL. Hybridization, imaging, and signal detection were performed under standard conditions.
2.7. Fluorescence In Situ Hybridization on Late Pupal Wings
At 70 h after puparium formation (APF), male flies -carrying a single X chromosome, and therefore a single set of each probe- were immobilized on adhesive tape, removed from their pupal cases, and gently dragged across the tape surface to remove residual cuticle shed during pupal development. Specimens were transferred to a small dish containing deionized water for approximately 1 min to allow wing unfolding. Wings were fixed in 4% formaldehyde in PBTT (1× PBS, 0.1% Tween-20, 0.3% Triton X-100) for 30 min at room temperature (RT), then washed twice in PBS. At 70 h APF, wings are enclosed by non-permeable cuticle and therefore require mechanical separation of the dorsal and ventral epithelial layers prior to probe hybridization. To achieve this, wings were transferred in a droplet of deionized water onto a non-adhesive substrate, typically the protective liner of double-sided tape. Excess water was removed, and wings were arranged flat without overlap. A strip of Tesa ECO & ULTRA tape (66 m × 50 mm)—selected for minimal autofluorescence, as it remains attached throughout subsequent staining, mounting, and imaging steps—was gently pressed onto the wings and lifted, ensuring wing adhesion to the tape. The taped wings were overlaid with a second tape strip and pressed gently to preserve structural integrity. The two tape layers were then carefully separated, thereby splitting the dorsal and ventral wing epithelial layers. Finally, the separated wing layers, still tape-mounted, were post-fixed for 20 min at RT in 4% formaldehyde in PBTT.
Wings were then briefly rinsed twice by rapid addition and removal of 500 µL freshly prepared PBS with 0.1% Tween-20 (PTw), followed by two 10 min washes in the same buffer. Wings were incubated for 1 h at 37 °C with RNase A diluted at 1:100 in PTw, then transferred to a 0.05 M HCl solution for 5 min to denature DNA and enhance probe penetration.
Wings were washed sequentially for 15 min each in 4:1, 1:1, 1:4, and 0:1 mixtures of PTw and FISH wash buffer (Fwb; 35% formamide, 0.1% Tween-20, 4X SSC). Fwb was prepared fresh before use and stored at 4 °C for a maximum of 48 h.
Wings were incubated with probes diluted to a final concentration of 3 ng/µL each in hybridization buffer (50% formamide, 5X SSC, 0.1% Tween-20, 100 µg/mL denatured herring sperm DNA, 50 µg/mL heparin). The samples were heated at 87 °C for 20 min in a thermocycler, then cooled gradually to 37 °C over 2 h. Hybridization was performed overnight for 12–16 h at 37 °C.
Following hybridization, wings were rinsed briefly, then washed twice for 1 h each at 37 °C in pre-warmed Fwb. Nuclei were stained with 20 mM Hoechst diluted 1:500 in PTw for 5 min, followed by four 10 min washes in PTw at RT. Wings were mounted in Vectashield, flattened under a coverslip using gentle pressure applied with magnets, and sealed with nail polish.
2.8. Nanorulers Calibration Standards
DNA origami nanorulers (GATTAquant GmbH, Munich, Germany) were employed as ground truth standards for validation of distance measurements [
21]. These nanostructures consist of self-assembled DNA origami scaffolds with precisely positioned fluorescent dye attachment sites labeled with 40 ATTO 565 and 40 ATTO 633. Nanorulers with defined inter-dye separations of 120 nm and 160 nm were used to assess distance measurement accuracy, while a design with intermingled 40 ATTO 565 and 40 ATTO 633 dyes, effectively representing zero separation, served as a colocalization standard to evaluate measurement precision. Mounted slides were provided directly by the manufacturer. For each nanoruler type, images were acquired under conditions identical to those used for
Drosophila wing samples.
2.9. Confocal Imaging
Hoechst, ATTO 565, and ATTO 633 channels were acquired on a Zeiss LSM 780 confocal microscope (Carl Zeiss Microscopy GmbH, Jena, Germany) using a Plan-Apochromat 40× oil immersion objective (NA 1.4). To minimize inter-channel displacement, line switching mode was employed instead of frame switching. Images were acquired with laser power of 0.2% (405 nm), 15% (565 nm), and 27% (633 nm); dwell time of 1.57 µs; voxel size of 103.8 nm × 103.8 nm × 250 nm; 16-bit depth. The pinhole was set to 1 Airy unit (AU) for the 488 nm wavelength.
2.10. Deep Learning-Based Detection of Nuclei and Spots
Nuclei and Oligopaint spots were detected using two independently trained deep learning models [
19] applied to individual 2D planes of 3D image stacks, producing volumetric segmentations. Models were trained using EPySeg [
22], with data augmentation (rotations, flips, intensity scaling). Probabilistic output maps were thresholded to yield binary masks for nuclei and point coordinates for spots.
2.11. Subpixel 3D Spot Centroid Extraction via Gaussian Fitting
Subpixel 3D coordinates of detected spots were refined using the
Big-FISH Python library [
23], providing initial coordinates and estimated spot sizes for a 3D Gaussian fit. This procedure yielded precise centroids even in the presence of moderate background noise.
2.12. Spot Pairing Across Channels
Before applying chromatic aberration correction, corresponding Oligopaint spots were paired across channels based on spatial proximity. For each image, detected 3D spot coordinates were used to identify the nearest neighbor in the other channel. When applicable, only spots located within segmented nuclei were considered for pairing, thereby excluding non-genomic or non-specific signals. The resulting paired centroids serve as the input for chromatic aberration corrections.
2.13. Linear Chromatic Correction (LCC)
To correct for chromatic aberrations and ensure unbiased distance measurements, we applied a method we named the Linear Chromatic Correction [
14,
24,
25]. The principle underlying this correction is that, for paired FISH Oligopaint centroids, the mean difference between corresponding points should be zero in the absence of chromatic aberration—the pairs should have no preferred orientation—regardless of whether the points are colocalized or distant [
14,
24].
The correction procedure begins by computing the difference between each pair of centroids. The mean of these differences is subtracted to center the distribution at zero. For each spatial axis, a linear regression model is fitted to estimate the scaling bias. One set of points is then corrected using these per-axis scaling factors. Finally, any residual global translation is removed to ensure that the corrected differences remain centered at zero.
2.14. Affine Chromatic Correction (ACC)
In addition to LCC, we implemented an independent global method to correct chromatic aberrations by aligning spot coordinates across channels using a 3D affine transformation. This approach compensates for systematic inter-channel discrepancies, including global translations, anisotropic scaling, rotations, and shear.
The procedure operates on paired, colocalized Oligopaint centroids extracted from multiple images sharing the same chromatic aberration profile, acquired on the same microscope within a limited time window. To reduce the influence of mismatched pairs, pairwise distances are filtered using the median and interquartile range (IQR), removing extreme outliers that could strongly bias the affine transformation. Filtered paired centroids from all images are then aggregated to estimate a single global affine transformation matrix.
The affine transformation is estimated by minimizing the squared displacement between corresponding points, allowing for translation, rotation, anisotropic scaling, and shear. The transformation matrix is computed using an implementation adapted from a library by C. Gohlke [
26], which solves the resulting least-squares point-set alignment problem using a singular value decomposition (SVD)-based approach. Once estimated, the transformation is applied to all points in one channel to generate corrected coordinates.
The resulting affine transform is stored and can be reapplied to images of non-colocalized spots acquired under the same conditions and within short time intervals, allowing consistent correction across datasets.
2.15. Distance Measurements and Resolution Assessment
Euclidean distances between registered, colocalized spots were calculated in 3D to assess spatial resolution. Distances for non-colocalized spots were similarly computed to quantify inter-locus spacing. Distributions were visualized using histograms and violin plots in
Matplotlib (version 3.10.8) and
Seaborn (version 0.13.2) [
27,
28], and robust statistics (median and interquartile range, IQR) were used to limit the influence of outliers and assess chromatic correction precision and spatial organization.
4. Discussion
4.1. FISH-Dist Enables Reproducible 3D Measurements of Short-Range Genomic Distances
We developed FISH-Dist, an automated pipeline designed for quantitative 3D distance measurements in FISH imaging at the kilobase scale. FISH-Dist combines deep learning-based spot detection, sub-pixel 3D localization, and dual chromatic aberration correction methods (affine and linear), with performance assessed using calibrated standards such as DNA origami nanorulers.
While many existing FISH tools focus on large-scale chromosomal structures (hundreds of nanometers to micrometers), biologically critical interactions—such as enhancer–promoter contacts, local chromatin loops, regulatory clustering within topologically associating domains, DNA damage foci, and homologous chromosome pairing—occur at much shorter distances (10–50 kb). At these scales, chromatic aberration and localization precision are limiting factors. In our system, uncorrected inter-channel offsets exceeded 170 nm, with axial components approaching 160 nm—comparable to or larger than many true biological separations. FISH-Dist reduces these errors to approximately 50 nm for colocalized targets.
Additionally, we demonstrate that 2 kb probe targets offer an optimal balance between detection efficiency and spatial compactness, achieving ∼45 nm resolution. These advances enable robust, quantitative measurement of short-range genomic interactions that are typically inaccessible to conventional FISH analysis pipelines.
4.2. Probe Design Guidelines Emerge from Systematic Parameter Testing
Our systematic evaluation of probe design parameters yielded several actionable insights for experimental optimization. The relationship between target sequence length and resolution reveals a critical trade-off: longer sequences improve detection efficiency but compromise localization precision. Our 2 kb construct achieved optimal 45 nm resolution with robust detection (190.5 spots/image). Shortening to 1 kb degraded resolution to 75–80 nm despite reduced target size. The 500 bp construct showed slightly better resolution (∼70 nm) than 1 kb but both exhibited markedly reduced detection efficiency.
These findings indicate that competing effects exist, with shorter targets providing compact labeling but reduced signal intensity and fewer probe binding sites, both of which compromise localization. The 2 kb length represents a favorable compromise. Importantly, sequence composition influences performance even at fixed length, as the R1 construct showed reduced efficiency despite having the same length and number of probes as R6. Factors such as GC content, secondary structure, or chromatin accessibility likely modulate hybridization efficiency.
Probe density experiments showed reducing labeling from 100% to 66% caused moderate degradation (45 nm to 60 nm resolution, 30% detection loss). Further reduction to 50% caused substantial degradation (80 nm resolution, 75% detection loss). Even with probe excess, reducing fluorescent fraction diminishes photon counts, degrading centroid localization. For high spatial accuracy, we recommend maintaining at least 57 oligos/fluorphores per probe set spanning 2 kb (corresponding to labeling in our conditions).
Our optimization was conducted in specific genomic and cellular contexts. Absolute values likely depend on chromatin accessibility, nuclear architecture, and cell type. However, general principles—optimal target length balancing signal and compactness, sequence-dependent performance, sensitivity to probe density—should apply broadly. We encourage pilot optimization in specific biological systems.
4.3. Methodological Considerations and Limitations
Validation with DNA origami nanorulers provides ground-truth accuracy under idealized conditions, while transgene-based measurements in fixed biological samples demonstrate accuracy in realistic contexts where chromatin organization and nuclear architecture are preserved. This combination represents a strength of our approach, as it bridges artificial calibration standards and biologically relevant measurements. Nevertheless, probe accessibility can vary with chromatin state, DNA sequence, and cell cycle stage. Although fixation stabilizes chromatin structure, labeled DNA retains conformational heterogeneity reflecting the ensemble of chromatin configurations present at fixation, such that measured distances represent spatial averages rather than rigid, single conformations.
Our chromatic aberration correction assumes a stable optical transformation between channels during image acquisition, and performance may degrade in samples with substantial axial drift or stage movement. In addition, the deep learning-based spot detection relies on training data representative of the target application. Although the trained model generalizes well across the range of probes, labeling densities, and imaging conditions tested here, users working with substantially different probe types or imaging modalities may benefit from retraining or fine-tuning the model. Due to their size, the training datasets and scripts are not publicly distributed but are available upon request.
Throughout validation and optimization, we employed two conceptually distinct chromatic aberration correction approaches, affine and linear, which consistently produced nearly identical corrected distances across all tested conditions, typically differing by only a few nanometers. This close agreement indicates that both methods accurately model the chromatic aberration in our imaging system and provides an internal consistency check that strengthens confidence in measurements.
Although the two approaches differ primarily in workflow rather than performance, maintaining both in the analysis pipeline offers important diagnostic value. Agreement between methods, as observed here, confirms optical stability and the adequacy of global correction models, whereas substantial divergence would indicate optical instability or model inadequacy and prompt re-evaluation of imaging conditions. Consistent with this interpretation, spatial analysis of residual displacements revealed relatively uniform chromatic aberration fields across the imaging volume, indicating that global correction models capture most systematic offsets in our system. Users are nevertheless encouraged to assess spatial uniformity in their own setups to determine the most appropriate correction strategy.
4.4. Comparison with Existing Methods
Several computational tools exist for FISH image analysis, each with distinct design priorities and validation strategies. Classical approaches based on watershed segmentation or intensity thresholding remain widely used, but they often require manual parameter tuning or segmentation, which is error-prone and can lead to substantial over-segmentation, particularly under variable background conditions or in regions with densely packed spots.
Particle tracking frameworks such as TrackMate [
29] provide flexible and sophisticated spot detection and localization, capable of handling multiple spots per frame. Tools like the DNA-FISH Fiji plugin [
30], which leverages TrackMate for spot detection, offer semi-automated analysis of FISH images, including distance measurements between genomic loci. However, in noisy images such as ours, these methods can generate excessive false-positive detections. While post-processing can mitigate this, the procedure is often slow and not optimized for high-throughput FISH distance measurements. Additionally, neither TrackMate nor the DNA-FISH plugin explicitly addresses chromatic aberration correction, which is critical for accurate short-range distance quantification.
FISH-quant [
23] is the most directly comparable tool. However, it was developed primarily for RNA FISH and transcript quantification rather than precise genomic distance measurements and does not include chromatic aberration correction. In contrast, FISH-Dist is specifically designed for short-range genomic distances, implements two complementary chromatic aberration correction methods (linear and affine), and has been rigorously validated using both transgenic constructs and DNA origami nanorulers. Its systematic evaluation of probe design parameters provides increased confidence in quantitative accuracy compared with single-method or less extensively validated pipelines.
5. Conclusions
FISH-Dist has been developed with accessibility and reproducibility in mind. The complete analysis pipeline is available as open-source software, accompanied by comprehensive documentation, example datasets, and tutorials to facilitate adoption. The tool is provided as a command-line application and includes pre-trained deep learning models that support many standard use cases, while also allowing retraining or fine-tuning for specialized experimental conditions.
More broadly, FISH-Dist provides a validated and automated framework for quantitative 3D distance measurements in genome organization studies. By integrating chromatic aberration correction, robust spot detection, and systematic distance analysis into a single workflow, the pipeline reduces technical barriers and enables reproducible, high-throughput spatial measurements across diverse experimental systems.