1. Introduction
1.1. Paired Observations and Why They Matter in Anatomy
Anatomic variants are inherently paired: each individual has a left and a right side that share developmental, genetic, and environmental influences. Accordingly, the presence of a variant on one side is generally not independent of its presence on the other. Bilateral manifestation of a muscle, vessel, or skeletal feature often reflects shared embryologic pathways, whereas unilateral expression reflects asymmetric development within the same individual [
1].
This paired structure is particularly relevant in osteological research, where data are often derived from skeletal collections assembled over long time spans and with variable specimen completeness. Left and right elements may originate from the same individual, from different individuals, or from mixtures of paired and unpaired material. Even when pairing is conceptually present, explicit left–right correspondence is frequently incomplete or unreported.
From an inferential standpoint, paired anatomy has two distinct implications. Laterality—whether a variant preferentially affects one side—depends exclusively on discordant individuals (unilateral expression). Bilateralism—whether a variant tends to occur symmetrically—depends on joint occurrence within individuals. These quantities address different anatomical questions and should not be conflated [
2].
In practice, however, many primary anatomic studies—particularly those based on osteological collections—report only side-specific prevalences (e.g., presence in x% of right sides and y% of left sides), without indicating how often the variant occurs bilaterally. Consequently, meta-analyses that treat bilateral prevalence as a distinct outcome remain uncommon despite its anatomical relevance [
3]. When such studies are synthesized, the joint left–right distribution is unobserved, even though it determines both laterality and bilateralism. Ignoring this structure risks interpreting asymmetry and symmetry as population-level properties rather than as within-individual phenomena.
1.2. The Unavoidable Role of Dependence Assumptions
When joint left–right information is unavailable, some assumption about within-individual dependence is unavoidable. This situation is common in anatomical prevalence research—particularly in osteological collections—where only side-specific counts are reported and individual pairing cannot be reconstructed. In such cases, left–right dependence is not estimable but constitutes an unobserved structural feature of the data.
Assuming independence implies that presence on one side conveys no information about presence on the other—an assumption rarely consistent with anatomical development. At the opposite extreme, maximal feasible bilateral concordance attributes all asymmetry to marginal differences, leaving no scope for genuine unilateral expression. Both positions lie at the boundaries of the admissible dependence range and cannot be empirically justified when only marginals are available.
Many meta-analyses address variance instability or heterogeneity using fixed correlation values or transformations developed for independent binomial data. Generalized linear mixed models have been proposed as a principled alternative [
4,
5], and the Freeman–Tukey double-arcsine transformation remains widely used despite sustained methodological debate [
6,
7,
8]. However, while these approaches target variance or distributional concerns, they do not resolve the central issue in paired anatomical data: the joint left–right distribution is unobserved and therefore unidentified.
The core problem is thus structural non-identifiability rather than estimation error. Infinitely many joint distributions are compatible with the same marginal prevalences, making it impossible to recover the “true” dependence from marginals alone. The methodological question is therefore not whether to assume dependence, but how to parameterize it in a way that is anatomically feasible, mathematically coherent, and transparently interpretable.
An explicit feasibility-based parameterization makes the dependence assumption visible and allows readers to assess how inference varies across alternative, yet admissible, joint structures. Such transparency is essential for principled meta-analysis of paired binary anatomical data.
Once dependence is recognized as structurally unidentifiable from marginals, the natural question becomes which anatomical estimands are most sensitive to that structural uncertainty.
1.3. Laterality and Bilateralism as Complementary Outcomes
Within the setting of unobserved joint left–right structure and unavoidable dependence assumptions, we focus on two complementary outcomes that capture the principal anatomical questions in prevalence studies: laterality and bilateralism. Although both arise from the same joint distribution, they target distinct features of anatomical variation and respond differently to uncertainty about within-individual dependence.
Laterality is quantified by the paired odds ratio, comparing right-only to left-only manifestation within individuals. By construction, bilateral cases do not enter this contrast. Laterality is therefore determined entirely by discordant individuals and reflects directional asymmetry at the individual level rather than differences in marginal side-specific prevalences. Statistically, it corresponds to a McNemar-type estimand [
9].
Bilateralism, in contrast, is measured by bilateral prevalence—the proportion of individuals in whom the variant is present on both sides. This outcome captures symmetric manifestation within individuals and is often of intrinsic developmental or anatomical interest. It is conceptually distinct from overall or side-specific prevalence, and conflating these quantities can obscure meaningful patterns.
Although both endpoints derive from the same joint distribution, they behave differently when joint data are unreported. Laterality depends exclusively on discordant pairs and is therefore sensitive to how probability mass is allocated between discordant and concordant outcomes. Bilateral prevalence depends directly on the bilateral joint probability and varies linearly across the admissible dependence range. Recognizing this distinction is essential for interpreting meta-analytic results based solely on marginal side-specific data.
1.4. Existing Statistical Frameworks for Paired Binary Outcomes
The need to account for correlation in paired or clustered binary outcomes is well recognized in other biomedical fields. In ophthalmology, for example, treating two eyes as independent observations is explicitly discouraged, and paired or clustered analytic methods are standard practice [
10,
11]. More generally, several statistical frameworks have been developed for meta-analysis of correlated or bivariate binary outcomes, including marginal beta–binomial models [
12], models for split-body or paired interventions [
13], and copula-based approaches that explicitly parameterize within-study dependence [
14,
15].
These methods demonstrate that within-study dependence can, in principle, be modeled directly. However, they rely on a critical prerequisite: access to individual-level data or sufficiently detailed reporting of joint outcomes to identify the association structure. When the joint distribution is observed—or estimable—dependence parameters can be inferred using likelihood-based or Bayesian approaches.
Anatomical prevalence research rarely satisfies these conditions. Primary studies commonly report only marginal side-specific prevalences without joint left–right counts and may aggregate observations from specimens with uncertain or mixed pairing status, as often occurs in osteological collections. In such settings, the joint distribution is not simply unknown but fundamentally unidentified: multiple dependence structures are compatible with the same marginals. Consequently, even sophisticated statistical models cannot recover the true dependence without introducing additional assumptions.
The limitation in applying existing frameworks to anatomical meta-analysis is therefore informational rather than methodological. When joint outcomes are unreported, dependence cannot be estimated; it must be assumed. This context motivates approaches that treat dependence as a structural uncertainty to be parameterized transparently, rather than as a nuisance quantity to be inferred implicitly.
Despite the clear anatomical relevance of joint left–right occurrence, relatively few meta-analytic studies explicitly address bilateral prevalence or within-individual dependence in anatomical research. This scarcity appears to reflect reporting practices rather than a lack of biological interest. Primary anatomical studies—particularly in osteological collections—typically report only side-specific counts, and joint left–right distributions are rarely tabulated. Consequently, the methodological literature on paired binary meta-analysis has developed largely in clinical and ophthalmologic contexts, where joint information is routinely available, but remains underdeveloped in anatomical prevalence research. The present study addresses this structural gap.
1.5. A Feasibility-Based Approach to Unreported Dependence
Given these informational constraints, we adopt a feasibility-based approach. For any pair of marginal side-specific prevalences, the set of admissible joint prevalences is restricted by probability theory. The Fréchet bounds define the full range of joint left–right distributions compatible with the reported marginals.
In classical paired-design studies of bilateral anatomy, left–right dependence is typically quantified using an empirically estimated correlation coefficient, for example, in analyses of skeletal symmetry [
2]. Such approaches presume that paired observations and joint outcomes are fully observed, allowing the dependence structure to be estimated rather than assumed.
When joint data are unavailable, however, imposing an arbitrary correlation value within the admissible range is ill-defined. Instead, we parameterize the feasible interval using a single dependence index , which maps the bilateral probability continuously from independence () to maximal feasible concordance (). By construction, this parameterization ensures that all assumed joint distributions are compatible with the observed marginals and avoids infeasible configurations that can arise under fixed correlation assumptions.
Within this framework, we introduce the midway dependence hypothesis, defined by λ = 0.5. This choice places the assumed bilateral joint probability at the midpoint of the admissible Fréchet range. It does not assert biological truth or estimate the actual degree of association; rather, it provides a neutral and transparent reference when no joint information is available to favor either extreme. The midpoint is defined on the bilateral probability scale and does not correspond to the midpoint of any derived association measure (e.g., φ, correlation, or odds ratio).
The feasibility-based formulation offers several advantages. It separates the modeling assumption about dependence () from derived association measures such as the phi coefficient, whose scale depends on the marginals and is therefore not invariant across settings. It permits simultaneous evaluation of laterality and bilateral prevalence under a common dependence structure, facilitating explicit sensitivity analysis. And it guarantees that all assumed joint distributions respect the constraints imposed by the observed data.
By treating within-subject dependence as bounded structural uncertainty rather than as a fixed correlation parameter, this approach aligns statistical modeling with the reporting realities of anatomical research and provides a principled foundation for the analytical and simulation results that follow.
1.6. Partial Pairing in Real-World Anatomical Datasets
In practice, laterality and bilateralism are often meta-analyzed using only published side-specific counts, without access to individual-level pairing information. Even when observations are conceptually paired, the extent to which left and right sides originate from the same individual is frequently unclear or inconsistently reported.
At one extreme, studies based on whole skeletons or intact bodies imply complete pairing, making strong within-individual concordance anatomically plausible. At the other extreme, datasets composed of isolated elements or pooled sides may justify treating observations as independent. More commonly, however, anatomical material represents a mixture of paired and unpaired observations, as occurs in osteological collections, cadaveric series, or aggregated datasets compiled over long periods with incomplete provenance. In such settings, neither full independence nor maximal bilateral concordance is likely to reflect reality.
When pairing is partial or unknown, dependence cannot be inferred from marginals, nor can it be reliably approximated by borrowing correlation estimates from other contexts. A compromise assumption is therefore required—one that acknowledges some degree of pairing without defaulting to boundary values. The feasibility-based dependence index formalizes this compromise by positioning the assumed joint distribution within the admissible range implied by the marginals.
Operationally, the midway dependence hypothesis (
) places the unobserved bilateral probability halfway between independence and maximal feasible concordance. A similar “midway” principle has been proposed for continuous repeated-measures meta-analysis when correlation information is missing [
16]. In the present context, this choice does not assert equal proportions of paired and unpaired specimens, nor does it claim biological truth. Rather, it provides a transparent working assumption when the extent of pairing cannot be reconstructed from published data.
These practical realities of specimen provenance motivate frameworks that treat within-subject dependence as a structural feature of anatomical data rather than as a secondary modeling detail. By making dependence explicit and feasibility-constrained, the present approach enables consistent analysis of laterality and bilateralism across heterogeneous datasets with incomplete or absent pairing information.
1.7. Scope, Aims, and Structure of the Present Study
This study develops a principled framework for meta-analysis of paired binary anatomical data when individual-level left–right outcomes are unreported or incompletely observed, as commonly occurs in anatomical and osteological prevalence research. Rather than attempting to estimate within-individual dependence from marginal data—an inherently non-identifiable problem—we treat dependence as bounded structural uncertainty defined by feasibility constraints.
Within this framework, admissible joint distributions are parameterized using a feasibility-based dependence index , spanning the full range of joint structures compatible with the observed marginals. This formulation permits laterality and bilateral prevalence to be analyzed under a common dependence assumption while making explicit how inference depends on unobserved joint structure.
A central reference point is the midway dependence hypothesis (), introduced as a neutral feasibility-based working assumption when neither independence nor maximal concordance can be empirically justified. It is not proposed as a biological model or as an estimate of true correlation, but as a transparent reference that enables interpretable inference and structured sensitivity analysis.
Using this framework, we evaluate how laterality and bilateral prevalence vary across the admissible dependence range. We demonstrate that laterality, quantified by paired odds ratios, is highly sensitive to dependence assumptions—particularly near the upper feasibility boundary—whereas bilateral prevalence is comparatively robust. We further examine the effects of rare variants and marginal imbalance, and provide exact reference values for derived correlation measures under maximal and midway assumptions to support transparent sensitivity analyses [
16].
2. Materials and Methods
2.1. Paired Binary Data Structure, Identifiability, and Overview of the Analytical Framework
We consider an anatomical variant that may be present on the left side, the right side, on both sides, or on neither side of an individual. Each observation, therefore, consists of a paired binary outcome reflecting left–right presence within the same individual.
It is useful to distinguish three related levels of description: (i) individual-level pairing, (ii) the joint distribution of bilateral occurrence, and (iii) marginal side-specific prevalences.
At the individual level, left and right outcomes are intrinsically paired because they arise within the same organism and may share developmental, genetic, and environmental determinants. Consequently, they cannot generally be treated as independent observations.
At the population level, this pairing is summarized by the joint distribution , where denotes the probability that the variant is present (1) or absent (0) on the left () and right () sides. The four joint probabilities fully characterize bilateral absence, unilateral left, unilateral right, and bilateral presence. Laterality and bilateralism are functions of this joint distribution.
In practice, however, most primary anatomical studies report only marginal prevalences: the proportions of individuals with the variant on the left side () and on the right side (). These marginals are linear combinations of the joint probabilities but do not reveal how frequently left- and right-side occurrences co-occur within individuals. Multiple joint distributions, corresponding to different degrees of within-individual dependence, can produce identical marginal prevalences. When joint information is unavailable, the individual-level structure underlying laterality and bilateralism is therefore not identifiable from marginals alone.
To clarify this structural separation between observed marginals and the unobserved joint distribution,
Figure 1 provides a schematic overview of the feasibility-based analytical framework developed in this study. The diagram summarizes (i) the reporting structure of primary studies, (ii) the feasible region of joint probabilities consistent with the marginals, (iii) the parameterization of within-individual dependence through a feasibility index
, and (iv) the derivation of bilateralism and laterality estimands for meta-analytic synthesis.
In anatomical prevalence settings, variant expression may also vary with age (e.g., due to remodeling or functional adaptation). If dependence between sides differs across age strata, pooled marginal prevalences represent mixtures of age-specific joint distributions. In such cases, any inferred dependence parameter should be interpreted as an aggregate quantity induced by the study’s age composition rather than as a homogeneous biological constant. The feasibility constraints developed below remain valid for pooled marginals, but derived laterality and bilateralism measures are conditional on the underlying (and potentially unreported) age structure.
All subsequent modeling explicitly separates the observed marginals from the unobserved joint distribution that links them. The feasibility framework introduced in
Section 2.4 formalizes this distinction mathematically.
Appendix A provides formal definitions, derivations, and exact results underlying the feasibility-based framework developed in the main text.
2.2. Target Estimands: Laterality and Bilateralism
We consider two co-primary endpoints that summarize complementary aspects of paired anatomical variation: laterality and bilateralism. Both are functions of the same joint left–right distribution but address distinct biological questions and exhibit different sensitivities when joint information is unavailable.
2.2.1. Laterality
Laterality concerns directional asymmetry at the individual level—whether a variant preferentially affects one side of the body. For paired binary data, laterality is quantified using the paired odds ratio, which compares the probability of right-only manifestation () to left-only manifestation ().
This measure depends exclusively on discordant individuals (those with unilateral expression). Bilateral and bilaterally absent cases do not contribute to laterality because they exhibit no within-individual side preference. Laterality, therefore, captures directional asymmetry rather than overall prevalence.
When joint counts are reported, the paired odds ratio is directly identifiable. When only marginal prevalences are available, however, the number of discordant individuals is not determined by the marginals alone. In such cases, laterality estimation requires an assumption about how probability mass is allocated between concordant and discordant outcomes, making inference sensitive to the underlying dependence structure.
2.2.2. Bilateralism
Bilateralism addresses the frequency of symmetric expression within individuals. We quantify bilateralism using bilateral prevalence, defined as the joint probability of variant presence on both sides.
Bilateral prevalence differs conceptually from marginal side-specific prevalence and from the probability of having the variant on at least one side. Whereas marginals describe side-specific occurrence, bilateral prevalence captures concordant manifestation within individuals.
Like laterality, bilateral prevalence cannot be inferred from marginal prevalences alone when joint data are unreported. Any reconstruction, therefore, requires an assumption about the left–right dependence structure.
Together, laterality and bilateralism provide a complete summary of paired binary variation: laterality reflects directional asymmetry among discordant individuals, whereas bilateralism reflects symmetric expression among concordant individuals. The subsequent sections develop a feasibility-based framework for estimating these endpoints when the joint distribution is unobserved.
2.3. Feasible Joint Distributions
When joint left–right data are unreported, the central inferential challenge is that the joint distribution linking side-specific occurrences is unknown. However, it is not arbitrary. For any given pair of marginal prevalences, only a restricted set of joint distributions is mathematically admissible.
This restriction follows from basic probability constraints: joint probabilities must be non-negative, sum to one, and reproduce the observed marginals. As a consequence, the bilateral probability is bounded above and below by limits determined entirely by and . These limits are the Fréchet bounds.
The Fréchet bounds define the most extreme joint structures compatible with the observed marginals. At the lower bound, bilateral co-occurrence is minimized subject to the marginal constraints; at the upper bound, it is maximized. Every admissible joint distribution lies between these extremes.
Importantly, independence is only one possible joint configuration within this feasible region and does not, in general, coincide with either bound. Thus, when only marginals are available, laterality and bilateral prevalence are not identifiable: different admissible joint distributions can reproduce the same marginals while implying different values for both endpoints.
The resulting inferential problem is therefore one of structural non-identifiability rather than sampling variability. Without additional assumptions, marginal prevalences alone do not determine how probability mass is distributed between concordant and discordant outcomes. Any analysis must explicitly specify where the true joint distribution lies within the feasible region.
In the next section, we formalize these constraints by expressing the Fréchet bounds as inequalities on the joint probabilities and introducing a scalar parameter that spans the entire admissible dependence range.
2.4. Feasibility-Based Dependence Parameterization
As established in
Section 2.3, when only marginal side-specific prevalences are available, the joint left–right distribution is not identifiable but is constrained to lie within a well-defined feasible region determined by the observed marginals. We now formalize these feasibility constraints and introduce a scalar parameter that spans the entire admissible range of within-individual dependence.
2.4.1. Feasibility Constraints
Let
and
denote binary indicators of variant presence on the left and right sides, respectively, with joint probabilities:
The marginal prevalences are
Because probabilities must be non-negative and sum to one, the bilateral probability
cannot take arbitrary values once
and
are fixed. Instead, it is constrained by the Fréchet bounds:
These bounds are sharp: every value of within this interval corresponds to at least one valid joint distribution consistent with the observed marginals, whereas values outside the interval are mathematically impossible. Independence corresponds to the interior value , which is feasible but not privileged.
The Fréchet bounds, therefore, define a one-dimensional feasible segment of joint distributions compatible with the observed marginals. Any assumption about within-individual dependence, in the absence of joint data, amounts to selecting a point along this segment.
2.4.2. Parameterization via a Feasibility Index
To parameterize this selection transparently, we introduce a feasibility-based dependence index
, defined through linear interpolation within the admissible interval:
Under this construction, corresponds to independence, corresponds to maximal feasible bilateral concordance, while intermediate values span the entire admissible dependence range.
This formulation guarantees that every assumed joint distribution is feasible for the observed marginals.
2.4.3. Reconstruction of the Joint Distribution
Once
is specified through
, the remaining joint probabilities follow deterministically:
Thus, specification of completes the reconstruction of the joint distribution .
2.4.4. Link to Estimands
The reconstructed joint probabilities determine both estimands of interest: Bilateralism, defined directly by bilateral prevalence
, and Laterality, quantified by the paired odds ratio based on discordant outcomes,
Because discordant probabilities and shrink as concordance approaches its upper bound, laterality may exhibit boundary instability as , whereas bilateral prevalence varies smoothly across the feasible interval.
2.4.5. Interpretation of λ
By construction, is a dimensionless feasibility index defined on the admissible segment determined by the marginals. It is not a correlation coefficient. Correlation measures such as the phi coefficient arise as derived, marginal-dependent quantities once and the marginals are specified. We therefore treat as the primary modeling assumption, with all downstream estimands interpreted conditionally on this choice.
2.5. The Midway Dependence Hypothesis
2.5.1. The Midway Hypothesis as a Reflection of Symmetry
The feasibility-based parameterization introduced in
Section 2.4 spans the entire admissible range of joint left–right dependence through the scalar index
. For meta-analysis across studies, a working reference value of
is required to compute study-level laterality and bilateral prevalence from reported marginals prior to pooling. In the absence of joint data, we therefore define a neutral reference point within the admissible dependence range, termed the midway dependence hypothesis.
Formally, the midway dependence hypothesis corresponds to the choice
This choice places the bilateral joint probability exactly halfway between the value implied by independence and the value implied by maximal feasible concordance. Under this assumption,
Because varies linearly with by construction, the midway hypothesis corresponds to the arithmetic midpoint of the feasible interval for . The same linearity implies that the discordant probabilities and , as well as the logarithm of the paired odds ratio, are positioned halfway between their independence and maximal-concordance values.
Importantly, the midway dependence hypothesis does not assert that the true within-individual dependence equals , nor does it correspond to a fixed correlation on any conventional scale. Rather, it serves as a neutral feasibility-based reference, analogous to selecting the center of a bounded parameter space when no empirical information justifies favoring either extreme.
This role is particularly relevant in anatomical datasets with partial or unknown pairing, where neither complete independence nor maximal bilateral concordance is anatomically or empirically defensible. By anchoring study-level estimation at the midpoint of the admissible range, the midway hypothesis provides a transparent and reproducible basis for subsequent meta-analytic pooling, while allowing sensitivity analyses across the full dependence spectrum .
Subsequent sections examine how laterality and bilateral prevalence vary across this admissible range and assess the robustness of conclusions drawn under the midway assumption, particularly in settings involving rare variants or imbalanced marginals.
2.5.2. The Midway Hypothesis Under Functional Asymmetry
In anatomically asymmetric systems, differences between and may arise from structural or functional lateralization (e.g., hemispheric specialization or asymmetric organ arrangement). In such contexts, it may be desirable to define a biologically informed reference dependence level reflecting expected concordance under asymmetric predisposition.
One pragmatic approach is to define an asymmetry-adjusted reference
such that the implied discordant probabilities preserve the observed marginal imbalance while avoiding implicit equal weighting of concordance extremes. For example,
where
δ is a sensitivity coefficient constrained to ensure
. This formulation preserves feasibility while allowing the reference point to shift modestly in proportion to the degree of marginal asymmetry.
Alternatively, independence () may serve as a biologically neutral baseline in systems where left and right development are assumed to be mechanistically independent despite marginal imbalance.
Marginal asymmetry () and concordance structure () are conceptually distinct: the former captures directional predisposition, whereas the latter captures within-individual dependence conditional on those predispositions. The feasibility framework accommodates both without requiring symmetry assumptions.
2.5.3. Association-Scale Midpoint for Functionally Asymmetric Systems
The choice defines a midpoint on the bilateral probability scale within the Fréchet-feasible interval . In functionally asymmetric settings where , it may be preferable to define a reference point on an association scale.
With fixed marginals, define
where
Independence corresponds to
at
. Because
at the upper Fréchet bound, we define a high-concordance interior anchor
for example, with
, and set
.
An association-scale midpoint
is then defined by
Mapping this value to the feasibility index yields
This defines a reference dependence level centered on the log-odds-ratio scale while remaining fully consistent with the feasibility constraints under marginal imbalance.
In subsequent analyses, study-level estimands are first computed under a specified reference value of
(e.g., the midway hypothesis) and are then synthesized across studies using the meta-analytic models described in
Section 2.6, with robustness evaluated over the admissible dependence range
2.6. Meta-Analytic Pooling of Reconstructed Estimands
Once study-level joint probabilities have been reconstructed under a specified feasibility index , the derived estimands—laterality and bilateral prevalence—are synthesized across studies using standard random-effects meta-analytic models.
2.6.1. Pooling Laterality
Laterality is quantified by the paired odds ratio,
which is analyzed on the logarithmic scale. For study
, let
We assume a conventional random-effects model:
where
represents between-study heterogeneity and
denotes within-study sampling variability. Estimation proceeds using inverse-variance weighting under standard random-effects methodology.
2.6.2. Pooling Bilateral Prevalence
Bilateralism is quantified by bilateral prevalence
, analyzed on the logit scale:
A parallel random-effects model is specified:
with between-study heterogeneity
.
Thus, for any chosen value of , study-level laterality and bilateral prevalence are reconstructed and subsequently pooled across studies. Sensitivity analyses are performed by varying across its admissible range.
2.6.3. Derived Correlation Measures and Non-Invariance
Correlation measures are often used to summarize within-individual dependence in paired data. In the present framework, however, such measures are not treated as primary modeling parameters. Instead, they are derived quantities determined by the reconstructed joint distribution, conditional on the marginal prevalences and the chosen value of the feasibility index .
For paired binary outcomes, a common association measure is the phi coefficient (ϕ), defined as the Pearson correlation between the binary indicators
and
:
Substituting
from
Section 2.4 yields an induced correlation
. Because both numerator and denominator depend on the marginals, the same value of
generally corresponds to different values of
across prevalence settings. Thus,
is not invariant under changes in overall prevalence or left–right imbalance.
The attainable range of is itself constrained by the Fréchet bounds. For fixed marginals, the maximum and minimum occur at the feasibility limits of . Under the midway dependence hypothesis (), the induced correlation is obtained by evaluating at the midpoint of the admissible interval for . This provides a convenient reference value but does not represent a universal or biologically fixed correlation parameter.
These results aid the interpretation of simulation findings but do not alter the central role of as the primary modeling assumption.
2.7. Exact Feasible Range of the Phi Correlation
For fixed marginal prevalences
and
, the phi coefficient
defined in
Section 2.6 is constrained by the Fréchet bounds on the bilateral probability
. We now make these constraints explicit and derive the corresponding admissible range of
.
Because is a monotone increasing function of for fixed marginals, its minimum and maximum values occur at the lower and upper Fréchet bounds, respectively. Substituting yields the maximum feasible correlation , while substituting the lower bound yields the minimum feasible correlation . Together, these define the full admissible correlation range for paired binary data with the given marginals.
Both and depend explicitly on and . Even when , the maximum attainable correlation is generally less than one unless prevalence is near zero or one. Under marginal imbalance, the attainable range may be substantially narrower.
Within the feasibility-based parameterization of
Section 2.4, the induced correlation corresponding to a given
is obtained by evaluating
at
. In particular, under the midway dependence hypothesis (
),
which lies strictly between
and
, with its exact value determined jointly by
and
.
For reference, we provide closed-form expressions for
and
as functions of the marginals. These expressions are used to interpret the magnitude of correlation implied by feasibility-based assumptions and to illustrate how fixed values of
translate to different correlation scales across studies. Numerical examples for representative prevalence scenarios are reported in the
Supplementary Material.
This characterization reinforces a central implication of the feasibility framework: correlation is not an invariant descriptor of within-individual dependence in paired binary data. Its attainable values are constrained by the marginals, and comparisons based solely on correlation coefficients must therefore be interpreted in light of the underlying prevalence structure.
2.8. Behavior of the Midway Dependence Hypothesis Under Rare and Imbalanced Marginals
The interpretation of any dependence assumption in paired binary data depends critically on the marginal prevalences. This is especially relevant in anatomical studies, where many variants are rare and left–right prevalences may be substantially imbalanced. We therefore examine how the midway dependence hypothesis () behaves under such conditions and clarify its implications for derived correlation measures.
Within the feasibility framework, the unidentified joint probability is parameterized along the admissible segment between independence and maximal feasible concordance. The midway hypothesis places at the midpoint of this segment, irrespective of the magnitude or balance of the marginals. Consequently, while the relative position within the feasible range remains fixed, the implied joint structure adapts automatically to the prevalence setting.
For rare variants, where both and are small, the Fréchet upper bound for approaches . In this regime, the maximum feasible phi correlation approaches one, whereas the correlation under independence approaches zero. The midpoint assumption, therefore, induces a correlation strictly between these extremes, but its magnitude depends on the marginals and their balance.
Marginal imbalance further modifies this relationship. When , the admissible range of —and thus of —contracts. Under the midway hypothesis, the induced correlation reflects this contraction automatically, without requiring adjustment of . Thus, although remains fixed at 0.5, the implied association scale expands or contracts according to the marginal structure.
These observations underscore the distinction between the feasibility index and conventional correlation measures. A fixed value of does not imply a fixed value of , particularly under rare or imbalanced marginals. Rather, specifies a relative position within the admissible dependence region, whereas represents the absolute association implied by that position under given marginal constraints.
To aid interpretation, we evaluate exact and approximate expressions for the maximum feasible correlation and for the correlation induced by the midway hypothesis across representative scenarios involving rare variants and marginal imbalance. These results contextualize the magnitude of observed in simulations and empirical examples, and demonstrate that the midway hypothesis remains a neutral and internally coherent reference even in extreme prevalence regimes.
2.9. Dependence Parameterization Under Rare Variants
To clarify the relationship between the feasibility-based dependence index and conventional correlation measures, we examine the limiting behavior of the phi coefficient in the rare-variant regime. This setting is common in anatomical prevalence studies and provides analytic insight into how feasibility constraints translate into correlation scales.
We consider the regime in which both marginal prevalences are small,
, allowing for possible left–right imbalance. In this case, the Fréchet upper bound for the bilateral probability satisfies
up to terms of order
. Under independence, the joint probability is
which is of smaller order.
Substituting these expressions into the definition of the phi coefficient yields simple approximations. At maximal feasible concordance, the attainable correlation satisfies
demonstrating the strong influence of marginal imbalance. Even when variants are rare, substantial left–right asymmetry can markedly reduce the maximum attainable correlation.
Under the midway dependence hypothesis (
), the joint probability is approximated by
and the induced correlation satisfies
Thus, in the rare-variant limit, the midway hypothesis corresponds approximately to halving the maximum feasible correlation, irrespective of absolute prevalence. This provides an intuitive interpretation of when exact algebraic expressions are cumbersome.
These approximations are used solely for analytic insight. All estimation and numerical evaluations rely on exact expressions accounting for finite-prevalence corrections, which are reported in the
Supplementary Material.
By making explicit the rare-variant behavior of and , this section reinforces a central implication of the feasibility framework: correlation is inherently marginal-dependent. Its scale and interpretation cannot be separated from prevalence. In contrast, the dependence index remains invariant across prevalence regimes and therefore provides a more stable basis for modeling unreported joint dependence.
2.10. Simulation Study Design
Simulation studies were conducted to evaluate the behavior of laterality and bilateral prevalence when joint left–right information is unreported and dependence must be reconstructed under assumed values of the feasibility index . The design mirrors the analytical framework developed above and permits direct comparison between gold-standard inference based on fully observed joint data and reconstructed inference based on marginal data alone.
For each simulated study, marginal prevalences
and
were specified to represent anatomically realistic scenarios, including rare variants, balanced and imbalanced marginals, and varying overall occurrence. Given these marginals, a joint distribution was generated by selecting a value of
and computing
using the feasibility-based parameterization (
Section 2.4). The remaining joint probabilities were determined uniquely by the marginal constraints.
Individual-level paired outcomes were then sampled from the resulting multinomial distribution over the four joint categories. From these complete data, gold-standard values of the paired odds ratio and bilateral prevalence were computed directly.
To emulate reporting limitations in primary anatomical studies, joint counts were subsequently discarded, retaining only marginal prevalences. Laterality and bilateral prevalence were then reconstructed under assumed dependence scenarios, including the midway hypothesis () and alternative values spanning the admissible range.
Simulated studies were combined using standard meta-analytic procedures. Laterality was analyzed on the log-paired odds ratio scale using inverse-variance weighting, and bilateral prevalence on the logit scale. Random-effects models were applied throughout to accommodate between-study heterogeneity.
Performance was assessed by comparing reconstructed estimates with gold-standard values across replicates. Evaluation metrics included bias, root mean squared error, confidence interval coverage, and indicators of numerical instability, particularly under rare-variant and strong-dependence conditions.
By explicitly separating data generation, information removal, dependence-based reconstruction, and meta-analytic aggregation, this design provides a controlled framework for quantifying the inferential consequences of feasibility-based assumptions and for interpreting the sensitivity analyses reported in the Results.
2.11. Propagation of Uncertainty in the Dependence Assumption and Unequal Marginals
The analyses described thus far treat the feasibility-based dependence index as a fixed working assumption. In practice, however, uncertainty about within-individual dependence may itself be substantial, particularly when joint left–right information is entirely unreported. To assess the robustness of laterality inference to such uncertainty, we extended the deterministic framework by allowing to vary stochastically rather than remaining fixed.
Uncertainty in the dependence assumption was modeled on the logit scale. Specifically, we assumed that follows a normal distribution with specified mean and variance, and mapped realizations back to the unit interval via the inverse logit transformation. This construction ensures that all sampled values of remain within the admissible range , while allowing flexible control over the degree of uncertainty around a chosen reference value, such as the midway hypothesis.
For each scenario, we evaluated the impact of uncertainty in on the precision of laterality inference by computing the expected standard error of the log paired odds ratio as a function of the induced variance in . This approach parallels classical analyses of uncertainty propagation in paired continuous outcomes but is adapted here to the feasibility-based dependence framework for paired binary data.
In addition to uncertainty in dependence, we examined the effect of unequal marginal prevalences on the behavior and interpretability of the midway dependence hypothesis. Marginal asymmetry was summarized using the prevalence ratio . Across a range of imbalance scenarios, we evaluated (i) the correlation implied by the midway dependence assumption as a function of marginal imbalance, and (ii) the relative precision of laterality estimates under the midway hypothesis compared with independence, expressed as the ratio of standard errors.
These analyses characterize how both uncertainty in the dependence assumption and asymmetry in marginal prevalences influence the stability and precision of laterality inference. They also clarify the conditions under which the midway dependence hypothesis provides a practically useful reference point, and those under which laterality estimates become inherently unstable regardless of the assumed dependence structure.
2.12. Simulation of Structured Heterogeneity in Within-Subject Dependence Marginals
To evaluate the impact of biologically structured heterogeneity in within-individual dependence, we extended the baseline simulation design to allow the feasibility-based dependence index λ to vary systematically across subgroups within studies. This extension was motivated by the possibility that bilateral concordance may differ by age (e.g., younger versus older individuals), even when only pooled marginal prevalences are reported.
For each simulated study, individuals were partitioned into two strata representing age groups. Let index strata within study . A study-specific mixture proportion determined the relative size of the strata, with and , where denotes the total study sample size. In primary scenarios, marginal left- and right-side prevalences were held constant across strata to isolate the effect of heterogeneity in dependence; additional sensitivity scenarios allowed modest shifts in marginal prevalences between strata to reflect age-related differences in overall occurrence.
Stratum-specific joint distributions were constructed using the feasibility-based parameterization in
Section 2.4, assigning distinct dependence parameters
to each stratum. Specifically, the bilateral joint probability within stratum
was defined as
with remaining joint probabilities determined by the marginal constraints. Scenarios included homogeneous dependence
, moderate heterogeneity (e.g.,
), and strong heterogeneity (e.g.,
).
Individual-level paired binary outcomes were generated independently within each stratum by multinomial sampling from the corresponding joint distribution. Stratum-specific counts were then aggregated to produce pooled study-level joint tables, from which gold-standard laterality (paired odds ratio) and bilateral prevalence were computed. To mimic typical reporting limitations, joint information was subsequently discarded, retaining only pooled marginal prevalences. Laterality and bilateral prevalence were then reconstructed under working assumptions imposing a single study-level dependence parameter , including independence (), the midway dependence hypothesis (), and maximal feasible concordance (). Reconstructed estimates were compared with pooled gold-standard quantities across simulation replicates.
Meta-analytic aggregation proceeded as in the baseline simulation using inverse-variance weighting for and logit transformation for bilateral prevalence under random-effects models. Performance metrics included bias, root mean squared error, confidence interval coverage, and numerical instability indicators (e.g., near-vanishing discordant cells under reconstruction). This design quantifies the inferential consequences of unrecognized age-related heterogeneity in within-individual dependence when only pooled marginal data are available.
2.13. Computational Implementation and Software
All analytical derivations, simulations, and graphical displays were implemented using the R statistical environment (version 4.2.2; R Foundation for Statistical Computing, Vienna, Austria). Data manipulation and aggregation were performed using the dplyr and tidyr packages, and all figures were produced using ggplot2. Multinomial sampling for paired binary outcomes was carried out using base R functions.
Simulation studies were implemented to evaluate the behavior of laterality and bilateral prevalence estimands across the full admissible range of within-individual dependence. Values of the feasibility-based dependence index
were varied on a fine grid to ensure smooth and interpretable summaries, and simulation settings were chosen to reflect realistic anatomic prevalence scenarios. All simulation code was written to ensure reproducibility and consistency with the analytical framework described in
Section 2.1,
Section 2.2,
Section 2.3,
Section 2.4,
Section 2.5,
Section 2.6,
Section 2.7,
Section 2.8,
Section 2.9,
Section 2.10 and
Section 2.11.
As a supportive aid during manuscript development, ChatGPT (version 5.2; OpenAI, San Francisco, CA, USA) was used for assistance in code structuring, language refinement, and consistency checking of analytical descriptions. All methodological choices, mathematical formulations, simulation designs, and interpretations were conceived, verified, and approved by the authors. ChatGPT was not used to generate data, perform statistical analyses, or determine scientific conclusions.
2.14. Code Availability
All R scripts required to reproduce the simulations, analytical calculations, and figures presented in this manuscript are archived at Zenodo (DOI:
https://doi.org/10.5281/zenodo.18825830). The repository contains the complete workflow, including scripts for all main and
Supplementary Figures. Random seeds are fixed to ensure full computational reproducibility. The code is distributed under the MIT License.
4. Discussion
4.1. Principal Findings: Complementarity and Structural Instability
This study addresses a structural limitation in anatomical prevalence research: the absence of reported joint left–right outcomes. By treating within-individual dependence as a bounded feasibility problem rather than an estimable parameter, we clarify how laterality and bilateralism behave under structural non-identifiability. Our results demonstrate a fundamental asymmetry between these endpoints: bilateral prevalence varies smoothly across the admissible dependence range, whereas laterality exhibits intrinsic boundary instability. The midway dependence hypothesis provides a neutral and numerically stable reference within this constrained space. The following sections discuss the theoretical, practical, and methodological implications of these findings [
2,
16].
Our results demonstrate that bilateral prevalence varies smoothly and predictably across the entire admissible dependence range, reflecting its direct dependence on the joint probability of bilateral occurrence. Laterality, by contrast, is inherently sensitive to how probability mass is allocated between discordant and concordant outcomes, because it is determined exclusively by individuals exhibiting unilateral expression. This differential behavior is not a modeling artifact, nor a consequence of the chosen parameterization, but a structural property of paired binary anatomic data. Similar distinctions between symmetry and asymmetry at the individual level have long been emphasized in classical anatomical and morphological analyses [
1,
2].
From an interpretive perspective, these findings underscore the importance of treating laterality and bilateralism as distinct, complementary endpoints rather than interchangeable descriptors of anatomic variation. Reporting laterality without accounting for bilateralism, or vice versa, risks conflating asymmetric and symmetric manifestations and obscuring the role of within-individual dependence [
3]. This risk is particularly salient in meta-analyses based on side-specific prevalence data, where the underlying joint structure is unobserved and the two outcomes may behave very differently under the same dependence assumptions.
A central finding of this work is the structural instability of laterality as within-individual dependence approaches its upper feasibility boundary. As dependence increases toward maximal feasible concordance, one of the discordant outcome categories—left-only or right-only manifestation—necessarily becomes sparse or empty, unless marginal prevalences are exactly balanced. Under these conditions, the paired odds ratio diverges, reflecting the vanishing information available to support directional asymmetry.
This boundary degeneracy is geometric in nature and arises directly from the Fréchet constraints imposed by the marginal prevalences, rather than from any particular modeling choice or estimation strategy. Importantly, the resulting instability persists even at moderate sample sizes and cannot be eliminated by alternative estimators, continuity corrections, or variance-stabilizing transformations. It therefore reflects an intrinsic limitation of laterality as an estimand when joint left–right data are unavailable and strong within-individual dependence is anatomically plausible.
Similar boundary phenomena have been described in other settings involving paired or clustered binary outcomes, particularly under extreme association structures [
17]. In the anatomical context, however, this behavior has specific interpretive consequences: when bilateral concordance is high, laterality estimates become increasingly sensitive to unobserved dependence assumptions and may be dominated by a small number of unilateral cases. In contrast, bilateral prevalence remains well behaved across the entire admissible dependence range, reinforcing its role as a stable and complementary endpoint for meta-analysis of paired anatomic data.
Having established the structural instability of laterality and the stability of bilateral prevalence, we next consider the role of the midway dependence hypothesis as a practical working reference.
4.2. The Midway Hypothesis and Robustness Under Uncertainty and Marginal Imbalance
When joint left–right data are unreported, within-individual dependence is fundamentally unidentified and must be assumed rather than estimated. The feasibility-based framework developed here makes this assumption explicit by parameterizing the admissible joint distributions through the dependence index
λ ∈ [0, 1], which spans the entire Fréchet-feasible range determined by the marginal prevalences [
12,
13,
14,
15].
Within this framework, the midway dependence hypothesis (λ = 0.5) occupies a natural and interpretable position. It corresponds to the midpoint of the feasible segment defined by the Fréchet bounds, thereby avoiding commitment to either independence or maximal feasible concordance in the absence of empirical joint information. Importantly, the midway hypothesis is not proposed as a biological model of symmetry, nor as an estimate of a true underlying correlation. Rather, it serves as a neutral, feasibility-based reference that makes the dependence assumption transparent and reproducible.
An analogous midpoint principle has previously been proposed for continuous repeated-measures meta-analysis under missing correlation information, where it was shown to balance variance attenuation between independence and perfect pairing [
16]. In the present paired binary setting, the midway assumption plays a similar conceptual role, while remaining grounded in feasibility constraints rather than correlation scales. This distinction explains why the correlation implied by
λ = 0.5 varies across studies with different marginal prevalences, and why interpreting
λ as a surrogate correlation would be inappropriate.
Accordingly, the midway dependence hypothesis should be understood as a principled working reference that facilitates interpretable inference and structured sensitivity analysis under incomplete reporting, rather than as a claim about anatomical truth or the actual degree of left–right association. In settings where joint outcomes cannot be reconstructed—such as many osteological and prevalence-based datasets—it provides a transparent and internally consistent basis for analysis while explicitly acknowledging the limits imposed by the available data.
When λ varies across age strata, the pooled marginals correspond to a mixture of joint structures, so should be interpreted as a study-level working position in the feasible space rather than a homogeneous biological constant. Under such mixtures, laterality can shift because discordant mass is redistributed across strata, whereas bilateral prevalence remains comparatively stable due to its linear dependence on under the feasibility mapping.
Allowing uncertainty in the dependence assumption provides additional insight into the stability of laterality inference when joint left–right data are unreported. When the dependence parameter λ is treated as uncertain rather than fixed, moderate variability around the midway hypothesis leads to only modest inflation of the expected standard error of the paired log odds ratio. Substantial loss of precision arises primarily when uncertainty spans a large portion of the admissible dependence range, particularly near the upper feasibility boundary, where discordant outcomes become sparse.
Marginal imbalance further modulates these effects. Because both the feasible dependence range and the induced association scale depend on the marginal prevalences, asymmetry between left and right sides alters the precision and interpretability of laterality estimates under any fixed dependence assumption. These effects are consistent with earlier observations on the sensitivity of paired binary estimands to marginal imbalance and sparse discordant counts [
17].
Taken together, these results indicate that the midway dependence hypothesis provides a practically robust reference across many realistic anatomical settings. At the same time, they clarify the conditions under which laterality inference becomes inherently unstable—namely, when marginal imbalance is pronounced and within-individual concordance is high—regardless of the assumed dependence structure.
4.3. Relation to Classical Methods and Existing Frameworks
Laterality in paired binary data is traditionally analyzed using McNemar-type methods that compare discordant outcomes within individuals. When full joint left–right data are available, the paired odds ratio is directly identifiable, and inference requires no additional assumptions beyond pairing.
The setting addressed here differs fundamentally. When only marginal side-specific prevalences are reported, discordant counts are unobserved and cannot be reconstructed without an explicit assumption about the joint distribution. The feasibility-based framework makes this assumption transparent and separates structural uncertainty arising from missing pairing information from estimation of laterality itself [
12,
13,
14,
15]. It does not replace classical paired tests but extends paired reasoning to contexts where joint data are unavailable, while preserving a clear distinction between what is observed, assumed, and inferred.
Variance-stabilizing transformations such as the Freeman–Tukey double-arcsine transformation remain widely used in prevalence meta-analysis, despite ongoing debate regarding their statistical properties and interpretability [
6,
7,
8]. Generalized linear mixed models have likewise been advocated as principled alternatives for binomial prevalence synthesis [
4,
5]. However, irrespective of this debate, these approaches do not encode paired structure and therefore cannot resolve the left–right dependence problem underlying instability in laterality inference when discordant cells are sparse. This limitation is structural rather than technical: such methods were developed for independent binomial proportions and do not preserve the joint anatomy of paired outcomes.
Empirical anatomical studies frequently discuss laterality descriptively but rarely report joint distributions or conduct paired-inference analyses, which may partly explain the limited development of formal meta-analytic approaches in this domain. Previous methodological work on paired or clustered binary outcomes—including correlation-based adjustments and copula or mixed-effects formulations [
10,
11,
12,
13,
14,
15]—assumes access to joint information within studies. In contrast, anatomical prevalence research often provides only marginal frequencies. Under these constraints, the principal inferential difficulty is structural non-identifiability rather than variance estimation. Recognizing this distinction clarifies why direct transplantation of correlation-based or transformation-based methods from clinical meta-analysis may be inappropriate in anatomical contexts.
4.4. Structured Heterogeneity and Extensions
The framework developed here is readily extensible to settings in which paired anatomic data are stratified by sex or other grouping variables. Sex-specific laterality and bilateralism are of particular interest in anatomical research, as they may reflect differences in developmental pathways, functional demands, or selective pressures. When sex-stratified marginal prevalences are reported without joint left–right data, the same feasibility constraints apply within each stratum, and the dependence parameterization can be applied independently [
1,
3].
More broadly, the feasibility-based approach does not preclude the use of alternative dependence models when additional information is available. Copula-based, random-effects, or multivariate models may be appropriate when joint distributions or individual-level data are reported, allowing dependence parameters to be estimated rather than assumed [
12,
13,
14,
15]. The present framework is intended specifically for the common situation in which joint information is missing and dependence is therefore unidentifiable.
Importantly, feasibility-based parameterization should be viewed as complementary rather than competing with these approaches. When richer data are available, feasibility constraints become inactive and standard modeling strategies apply. When inference must proceed from marginal data alone, feasibility provides a principled boundary within which dependence assumptions must reside.
Sex-stratified analyses are frequently performed in anatomical and paleopathological research because biological sex may influence both overall prevalence and laterality patterns. Beyond differences in marginal prevalences, it is also biologically plausible that within-individual dependence between sides may differ by sex. For functionally mediated variants—such as those associated with repetitive manual tasks or asymmetric biomechanical loading—sex-specific behavioral patterns could influence bilateral coordination and thus concordance structure. For example, stronger lateralized activity in one sex could plausibly reduce bilateral manifestation of certain variants, corresponding to a lower λ within the feasibility-based parameterization.
Within the present framework, λ indexes concordance conditional on the observed marginals. Nothing in the formulation requires λ to be constant across subgroups; the use of a common midpoint assumption (λ = 0.5) for both sexes is a neutral reference in the absence of subgroup-specific joint information. However, when biological plausibility suggests sex-structured dependence, this assumption should be evaluated rather than imposed.
In studies where full joint left–right data are available for at least a subset of individuals, sex-specific dependence parameters can be estimated directly. Let
and
denote observed bilateral prevalences within males and females, respectively, with corresponding marginals
and
. Sex-specific dependence indices may be computed as
Statistical comparison of and may be performed using bootstrap confidence intervals or likelihood-based tests derived from the multinomial joint distribution within each sex. Alternatively, a constrained model assuming common λ across sexes may be compared against an unconstrained model allowing sex-specific λ values, using information criteria or likelihood ratio testing where feasible.
If substantial sex differences in λ are detected, sex-stratified reconstruction of laterality under sex-specific working values and may provide more biologically coherent inference. Conversely, if λ estimates overlap substantially, application of a common midpoint assumption may be considered adequate.
These considerations reinforce that marginal asymmetry, age structure, preservation bias, and sex heterogeneity represent distinct layers of uncertainty that can be examined empirically when sufficient joint information is available.
4.5. Applied Implications for Anatomical Meta-Analysis
The results of this study have several implications for the conduct and interpretation of meta-analyses of anatomic variants. First, meta-analyses that pool side-specific prevalences without accounting for within-individual pairing implicitly impose a dependence assumption that is rarely stated explicitly. Depending on the underlying joint structure, such practices may distort inference on both laterality and bilateral prevalence, even when primary data are sound [
2,
3].
Second, reporting laterality without accompanying information on bilateral prevalence obscures the distinction between asymmetric and symmetric manifestations. As shown here, laterality and bilateralism respond very differently to uncertainty in within-individual dependence. Reporting both outcomes provides a more complete and anatomically meaningful description of paired variation, particularly when joint left–right data are unavailable.
Third, the widespread practice of adopting fixed correlation assumptions—either explicitly or implicitly—is problematic in paired binary settings. Because the attainable correlation range depends on the marginal prevalences, fixed-correlation heuristics may be infeasible or misleading across heterogeneous studies. Feasibility-based parameterization avoids this inconsistency by adapting automatically to the reported marginals and by making the assumed dependence structure transparent.
The magnitude of the induced correlation under maximal and midway dependence assumptions varies substantially across prevalence scenarios (
Supplementary Table S2). In particular, marginal imbalance markedly constrains the attainable correlation range, even under rare variants.
Sensitivity analyses further demonstrate that laterality estimates are highly responsive to dependence assumptions near the upper feasibility boundary, whereas bilateral prevalence remains comparatively stable (
Supplementary Table S3). These patterns persist across representative prevalence regimes.
In osteoarchaeological contexts, differential preservation of skeletal elements may distort observed side-specific prevalences. Taphonomic processes—including soil chemistry, burial position, fragmentation, and recovery bias—can produce unequal probabilities of recovering left and right elements. As a result, the observed marginal prevalences and may not coincide with true population prevalences.
The feasibility bounds developed in
Section 2 are defined relative to the observed marginals. In archaeological datasets, therefore, the admissible dependence region characterizes the joint distribution conditional on preservation rather than the underlying biological population. If preservation mechanisms operate independently of pathology status, marginal imbalance primarily reflects differential recovery probabilities rather than intrinsic biological asymmetry. However, if preservation probabilities vary according to lesion presence—such as when pathological elements are more fragile or more readily identified—selection effects may also distort the apparent concordance between sides.
Differential preservation thus affects inference at two interconnected levels. First, unequal preservation rates () may induce an artificial imbalance in the observed marginals (), thereby shifting the feasible dependence region itself. Second, if bilateral cases are differentially preserved relative to unilateral cases, the observable concordance structure may diverge from the population-level dependence pattern. Because feasibility bounds are explicit functions of the marginals, any distortion in marginal prevalences necessarily propagates into reconstructed dependence measures. Importantly, however, this does not invalidate the λ-parameterization; rather, λ continues to index concordance within the preservation-conditioned observable space.
To evaluate whether observed marginal imbalance could plausibly be explained by preservational artifacts alone, several sensitivity assessments may be informative. Side-specific completeness can be examined by estimating preservation rates and from element presence counts independent of pathology; substantial differences would suggest that marginal imbalance may arise from differential preservation. Observed prevalences may then be adjusted by these estimated preservation rates, for example, by computing and (truncated to the interval [0, 1]) and reassessing whether the imbalance persists after adjustment. In addition, feasibility regions can be recalculated under hypothetical equal preservation () to determine whether reconstructed laterality conclusions remain stable under plausible correction magnitudes. Finally, laterality inferences may be re-evaluated across the full admissible range of λ using preservation-adjusted marginals to assess robustness of concordance-related conclusions.
If laterality patterns remain stable across reasonable preservation adjustments, the observed asymmetry is unlikely to be attributable solely to taphonomic processes. Conversely, marked sensitivity to preservation corrections would indicate that biological interpretations should be made cautiously.
4.6. Methodological Contribution, Strengths, and Limitations
Beyond the specific analytical results, the principal contribution of this study lies in reframing the problem of laterality inference from marginal data as one of structural non-identifiability rather than model choice. Most existing approaches to meta-analysis of binary outcomes—including generalized linear mixed models, copula-based formulations, and variance-stabilizing transformations—are designed for settings in which the joint distribution or within-study association structure is observed or estimable, and therefore primarily address variance estimation, heterogeneity, or distributional assumptions. In contrast, many anatomical prevalence studies systematically lack joint left–right information, so that within-individual dependence cannot be estimated at all and must instead be represented explicitly as an assumption.
The feasibility-based parameterization introduced here provides a unified and internally coherent way to represent this uncertainty by spanning the full set of joint distributions compatible with the reported marginals, thereby avoiding infeasible or implicitly contradictory dependence assumptions such as fixed correlations applied across heterogeneous prevalence settings. Within this framework, the midway dependence hypothesis offers a neutral and reproducible working reference that avoids the boundary artifacts associated with extreme assumptions while remaining interpretable and suitable for structured sensitivity analysis.
A further methodological contribution is the demonstration—through analytical derivations, exact feasibility constraints, and simulation studies—that laterality and bilateral prevalence respond in fundamentally different ways to uncertainty in within-individual dependence. This distinction clarifies why methods developed for independent or unpaired proportions cannot fully address the inferential issues posed by paired anatomical data. Taken together, these elements provide a transparent and adaptable framework for meta-analysis of paired binary outcomes under incomplete reporting, allowing anatomically meaningful quantities to be analyzed while explicitly acknowledging the limits imposed by the available data.
The primary strength of this work lies in its explicit integration of feasibility constraints, analytical results, and simulation evidence into a unified framework that is both mathematically coherent and practically interpretable. By mirroring the structure of earlier work on paired continuous outcomes, the approach provides conceptual continuity across outcome types and clarifies how missing joint information constrains inference in paired binary data.
Several limitations should nevertheless be acknowledged. The dependence parameter λ is a working index rather than an estimable quantity when joint left–right data are unavailable, and its interpretation should remain descriptive rather than biological. In addition, the present analyses focus on laterality and bilateralism as binary paired outcomes; extensions to multi-site, multi-category, or higher-dimensional anatomic configurations will require further methodological development.
The midway dependence hypothesis in particular requires careful interpretation. Most fundamentally, the extent of left–right dependence is not identifiable from marginal side-specific prevalences alone. When joint counts are unreported, the true within-individual association cannot be estimated without additional assumptions, regardless of the statistical framework employed. In this setting, the choice is not between estimation and assumption, but between explicit and implicit assumptions. Treating left and right sides as independent corresponds to the lower boundary of the feasible dependence range, while assuming maximal bilateral concordance corresponds to the upper boundary; neither represents a neutral default in anatomical data.
Accordingly, the midway hypothesis does not assert that the true dependence equals λ = 0.5. Instead, it uses the midpoint of the admissible range as a transparent reference when no information is available to favor either extreme. In this sense, λ = 0.5 functions as a feasibility-based working assumption rather than a biological parameter or point estimate.
A related limitation is that correlation measures implied by a given value of λ, such as the phi coefficient, are not invariant across prevalence settings or marginal imbalance. This behavior is not a deficiency of the framework but a consequence of the geometry of paired binary data. As demonstrated analytically and through simulation, laterality measures become intrinsically unstable as maximal feasible concordance is approached, whereas bilateral prevalence remains well behaved across the admissible dependence range. For rare variants and imbalanced sides, the attainable correlation scale itself contracts, leading to smaller values of ϕ even under identical values of λ.
For these reasons, sensitivity analyses toward the feasibility boundaries should be interpreted as stress tests of inferential stability rather than as representations of plausible anatomical scenarios. Within these constraints, the midway hypothesis occupies a region of numerical and inferential stability and provides a consistent reference point across a wide range of realistic anatomical configurations.
5. Conclusions
Unreported within-subject dependence in paired binary anatomical data constitutes a problem of structural non-identifiability rather than a technical issue of statistical estimation. When joint left–right outcomes are unavailable, neither laterality nor bilateralism can be determined from marginal side-specific prevalences alone. Infinitely many joint distributions are compatible with the same marginals, and inference therefore depends necessarily on assumptions about the unobserved joint structure. These constraints arise from the geometry of paired binary data and persist regardless of sample size, modeling strategy, or inferential framework.
Within this structurally constrained setting, laterality and bilateralism behave fundamentally differently under uncertainty about within-individual dependence. Laterality, quantified by the paired odds ratio, depends exclusively on discordant outcomes and becomes intrinsically unstable as feasible dependence approaches its upper boundary and discordant counts vanish. Bilateral prevalence, by contrast, varies smoothly and linearly across the admissible dependence range and remains comparatively robust. Analytical derivations and simulation studies confirm that laterality inference is generally stable under moderate assumptions about dependence, with instability confined to regions near the feasibility boundary.
The feasibility-based framework developed here provides a principled way to parameterize this structural uncertainty. By indexing the admissible joint range through a scalar dependence parameter λ, the approach separates observed marginal information from unobserved joint structure and makes the underlying assumption explicit, constrained, and testable. The midway dependence hypothesis offers a neutral feasibility-based reference when joint information is missing, facilitating interpretable inference and structured sensitivity analysis without asserting biological symmetry or fixed correlation.
By reframing unreported dependence as a bounded structural uncertainty rather than a nuisance parameter, this framework clarifies how pairing affects inference in meta-analysis of anatomical prevalence data. It delineates the conditions under which bilateral prevalence can be interpreted robustly from marginal data alone and identifies the regions where laterality estimates require explicit feasibility-based sensitivity analysis. More broadly, it illustrates how transparent parameterization of structural uncertainty can strengthen inference in paired binary meta-analysis beyond anatomical applications.