Theoretical Framework for the Study of Genetic Diseases Caused by Dominant Alleles

We propose a theoretical basis for analyzing several features of genetic diseases caused by dominant alleles, including: disease prevalence, genotype penetrance, and the relationship between causal genotype frequency and disease frequency. In addition, we provide a theoretical framework for accurate diagnosis and clinical approaches for disease study, including two examples in which inaccurate and incomplete diagnoses affect the estimates of disease prevalence: First, the disease iceberg effect shows that disease prevalence is often underestimated due to errors introduced by inaccurate diagnosis; second, because lifetime risk of disease is cumulative, and therefore an increasing function of age, measurements of prevalence are inaccurate if people of all ages are not included. Finally, we discuss the aggregation of genetic diseases. We identify theoretical and computational deficiencies associated with using the sibling recurrence-risk ratio as a measure of familial aggregation. We develop an alternative concept of aggregation and propose an associated measure that does not experience the deficiencies. Throughout, we provide clinicians and researchers practical implications of our theoretical framework.


Introduction
Determining the genetic basis for diseases is an important part of population genetics and epidemiology, as disorders can be caused both by a person's genetic predisposition and by environmental influences. The accurate allocation of the cause between genes and the environment allows a better understanding of disease mechanisms and promotes techniques for diagnosing and combating disease [1,2].
The analysis of genetic diseases has a long history. Garrod [3] first drew attention to the relation between inheritance of recessive alleles and the appearance of alkaptonuria in human families. This work ultimately led to the understanding that body characteristics (phenotype) are primarily determined by cellular proteins and that genes (genotype) specify these proteins (e.g., enzymes). Genetic diseases are phenotypes; thus, a genetic disease is similar to any phenotype specified by a genotype [4]. Though different genetic diseases may have different biochemical bases, their transmission processes are identical, and each can be characterized as being caused by recessive or dominant alleles. We focus on singlegene disorders caused by dominant alleles and assume an autosomal "two-allele" model for the genotype-phenotype relationship; consequently, we will not discuss multi-gene or sex-linked diseases.
Our purpose is to clarify the relationship between disease-causing genotypes and the presence of the disease, as well as to clarify the role of accurate diagnosis. We identify theoretical and computational deficiencies associated with the current measure of familial aggregation and propose an alternative concept of aggregation and its measure. Our intention is therefore to describe the theoretical issues clearly, to show why accurate diagnosis is lacking in some cases, as well as to provide replacements for commonly used approaches that experience theoretical and computational deficiencies. Throughout, we provide clinicians and researchers practical implications of our theoretical framework.
In developing our theoretical framework, we will use: probability as a relative frequency; a set theoretic approach to probability; partitions and the law of total probability; conditional probabilities and their properties; population parameters and their estimators; and large-sample-size confidence intervals. As the background for the underlying probability and statistical concepts used, we recommend References [5,6].
Unbiased clinical studies can provide accurate estimates of population parameters (e.g., allele frequency, genotype penetrance, or disease prevalence), which are required for meaningful inferences about disease characteristics. Readers interested in specific protocols for obtaining unbiased clinical studies may see [7] for an in-depth discussion of clinical study design-including strategies for minimizing biases, the statistical analysis of the data, and ethical issues. In addition, we suggest two clinically oriented works that give additional perspectives on specific genetic diseases [1,2].

Disease-Causing Genotypes and Prevalence
We discuss disease-causing genotypes and their relationship to the presence of the associated disease caused by a dominant allele, including their role in determining the disease's frequency in the population.
Traits (phenotypes) are divided into categories determined by genotypes written for convenience as if they consisted of only two alleles [8]. Indeed, most treatments of population genetics [9] focus on a two-allele model, while acknowledging a more complete treatment recognizes that genes have multiple alleles. Nonetheless, even genotype models describing more than two alleles [9] can be reduced to two-allele models if allele contribution is expressed in terms of the functions of the proteins synthesized by each allele.
Let D denote the event that an individual in the population has the disease caused by a dominant allele. Let P(D) denote the probability that any individual in the population has the disease. In the literature, P(D) is sometimes referred to as: (a) the frequency of the disease in the population; (b) the risk of the disease for an individual in the population; (c) the likelihood an individual in the population has the disease; or (d) the prevalence of the disease in the population [10,11].
In our two-allele model, we denote the alleles by C and c and define them as the only two options, where a C allele synthesizes a functioning protein and a c allele makes a non-functioning protein. The C allele is called a dominant allele, and the c allele is called a recessive allele. We will use the following notation for the frequency of these alleles in the population: Let p = P(C) denote the frequency (probability) of the C allele in the population; let q = P(c) denote the frequency (probability) of the c allele in the population. Obviously, p + q = 1, since C and c are the only options in our two-allele model.

Penetrance and Environmental Influence
Penetrance refers to the frequency (probability) of the disease D, given a particular genotype CC, Cc, or cc [8,[12][13][14]. Specifically, the penetrance of a particular genotype is the corresponding conditional probability: The penetrance of CC is P(D|CC); the penetrance of Cc is P(D|Cc), which is the same as the penetrance of cC; the penetrance of cc is P(D|cc). For example, P(D|CC) is the frequency of those in the population with genotype CC who have the disease D.
In agreement with some authors [15], we say a specific genotype has full penetrance provided its penetrance is one; for example, P(D|CC) = 1 corresponds to the genotype CC having full penetrance. A specific genotype has partial penetrance provided its penetrance is less than one; for example, P(D|CC) < 1 corresponds to the genotype CC having partial penetrance.
Penetrance is often presented in an imprecise manner [8], which may lead to misunderstanding; our probability-based quantitative description is unambiguous. Indeed, because genotype penetrance indicates the frequency of those people with a particular genotype who have the disease, penetrance is not a measure of disease severity. This means genotype penetrance does not influence whether a person has a severe, moderate, or mild form of the disease. For example, P(D|Cc) = 0.5 means that, of those people with genotype Cc, about 50% are identified with the disease; it does not mean a diseased person with genotype Cc has a moderate form of the disease. Disease severity is instead related to the concepts of "complete/incomplete dominance" and "expressivity" [8].
The concept of penetrance is one way to include an environmental component in the genotype-phenotype correlation. The estimate of penetrance may include a suspected environmental effect on gene expression (e.g., eating gluten is necessary for the onset of Celiac disease [16]). Even so, it is not always possible to accurately identify a disease phenotype, though the genotype might be known. Griffiths et al. [8] describe this as an aspect of penetrance leading to the "subtlety" of the mutant phenotype; we add that incomplete diagnosis can masquerade as partial penetrance (Section 3).
In order to use genotype frequencies to accurately estimate disease prevalence, it is essential that penetrance be accurately estimated (Section 2.2). With that in mind, it is important to note that using clinical studies to estimate the penetrance of a particular genotype requires: (i) the use of a genetic test to identify whether a person has the genotype; (ii) the identification of the disease's phenotypes; and (iii) the use of a diagnostic test to determine whether such a person with the genotype has the disease (i.e., exhibits the disease's phenotypes). Thus, the accuracy of diagnosis plays a critical role in estimations of penetrance (Section 3).

Prevalence of Diseases Caused by Dominant Alleles
A person with either genotype CC or Cc might be affected with a condition sometimes called a dominant disorder [15]. This may occur where the genotype cc produces the wildtype phenotype, but mutation from c to C generates a new version of the c protein that may impair cellular function.
We introduce a parameter that describes the relationship between the penetrance of CC and Cc. The parameter r is the ratio of the penetrance of Cc to the penetrance of CC (Section 2.1); that is, where 0 < r ≤ 1 because 0 < P(D|Cc) ≤ P(D|CC). We will use the parameter r in clarifying a theoretical framework for the prevalence of diseases caused by dominant alleles. Because the genotypes CC, Cc, cC, and cc form a partition of the population, prevalence can be written in the form Equation (2) describes the prevalence for diseases (P(D)) in terms of the frequencies of the alleles (p and q) and in terms of the penetrance of the genotypes (P(D|CC), P(D|Cc), and P(D|cc)). The derivation of Equation (2) is provided in Appendix A. For a disease caused by a dominant allele, P(D|cc) = 0, P(D|CC) > 0, and P(D|Cc) > 0. In this case, Equation (2) becomes P(D) = p 2 P(D|CC) + 2pqP(D|Cc); in other words, disease prevalence (P(D)) in principle equals the sum of the homozygote dominant and heterozygote genotype population frequencies (p 2 and 2pq), where each frequency is rescaled according to its associated penetrance (P(D|CC) and P(D|Cc)). This allows us to introduce a new formulation for P(D). Substituting Equation (1) yields, which we write in the form Incidentally, in the above derivation of Equation (3), we demonstrate that in other words, the expression p(2r + (1 − 2r)p) is simply another way to write the sum of the homozygote dominant and (rescaled by r) heterozygote population frequencies (p 2 and 2pqr). The advantages of using this expression will become apparent in the following discussion. Equation (3) completely characterizes the theoretical prevalence of such a disease by describing it in terms of only three parameters: the penetrance of the genotype CC (P(D|CC)); the parameter r (Equation (1)); and the frequency of the C allele in the population (p = P(C)). It thus identifies the roles of the three important parameters in determining disease prevalence. In particular, prevalence has a different structure as a function of p in each of the three cases for r: (i) If 1/2 < r ≤ 1, then P(D) has a concave down parabolic relationship in terms of p.
(ii) If r = 1/2, then P(D) has a linear relationship in terms of p. (iii) If 0 < r < 1/2, then P(D) has a concave up parabolic relationship in terms of p. Figure 1 illustrates how Equation (3) uses allele frequency and the penetrance of the genotype CC to determine disease prevalence, where graphs for the three cases of r are shown: (i) The blue shaded region corresponds to 1/2 < r ≤ 1, where the solid blue curve is r = 1, and the dotted blue curve is an illustrative example (r = 3/4). (ii) The black line corresponds to r = 1/2. (iii) The red shaded region corresponds to 0 < r < 1/2, where the dashed red curve is the lower limit r = 0, which cannot be achieved because r must be positive for diseases caused by dominant alleles. The dotted red curve is another illustrative example (r = 1/4). An advantage of Equation (3) (and Figure 1) over other expressions for prevalence (e.g., Equation (2)) is that it clearly identifies the critical role r plays in determining the prevalence's different theoretical framework as a function of p in each of the three cases mentioned. Incidentally, the parameter r has an important role in our alternative new concept of disease aggregation (Section 4).  (3)): (i) the blue shaded region corresponds to 1/2 < r ≤ 1, where the solid blue curve is r = 1, and the dotted blue curve is an illustrative example (r = 3/4); (ii) the black line corresponds to r = 1/2; (iii) the red shaded region corresponds to 0 < r < 1/2, where the dashed red curve is the lower limit r = 0, which cannot be achieved because r > 0 for dominant diseases. The dotted red curve is another illustrative example (r = 1/4). The theoretical prevalence of any disease caused by a dominant allele must be above the dashed red curve and, at most, the solid blue curve. Numerical values on the vertical axis can be assigned once a value of P(D|CC) is known. Note that the largest possible value of P(D) is P(D|CC), which occurs at p = 1, where all three cases coalesce.
An important property for the prevalence of diseases caused by dominant alleles illustrated in Figure 1 is: The theoretical prevalence of any disease caused by a dominant allele must be greater than the dashed red curve (r = 0) and, at most, the solid blue curve (r = 1). That is, P(D) always satisfies Thus, if clinicians estimate a value of disease prevalence ( P(D)) to be outside this interval, it should suggest to them that there likely are diagnostic errors (Section 3) with how P(D) has been estimated.
Moreover, if a disease is thought to be caused by a dominant allele, then clinicians should find that prevalence estimated from diagnostic tests will be close to P(D) described in Equation (3). If it is not, then that should alert clinicians that the diagnostic test is possibly not accurate (Section 3.2).

Necessary and/or Sufficient Genotypes
We develop the theoretical framework characterizing when the disease-causing genotypes are necessary and/or sufficient for the presence of the disease. Let G denote the disease-causing genotypes for a disease caused by a dominant allele; specifically, To define the logical concepts of "necessary" and "sufficient", we frame the discussion in terms of the events G and D representing the disease-causing genotypes and the presence of the disease, respectively. However, the concepts apply to any two events; for example, in Section 3.2, we discuss whether a positive result in a diagnostic test (denoted by T) is necessary and/or sufficient for the presence of the disease (again, denoted by D).
We say that G is necessary for D provided D ⇒ G. That is, the occurrence of D implies the occurrence of G. In other words, (in this context) if a person has the disease, then the person will (likely) have the disease-causing genotype.
We say that G is sufficient for D provided G ⇒ D. That is, the occurrence of G implies the occurrence of D. In other words, (in this context) if a person has the disease-causing genotype, then the person will (likely) have the disease.
Conditional probability formulations. We now develop equivalent conditional probability formulations for the concepts of "necessary" and "sufficient" discussed above. The formulations apply to any two events, but we will frame the discussion in terms of G and D as above (see Section 3.2 for another example). Observe that P(G|D) = 1 is equivalent to saying that "G is necessary for D". Also, observe that P(D|G) = 1 is equivalent to saying that "G is sufficient for D". The details for the equivalence of these formulations is established in Appendix B.
We now use the formulations to clearly identify when the disease-causing genotypes are necessary and/or sufficient for the presence of a disease caused by a dominant allele. In addition, we include implications for clinicians as the context. For a disease caused by a dominant allele, P(D|G ) = 0. Now, which implies therefore, G is necessary for D. Moreover, hence, G is sufficient for D if and only if P(D) = P(G). Recall that the frequency of the disease-causing genotypes is P(G) = P(CC) + P(Cc ∪ c C) = p 2 + 2pq ; therefore, by Equation (2) (since 0 < P(D|CC) ≤ 1 and 0 < P(D|Cc) ≤ 1), we conclude P(D) = P(G) ⇔ P(D|CC) = 1 and P(D|Cc) = 1.
In summary, the disease-causing genotypes CC and Cc are always necessary for D; they are sufficient for D if and only if the disease-causing genotypes are fully penetrant (P(D|CC) = 1 and P(D|Cc) = 1).
An implication for clinicians is that if they believe the disease-causing genotypes are "necessary, but not sufficient" for the presence of the disease, then P(D|CC) = 1 and/or P(D|Cc) = 1. Two explanations are: there could be other components (e.g., environmental) affecting the presence of the disease, resulting in CC and/or Cc not being fully penetrant; or it could be that the associated diagnostic test lacks the accuracy (Section 3.2) to correctly predict that the genotypes are fully penetrant. Consequently, it is essential that clinicians not use their belief that a disease-causing genotype is partially penetrant as the justification for relying on an inaccurate diagnostic test. In all of these scenarios, it is imperative that clinicians continue their investigations, ultimately seeking a thorough understanding and explanation of the actual relationship between P(D) and P(G).
In Section 3, we provide a similar analysis with D and a diagnostic test's positive result, which we denote by T. Specifically, we demonstrate that accurate diagnosis is equivalent to T being necessary and sufficient for D. This allows us to develop, in Sections 2 and 3, a unified theoretical framework for identifying a genetic disease caused by a dominant allele, as summarized in Section 5.

The Role of Diagnostic Tests
We provide three fundamental concepts for obtaining accurate estimates of disease prevalence: (1) identifying the genetic basis for the disease (Section 3.1); (2) achieving an accurate diagnosis via appropriate tests (Section 3.2); and (3) viewing disease prevalence as a cumulative lifetime risk [11] (Section 3.3).
Before discussing the three fundamental concepts, it is important to recognize that the prevalence of genetic diseases is commonly underestimated [17][18][19][20][21]. This general underdiagnosis of diseases occurs because of inattention to the three fundamental concepts, and specifically because of the difficulty of identifying people with genetic diseases that are either non-lethal or that have symptoms similar to those of other diseases. Last [17] conceived of the analogy of a disease iceberg to describe this general disparity between the perceived and actual prevalence of a disease in the population. In his model, the entire iceberg represents the proportion of the population with the disease (actual prevalence); the "above water portion" of the iceberg corresponds to the diagnosed portion of the population with the disease (perceived prevalence); the "below water portion" of the iceberg corresponds to the portion of the population with the disease, but as yet undiagnosed ( Figure 2A).
Theoretical framework. Let D denote the event that an individual from the population has the disease. Let A denote the event that an individual from the population has been diagnosed with the disease. The complement of A (denoted by A ) will therefore be the event that an individual from the population has not been diagnosed with the disease for whatever reason. Figure 2. An extended disease iceberg analogy differentiating between various levels of identifying a disease based on a particular diagnostic test. Each rectangle (an iceberg) represents the proportion of the population with a given disease (P(D)) and is the same in each panel. The differences between the panels represent the various abilities that particular diagnostic tests may have in identifying the disease. The white region (above water portion) in each rectangle denotes the proportion of the population with the disease and a positive test result (P(D ∩ T)), while the blue region (below water portion) in each rectangle denotes the proportion of the population with the disease, but unknown because they have a negative test result (P(D ∩ T )). Consider a disease with a significant iceberg effect, that is to say, the above-water portion of the iceberg is significantly smaller than the below-water portion of the iceberg. For the diseases studied by Last [17], the undiagnosed cases were 2-10-times the diagnosed cases. In other words, , and because A and A are mutually exclusive, P(D) = P(D ∩ A) + P(D ∩ A ). Hence, for a disease experiencing a significant iceberg effect, demonstrating that the perceived prevalence (P(D ∩ A)) consisting of those thought to be affected by the disease will significantly underestimate the actual disease prevalence (P(D)).
The disease iceberg effect is common among diseases caused by dominant alleles and can be significant; indeed, disease prevalence can be underestimated by close to 90% [17,19,22]. Moreover, knowing the ratio of diagnosed-to-undiagnosed cases allows researchers and clinicians to more accurately estimate the actual disease prevalence P(D) [17,19,22], which we illustrate with an example.
Consider a disease with a perceived prevalence of 3.6% (P(D ∩ A) = 0.036). In addition, suppose it is reported that 90% of those with the disease are undiagnosed; that is, P(D ∩ A ) = 0.9P(D). Using this information, researchers and clinicians can give a more accurate estimate of the actual disease prevalence P(D). Indeed, one can show that P(D) = 0.36; thus, the actual disease prevalence is more accurately estimated as 36%, which is 10-times the perceived prevalence.
In Section 3.1, we extend the iceberg analogy and explain that the disease iceberg effect can be reduced by better: (i) disease identification; (ii) knowledge of disease-causing genotypes; and (iii) diagnosis (Section 3.2).

Identifying a Genetic Disease
Identifying a genetic disease requires two key approaches: (i) the assignment of a disease to a particular genotype; and (ii) the performance of accurate diagnostic tests.
The assignment of a disease to a particular genotype. Each person with the disease caused by a dominant allele has a particular genotype (CC or Cc). This genotype can be inferred from a family pedigree, and it can be directly determined by laboratory genotype tests. The genotype can be correlated via other laboratory tests with known symptoms and signs of the disease in order to discover (structurally, immunologically, or physiologically) why the particular genotype generates the disease phenotype. A combination of genetic tests and diagnostic tests is used; these tests must each be sensitive (very high true-positive rate) and specific (very high true-negative rate) for an accurate assignment (Section 3.2). If the various tests are appropriate and accurate, they should all agree with each other within reasonable error bounds. If different tests give different results regarding disease presence, clinicians should determine why the tests differ. These tests plus careful clinical examination should lead to an accurate diagnosis that minimizes the likelihood of misidentification.
The performance of accurate diagnostic tests. Clinical studies are used to estimate disease prevalence (Section 2.2), to determine which symptoms and signs are the most relevant, and to correlate these with the genotypes of disease carriers. Medical diagnoses (e.g., physical biopsies, tests for antibodies, and observation of symptoms) are combined with genotype determination [23].
Theoretical framework. Let T denote the event that a particular diagnostic test yields a positive result for the disease, which can be used to decompose P(D) as We develop an extended disease iceberg analogy to differentiate between various levels of identifying a disease based on a particular diagnostic test. In Figure 2, the rectangle (an iceberg) in each panel represents the proportion of the population with a given disease. In our analysis, both the disease and P(D) are the same in both panels. The differences between the panels represent the various abilities that particular diagnostic tests may have in identifying the disease. The white region (above water portion) in each rectangle denotes the proportion of the population with the disease and a positive test result (P(D ∩ T)), while the blue region (below water portion) in each rectangle denotes the proportion of the population with the disease, but who are unknown because they have a negative test result (P(D ∩ T )).
More precisely, we have the following levels of a diagnostic test identifying a disease: which is equivalent to P(D ∩ T) ≈ 0. This implies the prevalence of the disease will be significantly underestimated by the diagnostic test and is equivalent to Last's [17] concept of the disease iceberg effect ( Figure 2A).
which is equivalent to P(D ∩ T ) ≈ 0. This implies the diagnostic test will yield an accurate estimator, via an unbiased clinical study based on the diagnostic test, for the prevalence of the disease ( Figure 2B).
The clinical understanding of diseases has progressed over time based on improvements in the understanding of disease mechanisms and also on the development of new diagnostic tools. Thus, we suggest that the panels for the hypothetical disease in Figure 2 should illustrate the progression from "not well-identified" to "well-identified" in an actual disease as diagnostic tests improve in disease identification. In Sections 3.2 and 3.3, we develop a theoretical framework for achieving this, as well as include suggestions/implications for researchers and clinicians.
Dominant fatal diseases, such as Huntington's disease, have a clear genotype-phenotype relationship and straight-forward diagnostic approaches; they should, therefore, show minimal iceberg effects-they are "well-identified" diseases ( Figure 2B). For others, such as prion diseases [13], the genotype-phenotype relation is not as well identified ( Figure 2A). Prion diseases are rare disorders in which abnormally folded proteins cause neural disabilities. An example is Creutzfeldt-Jacob disease [24], in which the disease-causing protein originates from an alteration in allele sequence or is obtained from an exogenous source (e.g., the diet). Only the genetic version of the disorder is relevant here.

Accurate Diagnosis
Again, we let D be the event that an individual in the population has the disease and let T be the event that a diagnostic test yields a positive result for the disease. For example, a diagnostic test might be: (i) a biopsy; (ii) a test for blood-borne substances, such as antibodies associated with the disease; or (iii) a test based on the presence of symptoms associated with the disease [2].
Recall that D and T can be used to partition a group of individuals (e.g., the population as a whole or a clinical study corresponding to a random sample of a population under consideration) of size n as shown in Table 1, where: n 11 = the number with D and T; n 12 = the number with D and T ; n 21 = the number with D and T; n 22 = the number with D and T ; and n = n 11 + n 12 + n 21 + n 22 . In addition, recall that the accuracy of the diagnostic test is defined to be which measures the frequency of those individuals in the clinical study that are correctly diagnosed. The closer the ratio is to one, the more accurate the diagnostic test is. Only a diagnostic test with n 12 ≈ 0 and n 21 ≈ 0 will provide an accurate diagnosis (Accuracy ≈ 1). We now discuss the properties of such a test.

Necessary and Sufficient Diagnostic Tests
We show that accurate diagnosis is equivalent to a positive test result being both necessary and sufficient for the presence of the disease. Establishing this equivalence leads to several new advances: (i) we will be able to describe the theoretical mechanism for developing an accurate diagnosis (Section 3.2.2); (ii) we will be able to develop a theoretical framework for cumulative lifetime risk and its role in accurate diagnosis (Section 3.3); (iii) together with Section 2.3, we will have a unified theoretical framework for identifying a genetic disease by understanding the relationships between D, G, and T as summarized in Section 5.

Necessary diagnostic tests.
An essential property of a diagnostic test is that it be effective at detecting the disease when the test is administered to an individual having the disease. More precisely, it should be the situation that P(T|D) ≈ 1; otherwise, this particular test should not be used as a diagnostic tool. Sometimes, P(T|D) is referred to as the true-positive rate, as well as the sensitivity of the diagnostic test [25].
Recall that P(T|D) = 1 is equivalent to saying that T is necessary for D (details of the equivalency are in Section 2.3 with G replaced by T); that is, "T is necessary for D" is equivalent to the diagnostic test having high sensitivity. Similarly, one can show that P(T|D) = 1 is equivalent to saying that the false-negative rate is zero (P(T |D) = 0). Therefore, "T is necessary for D" (i.e., the diagnostic test has high sensitivity or has a small false-negative rate) means that: if a person has the disease, then the person will almost always test positive for the disease. When T is necessary for D, the population is partitioned, as shown in Table 1 with n 12 ≈ 0:

Sufficient diagnostic tests.
A diagnostic test becomes a useful way of identifying those with the disease if P(D|T) ≈ 1. Sometimes, P(D|T) is referred to as the positive predictive rate [25].
Recall that P(D|T) = 1 is equivalent to saying that T is sufficient for D (details of the equivalency are in Section 2.3 with G replaced by T); that is, "T is sufficient for D" is equivalent to the diagnostic test having a high positive predictive rate. Similarly (assuming P(D ) = 0), one can show that P(D|T) = 1 is equivalent to saying that: the false-positive rate is zero (P(T|D ) = 0); as well as P(T |D ) = 1. Sometimes, P(T |D ) is called the truenegative rate, as well as the specificity of the diagnostic test [25]. Therefore, "T is sufficient for D" (i.e., the diagnostic test has a high positive predictive rate or a small false-positive rate, or high specificity) means that: if a person receives a positive test, then the person will almost always have the disease. When T is sufficient for D, the population is partitioned, as shown in Table 1 with n 21 ≈ 0: Accurate diagnosis: A necessary and sufficient diagnostic test. The goal of any diagnostic test is for a positive test result to be both necessary and sufficient for an individual to be identified with the disease; that is, T and D partition the population as a diagonal partition (Table 1 with n 12 ≈ 0 and n 21 ≈ 0), and those individuals in the population under consideration with the disease are precisely those individuals who receive a positive result from the diagnostic test. Only if both sensitivity and specificity are high in a clinical study can clinicians be confident their analyses are accurate.
In summary, the result of the foregoing is that accurate diagnosis depends on T being both necessary and sufficient for D. When this is the case, P(T) = P(D). Thus, an estimator for P(T) based on a clinical study should be close to an estimator for P(D) described by Equation (3).
An implication for clinicians is that if they choose to use a diagnostic test with a positive test result being "not necessary" for the occurrence of the disease, then that is equivalent to them accepting a significant iceberg effect and a large underestimation of the actual prevalence of the disease. Another implication for clinicians is that if they believe a diagnostic test's positive test result is "necessary, but not sufficient" for the occurrence of the disease, then that is equivalent to them accepting that the diagnostic test does not accurately predict whether a person has the disease or not. Instead, we suggest that it is imperative that clinicians continue their investigations-ultimately seeking a diagnostic test that does yield P(T) = P(D).

Estimating Prevalence via a Diagnostic Test
To actually create a diagnostic test that yields P(T) ≈ P(D), a clinician should begin with a diagnostic test for which T is necessary for D (Table 1 with n 12 ≈ 0). Indeed, if T is not necessary for D, then the diagnostic procedure ought to be rejected outright. When diagnostic tests are first developed, they are likely to have difficulty identifying those with the disease and those without it (Table 1 with n 21 ≈ 0 and, therefore, n 11 is underestimated). A clinician's goal is therefore to refine the diagnostic test, while keeping in mind accepted clinical study design protocols [7], so that it also ensures T is sufficient for D (Table 1 with n 12 ≈ 0 and n 21 ≈ 0). When this is achieved, clinicians will have created a diagnostic test that accurately predicts disease presence (i.e., the test is ready for usage as a diagnostic tool), and P(T) will be close to P(D).
The preceding intuitive discussion connects our theory to a clinician's practice. To our knowledge, we are the first to rigorously characterize the discussion by developing the theoretical mechanism for how P(T) approaches P(D) as the diagnostic test is refined. We demonstrate that when T is necessary for D (Section 3.2.1), P(T) can be used to provide lower and upper bounds for P(D); moreover, we show that as the false-positive rate (P(T|D )) approaches zero, the lower and upper bounds force P(T) to approach P(D). Thus, T will be both necessary and sufficient for D, and consequently, P(T) ≈ P(D). Specifically, the theoretical mechanism is described by where α 0 is an upper bound for P(T|D ); in other words, the false-positive rate is at most α 0 (0 ≤ P(T|D ) ≤ α 0 ). The derivation of Equation (4) is provided in Appendix C.
Reducing α 0 improves the diagnostic test's accuracy. Moreover, Equation (4) describes the theoretical mechanism by which P(T) approaches P(D) as α 0 becomes smaller (because the lower bound in Equation (4) approaches P(T) as α 0 approaches zero), resulting in the partition of the population induced by T and D approaching a diagonal partition, at which point, T will be both necessary and sufficient for D. The implication is crucial: As the false-positive rate becomes smaller, the probability increases that a positive result in the corresponding diagnostic test will more accurately predict prevalence of the disease.
Estimation procedure. The above theoretical development suggests the following four-step procedure for clinicians wanting to use a diagnostic test to accurately estimate disease prevalence: (i) Begin with a diagnostic test for which T is necessary for D. A corresponding clinical study should consist of data resembling Table 1 with n 12 ≈ 0. (ii) Estimate P(T). Use Table 1 to find P(T) = n 11 + n 21 n · (iii) Estimate the maximum value of a false-positive rate, which is denoted by α 0 . Use Table 1 to compute, for example, a 95% confidence interval [5,6] for the false-positive rate, and take α 0 to be the maximum of the interval (iv) Substitute the estimators of P(T) and α 0 into Equation (4), which yields an interval estimate for P(D).

Example 1.
As context, consider a disease caused by a dominant allele with p = 0.2, r = 1 and the genotype CC fully penetrant. Then P(D) = 0.36 (Equation (3)). In principle, an accurate diagnostic test should yield P(T) ≈ P(D) ≈ 0.36. To achieve this, begin with a diagnostic test for which T is necessary for D (Step (i)). Using a corresponding clinical study resembling Table 1 with n 12 ≈ 0, obtain the estimator P(T) ≈ 0.36 (Step (ii)). Figure 3 is an illustration of Equation (4), where the lower bound is the blue curve and the upper bound is the black horizontal line (at P(T) ≈ 0.36). The disease prevalence P(D) lies inclusively between the two bounds, and interval estimates for P(D) (indicated in red) are shown for α 0 = 0.3, 0.2, 0.1, and 0.02. Depending on the diagnostic test and how it is interpreted, false-positive results may generate uncertainty regarding P(D); for example, if the false-positive rate is as high as 0.3 (i.e., α 0 = 0.3), then P(D) is estimated as being inclusively between 0.086 and 0.36 (Steps (iii) and (iv); Figure 3). An interval estimate with such a large spread makes any P(D) estimate unreliable (e.g., the interval does not support claiming P(D) ≈ 0.09). Indeed, such uncertainty should alert clinicians that the diagnostic test is not accurate (T is necessary, but not yet sufficient for D). However, as α 0 is reduced, the test's accuracy is improved; at values α 0 ≤ 0.1, the disease prevalence will be estimated more accurately (Figure 3 with α = 0.1 and 0.02); T will become both necessary and sufficient for D, resulting in P(T) ≈ P(D) ≈ 0.36, as desired.
Incidentally, our development of accurate diagnosis applies to any disease, whether it is genetically based or not.

Accurate Diagnosis Requires Cumulative Lifetime Risk
For many disorders, disease prevalence is a cumulative lifetime risk; that is to say, disease prevalence is the likelihood a person from the population will be accurately diagnosed as having the disease at some point during their lifetime. For certain disorders, in particular those caused by dominant alleles, symptoms and the probability of testing positive for the disease (P(T)) show a peak in middle age. This leads to a steady accumulation of cases (of a particular disease) in the population [8,11,13,26,27]. Diagnostic tests for such diseases are administered to people thought to have the disease-causing genotype; these tests yield a result at a specific moment in each person's lifetime. For some disorders (e.g., Huntington's Disease (HD) [28]), the probability of a positive test result (P(T)) increases with age, so young people with the disease-causing genotype may not test positive for the disease. In non-fatal dominant diseases, these negative results are often misinterpreted to mean that such people will never test positive for the disease. Our analysis will make clear that this interpretation is unwarranted and is a source of underestimates of P(D). Figure 4 shows the cumulative lifetime feature of disease prevalence for people with HD. Figure 4A illustrates data for 84 people (ranging in age from 10 to over 80 years old) who at some point developed HD. The maximum proportion was diagnosed at approximately age 50, and by age 80 nearly all of those who would develop HD had been diagnosed. Figure 4B illustrates the corresponding cumulative distribution of diagnosis, indicating that it takes about 80 years for most people with the disease-causing genotype for HD to be identified. This cumulative mechanism means that a negative diagnostic test result at any age below, say 70, does not preclude either a positive diagnostic test result or actual disease itself at a later time. Therefore, HD prevalence cannot be accurately estimated by studying only those younger than age 70. This cumulative pattern of diagnosis applies to prion diseases [13] and amyotrophic lateral sclerosis [29], and in general has implications for the estimation of the prevalence of diseases that are detected only later in life.  Constructed from data in [28].
Genetic tests at any time will show the presence or absence of the disease-causing genotypes. For a disease such as HD, the CC genotype is unlikely to be found in living people because most individuals with the CC genotype die before birth. The presence of the Cc genotype suggests that the disease will develop in severity over the lifetime of the individual and the true prevalence P(D) is not accurately estimated until all ages have been accounted for [13]. Thus, for individuals with the Cc genotype, the variable appearance of HD over a lifespan is not necessarily a measure of the penetrance of the disease-causing genotype Cc, as disease prevalence may also depend on how carefully clinicians have diagnosed the condition (i.e., how likely it is to obtain a positive diagnostic test result may depend on disease severity and the diagnostic test's ability to detect mild forms of the disease).
Cumulative lifetime risk is best understood as an investigation of the accuracy of diagnosis and the identification of all people who might have the disease. Recall that an accurate diagnosis can be framed in terms of a positive diagnostic test result being both necessary and sufficient for the presence of the disease (Section 3.2.1). The implications are crucial for understanding population disease prevalence. We will show that cumulative lifetime risk is formally and actually equal to population-wide disease prevalence, P(D): Theoretical framework. The following is a theoretical framework for cumulative lifetime risk analysis. It describes the accuracy of a diagnosis as a function of subject age in terms of two measures of cumulative diagnosis, which we call the cumulative age-true positive rate and the cumulative age-positive predictive rate. The former is an index of the diagnostic test's true-positive rate, and thus of the degree to which the diagnostic test is necessary for demonstrating the disease; the latter is an index of the diagnostic test's positive predictive rate, and thus of the degree to which the diagnostic test is sufficient for demonstrating the disease. For simplicity, we assume that the maximum lifetime of individuals in the population is 100 years.
We define the age-true positive rate, denoted by f tpr (i), to be the conditional probability a person receives a positive test result at age i years old (i = 1, 2, . . . , 100), given the person has the disease; that is to say, f tpr (i) = P((T ∩ {age i years old})|D) (i = 1, 2, . . . , 100) .
Thus, the true-positive rate is the accumulation of all age-true positive rates, We define the age-positive predictive rate, denoted by f ppr (i), to be the conditional probability a person has the disease at age i years old, given the person receives a positive test result (i = 1, 2, . . . , 100); that is to say, 1, 2, . . . , 100) .
Thus, the positive predictive rate is the accumulation of all age-positive predictive rates, Here are the properties that both the age-true positive rate and the age-positive predictive rate satisfy (to simplify the notation, the function f (i) stands for both f tpr (i) and f ppr (i)): which is a consequence of the diagnostic test satisfying P(T|D) = 1 (T is necessary for D) and P(D|T) = 1 (T is sufficient for D).
(iii) The function f (i) is bell-shaped, but is not necessarily symmetric. That is, f (i) obtains its maximum at some age denoted by m; f (i) will be an increasing function for i < m and a decreasing function for i > m. For diseases with later-in-life detection (e.g., many diseases caused by dominant alleles), m typically occurs during middle-age. Figure 5A provides a graph of a typical f (which stands for both f tpr and f ppr ) for diseases with later-in-life detection. For convenience, the function f has been extended to a continuous function defined for all times 0 ≤ t ≤ 100. Indeed, the function f (t) can be thought of as a "best fit curve" using the values f (i) for i = 1, 2 . . . , 100, and f (0) = 0. Graph of a typical f , which stands for both the age-true positive rate ( f tpr ) and the age-positive predictive rate ( f ppr ). See the text for their descriptions. The function f (t) is bell-shaped, but is not necessarily symmetric, and obtains its maximum at some age denoted by m. For a disease with later-in-life detection, m typically occurs during middle-age. (B) Graph of a typical F, which stands for both the cumulative age-true positive rate (F tpr ) and the cumulative age-positive predictive rate (F ppr ). See the text for their descriptions. For a disease with later-in-life detection, F is close to one only after middle age.
We define the cumulative age-true positive rate of the disease at age i, denoted by F tpr (i), to be the sum of the age-true positive rates for ages at most i; that is to say, 1, 2, . . . , 100) .
We define the cumulative age-positive predictive rate of the disease at age i, denoted by F ppr (i), to be the sum of the age-positive predictive rate for ages at most i; that is to say, 1, 2, . . . , 100) .
Here are properties that both the cumulative age-true positive rate and the cumulative age-positive predictive rate satisfy (to simplify the notation, the function F(i) stands for both F tpr (i) and F ppr (i)): (iii) F(i) will be concave up (increasing at an increasing rate) for 1 ≤ i < m; and will be concave down (increasing at a decreasing rate) for m < i ≤ 100. Figure 5B provides a graph of a typical F (which stands for both F tpr and F ppr ) for diseases with later-in-life detection. For convenience, the function F has been extended to a continuous function defined for all times 0 ≤ t ≤ 100. Indeed, the function F(t) can be thought of as a "best fit curve" using the values F(i) for i = 1, 2, . . . , 100, and F(0) = 0.
In summary, accurate diagnosis (Section 3.2) in the context of a cumulative lifetime risk corresponds to F tpr (100) = 1 and F ppr (100) = 1.
Framing accurate diagnosis as a cumulative lifetime risk has implications for clinicians regarding a diagnostic test's result. For diseases with later-in-life detection (e.g., many diseases caused by dominant alleles), clinicians should be aware of three important and related concepts: (i) A negative diagnostic test result up to middle age does not indicate that the person will never be accurately diagnosed with the disease during their lifetime. For example, a person may actually have an early form of the disease that is not detected by the diagnostic test; consequently, inadequate testing may prevent treatment for the person during their lifetime. Indeed, because F tpr (t) ≈ 1 and F ppr (t) ≈ 1 only later in life, it is essential to continue testing a person with the disease-causing genotype who receives a negative diagnostic test result well beyond middle age ( Figure 5). (ii) Clinical studies exclusively using people from a specific age group (e.g., only those from 20-30 years old) will suffer from ascertainment bias; hence, such studies will not produce meaningful inferences regarding population disease prevalence ( Figure 5). Moreover, clinical studies consisting of people only up to middle age will suffer from ascertainment bias and result in an underestimation of the prevalence of diseases with later-in-life detection. For example, HD prevalence would be underestimated by about 30% if only people up to age 55 were included in the data in [28] ( Figure 4B). (iii) A positive diagnostic test result at any age (in a person with the disease-causing genotype) may also be a false-positive and may suggest treatments that will not be necessary. The chances of false positives should thus be minimized at all ages ( Figure 5). where P(D) is given by Equation (3). In summary, it is important to view the accuracy of diagnosis as a function of subject age in order to ensure that a positive diagnostic test result precisely identifies those individuals who have the disease. That is, the goal of any diagnostic test should be for P(T) to accurately estimate P(D).

Familial and Offspring-Group Aggregation
The current approach to investigating the prevalence of genetic diseases in various families relies on the concept of familial aggregation, in which the frequency of a disease may be higher in particular family groupings than in the general population. An initial grouping was the hereditary family, consisting of genetic relatives from the same family tree: grandparents, parents, siblings, cousins, etc. [8,11]. A more precise grouping is first-degree relatives (parents, offspring, and siblings [30]), which form a subset of the hereditary family. However, a person's genetic disease risk is not directly influenced by a non-parent in a hereditary family. Because current approaches assess a person's genetic disease risk via imprecise measures of familial aggregation, we propose they be replaced by a measure determined solely by parental genotypes; thus, we introduce a new approach that we call offspring-group aggregation. The advantages of this approach will become apparent below.
Throughout, we use standard human pedigree analysis terminology; for example, "parents" refers to genetic parents, and "siblings" refers to offspring with the same genetic parents [8].
Offspring-groups. Consider a two-allele model for a genetic disease. Table 2 illustrates all possible parental genotypes and their offspring. The entries in the individual cells are the frequencies of the corresponding offspring.
Constructing all the possible matings using the parents in Table 2, we observe that there are precisely six partition subsets of the general population, which we denote by F i (for i = 1, 2, . . . , 6), and have the following probabilities: In Figure 6, we illustrate the possible offspring genotypes within each subset F i (for i = 1, 2, . . . , 6). We refer to F i as an offspring-group, which consists of all people (offspring) whose parents have the genotypes that determine the partition F i . For example, F 2 consists of all people (offspring) in the general population whose parents have genotypes CC × Cc. Consequently, because a person's genotype is dependent on their parents, siblings belong to the same offspring-group. Moreover, an offspring-group will include people who are not necessarily siblings; indeed, two people who are not siblings could each have parents with the same genotypes and thus be members of the same offspring-group.
Incidentally, which offspring-group a parent belongs to is determined by the genotypes of their parents; a parent might not belong to the same offspring-group as their children. For example, suppose you and your mate have genotypes CC × Cc, then your offspring belong to F 2 ; in addition, suppose your parents have genotypes Cc × Cc, then you belong to  Figure 6 shows that some offspring-groups may have high P(D), while others may have low or zero P(D).
Clinical studies involving pairs of siblings report the likelihood that a sibling has the disease, given the other sibling has the disease. This statistic, called sibling risk, is presented as if it were a clinical characteristic of the disease. Disease risk is instead determined by the structures of the offspring-groups (Figure 6), the penetrance of disease-causing genotypes, and the frequency of the disease-causing genotypes. We will address this idea in Section 4.2. Familial aggregation is currently measured with the sibling recurrencerisk ratio, denoted by λ s , which refers to the ratio of sibling risk to the population-wide disease prevalence (Section 4.1). An estimated high value λ s 1 (e.g., occasionally obtained from clinical studies) is used often as an indication that a particular disorder has familial aggregation [10,11,31]. However, as we will show, the current measure of familial aggregation is biased because it ignores a large part of the population and because it is affected by (often mistaken) estimates of population disease prevalence. Indeed, we provide several arguments that, in principle, the theoretical sibling recurrence-risk ratio is always equal to one (λ s = 1); this gives the surprising result that any estimator λ s ≈ 1 be viewed with suspicion. Therefore, we propose that λ s is in need of replacement.
Our new concept focuses on the six offspring-groups ( Figure 6) instead of hereditary families. Because each offspring-group has its own disease risk, "familial risk" should not be represented by a population parameter with a single value such as λ s . After demonstrating the unsuitability of λ s , we propose an alternative that depends on the allele frequency and penetrance of disease-causing genotypes; thus, our measure differs among the possible six offspring-groups of the general population (Equation (9)). We also discuss why our new measure is likely to yield an unbiased estimator based on clinical studies-unlike estimators for the sibling recurrence-risk ratio (Section 4.2).

Sibling Recurrence-Risk Ratio
Sibling risk is defined as the probability that an individual has a disease, given that a sibling has the same disease [11,32,33]. More precisely, let S 1 and S 2 denote two (nonidentical) siblings with the same parents, let D 1 denote the event that S 1 has the disease, and let D 2 denote the event that S 2 has the same disease. In the literature [10,11,33], sibling risk is often denoted by K s ; thus, K s = Sibling risk = P(D 2 |D 1 ).
In addition, the population risk (frequency, prevalence, probability) of the disease in the population is often denoted by K. In particular, P(D 1 ) = K and P(D 2 ) = K. The literature in this field [10,11] defines the sibling recurrence-risk ratio for use in the explanation of familial aggregation, as well as for hypothesizing a need for additional genes to describe the dependence of disease prevalence on genotype. Misunderstanding and different interpretations of the definition of K s have led to various approaches for (inaccurately) estimating λ s , making valid inferences and hypotheses problematic [32]. Our approach to this issue is based on the alleles of offspring being dependent on their parents, as well as on the small number of possible offspring-group types in a population and the membership of two siblings in the same offspring-group. Observe that while the siblings S 1 and S 2 are from the same offspring-group, the definition of K s as currently used does not specify to which of the six offspring-groups the siblings belong ( Figure 6). Thus, K s is not defined as a conditional probability with respect to an offspring-group, forcing the general population to become the focus for determining K s . Therefore, the heterogeneity of offspring-groups means λ s is not an enlightening measure of familial aggregation.
Our analysis develops several biologically based probabilistic arguments leading to the demonstration that K s = K for a genetic disease; that is, λ s = 1 (Sections 4.1.1 and 4.1.2).
Following this demonstration, we will explore its implications for the calculations of estimators for K s and K. We also discuss why the estimator λ s experiences computational deficiencies-incorrectly predicting λ s > 1. In addition, we discuss the implications of K s = K and the misuse of λ s as the justification for additional gene hypotheses (Section 4.1.3).
The genotypes of offspring are dependent on the parents, not on the siblings; consequently, whether S 1 has a particular allele is not affected by whether S 2 has the allele and genetic events regarding S 1 and S 2 will be independent of each other. In particular, with respect to genetic diseases, D 1 and D 2 are independent events. Therefore, P(D 1 ∩ D 2 ) = P(D 1 )P(D 2 ), which implies hence, we conclude that λ s = 1. This means that λ s = 1 for any disease in which disease status is independent in each sibling. Incidentally, the independence of D 1 and D 2 may not be the case for certain types of disorders; for example, two siblings living in the same household will likely not be independent of each other with respect to non-genetic contagious disease status [32].
As another approach showing λ s = 1, we note that Risch [33] writes λ s in terms of the covariance between siblings Because D 1 and D 2 are independent events, Cov(D 1 , D 2 ) = 0 [5,6] and we again conclude that λ s = 1.
As a third approach showing λ s = 1, we note Risch [11] defines φ s as the probability that two siblings share zero marker alleles and states that φ s = 1/4. Let Z = {S 1 and S 2 share zero alleles}, and observe that P(Z) = φ s = 1/4. Recall {S 1 and S 2 have the disease} = D 1 ∩ D 2 . As indicated in [11], As described in [10], the expected proportion of affected sibling pairs sharing zero alleles is 0.25; that is, P(Z|(D 1 ∩ D 2 )) = 0.25 = φ s . Hence, φ s = φ s /λ s , and we again conclude that λ s = 1.

Siblings
Are from the Same Offspring-Group: λ s = 1 We define the offspring-group risk for a specific offspring-group F i to be the probability of an individual having the disease, given that the individual is an offspring in F i . That is, offspring-group risk is P(D|F i ) (for i = 1, 2, . . . , 6). From Figure 6, using P(D|cc) = 0 for a disease D caused by a dominant allele (Section 2.2), we compute the offspring-group risk for each of the six offspring-groups: P(D|F 1 ) = P(D|CC) ; P(D|F 2 ) = 1 2 P(D|CC) + P(D|Cc) We are now ready to compute sibling risk using the offspring-group risks. Because the six offspring-groups form a partition of the population and because siblings are from the same offspring-group, we can write Using the offspring-group frequencies (Equation (5)), we have that Substituting the offspring-group risks (Equation (6)) into Equation (7) gives the following representation K s = P(D 2 |CC)p 4 + 2[P(D 2 |CC) + P(D 2 |Cc)]p 3 q + 2P(D 2 |Cc)p 2 q 2 + [P(D 2 |CC) + 2P(D 2 |Cc)]p 2 q 2 + 2P(D 2 |Cc)pq 3 .
Finally, combining similar terms (and noting that p + q = 1), using Equation (1) and Section 2.2, and using Equation (3) yields Thus, we again conclude that K s = K. This last argument has the additional utility that it provides the underlying structure for developing a new measure of aggregation (based on offspring-groups instead of hereditary families), which we discuss in Section 4.2.
Even though the values of K s and K are identical, certain offspring-groups (and hereditary families) may have more members with a disease than other groups and may also have a higher or lower P(D) than the population as a whole. The equality of K s and K simply means that the sibling recurrence-risk ratio is not an appropriate measure of aggregation among offspring-groups or hereditary families. Before we propose an alternative measure that avoids the challenges associated with λ s , we discuss why estimators ( λ s ) of λ s appear to be greater than one.

Estimating the Sibling Recurrence-Risk Ratio
There are two main reasons for errors in the traditional statistical construction of the estimator λ s : (i) the prevalence of the disease, K, is almost always underestimated; (ii) sibling risk, K s , is almost always overestimated.
Having already discussed the underestimation of K (Section 3), we now discuss the overestimation of K s . Recall that Using data from a clinical study consisting of pairs of siblings, an estimator P(D 2 ∩ D 1 ) will likely yield an overestimation of P(D 2 ∩ D 1 ) because the clinical study will almost always not include siblings from offspring-group F 6 for which P(D|F 6 ) = 0 (Equation (6)). Hence, ascertainment bias will cause to be overestimated. Incidentally, the contribution of offspring-group F 6 can be significant. For example, when p ≤ 0.2, more than 40% of all population members are in this offspringgroup; thus, the same proportion (more than 40%) of the population is likely not included in computing an estimator for K s (though F 6 is likely to be included in computing an estimator for K).
In addition, we point out that the sibling recurrence-risk ratio is particularly sensitive to underestimates of K. Indeed, observe that Because the exponent for K is two, while P(D 1 ∩ D 2 ) has exponent one, λ s will be more sensitive to underestimates of K than to overestimates of P(D 2 ∩ D 1 ). Similarly, an estimator for K s based on a conditional probability approach is also almost always overestimated. Consider a clinical study consisting of pairs of siblings with one of the siblings known to have the disease. An estimator of K s will be K s = P(D 2 |D 1 ). In this case, the clinical study will likely consist mostly of individuals participating from offspring-groups with high offspring-group risks (Equation (6)) [32]; that is, the clinical study will suffer from ascertainment bias. Hence, the calculated value of K s will likely yield an overestimation of K s .
Despite the reality that in principle K s = K, several studies [10,11,31,34] have used estimators of K s and K derived from clinical studies to suggest λ s > 1 and propose that a more complicated genetic model is required to explain the causes of certain genetic disorders. However, as we have shown that λ s = 1, it appears that equations using λ s with a value other than 1 should not be used to propose alternative genetic hypotheses.
As an illustration, we now discuss an example where using λ s is problematic. The contribution of the Human Leukocyte Antigen (HLA) region (denoted by λ sHLA ) to the sibling recurrence-risk ratio is the "expected proportion of affected sibling pairs sharing zero haplotypes identical-by-decent (IBD) (0.25) divided by the observed proportion [of affected sibling pairs sharing zero haplotypes IBD]" [10]; that is, where Z = {S 1 and S 2 share zero haplotypes}. Assuming a multiplicative model [11], the percentage of the HLA's contribution to the sibling recurrence-risk ratio (denoted by % λ sHLA ) is calculated [10] using the equation which obviously requires λ s = 1 (otherwise, the denominator is zero). However, because of our earlier discussion that λ s = 1 (Sections 4.1.1 and 4.1.2), we conclude that this equation experiences a theoretical deficiency by always producing an undefined result-assuming the true value of λ s is used.
In addition to the already-discussed issues with the estimator λ s , it appears that estimating λ sHLA also is problematic; indeed, the above equation for % λ sHLA often is used with an estimated value of λ s satisfying λ s > 1 and an estimated value of λ sHLA also satisfying λ sHLA > 1 [10,11,31,34]. For example, Table 3 in [10] includes several clinical studies that can be used to construct λ sHLA , where the individual studies produce values of P(Z|(D 1 ∩ D 2 )) ranging from a low of 0 (also the median and mode) to a high of 0.50. These values correspond to λ sHLA ranging from undefined (infinite) to 0.50. Combining all of the data in the clinical studies produces P(Z|(D 1 ∩ D 2 )) = 0.07, but due to the large spread of the data, it is not likely that this single value is meaningful (as was pointed out by the authors of the study) [10]. In any event, even if researchers wrongly use λ s > 1 and λ sHLA > 1, they will still be able to compute the quantity However, inferences and hypotheses should not be based on such a calculated value of % λ sHLA because of the previously discussed issues with the estimator λ s and because of difficulties associated with the estimator λ sHLA . We do not dispute that, in principle, there may exist a percentage of HLA's contribution to disease risk; we are simply proposing that using % λ sHLA as an indicator is suspect.
In summary, our analysis shows that λ s experiences theoretical and computational deficiencies; in addition, its definition often is misunderstood and subject to misinterpretations [32]. These attributes lead to estimators of λ s being greatly inflated ( λ s 1); thus, drawing conclusions based on λ s is suspect. In particular, we propose that λ s does not accurately indicate familial aggregation nor provide insight for the general genotypedisease relationship.

Offspring-Group Aggregation and Its Measure
To better account for the fact that each offspring-group has its own disease risk, we propose replacing the concept of familial aggregation with what we call offspring-group aggregation, which describes the aggregation of genetic diseases among the six offspringgroups (instead of among hereditary families). In addition, we propose a new measure that precisely describes the frequency distribution of genetic diseases among the six offspringgroups and yields estimators of the offspring-group aggregation of genetic diseases.
To do this, we define the offspring-group recurrence-risk ratio as the ratio of the offspringgroup risk to the disease prevalence; specifically, 2,3,4,5,6).
It measures the likelihood that a person from offspring-group F i has the disease, relative to a person from the general population. For example, µ i = 2.5 means that a person from F i is about 2.5 times more likely to have the disease as a person from the general population.
Using Equations (1) and (6), we obtain the following representations of offspring-group risk (Section 4.1.2) in terms of r and P(D|CC): which we collectively write in the form where the functions β i (r) are: Using Equations (3) and (8), we obtain We propose that the values of µ i are an appropriate way to measure the degree of offspringgroup aggregation across all offspring-groups in the general population.
In Table 3, we provide illustrative examples of the offspring-group recurrence-risk ratio (Equation (9)): (i) a C allele with p = 0.2 and r = 1; (ii) a C allele with p = 0.2 and r = 0.5; (iii) a C allele with p = 0.02 and r = 1.  Table 3 illustrates several key features regarding the ability of µ i to measure offspringgroup aggregation: (i) The disparate values of µ i show that each offspring-group has its own contribution to offspring-group aggregation. For example, when p = 0.2 and r = 1, members of offspring-groups F 1 , F 2 , and F 3 are approximately three-times as likely to have the disease as members of the general population, while family F 6 will have no members with the disease. (ii) The distribution of offspring-group aggregation is influenced by the frequency of the dominant allele C. For example, when r = 1, the positive values of µ i increase markedly as p changes from p = 0.2 to p = 0.02. (iii) The distribution of offspring-group aggregation is influenced by the parameter r. For example, when p = 0.2, the offspring-group aggregation is more concentrated among families F 1 and F 2 for r = 0.5 than for r = 1.
An important property of the values of the offspring-group recurrence-risk ratio µ i is that their weighted sum is equal to 1, where the individual weights are the frequencies of the corresponding offspring-groups. Indeed, writing Equation (7) in terms of the offspringgroup recurrence-risk ratios yields K s = P(D 2 ) p 4 µ 1 + 4p 3 qµ 2 + 2p 2 q 2 µ 3 + 4p 2 q 2 µ 4 + 4pq 3 µ 5 .
Recalling that K = P(D 2 ), we obtain the following decomposition of the sibling recurrencerisk ratio λ s in terms of the offspring-group recurrence-risk ratios µ i Because λ s = 1 (Sections 4.1.1 and 4.1.2), it follows that where the coefficients of µ i are the corresponding frequencies of offspring-group F i given by Equation (5).
In addition, another key feature of the offspring-group recurrence-risk ratio is that, unlike λ s , Equation (10) precisely describes the frequency distribution of offspring-group aggregation of the disease among the six offspring-groups (recall for family F 6 that µ 6 = 0). Writing Equation (10) in the form emphasizes that each term in the sum, P(F i )µ i , is the offspring-group proportion of those with the disease who are in offspring-group F i , where P(F i ) is given by Equation (5). Table 4 illustrates the offspring-group proportions when p = 0.2 and r = 1. The implication of the values is straightforward; for example, of those people with the disease, approximately 57% are from offspring-group F 5 . Moreover, notice that the sum of the values equals 1, as required by Equation (10). Table 4. Offspring-group proportions when p = 0.2 and r = 1.
We point out that, for diseases in which the genotype CC is lethal prior to birth or shortly thereafter (e.g., Huntington's disease and Marfan syndrome [35,36]), offspringgroups F 1 , F 2 , and F 3 will not appear in the (living) population. In this case, the offspringgroup risk ratios µ 4 and µ 5 and the offspring-group proportions P(F 4 )µ 4 and P(F 5 )µ 5 are the most relevant.
In summary, our theoretical framework proposes replacing familial aggregation with offspring-group aggregation and replacing λ s with the offspring-group recurrence-risk ratio µ i , which has these advantageous properties: (i) it quantifies the clustering of the genetic disease within different offspring-groups and thus does not assume a single value of aggregation that applies across the general population; (ii) it depends on the parameters p and r, which can be estimated using unbiased clinical studies (Section 2); (iii) unlike λ s , it does not explicitly depend on K, which is often underestimated (Section 3); (iv) it can be used to precisely describe the frequency distribution of offspring-group aggregation (Equation (10)), which cannot be done with λ s . This emphasizes the importance for parental-sibling clinical studies of determining from which of the six offspring-groups each subject comes.
In Section 5.3, we provide a scenario illustrating how a clinician may use the theoretical framework for offspring-group aggregation as a clinical tool.

Discussion: Integration of Results
Researchers and clinicians who want to identify a genetic disease, including its genotype-phenotype relationship, benefit from being attentive to the three topics we have developed: (1) the relationship between the disease-causing genotypes and the presence of the associated disease (Section 2); (2) the role of diagnostic tests and their ability to identify the disease (Section 3); and (3) the frequency distribution of offspring-group aggregation among the six offspring-groups (Section 4). Figure 7 provides an organizational diagram of our unified theoretical framework of these three topics. Recall that G, D, and T denote the events that an individual from the general population has the disease-causing genotypes, has the disease, and receives a positive test result from a diagnostic test, respectively. Their possible relationships (logical implications) are illustrated by the blue and red arrows: Section 2 discusses when G is necessary and/or sufficient for D (i.e., when the disease-causing genotypes identify the disease); Section 3 discusses when T is necessary and/or sufficient for D (i.e., when a diagnostic test identifies the disease). Section 4 investigates the frequency distribution of offspring-group aggregation among the six offspring-groups (summarized by ∑ 6 i=1 P(F i )µ i = 1), which is affected by G, D, and T, as indicated by the green arrows.  Figure 7. Organizational diagram of our unified theoretical framework of the three main topics for identifying a genetic disease. Recall that G, D, and T each denote the events that an individual from the general population has the disease-causing genotypes, has the disease, and receives a positive test result from a diagnostic test, respectively. The possible relationships between G, D, and T are illustrated by the blue and red arrows (the arrows are the notation for the logical concept "implies"). The frequency distribution of offspring-group aggregation among the six offspring-groups is summarized by the equation, which is affected by G, D, and T, as illustrated by the green arrows.

Relationship between G and D (Section 2)
Fundamental to identifying a genetic disease is determining the relationship between the disease-causing genotypes and the presence of the associated disease. For a disease caused by a dominant allele: G is always necessary for D; G is sufficient for D if and only if the disease-causing genotypes are fully penetrant. This is illustrated in Figure 7: D ⇒ G and the corresponding blue arrow always occurs; G ⇒ D and the corresponding red arrow occurs if and only if P(D|CC) = 1 and P(D|Cc) = 1.
In other words, the relationship between disease prevalence and the frequencies of the disease-causing genotypes is always P(D) ≤ P(G), and P(D) = P(G) only when P(D|CC) = 1 and P(D|Cc) = 1.
The theoretical framework presented in Section 2 provides guidance to researchers and clinicians with regard to determining the relationship between the disease-causing genotypes and the presence of the associated disease. In particular, if they believe "G is necessary, but not sufficient for D", then we propose that researchers and clinicians continue their investigations, being aware of the associated consequences and responsibilities (Section 2.3), with the goal of characterizing the relationship between G and D. Even so, it is essential that clinicians not use their belief that a disease-causing genotype is partially penetrant as justification for using an inaccurate diagnostic test; that is, for using a diagnostic test with low sensitivity and/or low specificity (Section 5.2).

Relationship between T and D (Section 3)
The theoretical framework presented in Section 3 provides guidance to researchers and clinicians with regard to understanding the relationship between a positive diagnostic test result and the presence of the associated disease. In summary, we recommend that researchers and clinicians: (i) Ensure diagnostic tests have T that is both necessary and sufficient for D. Figure 7 illustrates the desired relationship: T ⇔ D and the corresponding blue and red arrows both occur. When this is the case, P(T) = P(D), where P(D) is described in Section 2.
If clinicians think that a diagnostic test's positive result is "necessary, but not sufficient" to confirm the presence of the disease, then that is equivalent to them accepting a diagnostic test that is actually inadequate at identifying the disease. The test either should be refined or replaced. We suggest it is imperative that clinicians continue their investigations-ultimately seeking a diagnostic test that consistently does identify the disease (Section 3.2). (ii) Treat P(T) as a cumulative lifetime risk. Framing accurate diagnosis as a cumulative lifetime risk has implications for clinicians considering the usefulness of a diagnostic test result, as well as for developing long-term clinical studies (Section 3.3).
These two essential features make it more likely that unbiased clinical studies produce an estimator P(T) that is close to the estimator P(D) described in Section 2.2.
In order to be useful in diagnosis, all diagnostic tests must, within reasonable error bounds, give the same diagnostic information. At present, antibody tests, pregnancy tests, and blood tests for particular substances are examples of diagnostic tests for which high sensitivity and specificity determinations are standard. This standard should be applied to all tests (e.g., tissue biopsies) that are part of the diagnostic system. Even so, for some genetic diseases, not all subjects with the disease-causing genotype will appear to have the disease. This may be because of partial penetrance, but it should also be considered that incomplete diagnosis may be at fault or that people may tend to ignore their symptoms or ascribe them to other causes. Those persons should be more carefully followed up with additional investigations and perhaps different types of diagnostic tests.
Finally, we mention that when G and T are both necessary and sufficient for D (all blue and red arrows in Figure 7 occur), then P(G) = P(D) = P(T), and clinical studies should produce estimators for P(G) and P(T) that are close; that is, P(G) ≈ P(T). Because genetic tests are less likely to have errors than are diagnostic tests, a discrepancy between the estimators more than likely suggests that P(T) is not accurate, indicating that further investigation is warranted, rather than concluding simply that G is not sufficient.

Offspring-Group Aggregation (Section 4)
The general population can be partitioned into six offspring-groups denoted by F i (for i = 1, 2, . . . , 6), and a specific offspring-group F i is determined by parental genotypes (Figure 6). We provide a theoretical framework for describing a genetic disease's offspringgroup aggregation (i.e., disease aggregation among the six offspring-groups).
We discuss the theoretical and computational deficiencies of the sibling recurrence-risk ratio, whose definition often is misunderstood and subject to differing and inconsistent interpretations. This ratio typically is used as an indicator of familial aggregation even though it ignores the six offspring-groups (Section 4.1).
We propose replacing familial aggregation with offspring-group aggregation, as well as an alternative measure that does not experience the deficiencies and precisely describes the frequency distribution of offspring-group aggregation among the six offspring-groups (Section 4.2). In summary, our proposed measure is the offspring-group recurrence-risk ratio (denoted by µ i ), which is defined in Equation (9). It measures the likelihood a person from offspring-group F i has the disease, relative to a person from the general population. The frequency distribution of offspring-group aggregation is described by the equation where P(F i )µ i is the offspring-group proportion of those with the disease who are in offspring-group F i .
Finally, we note that µ i and P(F i ) depend on understanding the disease-causing genotypes and the presence of the disease (Section 2), as well as accurate diagnosis of the disease (Section 3). Thus, our theoretical framework for offspring-group aggregation fundamentally relies on an understanding of the relationships between G, D, and T, as communicated by the green arrows in Figure 7.
Offspring-group aggregation as a clinical tool. We conclude with a scenario illustrating how a clinician may use the theoretical framework for offspring-group aggregation as a clinical tool. Consider a disease caused by a dominant allele with p = 0.2, r = 1, and P(D|CC) = 1. Then, P(D) = 0.36 (Equation (3)). Suppose a person visits a clinician wanting to know the likelihood they have the disease, given the person has a sibling known to have the disease. While the clinician may not know to which offspring-group the siblings belong, it is known they are not in offspring-group F 6 . As illustrated in Table 3, the clinician predicts the person is either 1.39, 2.08, or 2.78 times as likely to have the disease, compared to members of the general population, which is 0.36. Using this information, the clinician predicts the likelihood that the person has the disease is approximately either 0.50, 0.75, or 1.00, respectively, and the person's offspring-group determines which of the three values it is. However, even if the clinician does not know the person's offspring-group, it is still possible to estimate the likelihood the person has the disease. Indeed, based on Table 4, the clinician notices that, of those people with the disease, F 5 has the highest percentage (in fact, higher than the sum of all other offspring-groups); thus, the clinician may choose to only use the F 5 information and predict that the likelihood the person has the disease is about (1.39) × (0.36) = 0.50. Alternatively, the clinician may choose to use a weighted average, incorporating all the information in Tables 3 and 4, 0.57(0.50) + 0.21(0.75) + 0.22(1.00) = 0.66 as a prediction of the likelihood the person has the disease. Whichever value the clinician chooses (0.50 or 0.66), the clinician concludes the person is at a higher risk than a member of the general population (0.36). This information can be used to frame a discussion between the clinician and the patient regarding the next steps to pursue (e.g., whether to test the person for the disease-causing genotypes or administer accurate diagnostic tests).
We recommend that researchers and clinicians consider using the theoretical framework for offspring-group aggregation discussed in Section 4 and summarized in Section 5.3.
To place our analysis in the context of the current state of research, it is still epidemiologically valid to say that if one person in a hereditary family has a genetic disease, other family members are at risk, should be carefully evaluated, and appropriate precautions should be taken. Though other hereditary family members often are at higher risk than are members of the population as a whole, this does not mean K s > K in the general population. We suggest this mistaken idea be replaced by an approach that carefully uses diagnostic tools to accurately evaluate K, as well as describe genetic disease aggregation in terms of the offspring-groups F i and the offspring-group recurrence-risk ratio µ i . Acknowledgments: The authors thank J. Christopher Gaiser, Linfield University Department of Biology, for the helpful advice and discussion regarding population genetics; and Nadine Grzeskowiak, Celiac Nurse Consulting, Salem, Oregon, for insightful discussions regarding clinical applications of gene-disease relationships. The authors also thank the Editor for the valuable assistance, as well as the Reviewers for their helpful comments/feedback, which improved the exposition.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Derivation of Equation (2)
Consider the partition of the population in terms of the genotypes CC, Cc, cC, and cc. Because P(D ∩ Cc) = P(D ∩ cC), we obtain P(D) = P(D ∩ CC) + 2P(D ∩ Cc) + P(D ∩ cc).
By the definition of the probability of an intersection, P(D) = P(D|CC)P(CC) + 2P(D|Cc)P(Cc) + P(D|cc)P(cc), which can be written in the form shown in Equation (2).

Appendix B. Necessary and Sufficient as Conditional Probabilities
We now develop equivalent conditional probability formulations for the concepts of "necessary" and "sufficient". The formulations apply to any two events, but we will frame the discussion in terms of G and D (Section 2.3).
Furthermore, observe that P(D|G) = 1 is equivalent to saying that "G is sufficient for D." Indeed: ⇔ the occurrence of G implies the occurrence of D ⇔ G is sufficient for D.