The Genetic Architecture of Fluctuating Asymmetry of Mandible Size and Shape in a Population of Mice: Another Look

Fluctuating asymmetry (FA), typically measured by variation in the differences between right and left sides of bilateral traits, is commonly used to assess developmental instability (DI) in populations. A previous quantitative trait locus (QTL) investigation using an F2 intercross mouse population found little evidence of individual loci affecting FA in mandible size, but an abundance of epistatic interactions between loci. Here we extend this work by testing whether these patterns replicate in an F3 population derived from the same intercross. Using a large number of molecular markers genotyped in over 1200 mice, we uncovered significant interactions between loci (QTLs) affecting FA in mandible size (and shape). Epistasis contributed roughly 20% of the variation in FASIZE and 19% of the variation in FASHAPE at the 0.0001 probability level alone, and was comparable to that previously estimated for the F2 mice, and much greater than that generated from the few single-locus QTLs affecting the mandible FA traits. The positions of the single-locus and epistatic QTLs for FA that we discovered suggested that logical candidate genes for DI are those controlling size or shape in the traits themselves, and that they may be interacting with genes for heat shock proteins.


Introduction
Fluctuating asymmetry (FA), typically measured by non-directional variation in the differences between left and right sides of bilateral traits, is a widely-used measure of developmental instability (DI). DI is generated from any of a number of internal or external stressors that perturb the normal developmental pathway of structures and result in random developmental noise [1][2][3]. As a consequence, it has been assumed that stressed populations will exhibit greater levels of DI (as measured by FA) than control or less-stressed populations [4,5]. Although results from a number of studies using a wide variety of environmental stressors (toxins, noise, parasites, radiation, etc.) have supported this expectation, it has not been universally found [6].
In contrast to the environmental origin of development noise, it has long been thought that there is a genetic basis for buffering this noise and thus promoting stability during the developmental process [1,[7][8][9]. It therefore is not surprising that FA also has been the subject of a large number of genetic studies. Even after years of investigation, however, the genetic basis for DI remains obscure [6]. It is still not entirely clear, for example, whether DI is even heritable since many heritability estimates made for FA in various traits have been low in magnitude and non-significant [6]. On the other hand, several investigators [10,11] have argued that this is partly because FA is difficult to measure, and sample sizes and experimental designs used in many of these studies may not have been optimal. And in fact using a selection regime with large sample sizes, Carter and Houle [12] were able to demonstrate statistical significance for estimates of the heritability of FA in Drosophila wing lengths that were less than 0.01.
The general finding of low heritabilities for FA [10,13] led Leamy and Klingenberg [6] to suggest that nonadditive epistatic rather than additive effects might be major contributors to the genetic variance in DI. This possibility also is consistent with the coadaptation hypothesis for the origin of FA in which selection ultimately results in harmonious (epistatic) interactions of genes that increase developmental stability [14][15][16][17]. Generally these coadapted complexes are difficult to detect until they are broken down as is thought to occur in hybrids formed from different species or subspecies that tend to show greater FA than their parents [14][15][16][17][18]. If it is generally found that FA does have an epistatic genetic basis, this has several implications for its evolution and use in studies comparing DI in stressed versus non-stressed populations [6].
Although detection of epistasis especially for FA is statistically and experimentally difficult [19], some progress has been made in recent years using a quantitative trait locus (QTL) approach [6]. Leamy et al. [20] used an F2 population of mice generated from an intercross of the Large (LG/J) and Small (SM/J) inbred strains and found an abundance of QTL interactions affecting FA in mandible size in spite of there being very little evidence for single-locus QTLs affecting this trait [21,22]. Mandible size and shape, but not their asymmetries, were analyzed in a much larger sample of mice from an independent (F3) intercross of these strains , and therefore this population afforded an opportunity to conduct a more intensive search with greater statistical power to detect single-and two-locus (epistatic) effects on FA not only in mandible size, but also in mandible shape. In addition, this previous QTL analysis of mandible FA used a physiological epistasis model [20], and it was important to discover whether our use of the more conventional orthogonal model (see below) would yield similar results. Here we performed a search using this F3 mouse population to discover whether we might again find an epistatic genetic basis for FA. If found, we were interested to see whether the epistatic combinations would replicate those from the F2 generation, and whether the locations of the FA QTLs might suggest potential candidate genes for developmental stability.

The Mouse Population
We used mice from the F3 generation produced from an original intercross of the Large (LG/J) and Small (SM/J) inbred strains [23][24][25]. Approximately 40% of the F3 mice were cross-fostered by reciprocally exchanging half of the pups from pairs of litters born on the same day [26]. A total of 171 litters were involved in these exchanges. All mice were sacrificed at 70 days of age or after having reared their offspring to three weeks of age. After sacrifice, DNA was extracted and used to genotype 353 polymorphic autosomal SNP markers in all available F3 mice via the Illumina Golden-Gate assay [25]. These markers effectively covered all 19 autosomes with an average interval between markers of 4-5 cM except for several regions in the genome where there was little polymorphism between the LG/J and SM/J strains [27].
Skeletons were prepared by exposure to dermestid beetles and left and right sides of the mandibles in each mouse were separated at the mandibular symphysis, placed under a microscope, and their images scanned into a computer. From these images, we recorded x and y coordinates for 15 landmarks located around the outline of the mandible [23]. After all scans were digitized, we performed a second round of digitization using the same images to assess the magnitude of measurement error. Some mice had broken mandibles on one or both sides and thus were unusable, and we also eliminated some outliers and all mice for which genotyping was not accomplished. Replicate measures for both left and right mandibles were available for a total of 1233 F3 mice.

Mandible Size and Shape
To estimate FA in the mandibles, we first calculated mandible size and shape traits using a Procrustes superimposition technique on the 15 landmark points for both left and right mandibles [28]. The overall measure of mandible size, centroid size, was obtained from the square root of the sum of the squared distances between each landmark and the center of gravity (mean x and y coordinates) for the entire configuration. Thirty (15x and 15y) mandible shape variables were generated in the Procrustes procedure from the original x, y coordinates by a series of steps that eliminated effects of size, location, orientation and reflection [28] in the mandibles.
Once mandible size and shape traits were obtained, we assessed the magnitude of measurement error and tested to see whether significant FA was present. This was done with a conventional mixed model two-way analysis of variance (ANOVA) for mandible centroid size [29,30] and a two-way Procrustes ANOVA for the shape variables [28]. Sides (fixed) and individuals (random) were the two factors in these analyses, with the side-by-individual interaction term reflecting FA and the residual reflecting measurement error [30]. Variance components also were calculated for the three random factors (individuals, side-by-individual, and error), and each was expressed as a percentage of the total to assess their relative contributions.

Mandible Size and Shape Asymmetry
To calculate measures of mandible size and shape FA, we made use of the mean of the two replicates of left and right sides for both centroid size and for each of the 30 shape variables. This reduced the measurement error contribution to the total phenotypic variation by one-half and thus promoted more precise FA measures.
For mandible size FA, we first calculated signed, right minus left differences of centroid size for each mouse, and subtracted the overall mean of these differences from each of the signed differences to set their mean to 0 and thus adjust for directional asymmetry present [30]. The absolute values of these adjusted differences then provided measures of FA of centroid side (FASIZE). The distribution of these absolute differences between sides was half-normal as expected [30], so we transformed these values to be consistent with the approach of Leamy et al. [20] by taking them to the 0.44 power, and this was effective in promoting normality (p = 0.10 in a Kolmorogov-Smirnov normality test). We also tested for the potential effects of both sex and litter size on the transformed FASIZE values, but neither of these variables reached statistical significance (p > 0.05). Finally, we regressed the FASIZE values on centroid size (mean of left and right sides), but these results also were non-significant, so no adjustment was made for potential scaling effects.
We calculated a single measure of mandible shape FA with a procedure suggested by Klingenberg and Monteiro [31]. This was accomplished by first obtaining the signed differences of left and right sides for all 30 shape variables and adjusting them for any directional asymmetry in the same manner as described for centroid size. A multivariate regression of these signed differences on the signed differences for mandibular centroid size was significant (p < 0.01), so we adjusted each variable for this allometric association by obtaining the residuals from this regression. A principal components analysis then was run on the covariance matrix of these residuals, and standardized component scores were calculated for the first 26 components (the last 4 components had zero eigenvalues) by dividing each score by the square root of its appropriate eigenvalue. Klingenberg and Monteiro [31] provide a full justification for the origin and use of this measure of shape FA. The log of the square root of the sums of squares of these standardized PC scores, taken to promote normality, provided our measure of FA of mandibular shape (FASHAPE) for each individual mouse.

Single-Locus QTL Analyses
We tested for single-locus QTL effects separately for FASIZE and for FASHAPE with the Haley-Knott interval mapping option in R v. 3.01 [32] using the QTLRel package [33,34]. We used this program because it adjusted for the structured relatedness among individuals in families in our advanced intercross population by calculating condensed identity coefficients [35] from the pedigree information we provided. This program also imputed 223 additional genotypic index values between markers, resulting in a total of 576 sites with an average intermarker distance of 2.44 cM. At each of these sites, QTLRel fit a mixed linear model to evaluate potential additive and dominance fixed effects on the FA traits. The model included a random classification factor (nurse) to adjust for any non-genetic post-natal maternal effects that can arise from the common environment shared by family members with the same nurse mother [36,37]. With this model, the QTLRel program produced likelihood ratio values at each of the 576 sites throughout the genome that were converted into logarithm of odds (LOD) scores.
We used a permutation method [38] in QTLRel to establish threshold levels of significance for the LOD scores generated for the two FA traits. This method shuffled the genotypic data 1000 times, and recorded the highest LOD scores from the analyses run on each sample. We used the 95th percentile value among these 1000 LOD scores as the 5% experimentwise significance threshold (3.79 for FASIZE, 3.71 for FASHAPE). In addition, the 95th percentile values for the highest LOD score on each of the 19 autosomes were used as the 5% chromosomewise thresholds. These thresholds were very similar for FASIZE (mean = 2.33; range 2.17-2.52) and FASHAPE (mean = 2.34; range 2.11-2.54).
We considered potential QTLs to be present at the sites of all LOD scores on each chromosome that reached the chromosomewise threshold levels of significance. Confidence intervals for each QTL were defined by 1.5 LOD drops on each side of the peak position. QTLRel estimated additive (a) and dominance genotypic values (d) at each QTL site by partial regressions, and tested them for significance (p < 0.05) with individual t-tests. The additive genotypic value is defined as one-half the difference between the values for the two homozygotes and the dominance genotypic value is defined as the difference between the mid-homozygous and the heterozygous values [39]. We standardized the a and d estimates by dividing them by the standard deviation of the FA trait (FASIZE or FASHAPE) to facilitate comparison among QTLs. QTLRel also estimated the percentage of the total phenotypic variation for the FA traits explained by each QTL.

Epistasis Analysis
We also conducted scans for each chromosomal pair to test for potential effects of epistasis on FASIZE and FASHAPE. To accomplish this, we first converted the genotypic probabilities generated by QTLRel at each of the 576 sites to orthogonal additive and dominance genotypic index values [40]. Then at each pair of sites, we used the MIXED procedure in SAS (software version 9.2; SAS Institute, Cary, NC, USA) to fit the following full model: Here y is the trait, and the fixed variables are the additive (Xa1 and Xa2) and dominance genotypic index values (Xd1 and Xd2) and the Xa1Xa2, Xa1Xd2, Xd1Xa2, and Xd1Xd2 terms that represent their pairwise epistatic products. The model included dam and nurse as random classification factors to adjust for any non-genetic pre-and post-natal maternal effects. These analyses generated −2ln likelihood values that we subtracted from the comparable values in a reduced model that did not include the four interaction terms. We evaluated the difference between these two values via a chi-square test with 4 degrees of freedom, and transformed its associated probability into a LOD score = log10(1/probability).
To visualize the epistatic patterns for FASIZE and FASHAPE generated in these tests, we plotted the pairs of positions associated with probabilities of 5% or less (LOD scores of 1.3 or greater). For each pair of chromosomes, the two positions with the lowest probability within clusters of probabilities were taken to represent a potential epistatic combination. Where these probability clusters clearly were separated, each cluster was viewed as a potential separate instance of epistasis. For both FASIZE and FASHAPE, we counted the number of clusters for which the lowest probability was less than 1%.
To assess the many hundreds of epistasis tests conducted for each of the two FA traits, it first was necessary to calculate the number of these tests expected to reach significance because of chance alone. We therefore estimated the effective number of independent markers on each chromosome from the method proposed by Li and Ji [41]. We did not adjust for family relatedness because preliminary genetic analyses showed that family differences did not significantly affect either FASIZE or FASHAPE. The effective numbers varied from 5 to 11, all being less than the actual number of markers on each chromosome because of linkage disequilibrium among loci [42]. The total number of independent epistasis tests then was estimated by the sum of the cross products of the effective number of markers for all 171 pairs of chromosomes. This sum was 8739, suggesting that by chance alone, we might expect about 87.4 tests to reach significance at the 1% level (LOD = 2.0), 8.7 at the 0.1% level (LOD = 3.0), and 0.87 at the 0.01% level (LOD = 4.0). Epistasis therefore was indicated if the number of pair-wise QTL combinations reaching these probability levels significantly exceeded the appropriate numbers (using chi-square tests each with 1 d.f.).
We also used the expected numbers of independent tests of epistasis to assess individual instances of epistasis. Specifically, epistasis was considered to be significant at the experimentwise level when a given probability from the F test reached the 0.05 Bonferroni threshold level of significance [20,43], 0.05/8739 (LOD = 5.24). A suggestive threshold (5%) also was obtained by dividing 0.63 [44] by 8739, generating a LOD score of 4.14. For epistasis scans involving many markers, however, Holland [45] has argued that a more appropriate suggestive threshold level is 0.05 divided by the number of chromosome pairs. For the 171 pairs of chromosomes, this threshold would be 0.05/171 = 3.534. We compromised and considered any interaction reaching the 0.01% level (LOD = 4.0) as suggestive of epistasis.
For any epistatic combinations reaching this 0.01% significance level, we estimated the four orthogonal epistatic components (aa, ad, da, dd) from regression coefficients and tested them for significance (p < 0.05) in t-tests. The additive by additive (aa) mode of epistasis occurs when the single-locus additive genotypic value at a given locus (A) differs depending on the genotype at another locus (B) and vice versa. The additive by dominance (ad) mode of epistasis occurs when the single-locus additive genotypic value for a locus A differs depending on the genotype at another locus B whereas the single-locus dominance genotypic value at B differs depending on the locus A genotype (and vice versa for da epistasis). Dominance by dominance (dd) epistasis occurs when the single-locus dominance genotypic value at locus A differs depending on the genotype at locus B and vice versa.

Results
The results of the ANOVA of mandible size and shape are summarized in Table 1. Both traits show highly significant differences among individuals as well between left and right sides, suggesting the presence of directional asymmetry. More importantly, the significant individuals by side interactions for both traits indicate that significant FA is present as well. The FA contribution to the total variation is less than 3% for mandible size but is much higher (31.8%) for mandible shape. Measurement error is small (less than 1%) for mandible size but amounts to 8.5% of the total variation of mandible shape. The results of the single-locus QTL analysis of FASIZE and FASHAPE are given in Table 2. For FASIZE, only one QTL on chromosome 13 had a LOD score that reached the chromosomewise level of significance, which is approximately what is expected by random chance. Two QTLs on chromosome 7 and 17 were found for FASHAPE, both with significant additive and especially dominance effects. The QTL on chromosome 7 has greater statistical support since its LOD score of 3.53 approaches the threshold (3.71) for experimentwise significance. Table 2. QTLs affecting FA in mandible size and shape. Shown are the locations in mega base pairs (Mb) on each chromosome (Chr) and the confidence intervals (CI) for those QTLs with logarithm of odds (LOD) scores that reached chromosomewise significance for FA in size and shape. Also given for each QTL are its additive (a) and dominance (d) genotypic values and percentage contribution (%) to the total variance of the FA trait. * = p < 0.05; ** = p < 0.01. For FASIZE, the epistasis analyses produced 184 interactions that had LOD scores of 2.0 or higher (p < 0.01), and this was significantly different (p < 0.0001) from the 87.4 expected by chance alone. A total of 34 epistatic interactions reached the 0.001 (LOD > 3.0) probability level, however, and this also was significantly different (p < 0.0001) from the 8.7 expected by chance. A similar trend was seen for FASHAPE, with 193 epistatic interactions with LOD > 2.0 (p < 0.0001), and 31 instances of epistasis at the 0.001 probability level (p < 0.0001). At both the 0.01 and 0.001 probability levels, therefore, there is evidence for significant epistasis affecting both FA traits.
Four individual epistatic combinations affecting FASIZE, and three affecting FASHAPE, had LOD scores reaching the 0.0001 level of significance (Table 3). For FASIZE, the QTLs involved in these interactions reside on 7 different chromosomes, including two sites on chromosomes 6. Those on chromosomes 17 and 18 generate an interaction with the highest LOD score. Ten epistatic components, especially da and dd types, are statistically significant. Multiple regression of FASIZE on these ten components (adjusting for nurse and dam effects) explained 6.05% of the total variation. The QTLs involved in the three interactions affecting FASHAPE are on six different chromosomes at different locations than the epistatic QTLs for FASIZE. These interactions affecting FASHAPE also show a different pattern, with eight components significant, including all three aa and ad effects. A multiple regression model with these 8 components explained 4.1% of the total variation in FASHAPE. Table 3. Epistatic combinations for mandible FA with LOD scores reaching the 0.0001 probability level. The locations of each epistatic combination are shown in terms of their sites (Mb) on each pair of chromosomes (Chr). Also shown are the LOD scores for these combinations and the four orthogonal epistatic components. * = p < 0.05; ** = p < 0.01.  Figure 1A illustrates how an epistatic interaction involving QTLs on chromosomes 17 and 18 affect FASIZE. The three lines connect the FASIZE means for the chromosome 18 genotypes (SS, LS, and LL) at each of the genotypic means for the chromosome 17 genotypes. With no epistasis, we would expect these lines to be roughly parallel which they are not. Note that mice with the SS genotype for the QTL on chromosome 17 exhibit greater asymmetry with heterozygotes compared to homozygotes for the QTL on chromosome 18 whereas the reverse is true for chromosome 17 LL homozygotes. A different epistatic pattern is shown in Figure 1B for QTLs on chromosome 12 and 18 affecting FASHAPE. The most noticeable feature is that LL homozygotes for the chromosome 18 QTL exhibit the highest and lowest levels of asymmetry, respectively, when combined with LL and SS homozygotes at the chromosome 12 locus.

Evidence for Epistasis
The results of this study support the earlier findings from an independent study [20] that demonstrated that the genetic architecture of FA is dominated by epistasis. The evidence for this was seen in the significant excess of epistatic combinations affecting FASIZE at both the 0.01 and 0.001 probability levels beyond the number expected from chance alone. At the 0.001 level, there were nearly four times the numbers of observed epistatic interactions (34) as expected (8.7). This corresponds to a false discovery rate (FDR; [46] of 25.7%; which suggests that nearly three-quarters of these interactions represent true instances of epistasis. This finding provides strong support for the notion that epistatic interactions contribute a source of genetic variation to FA. Leamy et al. [20] previously found fewer (30) epistatic combinations affecting FA of mandible size in the F2 mice that reached the 0.001 probability level; but only 2 (1.85) were expected by chance, resulting in a much lower FDR of 6.2%. Far fewer (75) molecular markers were used in that study compared to the 353 we used here, however, and this resulted in a lower estimate for the number of independent epistasis tests (1850) and interactions expected at each of the probability levels [20].
Our estimate of the number of independent epistasis tests for FASIZE (8739) generated a high threshold for experimentwise significance (LOD = 5.24), and thus it was not surprising that no LOD scores for the epistatic interactions reached this level. In previous studies with the F2 generation mice, several epistatic combinations affecting FA in both mandible size [20] and tooth size [47] exceeded the 5% experimentwise significance level of 5.41 × 10 −5 (LOD = 4.27). On the other hand, testing for epistasis in the F2 mice was done using a physiological epistasis model [48] which was criticized by Zeng et al. [49]. We used an orthogonal epistasis approach, so our results are not strictly comparable to those for the F2 mice. Other studies using the orthogonal approach generally have detected some epistatic combinations for various quantitative traits with LOD scores reaching experimentwise significance [50,51], although this was not the case for epistasis affecting three physical activity traits measured in mice from an F2 intercross (C57L/J × C3H/HeJ) mouse population [52].
We also discovered epistasis affecting FASHAPE, although again, no epistatic effects for FASHAPE were found with LOD scores reaching the experimentwise threshold. A significant excess of epistatic combinations affecting FASHAPE occurred at the 0.001 probability level, but the observed number of epistatic interactions (31) was somewhat lower than that for FASIZE, generating a slightly higher FDR of 28.2%. This may partly be a consequence of the greater difficulty in measuring mandible shape compared to mandible size. Our calculated measurement error for mandible shape of 8.5% is over 9 times higher than that of 0.9% for mandible size whereas measurement error for mandible shape estimated by Klingenberg et al. [22] in the F2 mice was 4.9%, 6.4 times that for size. Leamy et al. [20] did not analyze FA of mandible shape in their previous study, but did find comparable levels of epistatic effects for both FA of size and FA of shape in molars in the same F2 mouse population. In that study, the level of measurement error for molar shape (22%) was quite high, but was only 2.5 times that (8.9%) for molar size [47].

Genetic Impact on Mandible FA
Although epistasis significantly affected FA of mandible size, assessing its true impact is problematic. This is because it is difficult to sort out true versus false instances of epistasis especially given that none of the interactions reached experimentwise significance. A multiple regression model suggested that the four epistatic combinations shown in Table 3 affecting FASIZE account for about 6% of its total variation, but at the 0.0001 probability level, one of these four is expected to be a false positive. At the 0.001 level, the significant epistatic components account for 27.8% of the total variation in FASIZE. Given that 74.3% of these epistatic interactions are expected to be true positives, this suggests that epistasis accounts for about 20.6% of the FASIZE variation. Although we cannot know the actual impact of epistasis on FASIZE in the F3 mice, it does appear that it is comparable to that previously found for FA of mandible size [20] or molar size [47] in the F2 mice.
The impact of epistasis on FASIZE in the F3 mice clearly was greater than that generated from single-locus effects. Thus we discovered only one QTL affecting FASIZE that accounted for less than 1% of the variation. This QTL also is not well supported statistically since it only reached the suggestive threshold for which one false positive result is expected from chance alone. This is in contrast to the 14 QTLs we [23] found for centroid size (mean of the two sides) itself in the F3 mice. Klingenberg et al. [22] also found only a single QTL affecting FA in mandible size in the F2 mice, although its contribution to the variance was somewhat higher (2.4%). Leamy et al. [53] found no QTLs for FA of mandible size in a backcross population of mice. These results also are consistent with the very low, non-significant heritabilities estimated by Leamy [54] for FA in 10 different mandible dimensions in a randombred mouse population.
We discovered two QTLs for FASHAPE, but their joint contribution (from a multiple regression procedure) was very small, less than 1% (0.83%). Thus there also appears to be little additive genetic variance for FA in shape. Both QTLs did exhibit significant dominance, however, and this fits well with previous predictions of a non-additive (including dominance) genetic basis for FA [6]. The epistasis contribution to FASHAPE (0.001 level), adjusted for the proportion of expected true positives, is about 19.3%. Thus the impact of epistasis on FASHAPE is considerably higher than that for single-locus effects. Leamy et al. [47] also found that epistatic effects contributed far more than single-locus effects to the variation in FA of molar shape in the F2 mice. Clearly there is very little additive genetic variation for FA in these mandible traits, and this continues to be the general expectation for FA in a number of other traits as well [10,47]. On the other hand, nonlinear developmental models for the origin of developmental instability suggest that nonadditive genetic effects, including epistasis, are expected to be an important part of the genetic architecture of FA [6].

Single-Locus and Epistatic Replication
The locations of the single-locus and two-locus (epistatic) QTLs affecting FASIZE in the F3 mice did not replicate those found in the F2 mice. Our only QTL affecting FASIZE was on chromosome 13 whereas the one QTL found by Klingenberg et al. [22] in the F2 mice was on chromosome 8. And none of the 10 epistatic interactions affecting FASIZE in the F2 mice that reached significance at the 0.1 experimentwise level [20] matched the positions for the QTL pairs in these interactions we found in the F3 mice. In addition, only one of these 10 interactions also matched an epistatic interaction in the F3 mice that reached the nominal 5% level of significance. This low level of replication is not surprising, however, because the power to detect individual loci is low, especially given the small proportion of variation they explain and the very high significance threshold even in an optimal experimental design with a large sample size as we have used here. There may well be a number of loci underlying variation in the FA traits, and if so, we would expect to uncover different loci in separate experiments [55]. Even slight changes in the frequencies of these loci from the F2 to the F3 generation also may have contributed to the lack of replication of the epistatic interactions. FA in mandible shape was not previously analyzed in the F2 mice, so no direct replication test for FASHAPE is possible. We can compare both single-locus and epistatic interactions affecting mandible FASHAPE that we found (Tables 2 and 3) to those for molar FASHAPE in the F2 mice [47], however, and again these comparisons showed no congruence. This result especially for epistasis is not unexpected given that Wolf et al. [56] tested for epistatic pleiotropy in this same F2 mouse population and found no evidence for this among three early-developing cranial traits and a limited number of common epistatic combinations affecting late-developing skull traits. The early-and late-developing skull traits also were positively and moderately correlated phenotypically (mean r = +0.47, +0.47) and genetically (mean r = +0.66, +0.69) whereas the correlation between FASIZE and FASHAPE is a rather low +0.09.

Nature of FA Genes
There have been two schools of thought regarding the nature of the genes affecting developmental stability. One school has suggested that genes for DS as assessed by FA in a given trait most likely are among those affecting the trait itself during its development [57]. An excellent example of this is provided by the Notch locus in Drosophila that is involved in bristle formation. Various Notch mutants are known that produce elevated asymmetry in bristle number, but only in the specific bristle types that they normally affect [58]. On the other hand, Takahashi et al. [59] found 92 deficiencies in the Drosophila melanogaster genome that affected FA in one or more morphological traits they measured, and some, but not all of these, also affected the mean of the traits themselves. For mouse mandibles, Leamy et al. [54] also found that many, but not all, of the QTLs for 10 different size dimensions mapped in the same regions as those for FA in these dimensions. Thus some genes affecting the size or the shape of the mandible might represent reasonable candidates for the single-locus or epistatic QTLs for FA. We used the Mouse Genome Informatics (MGI) website [60], and found 11 genes affecting the mandible that map to similar locations as some of our FA QTLs (Table 4). In addition, we found 10 QTLs previously mapped for centroid size (CENT) and shape (SH) in this same F3 mouse population [23] that colocalize with the FA QTLs, and have listed these in Table 4 as well.
It has also been suggested that there are general buffering genes that when altered result in higher levels of DI (increased FA or phenotypic variation) among a number of different traits [61]. Debat et al. [62], for example, found that overexpression of CycG (Cyclin G) dramatically increased FA in wing size and shape in their population of Drosophila melanogaster. Thus this gene may be a key player in the genetic network regulating developmental stability [62]. The orthologous gene in mice, Ccng1 (cyclin G1), occurs on chromosome 11 at 40.7 Mb, and none of the FA QTLs that we found (Table 4) are located on this chromosome. However, Ccng1 belongs to a class of genes producing cyclins, proteins involved in the cell cycle, and it seemed worthwhile to see if any of the genes in this family colocalized with our FA QTLs. We found 86 mouse cyclin genes in the MGI website [60], 6 of which mapped to within 5 Mb of the 14 epistatic QTLs in Table 4. Using this 5 Mb criterion and taking into account any QTLs less than 5 Mb from the ends of the chromosomes, the 14 epistatic QTLs comprise 127 Mb, or 5.4% of the 2354 Mb total in the genome. We might therefore expect 5.4% × 86 = 5.4 of these cyclin genes to map close to our epistatic QTLs by chance alone, and the six we found do not deviate significantly from this expectation. Another possibility for general buffering genes are those that code for heat shock proteins, molecular chaperones that bind to other proteins such as transcription factors and kinases to enable them to function properly [63]. When organisms are stressed or heat shock genes are mutated, it is expected that protein functioning will be hindered and that this will result in increased instability [64]. In fact this has been demonstrated in flies ( [65,66]; but see [67,68]), Arabidopsis [69] and in zebrafish [70]. It therefore is possible that one or more heat shock genes may be candidates for some of the QTLs we have discovered for FASIZE or FASHAPE. Using the MGI website, we found 62 mouse heat shock genes, 5 that colocalize with our FA QTLs where only 5.4% × 62 = 3.35 are expected by chance alone. These 5 heat shock genes we found do not differ significantly from random expectations (p = 0.35, chi-square test), although we list these in Table 4 because of the literature suggesting heat shock genes influence FA in some cases. In addition, it may be that only one or two heat shock genes regulate developmental stability, notably hsp90 [65,66], and it is suggestive that in fact this is one (Hsp90aa1) of the five genes we found that maps close to an FA QTL (Table 4).
If we presume that heat shock genes can alter FA in mandible size or shape, it may be that this occurs through their interactions with the structural and/or regulatory genes controlling mandible size/shape. Using recombinant inbred lines of Arabidopsis, Sangster et al. [69] found that QTLs controlling hypocotyl length colocalized with those for developmental stability in hypocotyl length. What was most interesting, however, is that these QTLs were uncovered only after the seedlings were treated with a heat shock protein (HSP90) inhibitor. This suggests the presence of HSP90-responsive QTLs that can affect both trait means and DS but that would normally be hidden unless HSP90 is altered [69]. On the other hand, Sangster et al. [69] also found some QTLs that affected DS and not trait means, suggesting that there may be genes whose sole function is to stabilize phenotypes.

Conclusions
Our analysis demonstrated that the genetic basis for FA in the mandible traits is primarily epistatic, with little evidence for single-locus effects. This result is consistent with all other genetic studies on mandible FA in mouse populations [20,53,54,71], and suggests that mandibles are highly canalized structures. If so, a more productive approach for uncovering genetic variation of FA in traits such as mandibles might be to conduct a QTL study in mice subjected to some sort of stress. Perhaps this would uncover additional single-locus or epistatic QTLs for FA whose locations would reveal underlying genes for DS. Our results suggest that logical candidate genes for DS are those controlling the traits for which FA is measured, but studies by Sangster et al. [69] and others also point to a potential role for heat shock genes in mediating the effects of these trait genes. With additional studies, we may eventually discover the nature of DS genes and how they function in promoting stability and in influencing the level of FA in populations, especially those subjected to stress.