A Tool for Sheep Product Quality: Custom Microarrays from Public Databases

Milk and dairy products are an essential food and an economic resource in many countries. Milk component synthesis and secretion by the mammary gland involve expression of a large number of genes whose nutritional regulation remains poorly defined. The purpose of this study was to gain an understanding of the genomic influence on milk quality and synthesis by comparing two sheep breeds with different milking attitude (Sarda and Gentile di Puglia) using sheep-specific microarray technology. From sheep ESTs deposited at NCBI, we have generated the first annotated microarray developed for sheep with a coverage of most of the genome.


Introduction
Milk and dairy products are an essential food and economic resource in many countries. Milk provides the primary source of nourishment for mammals' offspring before their adult diet and contains the principal nutrients plus a huge number of micronutrient molecules, some of them with still unknown properties [1,2]. Therefore, the quality of milk and its control is becoming increasingly important. Milk component synthesis and secretion by the mammary gland varies dramatically across species and involves the expression of a large number of genes whose nutritional regulation remains poorly defined [3]. Nutritional genomics is an integrated science which studies gene expression to identify genetic and nutritional effects of a diet (the nutrient influence) on a single individual; while nutrigenetics seeks to understand the individual genetic differences which affect response to diet [4]. Knowledge of mammary uptake of nutrients, biosynthesis pathways, and the relation between diet and milk composition have been achieved in many studies [5][6][7][8]. Although much is known about the biochemistry of milk synthesis, the regulatory and cellular signaling systems of mammary gland are not well understood [9]. In dairy animals, mammary gland undergoes huge functional and metabolic adaptation to prepare lactogenesis. In all mammals, lactogenesis is characterized by two stages [10][11][12][13]. During the first stage (stage 01), which starts few weeks before parturition, the mammary gland differentiates for secreting colostrum and milk proteins. After parturition (stage 02), the metabolic activity increases the levels of milk production. Milk yield significantly rises during the first few weeks of lactation. During this period a well-studied set of genes, involved in milk synthesis, also increases its expression [12,[14][15][16]. After the lactation peak, milk synthesis and qualified gene expression gradually decrease [14,15]. The end of milking activates the involution of the mammary gland which is characterized by epithelial cell death and by the mammary adipose tissue remodeling [17,18]. In dairy animals the nonlacting period, commonly referred to as the dry period between two lactations, is very important for milk production. A dry period of 40-60 days is necessary for optimal milk production during the next lactation [19].
In Italy, sheep is the second species in economic importance as a milk supply. Milk yield and composition, as well as lactation length, can fluctuate between breeds and within breeds. In normal sheep milk, fat ranges from 6% to 9%, protein from 4% to 7%, total solids from 17% to 21% and lactose from 4% to 6% [20]. Also other milk components implicated in human health vary considerably among breeds. Recently, Signorelli and collaborators [21] analyzed milk quality parameters and milk fatty acid profiles of three Italian breeds, Altamurana, Gentile di Puglia and Sarda, finding significant differences between breeds. The lowest content of saturated fatty acids (SFAs) was estimated in Gentile di Puglia breed, while mono-unsaturated FAs (MUFAs) were lowest in the Altamurana. No differences between breeds were found for conjugated linoleic acid (CLA) and poly-unsaturated FAs (PUFAs). Cheese quality is expected to be influenced by the differences between breeds in milk fatty acid contents [21].
The comparative analysis of some sheep breeds with different attitude to milk production could demonstrate the association between genetic variants and milk quality [21]. Candidate genes responsible for milk composition were intensively analyzed by Moioli and collaborators [22] to identify the molecular mechanisms underlying sheep milk quality. Among milk protein genes, the major effects were assessed for the αs1-casein, k-casein, β-lactoglobulin. Other important genes are those implicated in fatty acid metabolism, such as ACACA, SCD, LPL and DGAT1 [22]. However, in order to improve the overall picture, many more genes need to be deeply investigated.
Microarray technology is a powerful tool that helps to explore an organism transcriptome by measuring, in a particular cell or tissue, the expression levels of thousands of genes simultaneously. In livestock species, the microarray technology was discussed and reviewed as potential nutrigenomics tools, in the context of its economic benefits and improvement of food quality and safety in dairy and meat industries [23][24][25]. However, microarrays have been designed for very few livestock species. Moreover, the few devices so far developed, feature a largely incomplete coverage of the genome [26].
The objective of this study was to evaluate temporal changes in mammary gene network expression profiles by comparing two sheep breeds with different milking attitude. Gentile di Puglia (or Merino di Puglia, Pugliese Migliorata, Merino d'Italia, Merino Gentile) is a fine wooled breed from southern Italy. Development of this breed began in the 15th century, but the primary improvement began from the 18th century onwards. The breed was developed by crossing Spanish Merino with the local breeds. Today the selection objective of Gentile is focused onto meat production. Sarda is an Italian breed with high usefullness in milk production. It is widespread, mostly in Sardinia and in Central Italy, and representing 40% of the Italian ovine population.
In this study, we used a sheep-specific microarray chip technology covering most of the species' transcriptome, representing the first annotated microarray developed for sheep with a covering of 50% of the genome [26]. The chip was generated from sheep ESTs deposited at NCBI and carries 21,743 non-redundant features in quadruplicate, 73.4% of which are fully annotated and corresponding to 10,190 genes. We analyzed the mammary transcriptome using biopsies from individuals of Gentile di Puglia and Sarda at two lactation stages to assess the differences between breeds, with the aim to identify genes controlling milk composition and their metabolic pathways.

Results and Discussion
We succefully hybridized eight microarray slides (four slides per lactation stage). Since every spot was replicated four times, for each lactation stage we performed 16 gene replicates (see Table 1).
In wet lab experiments 213 genes resulted differentially expressed between the two breeds at stages 01 ( We performed an analysis to show the most represented KEGG pathways among the differentially expressed genes (Figures 1, 2), in order to identify molecular differences in milk synthesis between breeds and to identify genes controlling milk production and correlated metabolic pathways.  At stage 01, among the 143 upregulated genes in Sarda, we could recognize caseins αS2, β and K. In addition to milk protein genes, we identified genes involved in processes linked to both lactation and mammary involution, as oxidative metabolism, apoptosis, cell cycle control, oncogenes, ubiquitination pathway and cell communication (focal adhesion adherens junction), endocrine system (insulin signaling pathway, adipocytokine signaling pathway) [27]. The KEGG pathways (like amino acid and carbohydrate metabolism, glycan biosynthesis, cell communication, cell growth and death and the immune system) were significantly (p < 0.05) enriched. The molecular events underlying mammary development during pregnancy, lactation and involution are incompletely understood. The processes of lactation include the development of mammary tissue, as well as the synthesis and secretion of milk. After parturition, the proliferation and differentiation of mammary secretory cells lead to an increase in milk secretion, whereas after lactation peak, milk production declines largely because of apoptotic mammary cell death, which exceeds cell proliferation. The development of mammary gland is spatially regulated by the communication of the mammary epithelium with the extracellular matrix (ECM) through a family of adhesion receptors called integrins. Integrins, in response to both hormones and growth factors, support cells in proliferation, accurate morphological organisation, as well as in milk secretion. Cell adhesion to the ECM plays a key role in alveolar survival, morphogenesis and function [28]. In this context, we could observe a significant difference, between the two breeds, in expression of genes involved in extracellular matrix formation and cell adhesion (TJP1 upregulated in Gentile, CDH5 and TNXB upregulated in Sarda). Remarkably, during stage 01, the expression of the oncogene VAV3 is higher in Gentile, while one of the initiators of apoptosis CFLAR (CASP8 and FADD-like apoptosis regulator) increases in Sarda. Apoptosis, in fact, occurs during involution of mammary gland in cattle [29,30], and an overexpression of many apoptosis-related genes during lactation was recently reported [31]. At stage 01 (early lactation), we found a differential expression of genes, like USP9X, involved in the ubiquitination pathway in Sarda. The protein ubiquitination pathway is the most significantly enriched pathway during both lactation and involution [32]. Another category of genes found differentially expressed between the two breeds encompasses genes involved in oxidoreductase activity, like cytochrome C oxidase, NADH dehydrogenase and ferritin. The activity of cytochrome C oxidase was found to increase from late pregnancy to the first days of lactation [33,34]. The overall expansion of oxidative metabolism is a response to the increased energy demands of the lactation period. At stage 01 we observed an upregulation of cytochrome C oxidase, NADH dehydrogenase and ferritin in Gentile di Puglia. This may reflect the different lactation persistence, which is lower in Gentile di Puglia (60-150 days) as compared to Sarda (210 days).  Table   60% 30% 10%

Cellular Processes Environmental Information Processing Metabolism
At stage 02, some interesting genes as those encoding casein K, and proteins involved in oxidoreductase activity (like TGOLN2 and FTH1) and in ECM-interaction (like COL1A2), resulted overexpressed in Sarda. Finally, we can observe in Sarda an overexpression of genes implicated in lipolysis, like lipase (DAGLB) and phospholipase (PLD3). Several studies on different kind of cheeses have demonstrated that the fatty acid (FA) profile of raw milk influences cheese characteristics [22]. Lipolysis is particularly important in sheep cheeses due to the high fat content and lipase activity [35]. In this perspective, we like to stress that in the World the main output of sheep husbandry is cheese making.

Animals and Sampling
Whole mammary gland tissue samples were collected from four lactating individuals of two sheep (Ovis aries) breeds, Gentile di Puglia and Sarda (Figure 3). Lactating mammary tissue were taken at two lactation stages (first record, stage 01: 6 days after lambing; second record, stage 02: 44 days after lambing) in both breeds. Tissues from mammary gland were immersed in RNAlater (Sigma) and stored at −20 °C.

RNA Extraction
Tissues were subjected to RNA extraction with ice-cold TRIzol (Invitrogen) and using RNeasy Midi Kit columns (Qiagen). RNA integrity was assessed by electrophoretic analysis of 28S and 18S rRNA subunits. The purity of RNA and preliminary concentration were assessed with a spectrophotometer (GeneQuantpro). A260/A280 ratio was >1.9.

RNA Amplification and Labeling
RNA was quantified using a DTX fluorimeter (Beckman Coulter) using the Quant-iT kit (Invitrogen). Aliquots of 1 μg were amplified and Cy3/Cy5 labeled using the Kreatech Diagnostics kit.

Microarray Study Design and Hybridization
We designed an oligonucleotide chip from sheep ESTs deposited at NCBI. The oligonucleotide microarray platform is electrochemically synthesized and contains 21,743 non-redundant features in quadruplicate, 73.4% of which are fully annotated corresponding to 10,190 genes. A genome assembly for Ovis aries is not yet available, but considering the number of genes in the bovine genome (22,000), we estimate to have a 50% coverage of the sheep genome. We achieved very good technical outcomes, as reproducible patterns of differentially expressed genes (in each slide, replicates show a coefficient of variation <0.25 for differentially expressed genes with P < 0.01) [26]. Oligos were generated in situ on the chip using the Combimatrix (Seattle, WA, USA) equipment. Platform and microarray data have been deposited in the NCBI GEO database (Platform Accession GPL9461; Series Accession GSE18619).
The labeled aRNA was fragmented into 35-200-base fragments and then hybridized onto the slide according to Combimatrix's instructions. Hybridization was performed overnight at 50 °C. After hybridization, arrays were washed and scanned with a ScanArray Lite (Perkin Elmer) laser scanner. Microarray Imager 5.9.3 software was used to extract feature data from microarray fluorescence images.
The microarray study was designed as described in Table 1. Separate microarrays were used for individual samples. At each stage, RNA of from one lactating individual of each breed was labeled in dual color using Cy3 and Cy5 fluorochromes. RNA aliquots from the same stage of distinct breeds, labeled with different fluorocromes, were hybridized on the same microarray slide. We performed two technical microarray replicates per stage (four slides). The entire microarray experiment was repeated starting from a new RNA extraction of the same tissue sample (see Table 1), for a total of eight slides.

Microarray Data Analysis
Single microarray step Saturated (foreground median intensity, FMI, over the limits) and bad spots were flagged using the software Microarray Imager 5.9.3 (CombiMatrix). For each channel (Cy5, red and Cy3, green) the mean of empty spots FMI (E) was calculated. Non-empty spots with FMI to E ratio below 1.5 at least in one channel were filtered, together with flagged spots. We calculated the M and the A value (M = log 2 (R/G), A = (log 2 (R*G))/2, R = red FMI, G = green FMI) for each spot in order to obtain a measure of the differential expression for the two conditions analysed (M) and the log-mean of the spot intensity (A).

Paired dye-swap microarrays step and normalization
Systematic bias in the data was removed by applying the dye-swap normalization, that makes use of the reverse labelling in the two microarray replicates in order to remove the intrinsic difference of the two fluorochromes output [36]. In particular we paired each microarray with its dye-swap by calculating a new M value, M D = (M 1 − M 2 )/2, where M 1 is the M value for an experiment and M 2 is the M value calculated for the correspondent experiment with inverse fluorochromes, and a new A value, A D = (A 1 + A 2 )/2, where A 1 is the A value for an experiment and and A 2 is the A value calculated for the correspondent experiment with inverse fluorochromes. In order to remove intensity based bias we also applied the lowess normalization [37], obtaining a new M value (M L ) from M D .

Significance analysis
For each lactation stage, we performed one sample t-test to establish, for each gene, if the mean of the M L values (uM L ) was significantly different from 0, and corrected the p-value for multiple comparisons with the Benjamini and Hochberg False Discovery Rate [38]. Finally, only genes with a satisfactory effect (absolute value of the fold change |FC*| > 1.3, FC = 2^ uM L , if FC<0, FC* = −1/FC, if FC > 0, FC* = FC) and a significant p-values were considered. The statistical significance of the enrichment for the KEGG pathways of interest was computed using the hypergeometric test [39].

Conclusions
Sheep farming is very important for cheese production and the fatty acid and protein composition of raw milk is crucial in the cheese making process. The fatty acid profile of raw milk has been demonstrated to affect cheese characteristics and differentiate new types of cheese [21,40]. The genetic differences between breeds on milk quality are likely to affect also cheese quality and could be a marker to carry out genetic improvement plans of local and endangered sheep breeds. However, the number of studies on gene expression analysis between breeds aimed at understanding how genetic variations affect milk quality is quite limited in sheep as compared to cow. The main difficulty has been, to date, the absence of devices like microarrays. In this paper we have proven that a homologous chip, generated from sheep ESTs, is a valuable tool which can be employed in gene expression analysis. Furthermore, this approach can be easily extended to other species of which genetic sequences are present in public databases [26].