1. Introduction
Forests are much more than a large area of land covered with trees. They represent one of life’s support systems on Earth, providing essential resources for a range of ecosystems. Furthermore, forests supply various products and services, generating a huge number of economic and social benefits. Due to the high commercial value of wood products, maritime pine (
Pinus pinaster Ait.) is one of the main conifer species in southwestern Europe, covering approximately four million hectares in this region [
1]. In Portugal, maritime pine is one of the predominant tree species, and by far the most widespread, mainly in the regions of Atlantic influence, covering more than 700 thousand hectares, which corresponds to 23% of the total forest surface [
2].
In recent years there has been a worrying decline of a large number of forest species around the world, with maritime pine being one of the most affected [
3]. This alarming decrease is caused by abiotic and biotic factors, of which the pine wood nematode (PWN),
Bursaphelenchus xylophilus Steiner & Buhrer, 1934 (Nickle, 1970) is one of the main biotic factors [
4].
PWN is a quarantine organism in the European Union (Directive 77/93 EEC), being the causal agent of the pine wilt disease (PWD) that may kill a host tree within a short period of time after infection [
5]. Mostly due to this pathogen, the total area occupied by
P. pinaster suffered an abrupt decline in Portugal, with losses of 263,000 hectares between 1995 and 2010 [
2]. As a result,
P. pinaster went from being the main forest species, in terms of distribution and area, to the third, behind eucalyptus and cork oak. Recently, it was classified as an endangered species by the IUCN red list of threatened species [
6].
PWN was reported for the first time in Portugal in 1999 [
5], and in less than 10 years the whole
P. pinaster area had been affected. PWN is transported between host trees by an insect vector, a longhorn cerambycid beetle (
Monochamus galloprovincialis Oliv.) [
7]. The transmission may occur in two forms: (i) by oviposition, where the female beetles lay their eggs under the bark of stressed or recently killed trees by the PWN, and the nematodes migrate to pupae just before adult beetles emerge, ensuring successful survival of the parasite. Note that due to the low frequency and efficiency in susceptible trees, such as
P. pinaster, transmission by oviposition represents a secondary inoculation way [
8]; (ii) by feeding, considered the most common pathway of transmission on susceptible trees, that occurs through beetle feeding wounds (primary transmission). Nematodes carried by beetles move into wounds and breed in the xylem, but the survival of nematodes is not guaranteed [
9,
10]. This is a close relationship between PWN and its vector beetle, resulting in the epidemiological cycle of PWD [
11].
PWD expression depends not only on the pathogenicity of PWN and susceptibility of host trees but also on environmental conditions such as high temperature and large soil moisture, the optimal conditions for PWN proliferation [
9]. The symptoms caused by PWD are common to other diseases, and therefore can easily be confused. A typical early symptom is needle discoloration. Needles turn grayish green, then tan, and finally brown. Then, resin flow ceases and the wood is dry when cut [
4].
The defensive mechanisms of host trees can be divided into early and advanced stages [
12]. In the first stage, defensive response occurs in both susceptible and resistant trees, while late response is found only in susceptible trees [
12]. In the same species, it has been verified the existence of trees with different levels of susceptibility, some of which survive the infection, thus, constituting an opportunity for selective breeding. This has been the approach in breeding programs developed in China and Japan in the early sixties [
13].
Transcriptome analysis based on next-generation sequencing data provides information about all transcriptional activity in a cell or organism. It is now the most commonly used approach, and has been applied to disease pathogenesis studies and identification of biomarkers [
14]. For non-model organisms like
P. pinaster, for which there is no reference genome sequence available, RNA-Seq is an efficient means to generate functional genomic data [
15].
In order to understand the pathogenic mechanisms and reduce the damage caused by the PWD in Portuguese forests and respective ecosystems, several studies were performed [
16,
17,
18,
19] However, to our knowledge, the analysis of maritime pine molecular response based on RNA-Seq data was reported in only one approach [
20]. In this study, was pointed out a set of candidate genes potentially involved in the response to PWN, mainly related with terpenoid metabolism, defense against pathogen attack and oxidative stress.
This work is an approach to PWD, using RNA-Seq data to characterize the maritime pine transcriptomic profile in the response to infection with Bursaphelenchus xylophilus, over three different time points after inoculation, by determining the differentially expressed (DE) genes, regulatory networks and pathways, with the purpose of identifying potential candidate genes that may later on be used in the selection of P. pinaster trees displaying resistance against PWD.
2. Materials and Methods
2.1. Biological Material, Pine Wood Nematode Inoculation and Sampling
A total of fourteen potted three-year-old
Pinus pinaster trees were used in this study. These plants were derived from seeds and maintained in natural environmental conditions during the assay.
Bursaphelenchus xylophilus culture was grown in PDA (Potato Dextrose Medium) with Botrytis cinerea. After a significant growth, a suspension of nematodes was transferred to test tubes with 5 mL of water and barley grains previously autoclaved. Later they were incubated for a week at 25 °C and relative humidity of 70%, which represent optimal conditions for nematodes growth. Before inoculation, nematodes were extracted from test tubes using the Baermann funnel technique [
21]. Then, the culture was placed at 4 °C to stop multiplication and passing from juvenile stage to adult stage.
Inoculation with PWN was conducted following the method of Futai and Furuno [
22]. Shortly, a suspension with 2000 nematodes was pipetted into a small vertical wound (1 cm) made on the upper part of the main pine stem with a sterile scalpel. A sterilized piece of gauze was placed around the wound site and fixed with parafilm to maintain the optimal humidity level. This procedure was done in twelve
P. pinaster plants, while the two remaining plants were used as control (inoculation with water).
Four sampling time points were established, including 6 h, 24 h, 48 h and 7 days after inoculation. For each time point, a set of three P. pinaster plants was collected. Briefly, a small piece of stem tree above inoculation point was cut and flash frozen at −80 °C for further RNA extraction.
2.2. RNA Extraction, cDNA Synthesis, Library Preparation and Sequencing
All collected samples were ground in liquid nitrogen and a total RNA extraction was performed from 2 g of plant material, according to an optimized method from Provost and colleagues [
23]. Then, a DNase treatment was carried out following the instructions of the manufacturer (Kit TURBO DNA-free by Life Technologies, Hong Kong, China).
Approximately 1 microgram of total RNA was used for cDNA synthesis, following the ImProm-IITM Reverse Transcription System protocol kit (Promega, Madison, WI, USA). Before sequencing, four pools of cDNA were constructed (pool 1—control; pool 2—6 + 24 h; pool 3—48 h; pool 4—7 days).
cDNA libraries were constructed with the Ion Total RNA-Seq Kit v2 (Life Technologies, Hong Kong, China). Briefly, mRNA was fragmented with RNAse III. After short fragment removal, RNA adapters were ligated and the cDNA first and second strands synthesized. cDNA was then amplified with specific barcoded primers by PCR amplification and the resulting fragments selected for the correct size with magnetic beads.
Finally, the positive spheres from the four libraries were loaded into an Ion PI chip v2 and the transcriptomes were sequenced as single-end reads in the Ion Proton System (Thermo Fisher Scientific, Waltham, MA, USA) at Biocant (Cantanhede, Portugal). All procedures were carried out according to manufacturer’s instructions.
2.3. Pre-Processing RNA-Sequencing Data and Transcriptome Assembly
The quality of the RNA-Seq reads from the four sequenced libraries was checked using FastQC software Version 0.11.5 [
24], a quality control tool for high throughput sequence data. Based on the FastQC results, the thresholds for minimum average read quality and read length were established as 12 and 80 bp, respectively. These parameters were used to run Sickle tool Version 1.33 [
25], which trimmed poor quality bases and adapters sequences from the raw reads, and produced a set of processed reads that were then used in the downstream analyses.
Since no reference genome sequence is available for
P. pinaster, it was necessary to perform a de novo transcriptome assembly. The processed reads from all libraries were assembled into contigs using Trinity 2.1.1 [
26]. The contigs generated with the Trinity assembly were used as input for a run with CAP3 [
27]. The resulting assembly was the basis for the next procedures, being used as the reference transcriptome assembly.
2.4. Prediction of Candidate Coding Regions
The sequences from the reference transcriptome were analyzed with TransDecoder-2.0.1 [
28] to identify the open reading frames (ORF). The ORF transcripts identified were further scanned for homology to known proteins against the Swiss-Prot [
29] and Pfam [
30] databases by running BlastP [
31] and HmmScan [
32], respectively. In the end, TransDecoder provided the final set of candidate coding regions, namely the predicted genes representing the basis for their annotation.
2.5. Mapping and Differential Expression Analysis
Mapping the reads against the transcriptome assembly was performed using RapMap [
33]. Before performing a differential gene expression analysis, it is common to determine the number of unique mapped reads, which was accomplished with SAMtools-1.3 [
34]. Only the reads that mapped to a unique location in the reference transcriptome were used for downstream analyses.
The EdgeR package [
35] of Bioconductor was used to identify transcripts that were differentially expressed between the conditions. To adjust for library sizes and skewed expression of transcripts, the estimated abundance values were normalized using the Trimmed Mean of M-values normalization method [
36] included in the EdgeR package. As our experiment did not have replicates, it was necessary to determine the biological variability. Thus, in accordance with the EdgeR guidelines, a BCV (biological coefficient variation) of 0.1 was assigned [
37]. This procedure has been successfully used previously in other studies, for which biological replicates were also not available [
38]. After the identification of the differentially expressed (DE) genes, correction for multiple testing was performed by applying the Benjamini-Hochberg method [
39] on the
p-values, to control the false discovery rate (FDR). The final list of differentially expressed genes was generated after employing a threshold of 0.01 for the FDR.
2.6. qPCR Validation
To perform the validation of the data from RNA-Seq, five DE genes were selected (water deficit inducible (Wdip), WRKY transcription factor 1 (WRKY), PR10 protein (Pr10), MYB-like transcriptional factor (Myb), TIR/NBS/LRR disease resistance protein (LRR) and primers designed using Primer3 software [
40]. cDNA synthesis was performed using the ImProm-II TM Reverse Transcription System Kit (Promega) with 1 µg of total RNA following manufacturer instructions. Relative expression quantification was performed with Rotor-Gene Q software 1.7 (Qiagen, Venlo, The Netherlands) using the SsoFast™ Eva Green SuperMix 1x (SYBR based system, Bio-Rad, Hercules, CA, USA); 250 nM of each primer and 1 μL of cDNA in a final volume of 20 μL. All samples were run in triplicate, and a no template control (NTC) and a housekeeping gene were used for every primer pair. PCR cycling conditions were 95 °C for 3 min, followed by 40 cycles at 95 °C 10 s, 60 °C 60 s and 72 °C 30 s. A melting curve was generated for each reaction to assure specificity of the primers and the presence of primer-dimer. Primers efficiencies were assessed using a serial dilution of cDNA stock. The Elongation factor-1 alpha was used as housekeeping gene and for normalization of expression of each gene.
To compare the RNA-Seq and qPCR results, a Pearson correlation was calculated using the Log2 of the normalized expression values.
2.7. Transcriptome Annotation
The ORFs transcripts identified by TransDecoder were used for transcriptome annotation. This procedure was performed using InterProScan [
41,
42]. The protein domains, gene ontology (GO) terms [
43] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways [
44] associated with the genes annotated that are encoding enzymes were identified. A custom python script was run to filter GOs and KEGGs from the InterProScan output. Categorizer [
45] was used for the analysis of the GOs. The list of GO IDs belonging to one of the GO categories, which includes Biological process (BP), Cellular component (CC) and Molecular function (MF), was used and classified by its corresponding subcategories against the GO Slim plant database [
45]. The number of GOs was counted within each subcategory, and its percentage over the total set of GO IDs provided was reported.
Regarding the functional annotation for differential expressed genes, the contigs were annotated against the non-redundant National Center for Biotechnology Information (NCBI) plants database (version of August 2015) using BlastP (e-value 1 × 10−5).
2.8. Biological Networks Analysis
In this study, Cytoscape [
46] was used for visualization of molecular interaction networks. This type of analysis provided a deep knowledge about resistance mechanisms at the molecular level. Cytoscape analysis procedures began by establishing interactions between DE genes associated with KEGG pathways and GO terms. From the large amount of plugins and features available in Cytoscape to perform different types of studies, BiNGO [
47] was selected to identify which GOs were statistically overrepresented in the sets of DE genes. Moreover, the Enrichment Map plugin [
48] was also used with the results from BiNGO to visualize enrichment of specific functions. The statistical analysis was carried out with customized default values recommended by user’s guidelines.
2.9. SNP Calling
Variant calling was performed with the GATK toolkit [
49], which offers a variety of tools for variant discovery. Similarly to the differential expression analysis, only the unique mapped reads were used for SNP calling. A first set of variants was identified using the UnifiedGenotyper tool available in the GATK toolkit. This initial set of variants was then filtered, using the SelectVariants option with the parameters SNP quality (QUAL ≥ 60), individual coverage (DP ≥ 15) and genotype quality (GQ-phred quality ≥ 40), in order to produce the final set of high-confidence SNPs. Finally, SnpEff was used to annotate and predict the effects of the filtered SNPs.
2.10. Data Archiving Statement
The raw sequences used in this work were submitted to the Sequence Read Archive (SRA) with the BioProject accession number PRJNA378402 and accession name “Pinus pinaster Transcriptome sequencing”.
4. Discussion
In this study, an RNA-Seq based approach was used to determine the transcriptomic profile of maritime pine in different stages after inoculation with PWN, to identify candidate genes associated with the response mechanisms to the infection.
One of the main challenges in RNA-Seq studies for non-model organisms like maritime pine is to produce de novo transcriptome assembly. This is a crucial step, which can yield some erroneous assembled contigs, due to the nature of high-throughput sequencing reads, which can contain sequencing errors, and the algorithms used for de novo assembly, which can also generate assembly artifacts, i.e., contigs that do not represent true regions of the transcriptome. These factors may have had an impact in this study since the rate of predicted genes from the set of assembled contigs was low (83,468 genes were predicted from 355,287 assembled contigs). Even though genomic information for several conifer species has been generated, and compiled in the TreeGenes database [
60], the number of genomic resources for maritime pine remains limited, which is another relevant factor that may have contributed to the low rate of predicted genes. When a reference genome is not available, the genetic description contained in the assembled transcripts can only be successfully identified by homology if the protein products have homologies in different protein databases, giving a set of predicted genes. Hence, in the case this type of analysis is applied in species that yield a distinct set of expressed genes, with little or no homology when compared with all gene sequences deposited in the databases commonly used in these studies, it is possible to end up with a significant percentage of contigs for which no gene could be predicted. From the total genes predicted in this study, 70,646 (84.6%) of them were annotated, providing a genomic resource to further deepen the study of genes involved in the transcriptomic response of pine wood to infection with PWN. However, 25,545 annotated genes had “Unknown” description, mainly associated to
Picea sitchensis, which reinforces the need to strengthen the genomic resources for maritime pine, ideally with the availability of a fully sequenced and annotated reference genome.
The comparison of sequence data from all libraries revealed a total of 17,533 DE genes, a number that was obtained using a FDR value of 0.01, which is a more stringent correction for multiple testing when compared with the traditionally used FDR value of 0.05. Despite the stringency applied in the statistical methodologies used to generate the final list of differentially expressed genes, the unavailability of replicates may be responsible for increasing the number of false-positive results. Functional annotation with GO terms for predicted genes resulted in 38,762 (46.4%) unigenes with at least one assignment into one of the three categories of GO terms (BP, MF and CC). In each of the GO categories, the GO terms fell mainly into two or three subcategories. The GO subcategories identified with more evidences are in accordance with other reports [
20], and may represent a typical gene expression profile for
P. pinaster after infection with PWN.
Most plant defensive responses to pathogens have evolved into a complex system, simultaneously combining several mechanisms and pathways. To identify possible pathways involved in defense against PWN, a KEGG analysis for our set of predicted genes was performed. The different KEGG pathways associated with the predicted genes are in agreement with the Physiome Project Models [
61] for
P. pinaster. The most prevalent pathways were purine and pyrimidine metabolism. These subunits of nucleic acids are major energy carriers and precursors for the synthesis of nucleotide cofactors such and NAD and SAM [
62].
In this study, the identification of several DE genes related to biotic and abiotic stresses, hormonal regulation and cell wall defense further validates the hypothesis that these mechanisms play a crucial role in the plant defense system involved in the response to PWD.
4.1. Infection Leads to de novo Transcription of Genes Involved in Biotic Stress Response, Phenylpropanoid/Terpenoid Metabolisms and Hormonal Regulation
The genes induced only after inoculation with PWN are important to understand the mechanisms activated when the plant is infected by this pathogen. Indeed, a set of genes over-expressed in all conditions after inoculation (Pp02—6 h + 24 h, Pp03—48 h and Pp04—7 days) were identified, including the GDSL esterase/lipase that is potentially involved in defensive reactions [
63,
64], and a translationally-controlled tumor protein homolog, which participates in important cellular processes like the protection of cells against various stresses and apoptosis [
65]. Moreover, the pattern of over-expression after inoculation was also detected for the gene that codifies a jacalin-related lectin 3 protein (JRL), which is often associated with biotic and abiotic stimuli. JRLs proteins have been referenced as a component of the plant defense system, although their role is not well understood yet, due to their structural diversity [
66]. In general, the results obtained in the present study are in agreement with the set of candidate genes, related with the response to PWN infection, reported in different studies [
20,
67]. For instance, the relevance of terpenoid metabolism in maritime pine defense against PWN, important in the resin production process, was pinpointed by Santos and colleagues [
20]. In that study, the (E)-4-hydroxy-3-methylbut-2-enyl diphosphate (HMB-PP) reductase gene was found highly expressed after inoculation, an expression pattern that was also identified in Pp03 and Pp04 for HMB-PP synthase. Likewise, the thaumatin-like protein (TLP) was found to be deeply involved in plant defense system as a response to biotic and abiotic stress. Various studies imply its importance in plant resistance [
68,
69], even though its role remains unclear. For example, towards a pathogen attack and under stress, these proteins confer tolerance and induce stress resistance [
68]. In our work the gene codifying this protein is highly induced immediately after inoculation, decreasing its expression levels slightly over time. The difference observed for the expression pattern of this gene, relative to the study carried by Xu and colleagues [
67] in
Pinus massoniana, may indicate a specific response for maritime pine.
Genes involved in hormonal regulation, such as cytokinin dehydrogenase 6-like and auxin-induced cell wall protein, were identified as highly expressed after infection with PWN. In our study, the expression of cytokinins supports the possibility that they are involved in signaling defense responses after a pathogen attack, improving the resistance to pathogens [
70].
Phytohormones are responsible for various important physiological processes, from plant growth to plant defense [
71]. The response to an insect attack is mediated by plant hormones, as primary signal in the regulation of plant defense. The salicylic acid (SA), hormone ethylene (ET), jasmonic acid (JA) and its derivates are the major defense hormones. Differentially expressed genes codifying important enzymes in jasmonate biosynthesis, such as 12-oxophytodienoate reductase 3-like (OPR3), which is fundamental in this biosynthesis, acyl-CoA oxidase and a multifunctional protein (MFP) found to be associated with wound-induced, were identified through the different comparisons. Genes codifying transcription factors involved in the hormone signaling at local and distant tissues were also identified differentially expressed after inoculation. Most of the genes identified have higher levels of expression in the last stages post inoculation. A faster recognition of the pathogen is important for an efficient response, since the timing of the hormonal production can determine if the plant becomes more resistant to the nematode.
As already mentioned, the highest number of DE genes was identified between the control sample (Pp01) and Pp02, which clearly indicates an immediate response to PWN after inoculation. This observation is in accordance with previous results obtained in
Pinus thunbergii Parl., that suggested an early response to PWN in susceptible and in resistant trees [
72]. Within this early stage of response, and when compared with the control sample, several genes potentially involved in the defensive response were detected, evidencing the activation of the defense mechanisms in the infected plant. These genes included the mildew resistance locus 6 calmodulin binding protein gene, which triggers a defensive response in the occurrence of an infection caused by a foreign body [
73]. The processes used by the PWN to invade the
Pinus pinaster tissues are likely to represent a very similar mechanism that is used by the powdery mildew, hence these results provide further support for the involvement of the mildew resistance locus 6 calmodulin binding protein gene in the initial response of plants to infections. The sucrose synthase gene also displayed over-expression in the Pp02 time point. This gene codes for an enzyme that provides metabolites for the synthesis of cellulose and callose, and plays an important role in secondary cell wall synthesis [
74,
75]. Genes codifying TMV resistance protein N-like and nucleotide-binding site leucine-rich repeat (NBS-LRR) disease resistance proteins were identified in the control sample. The high expression of these proteins, which are relevant in the plant immune response, suggests that the plant recognizes and triggers the defense system [
76]. Comparing the control with all stages after inoculation, the NBS-LRR proteins showed a lower expression in the inoculation stages. These proteins are involved in the pathogen recognition and should act at an early stage for an effective response. In our work, these proteins only increased their values of expression in the last stage of inoculation, which may be too late for the plants to recover and combat the nematode.
Cytoscape analysis showed relevant interactions between the DE genes and KEGG pathways. Some enzymes appear to be directly associated with plants defense, being an important connection between relevant pathways, such as terpenoid backbone biosynthesis and thiamine metabolism, against biotic and abiotic stresses. In our work, the phenylpropanoid biosynthesis can be a relevant pathway, not only due to its connection with the phenylalanine metabolism by PAL, but also for the high number of peroxidase genes over-expressed in the Pp02. The higher expression of PAL genes, responsible for the first step in this pathway, can be a response to the wounds caused by the vector.
4.2. Infection Leads to a Reprogramming of Cell Wall Metabolism Putatively Involved in Cell Wall Reinforcement
The over-expression of peroxidase genes during PWD may be related with the oxidation of phenolic residues, likely the substrates of lignin and suberin, into cell wall polymers in the infected tissues [
51].
Enzymes involved in cell wall modifications were found differentially expressed, endoglucanase being the most highly expressed in the second and third time points. The endoglucanase is responsible for the catalysis of cellodextrin in cellobiose, the smallest subunit of cellulose, being also related with cell wall modifications in infected plants [
77]. Thus, the over-expression of this enzyme as a response to infection, suggests that not just proteins related to defensive mechanisms are used to fight the infection, since some mechanisms are activated to reconstruct the cell damage originated by the PWN.
Additionally, when comparing the Pp02 with Pp03 a gene encoding a laccase was found highly expressed in the Pp03. These kinds of proteins are involved in lignin biosynthesis and plant pathogenesis [
78]. Lignin forms important structural materials in the support tissues of vascular plants. Hence, it makes sense that one of the mechanisms activated is to reinforce the cell walls, especially in wood and bark.
4.3. Late Responses to Infection Seem to Be Involved in the Mitigation of Stress Caused by an Inefficient Early Response
A previous study reported a late response to infection with PWN in susceptible trees [
72]. This late response is observed in our study, and may occur approximately one week after inoculation, due to the large amount of DE genes that were identified between Pp02 and Pp04. Measuring differences between early and late responses can elucidate the different mechanisms activated. When comparing the two stages, the higher expression of the dehydrin gene in Pp02 is relevant since it has been associated to plant response and adaptation to abiotic stress such as water stress, being involved in a mechanism commonly developed in these stages [
59]. Thus, considering that the PWD results in destruction of parenchyma cells surrounding xylem resin ducts, causing a dysfunction of the water-conducting system, the involvement of this gene to prevent water loss makes sense. The pathogenesis related type 10 gene was also highly expressed at this stage. This gene was already found in conifers, displaying a transient accumulation in needles of drought-stressed trees [
79]. As a consequence of the water stress, the needles become drought stressed, which is one of the most characteristic symptoms of PWD. Stress proteins, such as heat shock, were encoded by over-expressed genes in Pp04. Under stressful conditions they protect cells by stabilizing unfolded proteins, giving the cell time to repair damaged proteins. Heat shock proteins are highly conserved among different organisms [
63]. Although it is unclear its precise role in
Pinus pinaster, it seems that this protein is involved in the plant’s effort to combat the PWN. The gene coding the light harvesting complex protein was highly expressed in Pp04. It is involved in light energy transfer to one chlorophyll-a molecule at the reaction center of a photosystem. Although this protein is not directly related with the defensive mechanism, it plays an important role trying to maximize the production of energy, which could be essential in helping the cellular systems triggered within the response. Genes encoding NBS-LRR proteins were identified highly expressed in this time point. NBS-LRR proteins are capable of recognizing a wide variety of pathogens and initiate a hypersensitive response (HR), resulting in cell death. The NBS-LRR proteins are divided in two major groups, involved in downstream specificity and signaling regulation [
80], both of them showing high levels of expression in the Pp04. The behavior of NBS-LRR proteins through the different time point comparisons pointed out one more difference between the early and the late phase of response and a possible reason why the early response is not effective.
When comparing the Pp02 and Pp03 time points, the auxin-induced protein 1 was found over-expressed in Pp02. Auxins regulate and control vital mechanisms, being involved in growth, development and in defense via signaling, involving different interactions of molecules [
81]. This protein seems to have an important role in the first stage of the response against the infection.
Lastly, regarding the comparison between Pp03 and Pp04, a phospholipase D alpha 1-like and a tau class glutathione S-transferase were over-expressed in Pp04. The former plays an important role in various cellular processes, including response to stress [
82], while the latter has been associated to the oxidative stress response mechanism [
83].
Recent studies related to PWD reported a set of genes associated to response to PWN infection. Several biotic-stress resistance genes were identified after PWN inoculation by Shin and colleagues in Japanese red pine [
72], including the pinosylvin synthase and iron superoxide dismutase genes. The pinosylvin synthase was found over-expressed only in Pp04, while the iron superoxide dismutase genes were over expressed immediately after inoculation. Pinosylvin, which belongs to stilbenoids family, has been associated to phytoalexin induction, mainly in young pine trees exposed to biotic stress [
84]. Moreover, pinosylvin is a key metabolite that can kill nematodes [
85]. The presence of pinosylvin observed only in the late response against PWD, is an important feature that must be further analyzed, namely in tolerant and resistant
Pinus species to PWN, since it can be one of the reasons why the first response to the infection is inefficient. The iron superoxide dismutase gene plays an important role in cell protection against oxidative stress [
86].
The response displayed by maritime pine against PWN infection is clearly very complex and dynamic, however, in the end it still fails to prevent plant death as a result of the infection. This study provides an explanation for the possible reasons why the response of maritime pine to PWN infection is so inefficient. The comparison with the response of a tolerant Pinus species, for the same time points after infection (study under progress) will help unveiling what are the preponderant genes and pathways associated with resistance to PWD.
The SNP calling analysis performed in this study yielded a total of 36,295 SNPs, of which 69.2% were identified in exons and 30.6% were located in intergenic regions. The analysis of the SNP effects by functional class, performed only for the 15,263 SNPs located in exons and transcripts, revealed that over than 50% had a silent effect, which means that the SNP does not change the protein sequence. However, about 48.5% displayed a missense effect. In these situations, these changes are responsible for coding a different amino acid. When a new amino acid is coded, the sequence of the protein coded by a particular gene is also changed. These changes may occur between amino acids with markedly different properties, which in turn can affect the enzyme catalytic activity, or affect the secondary and tertiary structure of the protein, among others. Additionally, from a total of 4061 SNPs identified within the 17,533 DE genes, 15 were found in the sequences of genes discussed in more detail in the present work. There were 11 SNPs displaying a missense effect in the exon regions of the genes that codify the GDSL esterase/lipase, auxin induced cell wall and dehydrin proteins. Moreover, the two genes codifying the phenylalanine ammonia-lyase and auxin induced cell wall proteins respectively present a SNP in their 3’ UTR region. This type of variation might affect transcription and translation and could be responsible for differences in gene expression. The two remaining SNPs were identified in the genes that encode the pathogenesis related 10 and jacalin-related lectin 3 proteins, generating a new stop codon, known as nonsense SNP. This type of SNP is responsible for the change of a coding codon to a stop codon, resulting in the inactivation of the respective gene [
87]. Hence, these are very important SNPs whose effect on potential resistance to PWD can be tested in larger pine tree populations where resistance phenotypic data might be available for genome wide association studies (GWAS).
5. Conclusions
Currently, PWD, caused by Bursaphelenchus xylophilus, is the most deadly maritime pine disease. This study establishes a new approach for the understanding of the molecular response of maritime pine, which is susceptible to PWN, over different time points after inoculation with PWN. Clear insights related with the defense mechanisms of Pinus pinaster against PWN were identified. The functional annotation of the predicted genes revealed the complexity of the system involved in the response against PWN, combining a number of mechanisms and pathways, simultaneously. As pointed out in previous studies, the occurrence of two phases of response against PWN was identified from the results of the differential expression analysis: an early response that occurs immediately after infection, and a late response that is developed approximately seven days after infection. Future studies will focus the analysis on the comparisons between for P. pinaster and a tolerant Pinus species, for the same time points, in order to try to understand which response is more effective to prevent the pathogen progression after infection. Moreover, the high number of DE genes found between the early and the late responses suggests that these two phases may have significant differences at the molecular level. The set of candidate genes identified over the different time points after inoculation will be a useful resource in future studies and breeding programs to select plants with lower susceptibility to PWD. Moreover, the SNPs identified in this study will be available to be tested in larger populations with available phenotypic records for resistance to PWD, thereby enabling the possibility of identification of molecular markers linked with this very important economic and biological trait in maritime pine.