Real-World Diagnostic Accuracy and Use of Immunohistochemical Markers in Lung Cancer Diagnostics

Objectives: Accurate and reliable diagnostics are crucial as histopathological type influences selection of treatment in lung cancer. The aim of this study was to evaluate real-world accuracy and use of immunohistochemical (IHC) staining in lung cancer diagnostics. Materials and Methods: The diagnosis and used IHC stains for small specimens with lung cancer on follow-up resection were retrospectively investigated for a 15-month period at two major sites in Sweden. Additionally, 10 pathologists individually suggested diagnostic IHC staining for 15 scanned bronchial and lung biopsies and cytological specimens. Results: In 16 (4.7%) of 338 lung cancer cases, a discordant diagnosis of potential clinical relevance was seen between a small specimen and the follow-up resection. In half of the cases, there was a different small specimen from the same investigational work-up with a concordant diagnosis. Diagnostic inaccuracy was often related to a squamous marker not included in the IHC panel (also seen for the scanned cases), the case being a neuroendocrine tumor, thyroid transcription factor-1 (TTF-1) expression in squamous cell carcinomas (with clone SPT24), or poor differentiation. IHC was used in about 95% of cases, with a higher number of stains in biopsies and in squamous cell carcinomas and especially neuroendocrine tumors. Pre-surgical transthoracic samples were more often diagnostic than bronchoscopic ones (72–85% vs. 9–53% for prevalent types). Conclusions: Although a high overall diagnostic accuracy of small specimens was seen, small changes in routine practice (such as consequent inclusion of p40 and TTF-1 clone 8G7G3/1 in the IHC panel for non-small cell cancer with unclear morphology) may lead to improvement, while reducing the number of IHC stains would be preferable from a time and cost perspective.


Introduction
In lung cancer, histopathological type influences the selection of predictive testing and treatment. Targetable mutations and fusions are essentially found in adenocarcinomas (AC) and guidelines do not recommend testing of all squamous cell carcinomas (SqCC) [1]. AC and SqCC are both tested for programmed death-ligand 1 (PD-L1) expression, while pemetrexed and bevacizumab are not used as treatment for SqCC [2,3]. Neuroendocrine tumors, consisting of small cell lung carcinoma (SCLC), large cell neuroendocrine carcinoma (LCNEC), and carcinoid tumors are not subject to predictive testing and choice of chemotherapy may differ from non-small cell carcinomas (NSCC) [4]. Hence, accurate and reliable diagnostics is vital, but may be challenging since most lung cancers are not treated surgically due to the advanced stage at diagnosis [5]. While surgical resections provide a solid base for histopathological diagnosis, morphology is less often clear in small biopsies and cytological material, and the addition of immunohistochemical (IHC) markers are often needed to support the diagnosis.
In a previous study, we found a moderate interobserver concordance among 20 pathologists for the diagnosis of non-selective lung and bronchial biopsies based on morphology, thyroid transcription factor-1 (TTF-1), and p40 [6] (the recommended panel for NSCC without distinct features [7,8]). Similar results have been shown in a study on tissue microarrays with a panel of four IHC markers [9].
The main conclusions from our study [6] were that the suboptimal specificity of TTF-1 clone SPT24 may cause diagnostic problems, that neuroendocrine morphology is sometimes missed, and that overuse (as well as occasional underuse) of IHC staining occurs. However, it was not possible in the same study to evaluate real-world diagnostic accuracy when any IHC marker could be ordered.
Although inevitably leading to selection bias, as some lung cancer types are rarely treated surgically, the most reliable method to evaluate diagnostic accuracy in small specimens should include cases with follow-up resections. Such a study would also provide information on usefulness of various sample types. While there is substantial literature on the diagnostic accuracy for detecting a malignant tumor in the lungs for different sampling methods [10][11][12][13], the accuracy for histopathological typing has rarely been assessed [14,15].
The aim of the present study was to evaluate real-world accuracy of lung cancer diagnostics, at two major sites in Sweden, for small specimens with follow-up resection as gold standard, as well as the use of diagnostic IHC staining.

Study Population
Consecutive lung resections with a malignant tumor from 1 January 2019 to 31 March 2020 were identified from the pathology departments' clinical databases in Lund and Stockholm, Sweden. The study population consisted of 268 resections for primary lung carcinomas in Lund and 270 in Stockholm.
During the same period, there were 104 resections in Lund with primary non-epithelial malignancies (n = 6), metastatic carcinomas (n = 75, including 3 recurrent lung cancers and 44 metastatic colorectal cancers), or metastatic non-epithelial malignancies (n = 23). Correspondingly, there were 90 resections in Stockholm with primary non-epithelial malignancies (n = 7), metastatic carcinomas (n = 59, including 2 recurrent lung cancers and 44 metastatic colorectal cancers), metastatic non-epithelial malignancies (n = 23), and one case for which it was unclear whether it was metastasis or primary lung cancer.
For the resected primary lung cancers and corresponding pre-surgical specimens (for both cohorts partly sampled at regional hospitals), data were collected from the pathology departments' clinical databases, including diagnosis and applied IHC staining. The presurgical specimen with the highest number of applied markers was regarded for number of IHC stains if more than one sample had been stained. Predictive IHC markers (e.g., ALK, ROS1, PD-L1) were excluded from all calculations. It was unknown if additional pre-surgical sampling had been performed due to partial clinical work-up outside the region for two cases in the Lund cohort and one case in the Stockholm cohort. To assure diagnostic accuracy of the resected tumors, the slides were reevaluated by the principal investigator (H.B.) for 214 consecutive and 21 additional selected resections.

Scanned Cases
To further investigate the use of IHC markers among consultants working with thoracic pathology, one representative hematoxylin-eosin-stained slide for 11 bronchial and lung biopsies and two representative slides stained with papanicolaou (ThinPrep ® ) and may-grünwald-giemsa for 4 bronchial brushes were scanned on a Hamamatsu NanoZoomer S360 (used for diagnostic scanning at the pathology department in Lund) at the 40× mode.
The cases were selected from the period of investigation to represent typical cases of primary AC with lepidic, acinary, mixed papillary and micropapillary, or mucinous growth pattern, SqCC, NSCC without distinct features (with IHC supporting AC or SqCC), SCLC, LCNEC, metastasis of breast cancer, and metastasis of colorectal cancer.
The participating pathologists were informed that all cases were malignant tumors, as well as the gender, age, and previous malignancy for each case. They were asked to state exactly which diagnostic IHC markers would be ordered. Twelve pathologists in Lund and Stockholm were invited, whereby 10 accepted participation (all co-authors, except H.B., who selected the cases).

Statistics
Number of IHC stains was compared between groups using the Mann-Whitney U test and the Kruskal-Wallis test and analyzed with multiple regression analysis including diagnosis (AC, SqCC, neuroendocrine tumor), sample type (cytology, biopsy), and pathology department (Lund, Stockholm) as variables. A p-value < 0.05 was considered statistically significant. The analyses were performed with MedCalc 14.12.0 (MedCalc Software bvba, Ostend, Belgium).

Cohort Characteristics
Characteristics of the resected primary lung cancers are seen in Table 1. In the Lund cohort, 42 (16%) of the 268 cases had a synchronous primary lung cancer with lower stage in addition to the main tumor, while three (1%) had a synchronous metastasis to the lungs. The stage IV cases both had pleural metastasis, where one case was known but surgery was performed due to persistent pneumothorax while the other proved to be metastasized at surgery. In the Stockholm cohort, 24 (9%) of the 270 cases had a synchronous primary lung cancer with lower stage, while one (0.4%) had a synchronous metastasis to the lungs. One of the stage IV cases presented with a single brain metastasis treated with curative intent while the other proved to be metastasized at surgery.

Diagnostic Accuracy
As evident from Table 1, there was a pre-surgical diagnosis in 164 (61%) of 268 and 174 (64%) of 270 cases in the Lund and Stockholm cohorts, respectively. All cases with a different diagnosis in pre-surgical specimens compared to the resection are summarized in Table 2. Cases with a discrepant diagnosis but both the resection and pre-surgical specimen diagnosed as AC, adenosquamous carcinoma (AdSq), pleomorphic carcinoma with an AC component, or NSCC (not otherwise specified) are presented separately in the table as these diagnoses are essentially handled the same way in the clinical setting. Overall, the most common diagnostic discordance was NSCC on cytology and AC on biopsy and resection with deliberately no IHC staining of the cytological sample or samples (to avoid costly parallel staining), with reference to the biopsy for specified diagnosis pre-surgically. Although a definite diagnosis of AdSq, pleomorphic carcinoma, or combined LCNEC with a NSCC component requires resection, such a diagnosis was (accurately) suggested in four biopsies. Table 2. Discordant pre-surgical diagnosis for 164 (Lund) and 174 (Stockholm) cases with resection as reference diagnosis.

Discordance Lund Stockholm
With potential clinical relevance Cytology AC, resection SqCC 0 4 (2) Cytology NSCC, resection SqCC 4 (4) 0 Cytology AC, resection mixed SCLC/LCNEC 0 1 Cytology NSCC, resection LCNEC 1 0 As evident from Table 2, diagnostic discrepancy of potential clinical relevance was seen in eight (4.9%) of 164 cases in the Lund cohort, but in four of these cases there was a different sample with the same diagnosis as on the resection; in all these cases, there was deliberately no IHC staining of the cytology. The two cases in the Lund cohort with NSCC as diagnosis on biopsy but SqCC on resection were poorly differentiated and negative for cytokeratin 5, TTF-1, and napsin A, while p40 was negative in one and partly positive (20-25%) in the other biopsy. The corresponding number in the Stockholm cohort was eight (4.6%) of 174, with a different sample with correct diagnosis in four of these cases. In all the four cases in the Stockholm cohort where AC was suggested on cytology, but SqCC was seen in the resection, the cytological specimens were stained with TTF-1 clone SPT24, and two were partly (but significantly) positive. In two of the cases other markers were used, but in all four cases no marker for squamous differentiation was included. An example of a case is presented in Figure 1.

Use of Diagnostic IHC Markers
The median number of IHC stains for the 164 and 174 cases with pre-surgical diagnosis in the Lund and Stockholm cohorts were 4 and 2, respectively; the distribution is found in Table 3. The difference was statistically different between the cohorts (Mann-Whitney U test, p = 0.003) but not in multiple regression analysis (p = 0.54) including sample type (cytology, biopsy) and diagnosis (AC, SqCC, neuroendocrine tumor).

Use of Diagnostic IHC Markers
The median number of IHC stains for the 164 and 174 cases with pre-surgical diagnosis in the Lund and Stockholm cohorts were 4 and 2, respectively; the distribution is found in Table 3. The difference was statistically different between the cohorts (Mann-Whitney U test, p = 0.003) but not in multiple regression analysis (p = 0.54) including sample type (cytology, biopsy) and diagnosis (AC, SqCC, neuroendocrine tumor). One or more double staining was performed in 91 (55%) and 69 (40%) cases in the Lund and Stockholm cohorts, respectively, and the number of actual slides used for IHC staining (median 3 and 2, respectively) is found in Supplementary Table S1.
In the Lund cohort, there were 42 cases with pre-surgical diagnosis on both biopsy and cytology (as shown in Table 1). In 15 (36%) of these cases, IHC staining was performed on more than one sample. The corresponding number for the Stockholm cohort was 31 (91%) of 34 cases.
The median number of IHC stains for cases with pre-surgical diagnosis on cytological samples only (n = 124) compared to cases with diagnosis on biopsy samples only (n = 138) from both cohorts were 1 and 4, respectively. The difference was statistically different (Mann-Whitney U test, p < 0.0001). The distribution is found in Supplementary Table S2.
The median number of IHC stains for diagnostic pre-surgical specimens where the final diagnosis on resection was AC (n = 241), SqCC (n = 58), or neuroendocrine tumor (n = 30) from both cohorts were 3, 4, and 8, respectively. The difference was significant with Kruskal-Wallis test (p < 0.0001) as well as with pairwise Mann-Whitney U test (all p < 0.006). The distribution is found in Supplementary Table S3.
Both sample type (cytology, biopsy) and diagnosis (AC, SqCC, neuroendocrine tumor) remained statistically significant factors (both p < 0.0001) for number of IHC stains performed in multiple regression analysis also including pathology department (Lund, Stockholm).
The IHC markers suggested by the 10 participating pathologists for the scanned cases are found in Table 4. All pathologists suggested at least one neuroendocrine marker for the three SCLC/LCNEC cases. In the three NSCC cases without distinct morphology, 1-3 pathologists did not include a marker for squamous differentiation (also true for the case with SqCC on cytology), and in one of the cases, one pathologist did not include a marker for adenocarcinomatous differentiation.

Sample Usefulness
Frequency of how often the various pre-surgical sampling procedures were attempted and point at which a malignant diagnosis could be established from the different procedures in the retrospective cohorts is presented in Table 5. As can be seen, transthoracic procedures (fine needle aspirations [FNA] and core biopsies) exhibited a higher diagnostic rate than bronchoscopic procedures among prevalent sampling methods for the surgically treated cases. Moreover, bronchoscopic biopsies had a slightly higher diagnostic rate than bronchoscopic cytology, and, for example, suction catheter or bronchioalveolar lavage were the sole diagnostic sample in only two cases each. As evident from Table 5, a malignant diagnosis was established in 84 (28%) of 304 bronchial brush samples in the two cohorts. In 25 of these, the diagnosis was NSCC (not otherwise specified), where 18 had a specified diagnosis in a different pre-surgical sample.

N2 Metastases
As evident from Table 1, there were 495 cases with one or more surgically sampled N2 stations in the two cohorts. Endobronchial ultrasound (EBUS)-guided lymph node sampling was performed pre-operatively in 193 of these cases and there were five and two cases with metastases to one or two N2 stations, without morphological confirmation of the metastases in pre-surgical specimens. Thus, a metastasis was missed with EBUS in seven (3.6%) cases. In two of these cases, the N2 station with metastasis was not sampled with EBUS (both including station L5), but in the remaining five cases, the EBUS was reported to be representative and without malignancy. Correspondingly, in the 302 cases with no EBUS, there were 11, five, and one case with metastases to a one, two, or three N2 station, respectively. Thus, for the group with no EBUS, N2 metastases were seen in 17 (5.6%) cases (a single metastasis to station L5 in five cases).

Discussion
Although originating from pathology, this study is also highly relevant for pulmonologists, radiologists, and lung oncologists. Diagnostics of lung tumors involve a team effort relying on sampling (pulmonologists and radiologists) as well as assessment of mor-phology and ordering of ancillary markers (pathologists). Our study highlights limitations and pitfalls of this diagnostic procedure in daily practice, which should be of interest for oncologists as receivers of pathology reports.
The results support a good overall diagnostic accuracy for lung cancer subtypes for small specimens. However, we can confirm our previous findings that neuroendocrine morphology is sometimes missed, and that TTF-1 clone SPT24 occasionally causes diagnostic problems [6]. Problems relating to TTF-1 clone SPT24 were seen in the Stockholm cohort, while in the Lund cohort, clone 8G7G3/1 was used for histological specimens and clone SPT24 was only used for cytology (due to weaker staining in CytoLyt ® /PreservCyt ®fixed Cellient™-cell blocks). However, based on our results, not including a marker for squamous differentiation and poor differentiation of the tumor may be more important than TTF-1 clone.
Furthermore, slightly more discordant diagnoses were seen for cytology than for biopsies. Inaccuracy with potential clinical relevance was seen in 14 (7%) of 200 cytological specimens and four (2%) of 214 biopsies. However, some cytologies (n = 4) were deliberately not stained with IHC as staining was performed on a different specimen. Thus, while these cases were discordant compared to the resection, they cannot be fully defined as inaccurate. Application of both p40 and TTF-1 (preferably clone 8G7G3/1) would most probably have resulted in accurate specified diagnosis in these cases as well as in the cases with a diagnosis of AC on cytology and SqCC on resection. Thus, similar diagnostic accuracy should be possible for cytology and biopsies if there is enough material for IHC staining.
The participating pathologists tended to prefer p40 as a squamous marker, though less seldom staining for CK5 was suggested as well (Table 5). We believe p40 is slightly superior to CK5, in line with the opinion of the WHO group [8]. However, this is only supported by some studies [16][17][18][19], while several reports have shown an essentially equal [20][21][22][23][24][25][26] or inferior [27] sensitivity/specificity profile for p40. A better specificity for TTF-1 clone 8G7G3/1 compared to clone SPT24 has consistently been shown [28][29][30][31][32][33]; we hope the present study contributes to pathology departments in Sweden and elsewhere changing to the clone 8G7G3/1 if a less specific one is being used. Given the limited specificity of neuroendocrine markers [34], routine inclusion of such markers is not recommended in NSCC cases [7]. However, both our present and previous study [6] support a need to improve diagnostics, especially of LCNEC.
While not all pathologists included markers for both squamous and adenocarcinomatous differentiation in cases with NSCC without distinct morphology, the retrospective data as well as the scanned cases at the same time support that unnecessary IHC markers are used in a significant proportion of routine cases. Large panels more often lead to deviant IHC profiles, often without obvious diagnostic gain [23], and result in increased cost and spent time for the involved staff at the pathology department. For example, in the present study CK7 has been used more often than is recommended [7,8], and sometimes additional markers were included based on patient history even if morphology strongly opposes that diagnosis. In Sweden, the recommendation is to stain with TTF-1 also in cases that are obvious non-mucinous AC, which also contributes to a high proportion of cases with IHC staining. However, the balance is difficult as use of IHC clearly leads to more precise diagnoses [35][36][37].
Concerning specimen types, transthoracic samples were more often diagnostic than bronchoscopic samples for our cohorts of surgically treated (i.e., predominantly peripheral) lung cancers, also seen in recent investigations using EBUS-guided bronchoscopy [38,39]. As in our study, a slightly higher sensitivity for transthoracic biopsies than FNA (92% vs. 75%) has been reported [14]. A meta-analysis from 2015 reported a 91% sensitivity for transthoracic needle sampling with similar numbers for CT and ultrasound-guided specimens [40]. We did not have complete information regarding CT or ultrasound guidance and could not compare the two in our study.
Although not the main purpose of the study, our investigation also provided data for the value of EBUS-guided lymph node sampling for detection of N2 metastases, also previously reported [41]. International guidelines recommend combined EBUS and esophageal ultrasound (EUS)-guided lymph node sampling [42]. For our cohorts, we did not have complete information on EBUS vs. EBUS combined with EUS (i.e., EBUS could mean EBUS plus EUS for the cases in our report), thus we could not analyze any difference between the two procedures.
Some additional limitations of the study need to be addressed. Diagnostic accuracy of non-malignant conditions and metastases to the lungs was not evaluated, and this may be of interest in future investigations for an overall accuracy of pulmonary samples. Moreover, we only included cases with follow-up resection (to evaluate diagnostic accuracy), and some of our results may not be applicable to advanced cases. For example, repeated sampling may be performed more often to acquire a specified diagnosis in metastatic disease and bronchoscopic sampling probably has a higher sensitivity in central tumors. Furthermore, no N2 lymph nodes were sampled at surgery in some of our cases (more often in very small lesions or ground glass opacities), although this is unlikely to affect our main results.
In conclusion, our results confirm that TTF-1 clone SPT24 may occasionally cause a diagnostic problem and highlights that both p40 and TTF-1 should always be performed in NSCC without clear morphology. Neuroendocrine and poorly differentiated tumors may also be diagnostically challenging; however, dealing with these problems is less clear, and it is important to be aware of these limitations in pathology. In practice, more IHC are often used than is recommended or needed. Furthermore, in these cohorts of predominantly peripheral tumors, transthoracic procedures were better at confirming a malignant diagnosis. As well, the use of EBUS-guided lymph node sampling leads to more pre-surgically detected N2 metastases.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/biom11111721/s1, Table S1: Number of slides used for diagnostic immunohistochemical staining for the 164 (Lund) and 174 (Stockholm) pre-surgical samples with diagnosis, Table S2: Number of diagnostic immunohistochemical markers for the pre-surgical samples with diagnosis on cytology only (n = 124) or biopsy only (n = 138), Table S3: Number of diagnostic immunohistochemical markers for the pre-surgical samples with diagnosis of adenocarcinoma (n = 241), squamous cell carcinoma (n = 58), or neuroendocrine tumor (n = 30) on follow-up resection.
Author Contributions: Conceptualization, Methodology, Data Curation, Formal Analysis, Visualization, Writing-Original Draft Preparation, Project Administration, and Funding Acquisition, H.B.; Investigation and Writing-Review and Editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding:
The study was supported by the Regional Agreement on Medical Training and Clinical Research (ALF; grant 13733) and the Swedish Cancer Society (grants 2017/997 and 2020/0786). The funding sources were not involved in study design, data collection or analysis, or writing the report etc.

Institutional Review Board Statement:
The study was conducted in adherence to the Declaration of Helsinki and was approved by the regional ethical review boards in Lund (Dnr 2020-00256) and Stockholm (Dnr 2021-01967), respectively. Informed Consent Statement: Patient consent was waived as it was not considered necessary by the ethical review boards.
Data Availability Statement: Data is available from the corresponding author upon reasonable request.

Acknowledgments:
The authors wish to thank Sara Issa at the Department of Genetics and Pathology, Lund, for scanning the cases.

Conflicts of Interest:
The authors declare no conflict of interest.