1. Introduction
Biological tissue identification in the operating room is critical in many situations, particularly in tight volumetric spaces with bleeding and illumination artifacts. In these scenarios the identification cannot be reliably made by sight. Furthermore, an incorrect identification of a biological tissue can result in nerve or blood vessel resection for instance, with undesired partial paralysis or hemorrhage. It is then essential to provide healthy tissue type feedback in the operating room to avoid collateral damage.
The contributions to the solution of this problem imply a noninvasive high contrast technique, and powerful classification algorithms to provide accurate results. Regarding the noninvasive high contrast technique, optical radiation provides noninvasive, nonionizing procedures, even noncontact. Optical techniques for diagnosis of biological tissues have demonstrated great potentiality. Among others, different types of microscopy, spectroscopy [
1] or fluorescence [
2] have been widely used. Even tomography [
3], increased contrast tomography [
4] or increased contrast by polarization [
5] are possible by means of advanced optical techniques. Healthy tissue classification requires a fast and high biochemical contrast technique that is relatively easy to implement, because it will probably be used in conjunction with other surgical tools. Optical diffuse reflectance spectroscopy has already shown great capabilities as a noninvasive optical technique for the study of biological tissues. Light propagation in biological tissues is strongly wavelength-dependent, and optical properties of these tissues are specific to the tissue type. The main optical properties to be considered in propagation are usually absorption (
) and reduced scattering (
) coefficients, together with anisotropy of scattering and refractive index. The strong dependency of optical diagnostic techniques on optical properties is reflected in tables for different tissues that appear in the literature, either for in vivo, ex vivo or in vitro humans or animals [
6,
7].
A lot of studies have reported the feasibility of diffuse reflectance spectroscopy to determine the present pathological state of biological tissues, and provide information regarding tissue morphology, functionality, and/or biochemical composition. Using this technique, abnormalities in tissues can be detected, and the technique then serves as an optical biopsy tool. Optical spectroscopy has been applied for the detection of abnormal cells or cancerous tissues in different tissue types and organs, such as liver [
8], bladder [
9], colon [
10], esophagus [
11], bronchial tree [
12], breast [
13], brain [
1], skin [
14,
15] epithelium [
16], bone [
17], muscle [
18], fat [
19] or nerve [
20], among others. The technique has been shown to have a great potential as an inspection and diagnosis procedure. Spectroscopy does not require ionizing energy that harms the tissue, as previously said, and can be implemented in optical fiber, characteristics that greatly facilitate the development of a clinical instrument.
The clinical studies just described use optical systems to illuminate the particular organ or tissue under study, and collect the spectral characteristics of backscattering light for a particular wavelength range. According to the usual application of distinguishing pathological, mainly tumoral, and healthy tissues, the diagnostic analysis is based on two types of classes, either healthy or pathological tissue. The distinction implies in the end a typical class discrimination problem [
21]. Apart from other considerations, this kind of analysis usually requires a large number of measurements, also from different specimens, in order to report statistically significant conclusions.
The problem that is addressed in this work is slightly different, as we are interested in healthy tissue discrimination, so as to be sure of the tissue type that is in front of the surgical tool in the operating room. Previous studies have shown the possibility of this analysis for some tissue types [
21], and even specifically for adipose tissue, that intraclass variability can be an issue in the classification problem depending on the deviation of the measured specimens and experimental conditions [
22], or the influence of experimental conditions [
16]. However, a deep analysis of several classification algorithms that try to address the problem of identifying more tissue types, mainly bone, muscle, nerve, fat or skin is missing. This is an unsolved clinical problem, and needs a reasonably simple and affordable technique to be implemented in a clinical device. In order to test this possibility, first of all ex vivo biological tissues of different kinds are obtained. Afterwards, a specific optical setup is built in order to measure the spectra. The measured spectra present high dimensionality, which makes the classification problem difficult to cope with. Several approaches were implemented in order to extract relevant characteristics that could be afterwards introduced into final classification algorithms to test the discrimination power. From all of them, one based on characteristics extraction, specifically particular gradients according to spectral characteristics, and another more general one based on principal component analysis are shown. The accuracy results demonstrate the potential of diffuse reflectance spectroscopy for healthy tissue discrimination, an issue of critical relevance in several clinical interventions. For example, neck surgery could greatly benefit from this approach, as failure to distinguish a blood vessel or a nerve that are finally resected could have great negative consequences on the patient.
In this work, ex vivo porcine biological tissues of several types are extracted, adequately processed, and measured by a diffuse reflectance spectroscopy setup. The optical properties of porcine biological tissues have been shown to be near to human [
23], so the approach could be extended to clinical applications. The samples come from different specimens, and are obtained on different days. Several spectra are captured from each sample, so as to reduce noise and nontissue type dependent variability. Spectral measurements represent high dimension data, and as a consequence the classification problem is computationally complex. Several approaches, based on characteristics extraction or components extraction, mainly principal components analysis (PCA), are employed. Afterwards classification approaches based on k-nearest neighbors (k-NN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA) or Naïve Bayes (NB) are applied. The results from these approaches are compared by statistical analysis such as ANOVA, and conclusions are extracted regarding the best approach for a potential clinical application.
Section 2 of this article describes the materials and methods employed. It first shows the details of the biological porcine samples and processing approaches. Afterwards the experimental setup for diffuse reflectance spectroscopy is described. The criteria for obtaining the spectra are also exposed. The algorithms for data dimensionality reduction, noise reduction or misalignment corrections are also described.
Section 3 presents the results of analyzing the measured spectra by means of several approaches, mainly characteristics extraction and principal component analysis. The particular characteristics are described, applied, and evaluated, including the classification algorithms for class assignment. The same approach is implemented with the principal components, that are further classified and evaluated. The statistical approaches for the evaluation are described and applied.
Section 4 contains the conclusions of the work.
3. Results
The spectra were collected as previously stated from the extracted samples, and the preprocessing steps were carried out. The appearance of typical spectra, coming from each of the samples, is shown in
Figure 4. As in the usual spectroscopic measurements, the locations of peaks and valleys are particularly relevant, because they are related with biochemical information from the sample. In the case of biological tissues, hemoglobin, water or even proteins or other pigments can dominate the spectral response [
27]. Hemoglobin peaks and valleys are usually around 425 and 555 nm for the deoxygenated state, and around 410, 540 and 575 for the oxygenated one [
7], proteins present another peak at 280 nm [
28], water shows peaks in the IR at 970 or 1197 nm [
29], and lipids at 930 or 1210 nm [
19]. Usually spectra coming from biological tissues present a decrease that starts around 650 nm [
30].
As can be seen in
Figure 4, there seemed to be differences between the biological tissue types, but they were not so evident by sight as by looking at the spectra. In fact, the location of most peaks and valleys was common to all of them, as previously said regarding the main components that influence the spectra, which most of the tissues contained. For instance, the presence of peaks or valleys at 415 or 575 nm could be attributed to deoxygenated hemoglobin, as they were ex vivo samples, or the 970 nm valley to water, present also in all the tissues.
Therefore, it was clear that classification analysis was needed in order to try to provide further classification information for the intended application. One of the first issues to be solved had to do with the high dimensionality of the data, which could make the computation of classification algorithms difficult. As not all the spectral information seemed to provide information for tissue classification, as just discussed, dimensionality reduction would be applied. Two main initial approaches would be carried out. The first one would be based on spectra characteristics extraction, while the second one would start from PCA. After the dimensionality of the problem was reduced, classification algorithms would be applied, and statistical conclusions would be extracted.
3.1. Classification Algorithms Based on Characteristics Extraction
Several tests were made with different characteristics from the spectra, and the conclusions showed that gradients based on spectral characteristic points were the most significant ones. The position of peaks and valleys on the spectra is then critical for adequately defining the gradients.
Table 1 contains the specific nine points considered in the analysis. These points were found in each particular spectrum and taken as the references for the gradients. Null first derivative implied that the first derivative of the curve was calculated, and afterwards a point of zero value was looked for in the vicinity of the showed wavelengths.
In order to take into account the evolution of these points, the analysis was made with the gradients of the curve formed by the previously mentioned points at their respective wavelengths. A first visual analysis employed a dispersion diagram with two of the gradients, the one relating points 4 and 5, and the other one joining points 5 and 6, for three tissue types, skin, muscle and bone. This diagram appears in
Figure 5.
Looking at
Figure 5, it seems that, at least for the example shown, several samples of some of the classes could be classified by these two gradients, as they are separated in the diagram. However, the classification approach needed a whole classification for all the tissues, and also statistically significant parameters for comparison.
The first statistical analysis that could be made is by means of a boxplot diagram. This boxplot represented the median of each parameter for each sample according to its class (tissue type), as long as the first and second quartile, and the maximum and minimum values.
Figure 6 contains this information for the previously analyzed gradients 4–5 and 5–6.
Although gradient 5 to 6,
Figure 6b, seems to be more significantly spread in the sense of median, it also presents larger variations in the same class, and several outliers that could complicate the classification. Regarding gradient 4 to 5,
Figure 6a, the spread of data values is more reduced, but the median values seem to be closer to each other, something that could complicate classification.
To statistically quantify these impressions, an ANOVA analysis was performed. In this analysis the mean values of each class were compared, taking their equality as the null hypothesis of the test. The Snedecor F parameter, that should be significantly high if the null hypothesis is false (that is, if the mean values of each class are statistically different) was calculated for all the gradients. The results of the F Snedecor, normalized by the maximum value, appear in
Figure 7.
As can be seen in
Figure 7, there were significant differences between the different gradients. This meant that the ones with lower values would not be quite appropriate for the classification problem, as the mean values of the different classes would be statistically similar. Nevertheless, the results of this test gave us information about the general classification potential, but not about the particular classification accuracy for each particular class. The analysis of this information required classification algorithms as a final step.
Five different classification algorithms were applied to the gradients, either including all of them (81), or just the first 14 with the larger F according to
Figure 7. One fifth of the data was used for training the algorithm, and the rest were employed as test data for obtaining the results. Several final classification algorithms were employed. Linear discriminant analysis (LDA) divides the input domain into several regions, defined by linear hyperplanes. This algorithm shows the best performance when the input data are linearly separable, and can be formulated as:
In this equation
is the weight vector and
is the bias for each class. A point is assigned to a class if
is bigger for the class k, when compared with all the other classes. If the discrimination boundaries are allowed to be of higher order, or curves on the graph, the algorithm is called quadratic discriminant analysis (QDA). It is also possible to try to finally classify data by a Naīve Bayes (NB) algorithm, in which statistical independence is assumed between the characteristics employed for the classification. Each input is then classified in order to maximize the following expression or the probability of multidimensional data
belonging to class
:
k-nearest neighbors (k-NN) algorithm classifies an input based on the previous classification of k inputs that are near it in the space domain, usually by a constant distance criterion. The input is classified according to the maximization of the following probability, that is the ratio of the number of neighbors that belong to k class
to the total number of neighbors
:
Classification and regression trees (CART) divide the classification space according to boundaries that are parallel to the main axes of the space. These dimensions are segmented and form a tree structure, by means of which each input goes along the tree until it reaches a node. The structure of the tree is built according to a maximization of the p-value of the predictor.
The first algorithm tested was LDA, whose confusion matrix is the following:
The obtained values for sensibility and specificity for each tissue type appear in
Table 2.
A similar analysis was performed with QDA, kNN, CART and NB. The results for the classification accuracy when using the best 14 gradients, defined as the ratio between correctly classified samples and the total number of samples, appear in
Table 3.
From
Table 3 it could be seen that the best classifier was kNN, followed by CART, with almost a 95% accuracy for the spectra employed. From the results it seemed that the gradients of the different classes (or type tissues) did not accomplish with a linear LDA or quadratic QDA model, or at least not so well as they did with adjacent measurements of neighbors (kNN) in the gradients space. A CART model was quite near in accuracy to the best kNN approach, which made sense as this approach was specifically optimized for p-value. The low accuracy of NB could be expected, as the assumption of independence of the gradients was not necessarily true.
3.2. Classification Based on Principal Component Analysis
As previously stated, another alternative to dimensionality reduction is PCA [
31,
32]. The PCA algorithm transforms a group of possibly correlated variables into a number of equal or smaller, uncorrelated, or orthogonal variables. These transformed variables are called principal components. Let
be an
matrix whose rows are the
observations. Each observation or spectrum
belongs to a D-dimensional space. The algorithm projects the data on a space of lower dimension while the variance of the projected data is maximized. Let
be a unit vector which defines the direction of the new space. The projected data variance can be expressed as:
where
is the mean of the dataset,
the data covariance matrix and superscript
denotes transpose. The maximization of expression (5) is done with respect to
, defining the following constraint
. It can be demonstrated that optimal projection is obtained when the new space is defined for the first eigenvectors
of the covariance matrix
.
Figure 8 shows the results of the first three components of the PCA. It could be seen that several peaks and valleys were reinforced, as they were supposed to contribute to signal variability between tissue types.
The number of principal components needed for adequate spectrum reconstruction, that is, for significant signal information, is so that the first component presents on average around 82% similarity, while with just 40 components the similarity increases to over 99%. This makes the dimensionality reduction obvious, as just the desired number of components need to be used, instead of the whole spectrum.
Figure 9 shows the results of just the first two PCA components as a function of tissue type. Although it was true that several tissue types were not clearly distinguished, the potentiality of the analysis with only two components seemed promising.
As in the previous case regarding characteristic parameters in
Section 3.1, the same classification algorithms, LDA, QDA, kNN, CART and NB, are applied to only the first 40 PCA components. In the case of the application of LDA, the confusion matrix is as follows:
In this case, again for LDA the values of specificity and sensibility appear in
Table 4.
These results clearly show a great classification power of the discriminant. As in the previous case of characteristics, other classification algorithms were applied, and the accuracy results are shown in
Table 5.
Large differences in the accuracy can be appreciated in
Table 5, although several classification algorithms are over 95%. The results imply a great accuracy of the results from these data.
4. Discussion
In this work two main classification approaches have been implemented and analyzed, one based on characteristics extraction, and the other one on principal component analysis. Both of them were intended for dimensionality reduction, as the large amount of data in each spectrum makes it difficult to provide an easily computable approach. Several classification approaches have been tried for each alternative.
The results with the first option, the one regarding spectra characteristics, has been focused on relevant peaks and valleys in the spectra, as suggested by the knowledge of the main biochemical components of the biological tissues, that usually condition spectroscopic measurements. This prior information is supposed to increase the efficiency of the approach. As said, not only these points but the gradients they form are considered. According to the first proposed data, a total of 81 gradients should be included in the analysis. As these points must be automatically looked for, by calculating the first derivative of the spectrum, and the analysis must be made with 81 values for each spectrum, it can be computationally intensive. The analysis concluded that just 14 gradients were the most significant ones for the analysis, greatly reducing the dimensionality of the problem. The subsequent application of classification algorithms, whose main results are in
Table 3, showed a maximum accuracy of almost 95% for the k-NN approach. The next one was CART, with around 93%, and afterwards came LDA, that dropped to almost 82%. A classification accuracy around 95% seemed to be quite promising then for the final application.
Regarding the second option, based on principal component analysis, the results showed that it was possible to employ just 40 components in order to have very significant results. PCA is a quite common algorithm that can be relatively easily implemented. Conversely to the previous approach, the selection of the data is made by purely mathematical reasons, and not by prior knowledge of biochemical components in the tissues. This fact could make the procedure more reliable perhaps in a situation with unknown different samples or other elements that alter the spectra of the samples. The application of the same classifiers as in the previous case, as it appears in
Table 5, gave accuracies of even more than 99% for QDA and LDA approaches, while kNN and CART were over 94%, and only NB dropped to 80%. The results were then better than those of the previous approach, as an almost perfect classification seemed to be possible. The interest of the results relied on final accuracy classification measures of massive measurements that demonstrated the potential feasibility of the technique for the clinical application of healthy tissue discrimination.
Of course, the results are very promising for solving the issue that was exposed at the beginning, based on the fact that discriminating tissue types is a clinical need. However, and despite the results, some considerations for clinical translation should be made. On one hand, the samples were specifically extracted and treated, so as to assure a high purity regarding tissue type, which would allow good classification algorithm training. In real clinical setups, there would be occasions in which more than one tissue type are in front of the operating area, and the response of the algorithms would probably change. Nevertheless, an appropriate classification would first need the analysis made in the present work. The spectra measured in this work constitute a core part of any classification algorithm of multilayered biological tissues, as they are obtained from known samples of one specific tissue type. Monte Carlo simulations regarding penetration depth and volumetric performance, as a function of tissue type and wavelength, show penetration depths of between 1 and 2 mm. Several relevant tissue types of interest in clinical practice are of that order of magnitude or even thicker. A precise model of the reflectance spectrum, even including a Lambertian model of the emission, could maybe add more relevant information for the classification. However, such a model would rely on specific assumptions of distance to the sample, angular incidence, and biological tissue structure, which we could not assure in the clinical praxis by an optical probe with uncontrolled distances to the sample. The denomination diffuse reflectance spectroscopy is quite extended in the field for spectral reflectance measurements, although sometimes the measured spectra might not follow strictly a theoretical diffuse model. A complete experimental analysis of multilayered tissue classification would be the next step in this work. On the other hand, several other components, such as additional blood, rinse water or other surgical elements could be in the operating area at the time of the measurement, and those elements would definitely alter the measured spectra. All these considerations should be taken into account in the next steps, as they would for sure decrease the classification accuracy.
The main aim of the manuscript regarding the clinical application is to provide first a simple and affordable tool for timely tissue identification, by means of a fiber probe. This would allow the surgical cutting procedure even to automatically stop when an undesired biological tissue is in front of the instrument, such as a nerve or a blood vessel. The technique could be extended to 2D measurements if we would like to obtain a tissue type map of the area of interest. That scenario would be possible by a fiber scanner, on one side, or even by a multispectral camera. Both approaches present challenges, as a fiber scanner could be expensive, difficult to stabilize and not fast enough, and a multispectral camera is expensive and provides limited spectral resolution.