Breath Analysis Using eNose and Ion Mobility Technology to Diagnose Inflammatory Bowel Disease—A Pilot Study

Early diagnosis of inflammatory bowel disease (IBD), including Crohn’s disease (CD) and ulcerative colitis (UC), remains a clinical challenge with current tests being invasive and costly. The analysis of volatile organic compounds (VOCs) in exhaled breath and biomarkers in stool (faecal calprotectin (FCP)) show increasing potential as non-invasive diagnostic tools. The aim of this pilot study is to evaluate the efficacy of breath analysis and determine if FCP can be used as an additional non-invasive parameter to supplement breath results, for the diagnosis of IBD. Thirty-nine subjects were recruited (14 CD, 16 UC, 9 controls). Breath samples were analysed using an in-house built electronic nose (Wolf eNose) and commercial gas chromatograph–ion mobility spectrometer (G.A.S. BreathSpec GC-IMS). Both technologies could consistently separate IBD and controls [AUC ± 95%, sensitivity, specificity], eNose: [0.81, 0.67, 0.89]; GC-IMS: [0.93, 0.87, 0.89]. Furthermore, we could separate CD from UC, eNose: [0.88, 0.71, 0.88]; GC-IMS: [0.71, 0.86, 0.62]. Including FCP did not improve distinction between CD vs. UC; eNose: [0.74, 1.00, 0.56], but rather, improved separation of CD vs. controls and UC vs. controls; eNose: [0.77, 0.55, 1.00] and [0.72, 0.89, 0.67] without FCP, [0.81, 0.73, 0.78] and [0.90, 1.00, 0.78] with FCP, respectively. These results confirm the utility of breath analysis to distinguish between IBD-related diagnostic groups. FCP does not add significant diagnostic value to breath analysis within this study.


Introduction
Inflammatory bowel disease (IBD) is a chronic condition of unknown aetiology, which includes Crohn's disease (CD) and ulcerative colitis (UC) [1]. Both conditions involve inflammation of the gut and are particularly unpleasant. While UC only affects the colon (large intestine), CD can affect any part of the digestive system from the mouth to the anus [2]. IBD is a common condition in the Western world, affecting over 250,000 people in the UK and 28 million worldwide [3]. The estimated annual cost of treatment, per patient, is approx. €30,000 with an average of 20% loss of working productivity [1,4]. This is due to the relapsing nature of the disease-there may be times when the symptoms are severe (flare-ups), followed by long periods when there are few or no symptoms at

Subjects
A total of 39 subjects were recruited for this pilot study, as part of the larger 'Famished' study. Ethical approval was obtained from the Warwickshire research ethics committee (IRAS ref: 18717). 30 patients had a histologically confirmed IBD (14 CD,16 UC), as well as 9 healthy control volunteers. IBD patients were recruited from dedicated IBD clinics at University Hospitals Coventry and Warwickshire (UHCW), UK. Details of medication and disease activity were recorded and simple colitis activity index (SCAI) for UC and Harvey Bradshaw index (HBI) for CD were calculated at the time of recruitment. Healthy controls were volunteers who did not report any overt gastrointestinal symptoms and were not on routine oral medication or recovering from any recent illnesses. An overview of the demographic data of IBD patients and healthy controls is shown in Table 1. The mean age of the IBD cohort was 49.7 years (standard deviation 17.5) and there were 18 males and 12 females.
As shown in Table 1, inflammation parameters such as CRP and FCP were recorded for the IBD cohort. FCP is a good indicator of inflammation in the bowel [24]. FCP levels between CD and UC patients have been shown to differ by over 55 ug/g (higher in those with UC) [25]. In our IBD cohort, mean FCP between CD and UC patients differ by almost 300 ug/g. Mean scores of 117 and 414 ug/g, respectively, indicate that the CD group is in remission and the UC group has active disease. A box plot of FCP scores for CD and UC are shown in Figure 1.

Subjects
A total of 39 subjects were recruited for this pilot study, as part of the larger 'Famished' study. Ethical approval was obtained from the Warwickshire research ethics committee (IRAS ref: 18717). 30 patients had a histologically confirmed IBD (14 CD,16 UC), as well as 9 healthy control volunteers. IBD patients were recruited from dedicated IBD clinics at University Hospitals Coventry and Warwickshire (UHCW), UK. Details of medication and disease activity were recorded and simple colitis activity index (SCAI) for UC and Harvey Bradshaw index (HBI) for CD were calculated at the time of recruitment. Healthy controls were volunteers who did not report any overt gastrointestinal symptoms and were not on routine oral medication or recovering from any recent illnesses. An overview of the demographic data of IBD patients and healthy controls is shown in Table 1. The mean age of the IBD cohort was 49.7 years (standard deviation 17.5) and there were 18 males and 12 females.
As shown in Table 1, inflammation parameters such as CRP and FCP were recorded for the IBD cohort. FCP is a good indicator of inflammation in the bowel [24]. FCP levels between CD and UC patients have been shown to differ by over 55 ug/g (higher in those with UC) [25]. In our IBD cohort, mean FCP between CD and UC patients differ by almost 300 ug/g. Mean scores of 117 and 414 ug/g, respectively, indicate that the CD group is in remission and the UC group has active disease. A box plot of FCP scores for CD and UC are shown in Figure 1.

Electronic Nose (eNose)
The term 'eNose' describes an instrument formed from an array of sensors with overlapping sensitivity [26]. Electronic noses are used in food and drink-related industries [27], as well as environmental monitoring [28]. Recent advancements in eNose technologies, such as improvements in gas-sensor design and pattern-recognition algorithms, have led to increased applicability of eNoses in the medical domain [29]. The Wolf eNose system (Warwick OLFaction: Wolf) was built in-house at the School of Engineering, University of Warwick [30]. The device consists of 13 sensors, employing a range of different sensor technologies, including eight amperometric electro-chemical sensors (Alphasense Ltd., Essex, UK), two non-dispersive infra-red (NDIR) optical devices (Clairair Ltd., Essex, UK) and a single photo-ionisation detector (Mocon, Minneapolis, USA). The sensors deployed in the Wolf eNose are summarised in Table 2. Since the Wolf eNose was designed and developed in-house, there is a high level of freedom allowing the instrument to be tailored to specific applications. A custom sample injection system was developed for this study, to enhance the capabilities of the Wolf eNose to analyse breath samples.
Alveolar breath samples, for Wolf eNose analysis, were collected using a commercially-available breath sampling device, known as the Bio-VOC (Markes Int., Llantrisant, UK). Alveolar breath refers to the last portion (350 mL) of exhaled breath, expelled from within the lungs and the lower-airways, which have undergone gaseous exchange with the blood in the alveoli [31]. A healthy adult expires approximately 500 mL air with each breath, of which the first 150 mL consist of dead-space air (no transfer of oxygen) from the upper-air ways and nasopharynx [32]. Subjects were asked to perform a single slow vital capacity breath into a Bio-VOC unit, in order to trap the last 129 mL of exhaled breath [33]. Subjects were supplied with a disposable carboard mouthpiece, and the Bio-VOC was cleaned thoroughly using antibacterial, alcohol-free sanitary wipes after every sample. Collected breath samples were injected into the Wolf eNose inlet port using a custom linear-actuator injection system. The injection system was manufactured using 5 mm acrylic sheets to support a 12 V, 200 mm linear actuator motor (JS-TGZ, Jianshun, Shenzhen, China). The Bio-VOC was secured into the structure, as shown in Figure 2. Thereafter, the plunger was compressed automatically, at a constant rate, over a 30-s injection period.  Figure 3 shows the typical sensors responses from the Wolf eNose to the exhaled breath sample of a CD patient. The chemicals refer to the target the sensor is sold to sense.

Gas Chromatography-Ion Mobility Spectrometry (GC-IMS)
A more recent, alternative approach to eNoses has been the use of portable GC-IMS analysers, which have demonstrated capabilities in medical diagnostics [34]. The BreathSpec GC-IMS (G.A.S., Dortmund, Germany) is a commercially-available instrument, consisting of a gas chromatograph (GC) and an ion mobility spectrometer (IMS), collectively known as GC-IMS. This instrument uses a GC as a pre-separator (based on chemical interactions with the column), followed by a IMS detector. In this case, the BreathSpec is equipped with a SE54 mid-range polarity column. The IMS uses a drift tube where the time taken for molecules to traverse the tube against a buffer gas (in this case nitrogen) is measured. This buffer gas is generated using a Nitrostation 50LC (Leman Instruments, Geneva, Switzerland) with 99.999% purity. The gas slows down the ions resulting in larger ions being slowed more than smaller ones. Ions are then collected on the detector (Faraday plate), to deliver a timedependent signal that corresponds with ion mobility [35]. This technique can measure substances in the low parts-per-billion (ppb) range and delivers measurement results in less than 10 min.
The Bio-VOC was not required for G.A.S. BreathSpec GC-IMS analysis. Subjects were provided with a disposable plastic mouthpiece, which pushes into the mouthpiece holder/sample inlet and connects directly to the side-panel of the instrument. This sampling procedure also collects end-tidal

Gas Chromatography-Ion Mobility Spectrometry (GC-IMS)
A more recent, alternative approach to eNoses has been the use of portable GC-IMS analysers, which have demonstrated capabilities in medical diagnostics [34]. The BreathSpec GC-IMS (G.A.S., Dortmund, Germany) is a commercially-available instrument, consisting of a gas chromatograph (GC) and an ion mobility spectrometer (IMS), collectively known as GC-IMS. This instrument uses a GC as a pre-separator (based on chemical interactions with the column), followed by a IMS detector. In this case, the BreathSpec is equipped with a SE54 mid-range polarity column. The IMS uses a drift tube where the time taken for molecules to traverse the tube against a buffer gas (in this case nitrogen) is measured. This buffer gas is generated using a Nitrostation 50LC (Leman Instruments, Geneva, Switzerland) with 99.999% purity. The gas slows down the ions resulting in larger ions being slowed more than smaller ones. Ions are then collected on the detector (Faraday plate), to deliver a time-dependent signal that corresponds with ion mobility [35]. This technique can measure substances in the low parts-per-billion (ppb) range and delivers measurement results in less than 10 min.
The Bio-VOC was not required for G.A.S. BreathSpec GC-IMS analysis. Subjects were provided with a disposable plastic mouthpiece, which pushes into the mouthpiece holder/sample inlet and connects directly to the side-panel of the instrument. This sampling procedure also collects end-tidal breath, since only the last four seconds of exhaled breath are collected for analysis [36]. The G.A.S. BreathSpec GC-IMS instrument is shown in Figure 4. breath, since only the last four seconds of exhaled breath are collected for analysis [36]. The G.A.S. BreathSpec GC-IMS instrument is shown in Figure 4.      The obtained sample is represented as a topographic map, whereby each datapoint is characterised by the retention time in the chromatographic column (seconds), the drift time (milliseconds) and the intensity of ion current signal (millivolts), indicated by colour. Laboratory Analytical Viewer (LAV) software (v2.2.1, G.A.S., Dortmund, Germany) was used to analyse the chromatograms.

Data Analysis
Both the Wolf eNose and G.A.S. BreathSpec GC-IMS require an initial feature extraction step before classification. The aim of feature extraction is to select robust information from the characteristic response; for example, selecting the maximum value of the original sensor response curve [37]. Supervised feature selection and class prediction was performed using a k-fold crossvalidation method, where k = 10. This method involves partitioning the original data set into 10 equally-sized subsets. Of the 10 subsets, a single subset is retained as the validation data for testing the model, and remaining subsets are used as training data. This process is repeated 10 times (number of folds), with each subset used once as validation data. The 10 results are then combined to produce a single estimation. This method for cross-validation is commonly used in breath analysis, to ensure robustness and avoid 'false-negative' errors (also known as type II errors) [38]. These occur when a d)

Data Analysis
Both the Wolf eNose and G.A.S. BreathSpec GC-IMS require an initial feature extraction step before classification. The aim of feature extraction is to select robust information from the characteristic response; for example, selecting the maximum value of the original sensor response curve [37]. Supervised feature selection and class prediction was performed using a k-fold cross-validation method, where k = 10. This method involves partitioning the original data set into 10 equally-sized subsets. Of the 10 subsets, a single subset is retained as the validation data for testing the model, and remaining subsets are used as training data. This process is repeated 10 times (number of folds), with each subset used once as validation data. The 10 results are then combined to produce a single estimation. This method for cross-validation is commonly used in breath analysis, to ensure robustness and avoid 'false-negative' errors (also known as type II errors) [38]. These occur when a test result indicates that a condition is true, when it is known to be false [39]. In our case, this would refer to a test result that indicates that a subject has IBD, when they are in fact from the healthy control group. A Wilcoxon rank sum test was used to calculate p-values for each feature, with the most informative features used for classification. This is undertaken inside the fold, to remove potential over fitting. Class predictions and sensitivity/specificity calculations were performed using five classification algorithms, specifically: support vector machine (SVM), sparse logistic regression (SLR), Gaussian process, neural network, and random forest (RF).
In addition to this analysis, the G.A.S. BreathSpec GC-IMS can potentially identify unknown VOCs that contribute significantly to the classification analysis. Using GC-IMS Library Search software (v1.0.1, G.A.S., Dortmund, Germany), we can identify compounds based on gas chromatographic retention times and ion mobility drift times, by referring to a NIST database. The database includes about 400,000 annotated retention indices and an estimated 83,000 compound entries [40]. To identify unknown compounds, GC-IMS files were loaded into the GCxIMS software and VOC identification is performed by simply clicking on the region of interest. The software then refers to the NIST database to generate a list of likely compound matches. A retention time range is provided, to indicate whether the suggested compound matches the expected retention time on the topographic map. Compounds with a close match in retention time and chemical structure were chosen as the identified compound.

Quality Assurance and Control
For quality assurance, the position and quality of the reactive ion peak (RIP) on the GC-IMS was regularly checked for signs of contamination. The RIP refers to the constant peak in the spectrum, which results from the carrier gas being always present in the measurement process. Moreover, samples were collected in the same setting, by the same operator, throughout the entire study. This is an important factor to ensure consistent sampling procedures since the collection process is manually triggered by the operator, while the subject exhales through the mouthpiece. Furthermore, the GC-IMS instrument was normalised using a standard ketone mix (2-butanone, 2-pentanone, 2-hexanone, 2-heptanone, 2-octanone and 2-nonanone), to match the GC-IMS Library Search software with the equipped column. For Wolf eNose calibration, the headspace gas from several chemical standards were tested (ketones, esters, alcohols, alkanes and aromatics). These experiments revealed various relationships between sensor responses and concentrations. Furthermore, testing the different compounds individually produced responses from different sets of sensors, which confirms a degree of selectivity [30]. In addition to this, quality control procedures were implemented. This involved collecting regular room air samples to monitor changes in ambient air and identify possible exogenous VOCs, i.e., compounds which do not originate from within the body.

Confounding Factors
In a recent study, Blanchet et al. [41] explored factors that influence the VOC content in human breath and stated that any application of exhaled air for diagnostics should consider possible confounders. For this IBD study, the following confounders were considered: body mass index (BMI), smoking habits and gender. BMI categories include: underweight (<18.5 kg/m 2 ), normal weight (18.5-24.9 kg/m 2 ), overweight (25.0-29.9 kg/m 2 ) and obese (>30.0 kg/m 2 ) [42]. To simplify the analysis, underweight and normal weight were combined into a single category, as well as overweight and obese. Smokers can be broadly defined as individuals who have smoked at least 100 cigarettes in their lifetime [43]. Thus, never smokers are defined as adults who have never smoked or have smoked less than 100 cigarettes in their lifetime. These definitions were used to categorise smokers and non-smokers. Gender groups were divided into male and female-this factor is of particular importance, since it was not possible to attain a gender balanced UC group during recruitment. The confounding factor groups are shown in Table 3, with roughly evenly matched groups and various combinations of CD, UC and control subjects.

Results
Analysis results are presented as operating characteristic (ROC) curves. The associated area under curve (AUC) is a measure of how well parameters can distinguish between diagnostic groups. In our case, the groups were IBD vs. controls, and CD vs. UC. Generated ROC curves for G.A.S. BreathSpec GC-IMS and Wolf eNose, IBD vs. controls and CD vs. UC, are shown in Figures 6 and 7, respectively. The RF algorithm consistently performed best. The analysis results are summarised in Tables 4 and 5. under curve (AUC) is a measure of how well parameters can distinguish between diagnostic groups. In our case, the groups were IBD vs controls, and CD vs UC. Generated ROC curves for G.A.S. BreathSpec GC-IMS and Wolf eNose, IBD vs controls and CD vs UC, are shown in Figures 6 and 7, respectively. The RF algorithm consistently performed best. The analysis results are summarised in Tables 4 and 5.

Chemical Identification
VOC analysis indicates that two compounds play a crucial role in distinguishing between IBD and controls. Chemical identification for the BreathSpec instrument, using the GC-IMS Library Search software, suggests that the best matches for the identified compounds include: butanoic acid (2-methyl-, propyl ester) and ethanoic acid (3-methyl-1-butyl ester).
Specific VOCs cannot be identified using the Wolf eNose. However, in an attempt to identify the chemical groups, which contribute most to the seperation between diagnostic groups, we have analysed the normalised average change in Wolf eNose outputs, per group. A radar plot of the responses is shown in Figure 8.
Specific VOCs cannot be identified using the Wolf eNose. However, in an attempt to identify the chemical groups, which contribute most to the seperation between diagnostic groups, we have analysed the normalised average change in Wolf eNose outputs, per group. A radar plot of the responses is shown in Figure 8. Ammonia, sulphur dioxide (SO2) and nitrogen dioxide (NO2) sensors showed the greatest changes in sensor outputs and thereby contributed most to the separation between diagnostic groups. Significant changes were observed in ammonia by both UC and CD patients. Changes in NO2 were marginally greater in CD over UC and controls, while UC is associated with increased sulphur dioxide. The other sensor outputs show small variations between groups, such as increased ethylene oxide in UC over controls.

Confounding Factors
The analysis previously conducted on IBD and control groups was repeated, using the same analytical techniques and algorithms, on the confounding factor groups, i.e., BMI: under-& normal weight vs overweight & obese; smoking: smokers vs never smokers; gender: male vs female. The analysis results are summarised in Tables 6 and 7. Ammonia, sulphur dioxide (SO 2 ) and nitrogen dioxide (NO 2 ) sensors showed the greatest changes in sensor outputs and thereby contributed most to the separation between diagnostic groups. Significant changes were observed in ammonia by both UC and CD patients. Changes in NO 2 were marginally greater in CD over UC and controls, while UC is associated with increased sulphur dioxide. The other sensor outputs show small variations between groups, such as increased ethylene oxide in UC over controls.

Confounding Factors
The analysis previously conducted on IBD and control groups was repeated, using the same analytical techniques and algorithms, on the confounding factor groups, i.e., BMI: under-& normal weight vs. overweight & obese; smoking: smokers vs. never smokers; gender: male vs. female. The analysis results are summarised in Tables 6 and 7.    Tables 6 and 7 demonstrate that the possible confounding factors of BMI, smoking and gender have insignificant influence on breath content. This is particularly true for BMI and smoking, as they achieve an AUC of around 50 for both technologies. Gender seems to have the most influence on breath content, with an AUC of around 60.

Faecal Calprotectin (FCP)
The demonstrated Wolf eNose analysis for CD vs. UC was repeated using FCP as an additional feature, on a reduced dataset of 20 samples (11 CD,9 UC) to match the availability of FCP scores. This analysis could not be repeated on the G.A.S. BreathSpec GC-IMS, because the features from this device are made up of clusters with numerous data points and are thus not directly compatible with single feature values, such as FCP.
The combined breath with FCP analysis was compared to breath without FCP, for the same dataset. The analysis results are summarised in Table 8. In addition to this, we investigated whether the combined analysis of breath with FCP could better distinguish between CD vs. controls and UC vs. controls, as shown in Table 9.

Discussion
In recent years, numerous studies have investigating the efficacy of breath VOCs to diagnose IBD [16,44,45]. At least three studies utilised selected ion flow tube mass spectrometry (SIFT-MS) to distinguish IBD from healthy controls, as well as separating UC from CD. Hicks et al. [44] identified 6 VOCs (hydrogen cyanide, ammonia, dimethyl sulphide, hydrogen sulphide, butanal, and nonanal) which significantly differed in concentrations, between diagnostic groups. Another study identified 3 specific VOCs (1-octene, 1-decene, (E)-2-nonene) as relevant for predicting the presence of IBD (AUC 0.96), but did not identify any significant difference in VOCs between CD and UC [16].
While the GC-TOF-MS study by Smolinska et al. was able to exceed the diagnostic performance achieved in this pilot study, the Wolf eNose and G.A.S. BreathSpec GC-IMS are a fraction of the price of GC-TOF-MS and SIFT-MS technologies (10-20% of the cost). Furthermore, the G.A.S. BreathSpec instrument is user-friendly, since it does not require trained operators, and is easy to move when mounted on a trolley. These practical advantages, in combination with chemical identification abilities, provide key advantages for GC-IMS technology in a clinical setting.
In this study, we demonstrated that both eNose and GC-IMS were consistently able to separate those with IBD from healthy control volunteers, regardless of disease activity as reflected by the FCP scores. Additionally, both technologies were able to provide some separation; GC-IMS p = 0.026 and eNose p = 0.0001, between CD and UC. The results indicate that the G.A.S. BreathSpec GC-IMS is better suited towards distinguishing between IBD and controls, while the Wolf eNose can better separate between CD and UC. These results are expected to be further improved by increasing the number of recruited subjects. The sensor array deployed in the Wolf eNose is focused towards inorganic gases, which most likely accounts for the differences in diagnostic accuracy achieved by the employed technologies. Ammonia, SO 2 and NO 2 sensors contributed significantly to the analysis of the Wolf eNose. Ammonia has an established link to IBD breath [16,44,47], since it is one of the intermediaries generated from bacterial fermentation of proteins [48]. UC was associated with higher SO 2 levels in our study, which has been observed previously [49]. It has been suggested that residential exposures to SO 2 and NO 2 may increase the risk of early-onset of CD and UC [50].
Increases in the aforementioned VOCs, butanoic acid and ethanoic acid, contributed significantly to the efficacy of our analysis for G.A.S. BreathSpec GC-IMS. These VOCs have been recently identified as important discriminatory volatile organic metabolites for IBD [51]. Short-chain fatty acids, such as butyric-, propionic-and acetic acids, are produced in the colon by fermentation of fibre. In particular, butanoic acid (also known as butyric acid) is a key component for health in the colon [52] and is the main energy substrate for colonocytes [53]. Butanoic acid has therefore been suggested to play an important role in the prevention and treatment of distal UC [54] and CD [55]. The variations in the identified compounds between IBD subjects and controls may be crucial for diagnostic purposes and need to be further investigated.
Analysis results of possible confounding factors show that BMI and smoking habits have insignificant influence on breath content. In general, the effect of smoking is an obvious factor to influence breath. However, it is worth noting that 19 of the 21 'smoker' subjects consider themselves to be ex-smokers. It is therefore unsurprising that greater differentiation was not possible in this case. Gender showed a more significant influence on breath content than the other two factors [AUC: 0.66, sensitivity: 0.68, specificity: 0.65]. While IBD generally affects men and women equally, some studies from North America show that UC is more common in men than women [56]. The unbalanced gender counts in the UC group (11M: 4F) could therefore be responsible for strengthening the separation between males and females for the confounding factor analysis. In addition to this, gender affects metabolism which can lead to differences in breath content [41]. However, this factor was not significant enough to create two distinct groups or undermine the IBD-related analysis. Age was not considered in the confounder analysis; however, this factor is known to have limited effect on breath content, with some studies showing no statistically significant associations between age and common breath gas metabolites [57]. Moreover, unlike many other diseases, IBD can occur at any age (most likely between . Age is therefore unlikely to have had a significant effect on the conducted analysis. Lastly, the effect of medication could not be considered in the confounding factor analysis, due to the number and variety of medications and treatments ascribed to each IBD subject. It is possible that the strong results distinguishing between IBD and controls, as well as CD and UC, could be related to the effect of medication; however, the same class of drugs was proportionally present in both IBD groups, so this factor is less likely to be a confounder. Results from the analysis combining breath analysis with FCP caused specificity, PPV and AUC to reduce, compared to those without FCP. While the differences in mean FCP scores between CD and UC is almost 300 ug/g (414.1-116.9 ug/g), this feature is prone to misclassification because all CD scores fall within the lower range of UC scores. Thus, including FCP with breath analysis does not improve distinction between CD vs. UC within this study. However, as shown in Table 9, FCP did improve separation of CD vs. controls and UC vs. controls. This is likely due to the significantly higher FCP scores in both CD and UC, when compared with the normal reference range of a healthy individual (<50 ug/g) [58]. Nonetheless, this specific application of FCP has little practical or clinical value, since the same conclusions can be derived using FCP scores alone. Moreover, there would be added costs (£18 per test) to conduct FCP tests, alongside breath analysis. These additional costs cannot be justified without a significant improvement in diagnostic performance in distinguishing between CD and UC. GC-IMS and eNose technologies therefore show the greatest potential as non-invasive diagnostic tools for IBD.

Conclusions
The results from this pilot study confirm the utility of breath VOC analysis to distinguish between IBD and healthy control volunteers, and CD from UC. To the best of our knowledge, this study was the first breath-based investigation of IBD utilising GC-IMS and eNose technology. Both technologies consistently showed the ability to separate those with IBD and controls [AUC ± 95%, sensitivity, specificity], eNose: BreathSpec GC-IMS is better suited towards distinguishing between IBD and controls, while the Wolf eNose can better separate between CD and UC. Compound analysis has identified two breath VOCs, which are likely to have a direct link to IBD: butanoic acid and ethanoic acid. These compounds played a crucial role in separating those with IBD from controls. Analysis of possible confounding factors indicate that BMI, smoking habits and gender have insignificant influence on breath content. Wolf eNose analysis was repeated on a reduced dataset, with FCP scores serving as an additional feature. This resulted in a poorer separation of CD and UC, which indicates that the efficacy of breath analysis is reduced, when supplemented with FCP; [0.85 (0.63-1.00), 1.00, 0.67] without FCP, and [0.74 (0.50-0.98), 1.00, 0.56] with FCP. The inclusion of FCP was able to improve diagnostic performance for CD vs. controls and UC vs. controls; however, this application has limited clinical value. Thus, the G.A.S. BreathSpec GC-IMS and Wolf eNose instruments offer the greatest potential as non-invasive, high-throughput, real-time diagnostic and screening tools for IBD, in point-of-care use. Moreover, since breath testing using these technologies could be undertaken during routine consultancy appointments, it has the potential to fundamentally change the current clinical diagnostic and assessment pathways for IBD.
Author Contributions: R.P.A. and J.A.C. conceptualized and designed the study. The Wolf was designed by J.A.C. Patients were recruited by J.K. Samples were collected by A.T. and J.K. Data analysis was conducted by A.W. and A.T. Original draft preparation, review and editing of the manuscript were completed by A.T., R.A.P. and J.A.C.