Skin Pigmentation Influence on Pulse Oximetry Accuracy: A Systematic Review and Bibliometric Analysis

Nowadays, pulse oximetry has become the standard in primary and intensive care units, especially as a triage tool during the current COVID-19 pandemic. Hence, a deeper understanding of the measurement errors that can affect precise readings is a key element in clinical decision-making. Several factors may influence the accuracy of pulse oximetry, such as skin color, body temperature, altitude, or patient movement. The skin pigmentation effect on pulse oximetry accuracy has long been studied reporting some contradictory conclusions. Recent studies have shown a positive bias in oxygen saturation measurements in patients with darkly pigmented skin, particularly under low saturation conditions. This review aims to study the literature that assesses the influence of skin pigmentation on the accuracy of these devices. We employed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement to conduct a systematic review retrospectively since February 2022 using WOS, PubMed, and Scopus databases. We found 99 unique references, of which only 41 satisfied the established inclusion criteria. A bibliometric and scientometrics approach was performed to examine the outcomes of an exhaustive survey of the thematic content and trending topics.


Introduction
New non-invasive methods that can be applied in laboratory tests for patients or healthy adults as athletes for accurate monitoring of health variables have been developed in recent years. These methods include heart rate determination by electrocardiography, photoplethysmography (PPG) [1,2], light spectroscopy [3], and pulse oximetry [4][5][6]. PPG is an optical, simple, inexpensive, and non-invasive method for detecting arterial pulsation for measuring changes in blood volume according to the amount of incident light reflected or transmitted [7,8].
Pulse oximeter devices incorporating PPG are employed to evaluate peripheral blood oxygen saturation based on the measurement of red and infrared light absorption levels, directly influenced by hemoglobin levels [4,5]. The arterial oxygen saturation (SaO 2 ) represents the percentage of binding sites on hemoglobin carrying oxygen. As the partial pressure of oxygen (pO 2 ) rises, more O 2 molecules are available to bind with hemoglobin [9,10]. In clinical practice, oxygen saturation by pulse oximetry (S p O 2 ) gives information about the amount of oxygen available in the tissues by measuring how many of these binding sites are mixed with oxygen.
During exercise stress tests or respiratory rehabilitation therapies, pulse oximetry is a valuable method to determine the limit of cardiopulmonary stress, which is characterized by a relevant decrease in the value of oxygen saturation [11]. This technique has become the standard in primary and intensive care units and other clinical settings, such as anesthesiology, which requires constant and frequent monitoring of vital signs [1]. Besides, pulse oximetry has been widely used in rehabilitated COVID-19 patients with risks of developing respiratory distress [12][13][14][15][16][17][18]. Indeed, a recent retrospective cohort study of high-risk patients with COVID-19 pneumonia found that mortality was 48% lower in patients who used a pulse oximeter to monitor SpO 2 than those without one [19]. Therefore, the current World Health Organization COVID-19 management guideline recommends that symptomatic people with COVID-19 to use a home pulse oximeter for self S p O 2 monitoring "as part of a package of care" [20].
Traditional pulse oximeters are composed of two principal components: a lightemitting diode (LED) with two or more wavelengths and a photodetector [21]. The penetration depth of the light is determined by the wavelength and the distance between the light source and the photodetector. Long wavelengths such as red (660-700 nm) and infrared light, are suitable for measuring deep-tissue blood flow [22]. Conventional red and infrared pulse oximeters are reliable at rest and in normoxic or hyperoxic conditions, but have presented worsening accuracy in hypoxic conditions [23]. Different factors have been examined to determine the influence on the accuracy of pulse oximetry such as skin pigmentation [24], body temperature [25], altitude [26], barometric pressure [27], nail polish or henna [28], or motion artifact [29]. Under intense exertion, motion artifacts complicate the use of the pulse oximeter and the accurate interpretation of the data [29][30][31], especially with heart rate higher than 150 beats per minute [32]. Regarding skin pigmentation, several studies have also reported inaccurate readings in S p O 2 and heart-rate measurements by PPG in dark-skinned subjects because of the interference of melanin with the quality of the reflected signal [33][34][35][36][37].
Using multi-wavelength devices improves accuracy because they are less sensitive than the classical devices to the variations in the hemoglobin saturation [38]. Analyses of PPG signals at different wavelengths have shown that not only the accuracy of S p O 2 is improved. However, it is possible to extract more information about skin pathologies at various tissue depths because other substances in the blood (such as methemoglobin and carboxyhemoglobin) can be detected [39]. Green-wavelength pulse oximeters display the greatest modulation depth with pulsatile blood absorption, i.e., a greater absorbance for both deoxyhemoglobin and oxyhemoglobin than infrared light [40].
Skin pigmentation impact on the accuracy of S p O 2 monitoring has long been studied [24,[41][42][43][44][45][46][47][48][49][50][51]. One of the earliest studies on this topic in 1976 already reported reading errors in dark-skinned patients, reflecting lower blood oxygen saturation values [41]. More recent studies have also revealed that dark skin pigmentation influences the accuracy and performance of pulse oximeters devices, resulting in overestimations [17,[49][50][51][52][53] with increased incidence in the risk for occult hypoxemia (SO 2 < 88% despite normal SpO 2 > 92%) [54,55]. This bias is a matter of major concern since drops of only 2% are of particular importance in respiratory rehabilitation, studies of sleep apnea, and athletes performing physical efforts because they can lead to severe causes for the patient, requiring an external oxygen supply or even hospitalization.
Conversely, some older studies did not identify a considerable bias related to skin pigmentation [44][45][46]56] suggesting that dark skin pigmentation does not affect the quality of the signal [47]. However, these findings may be misleading because of a small sample size of the dark-skinned population or a weak correlation with skin pigmentation by using uncertain descriptors of ancestry to classify the population.
Due to this increasing evidence, several health organizations and healthcare professionals have raised growing concerns about the clinical accuracy of pulse oximetry in dark-skinned patients [20,[57][58][59][60][61], particularly in detecting occult hypoxemia [54,55]. A recent comprehensive review of different studies comparing SpO 2 measurements with arterial blood gas analysis SaO 2 in different patients and conditions has reported accurate but imprecise measurements across all levels of skin pigmentation [61]. Although not all the reported studies assess specifically the influence of skin pigmentation, low-certainty evidence about overestimations in dark-skinned patients was reported. This systematic review aims to summarize the available literature in the main scientific databases which examine the impact of skin pigmentation on pulse oximetry, unraveling the following research questions: • RQ1: What are the most significant publications and the ongoing research trends for prospect analysis on this topic? • RQ2: How does skin color affect the accuracy of pulse oximeter devices incorporating photoplethysmography? • RQ3: On which human populations have studies been conducted to verify these discrepancies and what methods have been employed to classify skin pigmentation?
Section 2 describes the search employed method to identify all relevant references by defining a suitable search query. The inclusion and exclusion criteria are exposed in Section 2.2, and the analysis and data extraction process are presented in Section 2.3. Next, Section 3 introduces the obtained results individually. In Section 4, the corresponding discussion and the four research questions are answered. Finally, the conclusions are presented in Section 4.

Methodology
We performed a systematic review based on bibliometric analysis to build a fair portrayal of the current research trends on this topic. Systematic reviews provide an exhaustive synthesis of the relevant literature to a specific research question [62][63][64]. The bibliometric analysis permits us to analyze a large amount of information extracted from a scientific database, performing a quantitative evaluation of the research topics [65,66]. Therefore, we have performed a scientometrics study to assess whether there is a lack of accuracy in pulse oximetry devices related to dark skin pigmentation.

Literature Search
We conducted a retrospectively systematic review of electronic databases between January 1975 and February 2022. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement was applied to pre-identified the selected studies indexed in PubMed, Web of Science (WOS), and Scopus databases. The research questions were based on the Patient, Intervention, Comparison, and Outcome (PICO) model, taking into consideration the following aspects: population (patient), pulse oximetry method (intervention), comparison (comparison), and outcomes. Two authors (AMC and MFG) independently performed the bibliographic search, while the search algorithm was reviewed and discussed with the rest of the authors (PM, KL or DL). To construct a suitable query and maximize the search strategy acuteness, the research question included the following keyword and medical subject headings (MESH) combined terms and their related topics: • "oximetry" according to MeSH terms. • "pulse oximetry" OR "oximet*" OR "oxygen saturation" to include all the relative references. • "photoplethysmography" according to MeSH terms. • "photoplethysmography" OR "PPG" as a generic term that refers to the optical imaging technique for detecting arterial pulsation. • "skin" OR "pigmentation" OR "racial" OR "race" OR "ethnic*" to include all the relative references. • "accuracy" OR "precision" OR "error" OR "reliability" OR "bias" to find all relative references.
The final analysis contained primary research about the references that assess skin pigmentation's impact on pulse oximetry. A literature hand search supplemented these results. We display the full query in Table 1.

Inclusion and Exclusion Criteria
Research based on publications that describe the aforementioned topic was included. Any prospective or retrospective study, including full publications, reviews conference proceedings, technical notes, and reports, were examined. We included studies that explored the accuracy of any type of pulse oximeter regarding skin pigmentation. The level of skin pigmentation was considered by both an objective measurement or scale and ancestry descriptors. If the information of any study was insufficient to fulfill the inclusion criteria, we searched for more information by contacting authors. We excluded the studies where no response was obtained. The exclusion criteria were: • Non-human focused studies; • Skin pigmentation influence not evaluated; • References that do not focus on pulse oximetry.

Data Extraction and Analysis
A suitable query was designed to maximize Scopus, PubMed and WOS results. The five authors performed the quality and eligibility assessment of relevant references. The following information was obtained: author, year of publication, journal, source, citations, author's keywords, and research outputs. We also analyzed the bibliography of the encountered references to add associated studies on the topic. Titles, abstracts, and author's keywords were first independently analyzed to get enough information about the eligibility of two authors (AMC and MFG, or PM). Second, we carefully examined the full text to check whether the references satisfy the inclusion criteria by the five authors. The chosen references were ordered by the total received citations from the three databases. Figure 1 describes the eligibility screening processes by each stage of the search. After titles and abstract examination, 54 references were potentially appropriate for full-text reading. We excluded references that did not satisfy the inclusion criteria. The principal reasons for exclusion were: studies that did not focus on humans (5); skin pigmentation influence not evaluated (4); oxygen saturation not evaluated (4). Records screened, titles and abstracts n=99 Records screened, titles and abstracts n=54 Records that did not meet inclusion criteria after assessing titles and abstracts n=45 Full-text articles excluded n=13 Non-human focused studies n=3 Skin pigmentation influence not assessed n=6 Oxygen saturation not evaluated n=4 Studies included in our systematic review n=41 Bibliometrix R-package and Biblioshiny, the webinterface, were employed to unravel the proposed research question [67]. VOSviewer (version 1.6.16) was also used to visualize a knowledge map of the topics and perform a cluster analysis based on keywords and cooccurrence analysis. The co-occurrence technique is usually employed to analyze dynamic patterns and trends of publications related to a specific topic [68].

Results
We identified 41 studies that fulfilled the inclusion criteria from 31 different sources between 1976 and 2022. We present an overview of this information in Table 2.  Table 3 presents the selected references sorted by year of publication. The first author's name, publication year, size of the sample, number of dark-skinned subjects, gender, type of participant, and type of oximeter employed are shown.   Table 4 summarizes the information extracted from the included studies. Since not all the studies report the selected characteristics we indicate the number of studies that show such information between brackets. The percentage is calculated by the total number of studies that report such information.

Assessment of Risk of Bias
Although there is no a standard risk of bias tool for methods-comparison studies, we follow the QUADAS-2 protocol based on four domains: patient selection, index test, reference standard, and flow and timing of patients [90]. We have considered 35/41 studies to perform risk of bias assessment of the individual studies shown in Figure 2. We exclude the reviews, letters, and the studies that did not report the clinical or experimental procedure or the data required to answer the signaling questions to assist in judgments about the risk specified in the QUADAS-2 study checklist [91]. First, one author (AMC or MFG) independently assessed the risk of bias for the selected references. Second, another author (PM, KL or DL) checked the assessment. In case of discrepancies between criteria we resolved via discussion. Last, we used the tool robvis for visualizing the risk of bias assessment [92].
We found 9/35 studies (25.71%) to be at high risk of bias for one domain [17,21,44,45,51,79,82,87,88]. The main reason for this bias is attributed to the selection of patients, where the skin pigmentation levels were classified by an unstandardized or qualitative judgment such as "dark", "black", "light" or "white". Additionally, 21/35 studies (60.00%) were classified to be at unclear risk of bias for at least two domains [17,24,30,[41][42][43][44]47,50,53,56,72,75,76,[79][80][81][82]87,88,93]. The major issues about these domains were related to possible bias about the interpretation or conduction of the reference standard, and about flow and timing in terms of patient flow and all patients receiving the same reference standard. We also found that 6 of these studies were in both groups (at high risk of bias for one domain and unclear risk of bias for at least two domains) [17,44,79,82,87,88]. Finally, 11/35 (31.43%) studies were found to be at low risk of bias for at least three domains [23,38,46,48,[69][70][71]77,78,84,89]. Furthermore, we have also included a column at the right part of Figure 2 showing the qualitative final conclusion reported by the selected studies regarding the existence of inaccuracies related to skin pigmentation. The studies are ordered by year of publication. In recent years, we can see that most of the selected studies reported inaccuracies related to the skin pigmentation. Indeed, from the 11/35 studies classified as being at low risk of bias 8/11 (72.73%) reported inaccuracies related to skin pigmentation, 3/11 (27.27%) reported no demonstrable loss of accuracy caused by skin pigmentation [46,48,70], and 1/12 (8.33%) assessed that further studies with other clinical and patient-reported data are required to understand the effects of skin pigmentation on the accuracy [77]. We have analyzed the most influential references and the progression of available publications to unravel this query. Table 5 presents the top 10 cited references per author and year, comparing total global citations (TC) from the whole bibliographic database, TC per year, local citations (LC), which are the citations received by a document by other documents included in our selection, and the percentage TC/LC, which gives a ratio between the LC internally in our selection, and global TC of a document in the whole bibliographic database. The three most influential papers were M. Kumar et al. [78] with 236 TC and an average of 29.50 cites per year, Y. Mendelson et al. [21] with 181 TC and an average of 5.17 cites per year, and A.C. Ralston et al. [86] with 149 TC and an average of 4.66 cites per year. However, the most cited reference within our selection was P.E. Bickler et al. [23] with 10 LC and a LC/TC ratio of 7.63%.  [78]. They proposed a significant improvement in the accuracy estimation by developing a distance PPG method with a considerable reduction of the errors by comparing the algorithm on people with different skin pigmentation classified as pale/white to brown/dark. Because of the recent development of non-contact methods such as remote photoplethysmography (rPPG) or image photoplethysmography (iPPG) [94,95], this study paved the way for the advancement of new contact-less and safe methods for vital sign monitoring. Although, some methods such as iPPG are susceptible to artifacts such as surface reflections, motion artifacts, shadowing, or skin tone [96] a computer vision-based oximeter using a digital camera to measure S p O 2 could be convenient, since it is a contact-less, secure and inexpensive method.

Risk of bias domains
Mendelson et al. in 1988 studied different factors to consider in the design of a skin reflectance sensor for non-invasive Sp0 2 measurements [21]. The separation between the source and the detector or the skin heating effect can improve the photoplethysmographic waveforms. Indeed, the larger the distance between the photodiode and the LED's, the larger the photoplethysmogram detected. They also found that skin heating can increase the pulse amplitude of the red and infrared photoplethysmograms. Besides, they asserted that skin pigmentation does not seem to affect the device's accuracy. However, they performed the study in a light-skinned small population (seven Caucasian volunteers), which is not a representative sample with a different skin pigmentation classification. Besides, the review of A.C. Ralston et al. in 1991 reported a slight decrease in accuracy related to dark skin pigmentation [86]. The most cited reference in our selection by P.E. Bickler et al. also came to the same conclusion finding overestimations in a study in 3 different oximeters with a more diverse population of 21 subjects (11 dark-skinned and 10 light-skinned) [23]. Hospital comparing measurements of SpO 2 and arterial blood gas of a sample of over 10,000 people with 14-20% of dark-skinned patients in both cohorts. An important finding of this study is that the probability of dark-skinned patients with SpO 2 reading between 92% and 96% having an arterial oxygen saturation of less than 88% was 3 times higher (11.7%) than light-skinned patients (3.6%) with the same SpO 2 measurement. Besides, they found overestimations in dark-skinned patients with an increased risk of occult hypoxemia [50]. There was a considerable upsurge of publications in 2021. The main reason for this surge may be attributed to the increase of this technique for triage in patients because of the current COVID-19 pandemic [12][13][14][15][16][17], and also because of the recent concern about the accuracy of these devices across all skin types [20,[57][58][59][60][61]. A further insight into the trending topics in terms of keywords co-occurrences is presented in Figure 4. Author's keywords are usually associated with the publication content [97]. The total number of keywords was 413, which were too many to fit on a chart. Hence, we configured a word minimum frequency of 4, while the time-span was set at 1976 to 2022. However, to analyze the current trending topics over the last years, we just show the keywords from 2002 to 2022. We also used a thesaurus file to group the related keywords, which gives us the 30 most cited terms related to the topic. A circle represents each keyword, the larger the circle, the more a keyword has been co-cited in our selection of publications. The distance between the circles denotes the proximity of the keywords. The lines describe the co-occurrence links between two keywords. The thicker the line, the more often two keywords are mentioned together [98]. The keyword "pulse oximetry" has the strongest strength in the middle of the chart connected with the rest of the keywords as "oxygen saturation", and "patient monitoring". The bottom right panel is a color bar showing the dynamics of the author's keywords. The first studies in skin pigmentation and oximetry are denoted by blue colors around 2000. The orange-red colors indicate the most recent publications.
The keyword "COVID-19" appears in 2020 near to the keywords, "hypoxemia", "oxygen-therapy", "intensive-care", "diagnostic accuracy", and "dark-skin population" among others. We found that 6 of the 41 references studied in this review examined the performance of pulse oximetry during the COVID-19 pandemic [17,30,50,51,69,71], that is the 50% of the references published since 2019. Since almost 40% of patients recovering from COVID-19 have respiratory effects derived from post-residual fibrosis after lung involvement, pulmonary rehabilitation is crucial. In this framework, "telemedicine" or telehealth, defined as the provision of health care that is offered remotely through any telecommunication tool, has recently had an enormous impact on research [99]. The limitations of contact pulse oximeter devices and the need for COVID-19 patients' triage have encouraged research about other alternatives as camera sensors [8,70,72,77,100] to predict S p O 2 . The latter is reflected in the keyword "camera-based oximeter" which appears before 2020. A recent study was performed on 46 subjects at center wavelengths of 840 nm (near-infrared), 675 nm (red), and 580 nm (green) to evaluate the feasibility of calibrating a camera-based S p O 2 oximeter using red and green light on data under normoxic and hypoxic conditions found significant bias at lower temperatures [72]. However, they did not stratify their results according to the level of skin pigmentation, which can introduce considerable errors.

Rq2: How Does Skin Color Affect the Accuracy of Pulse Oximeter Devices Incorporating Photoplethysmography?
PPG pulse oximeters depend on the detection of arterial pulsation therefore to obtain a large and stable photoplethysmogram from the back-scattered light, both the blood's optical absorption spectrum and also the opacity of the skin should be considered [21]. Melanin attenuates the wavelength of the incident light and limits the penetration to the subcutaneous tissue because is high light-absorbing [80,101]. Because dark skin contains more melanin, several studies assert that pulse oximetry may not be reliable in dark-skinned patients [21,23,24,50,51,53,69,73,75,76,78,80,[87][88][89]93].
Studies on healthy adults with different skin pigmentation showed that red and green wavelengths could estimate oxygen saturation with good agreement and lower error ratio compared to the traditional pulse oximeter [30,80,102]. Indeed, Fallow et al. found a better resolution using a green-light wavelength at rest and a green or blue-light wavelength in motion across all skin types [80].
A pioneering study to compare two pulse oximeters on 152 patients showed a statistically significant loss of accuracy in S p O 2 readings for dark-skinned patients for one device [43]. Although they attributed the greater accuracy in dark-skinned patients because of the subject's wide range of pigmentation levels, they also pointed out that these differences can be related to the LED employed in each device. Another study to determine the effect of varying LED intensity on pulse oximeter accuracy showed that in low saturation conditions, a 10:1 variation in LED intensity can lead to an error of 2.5% [103].
Accuracy errors in pulse oximeter readings have been associated with S p O 2 overestimations, especially at low saturation conditions [23,24,73,93]. A review of 13 models of pulse oximeters suggested that a high percentage of carboxyhemoglobin or methaemoglobin are related to failures in saturation readings and can hide hypoxemia [86]. The effect of skin pigmentation in pulse oximeter readings under hypoxia conditions was tested in 11 darkskinned male subjects and 10 light-skinned males, who were made to breathe an air mixture (nitrogen-carbon monoxide). The three pulse oximeters used showed overestimation in dark-skinned individuals that increases linearly as SpO 2 levels decreases [23].
A diagnostic accuracy study under the Standards for Reporting of Diagnostic Accuracy Studies [104] evaluated 10 of the most purchased pulse oximeters with arterial blood gas measurement SaO 2 as the reference standard and found a poorer SpO 2 performance in 5 of the 10 pulse oximeters. They advise that darker skin pigmentation affects the reliability of these pulse oximeters, and when used by a patient for home monitoring, confirmation of a medical-grade oximeter is required [71].
Feiner et al. conclude that pulse oximeter inaccuracy is reasonably small at S p O 2 > 80% because of dark skin pigmentation [24]. However, a bias of up to 8% was detected at lower saturation conditions in dark-skinned individuals. Lee et al. also observed this bias in an older study, finding that the estimation of arterial oxygen pressure, SaO 2 , varied significantly (p < 0.05) in multi-ethnic individuals (22 Chinese, 6 Malay and 5 Indian). The results suggested that the difference between S p O 2 and SaO 2 appeared to increase with darker skin pigmentation, with overestimations more pronounced under conditions of hypoxia and jaundice [85].
Accuracy improvements were found with a crosstalk-free sensor designed by Baek et al. who performed desaturation experiments at 60% to 100% on healthy adults with different skin pigmentation [75]. The conventional sensors showed a large error in darkskinned subjects, while the sensor which prevented optical crosstalk did not present S p O 2 measurement errors according to skin color.
A prevalence of occult hypoxemia in dark-skinned subjects has been recently confirmed in three studies, showing that dark-skinned patients may have over three times the risk of experiencing occult hypoxemia during hospitalization compared to light-skinned patients [53][54][55]. This finding is of particular importance since patients with hidden hypoxemia have higher rates of in-hospital mortality and organ dysfunction [55]. A study about applying a pulse oximeter device in the titration of fractional inspired O 2 concentration in ventilator-dependent patients revealed that a SpO 2 reading of 92% was reliable when titrating supplemental O 2 in light-skinned patients receiving mechanical ventilation. However, in dark-skinned patients, the same SpO 2 value was commonly related to significant hypoxemia, and a higher SpO 2 measurements of 95%, was needed to provide a tolerable level of oxygenation [88].
The skin pigmentation effect was also tested during exercise [83,87]. A study of the reliability of two ear oximeters under normoxic and hypoxic conditions in 33 healthy male subjects (mostly with dark skin pigmentation) showed unacceptable readings for S p O 2 values of less than 85% for one device and less than 90% for the other [87]. Although during exercise, the main potential errors are due to motion artifact, which complicates the use of the pulse oximeter and the accurate interpretation of the data [29][30][31], overestimations were attributed to high levels of carboxyhaemoglobin [83].

Rq3: On Which Human Populations Have Studies Been Conducted to Verify These Discrepancies and What Methods Have Been Employed to Classify Skin Pigmentation?
The FSP scale is commonly used to describe skin pigmentation according to the skin tanning response to UVR [105][106][107]. This scale is divided into six phototypes (SPT) from the lightest (tanning-resistant) phototype-I (SPT-I) to the darkest phototype-VI (SPT-VI). Fallow et al. classified 23 healthy subjects (11 males and 12 females, 20-59 years old) finding that phototype-V (dark brown) skin type has a significant signal-to-noise ratio than all other skin types [80]. The FSP scale was also employed to classify 404 intensive care unit patients as light (phototype I or II), medium (phototype III or IV), or dark (phototype V or VI) [76]. A small but statistically significant difference between SaO 2 and SpO 2 in light and dark skin phototypes [76] was found. In the study of the 10 most purchased pulse oximeters, 5 of 35 patients (14.3%) had a dark phototype (IV-VI) showing less accurate SpO 2 measurements [104].
In contrast, a study conducted in 298 patients classified according to the Munsell color system [108] as light group (51%), intermediate group (37%), and dark group (12%) conclude that skin pigmentation does not affect the bias or precision of pulse oximetry. They showed sub-optimal pulse oximeter function more frequently among dark-skinned patients but attributed this result to the observer bias. However, they also recognize that the study has some limitations as interrater variability or the classification by the Munsell color tile system, which requires user judgment [47]. Two more old studies, one in 1997 with a sample of 50 children, 15 of whom were dark-skinned [56] and another in 1996 with 100 dark-skinned patients [46], did not find any difference in accuracy between dark and light-skinned subjects. These studies may be misleading because of a weak correlation with skin pigmentation by using uncertain descriptors of ethnicity to classify the population as "Black" or "White". Besides, these pulse oximeters have probably been calibrated on a population sample with light-skin pigmentation, which miss relevant interaction effects of confounding variables and skin pigmentation [23,33,109].
A multi-ethnic population study composed of 22 Chinese, 6 Malay and 5 Indian patients was also studied, finding that overestimations were more pronounced in hypoxic conditions, jaundice and darker-skinned patients [85]. The same feature was observed in the data presented in the study by Zeballos and Weisman in 33 healthy, young, nonsmoking males [87]. Although they did not refer to a specific scale of skin pigmentation degree, they also found overestimations in the two compared oximeters in darkly pigmented subjects.
Neither, Feiner et al. referred to any pre-established scale [24]. They categorized each subject's skin as light (Caucasian), dark (African American), or intermediate (Hispanic, Indian, Filipino, and Vietnamese). They advised that a significant bias in dark-skinned patients with saturation below 80% should be considered. A similar classification was used by Baek et al. who recruited 3 dark-skinned subjects classified as "African American", and 9 light-skinned subjects classified as "Caucasian" or "Asian" in the comparison of two pulse oximeters, one which prevented crosstalk and a conventional one [75]. They found large errors in the dark-skinned subjects with the pulse oximeter that did not prevent crosstalk.
Even the most cited references by Bickler et al. did not refer to any scale and classified the population using the ancestry "African-American" as dark-skinned and "Northern European" as light-skinned [23]. Another more recent study of 26,603 patients across four self-identified racial groups as: "White", "Black" (i.e., "African", "African American", "Black"), "Asian" (i.e., "Asian", "Indian", "Cambodian", "Chinese", "Filipino", "Japanese", "Korean", "Laotian", "Pakistani", "Taiwanese", "Thai", "Vietnamese", and "American Indian" (includes "Alaskan Natives"), also found that the pulse oximetry accuracy for the detection of occult hypoxemia is not consistent across these self-identified ethnic groups with SpO 2 measurements on average overestimating arterial blood gas-derived oxygen saturation by 1.57% [53]. Besides, a subjective assessment of skin pigmentation was also employed using a scale of I to III to classify the populations as light, medium, and dark pigment levels, respectively, [43].
Female volunteers who had henna on their hands or feet were also tested [79,81,84]. They reported that black henna produces more errors in S p O 2 readings than red henna [84], which is expectable since black henna produces a darker skin. These studies suggest that an alternative site (as ear oximetry) should be chosen to monitor arterial oxygen saturation [81].
Pulse oximetry is also the preferred method of S p O 2 monitoring in examining every newborn and infant [110,111]. Therefore, some studies have also investigated if there is a skin pigmentation disparity in accuracy in neonatology [48,51,74]. Although one study asserted that there is no significant difference in systematic bias based on skin pigment for pulse oximetry [48], this study was not statistically significant since it had a small sample size and a stringent inclusion criterion that was not typical for premature infants. The other two studies found a small but consistent skin pigmentation disparity in S p O 2 measurements with values that may be falsely high in dark-skinned preterm infants [51,74].
The risks associated with significant hypoxemia have long been studied. A study performed in 1990 in ventilator-dependent patients when titrating supplemental O 2 report inaccurate measurements in dark-skinned patients [88]. However, since they use a subjective assessment of the degree of skin pigmentation classified as light, moderately dark, and very dark, a bias related to the selection of patients should be considered. Besides, another study in 187 pulmonary patients undergoing a stress test reported technical problems in patients with slightly less accurate readings [89].
These results have important consequences because of the importance of pulse oximetry to monitor patients recovering from COVID-19 pneumonia since they require frequent and accurate S p O 2 monitoring [14,73,112]. Stell et al. have recently performed a study to evaluate the characteristics of five portable pulse oximeters and their capability for homeuse. They selected 50 patients classified according to the FSP scale from respiratory wards and an intensive care unit. They reported that skin pigmentation had a significant effect on measurement bias with substantial discrepancies in two models [69]. Besides, another study was performed in patients with acute COVID-19 pneumonia to find out a relationship between patient ancestry and the accuracy of pulse oximetry found a small bias between arterial blood gas analysis SaO 2 and SpO 2 measurements of 0.28%, −0.33% and −0.75% for 194 patients of "White" (135), "Asian" (34) and "Black" (19) ethnic origin, respectively, [17]. Negative percentage points to SpO 2 measurements are higher than SaO 2 values.
Let us remark that a more objective method to distribute the study subject population should be employed in the out-coming studies. Controversial and ambiguous terms such as "race" or "ethnicity" do not accurately characterize skin tone. Several studies use the term "race" for patient self-identification as "Hispanic", "Asian", "American Indian", "African American", "Malay", "Native Hawaiian", "Pacific Islander" or even "White", "Black". This categorization is as broad as inaccurate, and does not account for an accurate description of the entire spectrum of skin pigmentation.

Final Remarks
New non-invasive methods have been developed for accurate monitoring and evaluation of health variables. Pulse oximetry is a simple and low-cost and non-invasive technique for detecting oxygen saturation. However, there is growing evidence that pulse oximeters are less accurate in dark-skinned individuals at lower saturation (<80%) resulting in overestimations. Overestimation is an issue of major concern since patients may seem healthier than they are with the corresponding risk of adverse health effects from diseases like COVID-19.
Given the central role of pulse oximetry in the management of COVID-19, the reliance on pulse oximetry to triage patients in communities with dark-skinned population requires integrating of both strategic and practical recommendations to the current regulations. Biomedical sensors for vital signs monitoring should be feasible across all skin pigmentation.
Although some old studies did not find any bias related to dark skin pigmentation, the latter studies have shown that pulse oximeters devices have some limitations with dark-skinned subjects, especially at low saturation or hypoxemia conditions. Since occult hypoxemia is associated with increased mortality risk, there is a need to determine such errors exactly.
It is also worthy to remark that, to date, the effects of skin pigment on pulse oximeter performance have been studied for only a relatively small number of pulse oximeter models, and most of them have been calibrated using light-skinned individuals. To obtain reliable data to overpass the lack of features of the commercial oximeters on the market regarding dark skin pigmentation, more studies with other clinical and patient-reported data are required, and if possible, to redesign the device algorithm related to the light absorption with a calibration based on dark-skinned patients.
Furthermore, a more accurate method for classifying the research subjects into categories by degree of skin pigmentation should be employed in these studies. Using uncertain descriptors of ancestry, ethnicity or even race to define patients with dark skin pigmentation is ambiguous and also troubled on many levels. Therefore, scale such as the Fitzpatrick Skin Pigmentation or other standardized and less subjective methods employed by the dermatology community can better describe the spectrum of skin pigmentation. This would decrease the probability of inappropriate escalations of the dark-skinned population.
A holistic understanding of these results may help to develop new applied works that can support new sensor design or healthcare decision-making to avoid any potential health risk between different skin pigmentation population. However, further statistical analysis regarding how the accuracy of pulse oximetry is affected by dark skin pigmentation with a proper classification should be performed.