Next Article in Journal
Comparative Characterization of Protein Hydrolysates from Three Edible Insects: Mealworm Larvae, Adult Crickets, and Silkworm Pupae
Next Article in Special Issue
Benefits of the Use of Lactic Acid Bacteria Starter in Green Cracked Cypriot Table Olives Fermentation
Previous Article in Journal
Functionality and Storability of Cookies Fortified at the Industrial Scale with up to 75% of Apple Pomace Flour Produced by Dehydration
Previous Article in Special Issue
Volatile Composition, Sensory Profile and Consumer Acceptability of HydroSOStainable Table Olives
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Panel and Panelist Performance in the Sensory Evaluation of Black Ripe Olives from Spanish Manzanilla and Hojiblanca Cultivars

by
Antonio López-López
*,
Antonio Higinio Sánchez-Gómez
,
Alfredo Montaño
,
Amparo Cortés-Delgado
and
Antonio Garrido-Fernández
Food Biotechnology Department, Instituto de la Grasa (CSIC), Campus Universitario Pablo de Olavide, Edificio 46, Ctra. Utrera km 1, 41013 Sevilla, Spain
*
Author to whom correspondence should be addressed.
Foods 2019, 8(11), 562; https://doi.org/10.3390/foods8110562
Submission received: 17 October 2019 / Revised: 4 November 2019 / Accepted: 6 November 2019 / Published: 8 November 2019
(This article belongs to the Special Issue Research on Characterization and Processing of Table Olives)

Abstract

:
There is vast experience in the application of sensory analysis to green Spanish-style olives, but ripe black olives (≈1 × 106 kg for 2016/2017) have received scarce attention and panelists have less experience on the evaluation of this presentation. Therefore, the study of their performance during the assessment of this presentation is critical. Using previously developed lexicon, ripe olives from Manzanilla and Hojiblanca cultivars from different origins were sensory analysed according to the Quantitative Descriptive Analysis (QDA). The panel (eight men and six women) was trained, and the QDA tests were performed following similar recommendations than for green olives. The data were examined while using SensoMineR v.1.07, programmed in R, which provides a diversity of easy to interpret graphical outputs. The repeatability and reproducibility of panel and panelists were good for product characterisation. However, the panel performance investigation was essential in detecting details of panel work (detection of panelists with low discriminant power, those that have interpreted the scale in a different way than the whole panel, the identification of panelists who required training in several/specific descriptors, or those with low discriminant power). Besides, the study identified the descriptors of hard evaluation (skin green, vinegar, bitterness, or natural fruity/floral).

Graphical Abstract

1. Introduction

World table olive production was around 2.6 × 106 tones in season 2016/2017 according to the last consolidated balance of the International Olive Oil Council [1]. Approximately, 40% of them were processed as black ripe table olives (Californian style). This style was first developed in the USA, which is still one of the most relevant contributors with current production of about 80 × 103 tons [1], but other countries, like Spain, Greece, Turkey, or Egypt, are progressively increasing their productions. Black ripe table olive processing includes a phase of storage, which is usually accomplished by immersing the fruits in brine or acidified solution, followed by a darkening step, which consists of the application of one (or several) lye treatments and subsequent immersion in tap water to remove the excess of alkali. During this oxidation phase, air is also bubbled through the suspension to accelerate browning. The colour is then fixed by a ferrous gluconate solution, after which the olives are packed and the cans sterilised [2]. The products usually offer a rather plain organoleptic profile, which has been a favourable condition for its introduction in new markets, due to their numerous treatments in aqueous solutions. In fact, according to the Trade Standards Applying to Table Olives [3], the only requisites for these olives are sensory characteristics and texture in agreement with their processing system.
Along the last decade, the International Olive Council developed a method for the sensory evaluation of table olives. However, it was mainly focused on green Spanish-style, since most of the descriptors included in the evaluation sheet are exclusively related to this product (e.g., abnormal fermentation, acidity, or bitterness) [4]. However, methods for the evaluation and classification of black ripe olives were developed in California, where this processing has a long tradition [5].
On the other hand, Quantitative Descriptive Analysis (QDA) is widely used for studying the sensory profile of diverse foods ([6,7,8], among many others). Recently, researchers have applied QDA to a list of 33 descriptors for the sensory comparison of American black ripe table olives with respect to those that are imported from other countries (Spain, Egypt, or Morocco) [9]. Similar descriptors were used to study the sensory profile of black ripe table olives from Spanish Manzanilla and Hojiblanca cultivars and successfully distinguishing among cultivars, farming origins, and storage period [10]. López-López et al. [11] have developed an entirely new lexicon for the application of QDA to Spanish-style green table olives; the results showed relevant differences between cultivars and origins. Therefore, the use of the QDA to black ripe table olives from the most important Spanish cultivar devoted to this elaboration is relevant.
Traditionally, the sensory analysis of table olive, regardless of style, has been mainly devoted to the characterization of products [12,13,14,15,16], but the panelists and panel performances were rarely studied in detail. However, along the last two decades, different authors have developed methodologies for evaluating the reliability of the panel [17,18,19,20,21,22]. Its application to the panel performance, discrimination power of descriptors of diverse green and black ripe table olives, following the COI/OT/MO No. 1/Rev. 2 methodology, has been recently published [17]. Nevertheless, the performance of a panel and panelists that were devoted to the sensory analysis of black ripe table olives using QDA has never been studied.
This work aims for the application of Quantitative Descriptive Analysis to black ripe table olives from Spanish Manzanilla and Hojiblanca cultivars, focusing interest on the panel and panelist performances as a tool for improving their training and reliability.

2. Materials and Methods

2.1. Olives and Their Processing

The olives were of the Manzanilla and Hojiblanca cultivars, harvested at green maturation stage in October 2016. Their origins were: Aljarafe (Sevilla) and Lora de Estepa (Sevilla) for Manzanilla, and Lora de Estepa (Sevilla), and Alameda (Málaga) for Hojiblanca. The samples were identified as MAL, ML, HL, and HA, according to cultivar (initial letter) and growing area (remaining letter/s).
Just harvested olives from each cultivar and origin were directly brined in 25 L (15 kg olives) PVC (polyvinyl chloride) fermenters in an acidified (2.4% acetic acid) solution. After three months of storage, the fruits were subjected to the darkening process. For this purpose, horizontal stainless steel cylindrical containers (0.4 m diameter, 0.7 m length) were used. The fruits were treated with a 3% lye solution until the alkali reached the pit. After removing the alkali, the olives were washed to low the pH up to 8.0 units. During both operations, an oxygen-saturated ambient was maintained in the suspension by bubbling air through a perforated tube lying along the bottom of the oxidation vessels. Subsequently, the black colour developed was fixed, while using a 0.1% ferrous gluconate solution with pH adjusted to 4.5 to prevent the precipitation of the element as hydroxide. Afterwards, the darkened olives were introduced in glass jars (145 g of olives), together with 170 mL of 3.5% NaCl cover solution, which also contained 0.2 g ferrous gluconate/L and had the pH adjusted to 4.5 with acetic acid. Finally, the jars were closed and sterilised at 130 °C for 20 min [23].
The sensory analysis of the above-prepared black ripe olives was achieved after storage at room temperature for 30 (to allow complete olive flesh/brine equilibrium) and 210 days (estimated maximum normal period of the product in the shelves before reposition). The new codes were those previously mentioned, plus 1 (one-month storage) and 2 (seven-month storage), respectively. Therefore, the symbols of the final samples: were: MAL1, MAL2, ML1, ML2, HL1, HL2, HA1, and HA2, which indicated the successive letters and figures cultivar, growing area, and the storage period, respectively.
A panel composed of eight men and six women, making a total of 14 panelists (40 years’ average age) performed the analysis. They all belonged to the Instituto de la Grasa staff and had vast experience on sensory studies due to their participation in the development of the Sensory Analysis Method for Table Olives [4] and the permanent involvement in diverse IG table olive sensory projects (e.g., [10,11]). Before the tests, the panelists were trained for one h twice a week for two months to familiarise them with the QDA techniques and the black ripe olive descriptors, while using industrially processed Spanish cultivars black ripe olives. The presentation of the samples was always made in the standard glasses [24], which were coded with three randomly chosen digits. After each test, the mouth was washed with tap water, freely available in each booth. Therefore, the panelists were progressively familiarised with the product, the sensory descriptors that were included in the evaluation sheet, informal tentative evaluations, and, finally, allowed for practicioning with the unstructured scale (1, complete absence; 11, strongest perception) of the evaluation sheet for another month. After these periods, they were considered ready for the evaluation of the real samples because of the previous expertise of the panelists in sensory testing. The assessed descriptors included appearance (skin red, skin green, skin sheen, flesh red, flesh yellow, and flesh green), aroma (briny, mushroom, earth/soil, oak/barrel, nutty, artificial fruity/floral, natural fruity/floral, vinegary, alcohol, fishy smell/ocean, and cheese smell), taste (sourness, bitterness, and saltiness), flavor (ripeness, buttery, metallic, rancid, soapy smell/medicinal, and gassy smell), and texture/mouthfeel (firmness, fibrousness, moisture release, mouth coating, chewiness, astringency, and residual). Their definitions and references may be found elsewhere [10].
For performing the tests, the black ripe olive samples were presented to panelists at an ambient temperature (20 ± 1 °C) and in a panel room that was equipped with individual booths under incandescent white lighting and free from any odors. The panelists were asked to mark the intensity of the different descriptors in the evaluation sheets. The scores of the attributes were measured with the exactitude of one decimal point and the results tabulated.

2.2. Data Analysis

The data were mainly studied while using the SensoMineR v.1.07 software (Agrocampus Ouest, Rennes, France) [25], a package that was designed and programmed in R language [26]. It is characterized by combining classical sensory statistical methods as well as others directly conceived in the developers’ laboratory. In this way, SensoMineR provides a synthesis of the results of the usual analysis of variance (ANOVA) models, as well as a diversity of easy to interpret graphical outputs. Notably, the package includes several options for the panel evaluation, such as multivariate analysis and the generation of virtual panels, by bootstrapping techniques, which allow for the estimation of the corresponding confidence limits. XLSTAT [27] was also applied in specific analysis and tests.

3. Results and Discussion

The matrix of data was constituted by the following variables: sample-storage period (just sample from now on), panelist, session, and the 33 descriptors making a total of 36 columns. Additionally, sample, panelist, and session had 8, 14, and 3 levels, respectively, making a total of 336 rows. Therefore, the overall number of cells was 12,096. The generated database was already used for product characterization [10], but, in this work, the analysis is focused on the panel and panelists performance as an exercise for improving their evaluation and training.

3.1. Overview of Results

After checking the dataset for possible outliers and typing errors, they were also subjected to a first overview (frequency histograms and boxplots), which indicates that several descriptors received low scores and they were hardly noticed; however, others were perceived by the panelists, distributed along the scale, and allowed for discrimination among samples (data not shown). Further details can be found elsewhere [10].

3.2. Panel Performance

The techniques that are available for panel and panelists performance are numerous, with ANOVA and multivariate analysis being the most common. Kermit and Lengard Almli [16] presented univariate and multivariate data analysis methods to assess the individual and group performances in a sensory panel. Notably, Husson et al., [25] developed the SensoMineR, which includes several innovative tools with this objective.

3.2.1. Effect of Sample (Power of Discrimination)

The evaluation of the panel performance is an essential premise not only for obtaining reliable results on sensory analysis, but also for improving the selection of panelists and their training. In this work, the panelperf instruction from SensoMineR, with the appropriate models and the corresponding analysis of variance, was used. The ANOVA was fitted to the following full model:
Score = sample + panelist + session + sample panelist + sample session + panelist session
where score stands for the expected evaluation value, while sample, panelist, and session for the predictive variables, with the effect of storage being included as levels of the variable sample. The panelist and the session were both studied as random effects, but the sample was considered to be fixed [28].
The results regarding performance (Table 1) showed that the panel was able to discriminate the samples based on skin green, flesh green, skin sheen, flesh red, firmness, fibrousness, flesh yellow, skin red, vinegary, moisture release, fishy smell/ocean, and saltiness. Good segregation among the samples or products by panelists is systematically reported in numerous publications ([6,17,28,29,30], among others).

3.2.2. Effect of Panelist

The significant effect of the panelist, with very low p-values, regardless of descriptors, indicates a different interpretation of the scales. Such an effect is not desirable, but it is usually observed. However, its presence does not represent any inconvenience for achieving appropriate conclusions, since the panelists’ variance can be eliminated thanks to the ANOVA analysis and by centring the data with respect to panelists [31]. The assessors’ performance will be studied in detail later.

3.2.3. Effect of Session

The effect of the session was not significant for any descriptor (Table 1), which indicates an overall good panelist performance over time (the samples were assessed in the same way from one session to another), which is an appropriated and desired situation. Subsequently, no further comments regarding this aspect are also required.

3.2.4. Sample·Panelist Interaction

In the case of a total consensus among the members of the panel to assess the descriptors in all samples, their effects should not be significant. However, in this work, there were numerous significant cases (Table 1). The evaluation of the interaction is usually measured by the coefficients of the ANOVA, defined as the difference between the expected mean score by all panelists and that given by a specific one. It is tedious to reproduce their meaning in all descriptors, so only the case of skin red and flesh red are shown as examples (Figure 1). The effect might be significant because of two circumstances: (i) the panelists do no rank the samples in the same order and (ii) they do no use the scale in the same way. Both situations were found in this work. Examples of different ranks were observed, among other descriptors, for skin red, panelist 1 gave the highest score to HA1, but panelist 2 ranked it as the second one from the bottom; a similar behaviour occurred for flesh red regarding panelist 5 with respect to panelist 6 (Figure 1).
On the other side, for skin red, panelist 1 used a narrower scale than panelist 6; the same trend can be observed for flesh red by panelist 1 and panelist 12 (Figure 1). Therefore, to improve panel performance, it will be required further additional training in the scoring of some attributes and the amplitude of their scales.
The corresponding coefficients of each panelist in the ANOVA model were assessed by the identification of the panelists who mainly contributed to the interaction [19]. With this aim, the difference between the expected score and that given by a concrete panelist, overall sessions and samples, represent how far a specific panelist scores the sample differently to the product mean of the whole panel. No significant differences were usually observed (panelists had, in general, good reproducibility), but some peculiarities were noticed. For example, panelist A12 scored skin green (Figure 2A) sensibly higher than any other panelist; subsequently, he was critical in the significance of this interaction. Additionally, panelist A3 tends to scoring skin red, skin sheen, and flesh red above the panel average (Figure 2A).
Another way of observing the sample·panelist interaction and measuring the panelists’ reproducibility is by plotting the mean per panelist over the mean on the whole panel according to samples. In agreement with previous comments, some panelists gave high scores to several descriptors and, in this line, panelist A12 overscored skin green in samples HL2, HA2, MAL2, and ML2 (Figure 2B). These high scores were due to a tendency of this panelist to evaluate several descriptors (flesh yellow and briny, data not shown) higher than other panel members. Similarly, outstanding scores were observed for panelist A5 in vinegary, alcohol, and sourness, and for panelist A8 in mouth coating, chewiness, stringency, and residual (data not are shown). However, most of the panelists differently scored only one descriptor like A4 in grassy smell, A10 in cheesy smell, A3 in a buttery, or A6 in rancid, to mention a few cases. Therefore, no panelist systematically contributed to the interaction, but the above-mentioned results could indicate that the panel performance would be improved by the further training of some panel members (A12, A5, and A8, on several descriptors or A4, A10, A3, or A6, only regarding specific ones). Kermit and Lengard Almli [19] also found several assessors who showed poor performance in some attributes, such as mealiness or fruity flavor.

3.2.5. Sample·Session Interaction

These interactions refer to the variation of the mean of each sample from one session to another and they should not be confused with the session effect, which applies to the mean of all samples between sessions. In the study (Table 1), the sample·session interaction was only significant in two cases: saltines (which was an important descriptor for sample discrimination) and metallic (Table 1). In saltiness, the significant interaction was mainly produced because of the different scoring for samples HA2, HL1, HA1, MAL1, and MAL2 in session S1 (Figure 3), while, in the case of metallic, the significant interaction is due to the abnormally high score of MAL1 in session S1 (Figure 3).

3.2.6. Panelist·Session Interaction

If significant, it means that one or more panelists do not similarly grade for all of the products from one session to another. There were several significant panelist·session interactions. Among the descriptors that contributed to discrimination, mushroom, oak barrel, cheesy smell, sourness, chewiness, bitterness, and saltiness had significant interactions (Table 1). The contribution of panelists to this interaction might also be evaluated by their respective coefficients, estimated as above-commented. Figure 4 shows examples.
Among the panelists that most contributed to the differences in scores between sessions according to descriptors, were: A13 for skin red, flesh red, and flesh green. Regarding other descriptors, A12 actively contributed to vinegar or A5 to natural fruity/floral, alcohol, and earthy soil (data not shown). However, most of the panelists had homogeneous contributions in most of the descriptors (skin green, skin sheen, flesh yellow, or briny, Figure 4). Moreover, no panelist showed a systematic trend for all descriptors, except a few of them, like A12 for skin sheen and flesh red or A7 for mushroom (Figure 4). Subsequently, the interaction was mainly due to the contribution of a reduced number of panelists (frequently only one) with limited influence on the panel repeatability.
The panelist·session interaction might also be presented as a plot of the mean per session over the mean on the whole sessions, according to panelists (Figure 5). Ideally, they should follow a line, regardless of sessions. In general, the panelists followed a similar trend over sessions (Figure 5 for some descriptors) with only punctual exceptions, like panelist A6 for rancid. Other cases were related to panelists A4, A12, and A8 for bitterness due to the abnormally low scores given by them (data not shown).
Finally, the plot of the different coefficients over sessions is the most common evaluation of the panelist·session interaction (Figure 6, for flesh red as an example). In this case, the problems that could be observed are, again, of different ranking in successive sessions or different amplitude of scale over sessions. In Figure 6, panelist A13 assigned an excessive high score in the first session, while in the second session the score was low. Additionally, the amplitude of the scale for this descriptor was wider-spread in the first session than in the second. In saltiness, the situation was different, A12 had a very low contribution (coefficient) but the scale amplitude was similar among sessions; in firmness and fibrousness, panelist A13 was the only who had an excessive high score and, subsequently, a high contribution to the interaction, while, on the contrary, had low contribution on saltiness. Therefore, the analyses in detail of this interaction allowed for detecting some weakness of panel performance and lack of coherence in some panelist. Then, personalized training would be advisable.

3.3. Panelist Performance

When a panelist can discriminate among samples and is well repeatable and reproducible (that is, score the same product consistently and agrees with the rest of the panel), it is considered to be reliable according to Rossi [18]. There are several techniques for evaluating these panelist’s performance parameters. Tomic et al. [20] develop a series of graphs for easy visualisation of the sensory profiling data for performance. Kermit and Lengard Almli [19] mentioned consonance analysis with PCA, full ANOVA model and notation, assessor sensitivity, assessor reproducibility, or agreement test as appropriate to evaluate the assessor and panel performance. Lanza and Amoruso [17] mention the repeatability index (RIt) and deviation index (DIt) to evaluate how assessors perform against themselves over time and their performance with respect to the whole panel, respectively. In this work, the diverse tools that were proposed by Husson et al. [31] for studying the panelist work will be particularly followed.

3.3.1. Discrimination Power of Each Panelist

The individual efficiency of panelists was evaluated with the model: score = sample + session. The p-values (Table 2) that are associated with the F-test of the sample effect on each panelist are, then, the appropriate parameter to measure this discrimination power. Their values, with rows and columns being sorted by the median estimated over them (Table 2), showed that most of the panelists were able to discriminate the black ripe table olive samples based on several of the descriptors that were developed by Lee et al. [9] and used later by López-López et al. [10]. Their efficiencies, in decreasing order, were: A14, A4, A2, A3, A6, A5, A8, A1, A12, A13, and A7, while only A11, A10, and A9 had not any discriminant power (Table 2). Skin green was the only descriptor that received an overall significant median; however, mouth coating, flesh red, briny, flesh green, or skin red were among the attributes most differently perceived in the samples (Table 2). On the contrary, soapy smell/medicinal, fishy smell, cheesy smell, alcohol, or metallic were among the most similarly perceived; however, this does not necessarily mean that the panelists were not able to differentiate samples, but that they were present in very low intensity or even completely absent (Table 2). There is controversy in the possible p-value that could be used as a cut off-level to consider one panelist acceptable. Stone et al. [32] proposed p ≥ 0.5, but the problem was that there were so many p-values below 0.5 when evaluating tea that almost any laboratory would retain them. Powers [33] pointed out that the real question was establishing the number of attributes with significant performance being necessary for a judge to be an acceptable assessor. However, no agreement on this aspect was achieved. In this work, in general, the panelists were not systematically excellent in all descriptors, but most of them were good at some descriptors (significant p-value), and their overall performance was reasonable; however, the behaviour of panelists A11, A10, and A9 should be, according to these results, candidates for possible further training or even removal from the panel if their performance will not sufficiently improve. Kermit and Lengard Almli [19] also identified an assessor with further need for training in attributes pea flavor, sweetness, fruity, and off flavor.

3.3.2. Panelist Repeatability

The panelists’ repeatability is the ability to consistently score the same product for a given attribute [18] and was evaluated by the standard deviation (SD) of the measurements of a descriptor from each panelist on each sample. It was considered that, when the residual of the ANOVA model for each panelist and descriptor (Table 3) was ≤ 1.96 (p ≤ 0.95), the panelist scored the samples in a narrow range through the successive sessions and only panelists with residuals that were above this limit scored differently between sessions. In this work, there were no panelists who systematically graded the descriptors differently from one session to another (SD ≥ 1.96, in bold); however, several of them showed residuals above the limits for one to various descriptors, but not at a large distance. Therefore, in general, the panelists showed acceptable repeatability.

3.3.3. Panelist Reproducibility

The panelist agreement with the panel, as associated to reproducibility [18], was assessed by the correlation between the panelists’ scores and the adjusted means of the panel (estimated by the ANOVA model) according to descriptors.
The procedure is similar to that used by Nyambaka et al. [30] to study the sensory changes in dehydrated cowpea leaves. The data are presented in a table, in which both panelists (in the column) and descriptors (in rows) are sorted from the highest to the lowest marginal median (Table 4). The panelists’ agreement with the panel (significant correlation, in black) were, in descending order of their medians, A6, A8, A14, A5, A1, A7, A13, A10, A9, A3, A2, A4, A12, and A11, while the negative correlation (in black and italic) was distributed more or less evenly, indicating opposed agreement with the panel (divergent behaviour). The inconsistence of some panelists when evaluating cowpea leaves was attributed to particular preferences of assessors [30] and could also be possible in table olives for some attributes, like firmness or fibrousness.
Overall, the descriptors that had the best agreement between panelists and panel, sorted by the median, were (in decreasing order of relationship) skin green, skin sheen, flesh red, firmness, flesh green, fibrousness, flesh yellow, and moisture release (Table 4). They were also among the descriptors with the most discriminant power. On the contrary, those with more discrepancies among the panelists were residual, artificial fruit/floral, metallic, rancid, sourness, or soapy smell/medical (Table 4), all of them with no discriminant influence.
These results show that the overall behaviour of the panelists was reasonable, although there was still margin for some improvement in their performance, particularly regarding those panelists with strongly opposed correlation to the mean of the panel. Alternatively, they could be candidates for further rejection.
Lanza and Amoruso [17] used line plot according to the attribute and deviation index (DIt) to evaluate the agreement between panelists and whole panel. Their results are in line with those described above, since they also found some panelists who clearly deviated from the consensus. According to these authors, this type of results helps the panel leader to identify repeatability problems of specific assessors as compared to the whole panel and correct the deviation by the corresponding training.

3.4. Multivariate Study of Panelists and Panel

3.4.1. Clustering

A first multivariate approach of the similarity among panelists was achieved by hierarchical clustering analysis based on the scores given to the sample descriptors by each of them. The study was performed in XLSTAT, while using Wards’ aggregation criterion [28]. Three groups of panelists were formed when comparing the panelists’ behaviour (Figure 7A). The greatest dissimilarity was found between the group that was formed by A4 and A6 with respect to the other panelists. The dissimilarity within the groups of other panelists was sensibly lower, leading to three groups. Two of them were composed of four and seven panelists, while the third only included panelist A8, who had a peculiar behaviour. Therefore, in this case, the cluster analysis, which considers the overall panelist performance, showed that the panelists followed a somewhat similar trend when evaluating the black ripe olive samples, but not reveal their peculiarities. In line with this result, the hierarchical classification is more usually applied for the classification of products or studying the association among descriptors. Francois et al. [28] used this technique for assessing the astringency of different beers while Pense-Lheritier et al. [29] applied it to link the sensory changes induced by the addition of drugs to different beverages. Alasalvar et al. [6] found similarity among the flavor of natural and roasted Turkish hazelnut cultivars. Clustering was also used to segregate different consumers segments according to their overall liking scores [34].

3.4.2. Panelist Reproducibility

The multivariate study of the agreement among panelists and the whole panel [18], while using bootstrapping, was made in SensoMiner, by considering the results of a virtual panel that was obtained by taking successive samples (500 simulations) from the real data and applying Principal Component Analysis. Only two eigenvalues ≥1 were found and they accounted for ~42 and 26% of the variance, respectively. The analysis was made while using the function panelipse·session. The resampling technique has been described in detail elsewhere [31].
The closeness of the whole panel and panelists’ answers was evaluated by projecting them onto the first two PCs. A PCA on the consensus allows for visualizing the strength of the consensus and the global discrimination of the products; besides, treatments identification shows the observed differences between the products [35]. In this work, the distance from each panelist to the situation of the corresponding sample assessed the agreement between the whole panel (squares symbols and different colours for the samples) and the panelists’ acronyms (associated to samples by circle symbols using the same colours) (Figure 7B).
PC1 was highly efficient for segregating samples from Manzanilla (on the left) and Hojiblanca (on the right) and it could be associated to cultivar, while PC2 was able to distinguishing samples as a function of growing area and storage. In general, the projections of panelists for each sample were situated around that of the whole panel (sample associated to the same colour); although, there were some of them far for their respective samples. The discrepant panelists were (as identified by the corresponding acronyms) the same already mentioned in previous sections, mainly: A12, A8 for HL2; A8 for HA2; A13, A12, A8 and A6 for HA1; A12, A7, A9, A6, A3, and A2 for MAL2; A12, A7, A6, and A2 for ML2; A13, A11, A9, A8, A7, A5, and A1 for MAL1; and, A12, A8, A7, A6, and A2 for ML1. The panelist who scored the samples differently more times was A12, followed by A8, A7, and A6. Lower discrepancies were observed for A2, A9, A13, A3, and A5. However, they represent just a few cases of divergences, while most of the panelists’ scores are jointly distributed around their corresponding samples. Additionally, panelists had greater ability (closeness to the sample average) to evaluate long stored Hojiblanca samples (HL2 and HA2) than any other sample. In conclusion, this plot has identified the panelists who will require particular training, but the performance of the others will also benefit from training. Our results are in agreement to those that were presented by Tomic et al. [21], who also found underperformance panelists and emphasized the need for a detailed study of their behavior while using the established statistical methods for the evaluation. Lanza and Amoruso [17] studied the performance of panelist against the whole panel using Eggsshell plots, concluding that there were also a few panelists that ranked some of the descriptors quite differently from the consensus, while there was a good agreement in others, like hardness.

3.4.3. Panel Repeatability

Study by Variables Projection on the Correlation Circle According to Sessions

The analysis was carried out using the virtual panel described above [31]. A first approach of the panel repeatability was observed by projecting the descriptors (only those more relevant, contribution >0.20) onto the first two PC according to sessions. Close situations of descriptors in the correlation circle for the different sessions indicate good repeatability. The panel was particularly repeatable among sessions for some descriptors, like skin green, astringency, flesh green, moisture release, fibrousness, flesh red, skin sheen, or flesh yellow. However, others had sensible distances from one session to another, like fishy smell/ocean, saltiness, or chewiness (Figure 8A). The interpretation of the relationships among variables is not straightforward due to these oscillations on the variables’ projections. Nevertheless, it is possible to establish overall associations, mainly in those variables with high repeatability among sessions. For example, firmness, fibrousness, or chewiness are opposed to moisture release, ripeness, or flesh green. Additionally, those black ripe olives with high astringency could also present flesh yellow or skin green notes, but low vinegar or ripeness scores.
Galán Soldevilla et al. [14] associated bitter, sour, and wood with Green, Cured, and Traditional Aloreña de Málaga table olives, respectively. In black ripe olives, discrimination among the samples from different origins was mainly based on the 2nd and 3rd PCs, which were the components linked to aroma and flavour characteristics; however, the more linear behaviour of panelists was related to a textural dimension that was strongly connected to PC1 [9]. Kinesthetic sensations were also critical for the segregation between defected and un-defected samples by PCA [12].

Study by Sample Projections According to Sessions

The analysis was also carried out using the virtual panel described above. In this case, the median scores of the virtual panel perception of the samples (the same of the real panel) were projected onto the plane of the two first PCs according to sessions. Subsequently, 95% of the closest points of the generated cloud of points were used to draw their confidence ellipses (p-value = 0.05), which were built according to the procedure that was described by Husson et al. [31] (Figure 8B). The repeatability of the panel to the session can be assessed by the displacement of the sample centres. In general, the separation between the sample centres due to session was limited, indicating a good panel agreement between sessions, which is also corroborated by the overlapping of their confidence ellipses. Incidentally, the plot also indicates that the long stored fruits showed lower dispersion by sessions than the just processed fruits (one-month storage).

4. Conclusions

Usually, the study of the panel performance is a previous, but superficial, task during the sensory evaluation of products. However, a detailed investigation of the panel and panelist performance is a convenient tool to uncover the details of their evaluation. In this work, such study allowed for the assessment of the panel performance as a whole, as well as detecting the panelist with the lowest discriminant power, those that have interpreted the scale in a different way than the panel and, therefore, require further training or even discovery that the stored black ripe olive products are more similarly perceived by the panelists over sessions. Besides, the study identified the descriptors of hard evaluation (skin green, vinegar, bitterness, or natural fruity/floral). Therefore, panelists would require particular training on them or, in case of not reaching the appropriate level of discrimination, be replaced by some other/s with higher sensitivity. In summary, the work has confirmed that such studies are an essential tool for the appropriate panel control and training, which should be a permanent concern of the panel leader.

Author Contributions

Conceptualization: A.L.-L. and A.G.-F.; Methodology: A.L.-L. and A.H.S.-G.; Software: A.G.-F.; Validation: A.L.-L. and A.G.-F.; Formal analysis: A.G.-F. and A.L.-L.; Investigation: A.C.-D., A.H.S.-G., A.M., A.L.-L.; Resources: A.L.-L., A.H.G.-S. and A.M.; Data curation: A.L.-L. and A.G.-F.; Writing—original draft preparation: A.L.-L. and A.G.-F.; Writing—review and editing: A.L.-L. and A.G.-F.; Visualization: A.L.-L. and A.G.-F.; Supervision: A.L.-L.; Project administration: A.M. and A.L.-L.; Funding acquisition: A.M. and A.L.-L.

Funding

This research was funded in part by the Ministry of Economy and Competitiveness from the Spanish government through Project AGL2014-54048-R, partially financed by the European Regional Development Fund (ERDF).

Acknowledgments

We thank Elena Nogales Hernández for her technical assistance.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. International Olive Council (IOC). Online Reference Included in World Table Olives Figures: Production. 2018. Available online: http://www.internationaloliveoil.org/estaticos/view/132-world-table-olive-figures (accessed on 30 April 2019).
  2. Garrido-Fernández, A.; Fernández-Díez, M.J.; Adams, R.M. Table Olive Production and Processing; Chapman & Hall: London, UK, 1997. [Google Scholar]
  3. International Olive Council (IOC). Trade Standards Applying to Table Olives. IOC/OT/NC No. 1; International Olive Council: Madrid, Spain, 2004. [Google Scholar]
  4. International Olive Council (IOC). Method for the Sensory Analysis of Table Olives. COI/OT/MO No. 1/Rev.2 November 2011; International Olive Council: Madrid, Spain, 2011; Available online: http://www.internationaloliveoil.org/estaticos/view/70-metodos-de-evaluacion (accessed on 30 April 2018).
  5. Department of Agriculture. United States Standards for Grades of Ripe Olives; Agricultural Marketing Order, Department of Agriculture: Colombia, WA, USA, 1983. [Google Scholar]
  6. Alasalvar, C.; Pelvan, E.; Bahar, B.; Korel, F.; Ölmez, H. Flavour of natural and roasted Turkish hazelnut varieties (Corylus avellana L.) by descriptive sensory analysis, electronic nose and chemometrics. Int. J. Food Sci. Technol. 2012, 47, 122–131. [Google Scholar] [CrossRef]
  7. Dabbou, S.; Issaoui, M.; Brahmi, F.; Nakbi, A.; Chelab, H.; Mechri, R.; Hammani, M. Changes in volatile compounds during processing of Tunisian-style table olives. J. Am. Oil Chem. Soc. 2012, 89, 347–354. [Google Scholar] [CrossRef]
  8. Heyman, H.; Hopfer, H.; Bershaw, D. An exploration of the perception of minerality in white wines by projective mapping and descriptive analysis. J. Sens. Stud. 2014, 29, 1–13. [Google Scholar] [CrossRef]
  9. Lee, S.M.; Kitsawad, K.; Sigal, A.; Flynn, D.; Guinard, J.X. Sensory properties and consumer acceptance of imported and domestic sliced black ripe olives. J. Food Sci. 2012, 77, 439–448. [Google Scholar] [CrossRef] [PubMed]
  10. López-López, A.; Sánchez-Gómez, A.H.; Montaño, A.; Cortés-Delgado, A.; Garrido-Fernández, A. Sensory characterisation of black ripe table olives from Spanish Manzanilla and Hojiblanca cultivars. Food Res. Int. 2019, 116, 114–125. [Google Scholar] [CrossRef] [PubMed]
  11. López-López, A.; Sánchez-Gómez, A.H.; Montaño, A.; Cortés-Delgado, A.; Garrido-Fernández, A. Sensory profile of Green Spanish-style table olives according to cultivar and origin. Food Res. Int. 2018, 108, 347–356. [Google Scholar] [CrossRef] [PubMed]
  12. Lanza, B.; Amoruso, F. Sensory analysis of natural table olives: Relationships between appearance of defect and gustatory-kinaesthetic sensation changes. LWT-Food Sci. Technol. 2016, 68, 365–372. [Google Scholar] [CrossRef]
  13. Yilmaz, E.; Aydeniz, B. Sensory evaluation and consumer perception of some commercial green table olives. Br. Food J. 2012, 114, 1085–1094. [Google Scholar] [CrossRef]
  14. Galán Soldevilla, H.; Ruiz Pérez-Cacho, P.; Hernández Campuzano, J.A. Determination of the sensory profiles of Aloreña table olives. Grasas Aceites 2013, 64, 442–452. [Google Scholar] [CrossRef]
  15. Marsilio, V.; Campestre, C.; Lanza, B.; De Angelis, M.; Russi, F. Sensory analysis of green table olives fermented in different saline solutions. Acta Hortic. 2002, 586, 617–620. [Google Scholar] [CrossRef]
  16. Lombardi, S.J.; Macciola, V.; Iorizzo, M.; De Leonardis, A. Effect of different storage conditions on the shelf life of natural green table olives. Ital. J. Food Sci. 2018, 30, 414–427. [Google Scholar]
  17. Lanza, B.; Amoruso, F. Panel performance, discrimination power of descriptors, and sensory characterization of table olive samples. J. Sens. Stud. 2019, e12542. [Google Scholar] [CrossRef]
  18. Rossi, F. Assessing sensory panelist performance using repeatability and reproducibility measures. Food Qual. Pref. 2001, 12, 467–479. [Google Scholar] [CrossRef]
  19. Kermit, M.; Lengard Almli, V. Assessing the performance of a sensory panel-panellist monitoring and tracking. J. Chemom. 2005, 19, 154–161. [Google Scholar] [CrossRef]
  20. Tomic, O.; Nilsen, A.; Martens, M.; Naes, T. Visualization of sensory profiling data for performance monitoring. LWT-Food Sci. Technol. 2007, 40, 262–269. [Google Scholar] [CrossRef]
  21. Tomic, O.; Forde, C.; Delahunty, C.; Naes, T. Performance indices in descriptive sensory analysis-A complimentary screening tool for assessor and panel performance. Food Qual. Pref. 2013, 28, 122–133. [Google Scholar] [CrossRef]
  22. Sipos, L.; Ladányi, M.; Gere, A.; Kókai, Z.; Kovács, S. Panel performance monitoring by Poincaré plot: A case study on flavoured bottled waters. Food Res. Int. 2017, 99, 198–205. [Google Scholar] [CrossRef] [PubMed]
  23. López-López, A.; Rodríguez-Gómez, F.; Cortés-Delgado, A.; Montaño, A.; Garrido-Fernández, A. Influence of ripe table olive processing on oils characteristics and composition as determined by chemometrics. J. Agric. Food Chem. 2009, 57, 8973–8981. [Google Scholar] [CrossRef] [PubMed]
  24. International Olive Council (IOC). Sensory Analysis of Olive Oil Standard Glass for Oil Tasting. COI/T20/Doc No. 5; International Olive Council: Madrid, Spain, 1987. [Google Scholar]
  25. Husson, F.; Lê, S. SensoMineR: Sensory Data Analysis with R. R Package Version 1.07. 2007. Available online: http://agrocampus-rennes.fr/math/SensoMinR (accessed on 08 November 2019).
  26. R Development Core Team. R: A Language and Environment for Statistical Computing; The R Foundation for Statistical Computing: Vienna, Austria, 2011. [Google Scholar]
  27. XLSTAT. Data Analysis and Statistical Solution for Microsoft Excel; Addinsoft: Paris, France, 2017. [Google Scholar]
  28. François, N.; Guyot-Declerck, C.; Hug, B.; Callemien, D.; Govaerst, B.; Collin, S. Beer astringency assessed by time-intensity and quantitative descriptive analysis: Influence of pH and accelerated aging. Food Qual. Pref. 2006, 17, 445–452. [Google Scholar] [CrossRef]
  29. Pense-Lheritier, A.-M.; Vallet, T.; Aubert, A.; Courne, M.-A.; Lavarde, M. Descriptive analysis of a complex product space: Drug-beverage mixtures. J. Sens. Stud. 2016, 31, 101–113. [Google Scholar] [CrossRef]
  30. Nyambaka, H.; Ryley, J. Multivariate analysis of the sensory change in the dehydrated cowpea leaves. Talanta 2004, 64, 23–29. [Google Scholar] [CrossRef] [PubMed]
  31. Husson, F.; Lê, S.; Pagés, J. Confidence ellipse for the sensory profiles obtained by Principal Components Analysis. Food Qual. Pref. 2005, 16, 245–250. [Google Scholar] [CrossRef]
  32. Stone, H.; Sidel, J.; Oliver, S.; Woolsey, A.; Singleton, R.C. Sensory evaluation by quantitative descriptive analysis. Food Technol. 1974, 28, 24–34. [Google Scholar]
  33. Powers, J.J. Current practices and application of descriptive methods. In Sensory Analysis of Foods; Piggott, J.R., Ed.; Elsevier Applied Science: London, UK, 1988. [Google Scholar]
  34. Kim, M.K.; Lee, Y.-J.; Kwat, H.S.; Kang, M.W. Identification of sensory attributes that drive consumer liking of commercial orange juice products in Korea. J. Food Sci. 2013, 78, 1451–1458. [Google Scholar] [CrossRef] [PubMed]
  35. Rodrigue, N.; Guillet, M.; Fortin, J.; Martin, J.F. Comparing information obtaining for ranking and descriptive tests of four sweet corn products. Food Qual. Pref. 2000, 11, 47–54. [Google Scholar] [CrossRef]
Figure 1. Panel performance. Sample·panelist interaction coefficients for selected descriptors (skin red and flesh red).
Figure 1. Panel performance. Sample·panelist interaction coefficients for selected descriptors (skin red and flesh red).
Foods 08 00562 g001
Figure 2. Panel performance. Sample·panelist interaction as assessed by (A) the panelist’s contributions (coefficients) for selected descriptors (skin red, skin green, skin sheen, and flesh red), and (B) means of panelists over the whole panel according to samples.
Figure 2. Panel performance. Sample·panelist interaction as assessed by (A) the panelist’s contributions (coefficients) for selected descriptors (skin red, skin green, skin sheen, and flesh red), and (B) means of panelists over the whole panel according to samples.
Foods 08 00562 g002
Figure 3. Panel performance. Sample·session interaction. Mean per session of panelists, according to samples, over the sample means of the whole sessions for significant descriptors: (A) saltiness, and (B) metallic.
Figure 3. Panel performance. Sample·session interaction. Mean per session of panelists, according to samples, over the sample means of the whole sessions for significant descriptors: (A) saltiness, and (B) metallic.
Foods 08 00562 g003
Figure 4. Panel performance. Panelist·session interaction. Contribution (coefficients) of panelists to the interaction for selected descriptors (skin red, skin green, skin sheen, flesh red, flesh yellow, flesh green, briny, and mushroom).
Figure 4. Panel performance. Panelist·session interaction. Contribution (coefficients) of panelists to the interaction for selected descriptors (skin red, skin green, skin sheen, flesh red, flesh yellow, flesh green, briny, and mushroom).
Foods 08 00562 g004
Figure 5. Panel performance. Panelist-session interaction. Means per session according to panelists over means of the whole sessions for selected descriptors (ripeness, buttery, metallic, and rancid).
Figure 5. Panel performance. Panelist-session interaction. Means per session according to panelists over means of the whole sessions for selected descriptors (ripeness, buttery, metallic, and rancid).
Foods 08 00562 g005
Figure 6. Panel performance. Panelist-session interaction. Detail of the coefficients through the three sessions for the flesh red descriptor.
Figure 6. Panel performance. Panelist-session interaction. Detail of the coefficients through the three sessions for the flesh red descriptor.
Foods 08 00562 g006
Figure 7. Panelist performance as assessed by multivariate analysis. (A) Clustering of panelists according to their performance. (B) Projection of panelists’ loads (individual description) and samples’ scores onto the first two Principal Components.
Figure 7. Panelist performance as assessed by multivariate analysis. (A) Clustering of panelists according to their performance. (B) Projection of panelists’ loads (individual description) and samples’ scores onto the first two Principal Components.
Foods 08 00562 g007
Figure 8. Panel repeatability as assessed by multivariate analysis, using bootstrapping. (A) Projection of the descriptors ‘loads on the correlation circle onto the first two Principal Components, and (B) Projection of the samples’ scores and confidence ellipses according to sessions onto the first two Principal Components.
Figure 8. Panel repeatability as assessed by multivariate analysis, using bootstrapping. (A) Projection of the descriptors ‘loads on the correlation circle onto the first two Principal Components, and (B) Projection of the samples’ scores and confidence ellipses according to sessions onto the first two Principal Components.
Foods 08 00562 g008
Table 1. Overall panel performance as assessed by analysis of variance (ANOVA) sorted by sample p-values, including main effects, and interactions. Panelist, session, and their interactions were considered as random, while the sample was studied as fixed factor/variable.
Table 1. Overall panel performance as assessed by analysis of variance (ANOVA) sorted by sample p-values, including main effects, and interactions. Panelist, session, and their interactions were considered as random, while the sample was studied as fixed factor/variable.
Sensory AttributeSamplePanelistSessionSample·PanelistSample·SessionPanelist·SessionMedian
Skin green2.792 × 10−109.069 × 10−262.177 × 10−12.404 × 10−86.211 × 10−19.423 × 10−24.712 × 10−2
Flesh green1.720 × 10−92.186 × 10−152.377 × 10−12.602 × 10−47.443 × 10−16.387 × 10−11.190 × 10−1
Skin sheen1.900 × 10−61.742 × 10−282.497 × 10−11.011 × 10−46.362 × 10−12.689 × 10−11.249x 10−1
Flesh red8.603 × 10−62.654 × 10−346.326 × 10−11.796 × 10−95.180 × 10−13.250 × 10−31.629 × 10−3
Firmness1.033 × 10−44.141 × 10−353.320 × 10−11.881 × 10−39.960 × 10−23.305 × 10−21.747 × 10−2
Fibrousness9.292 × 10−42.088 × 10−394.165 × 10−11.397 × 10−34.263 × 10−19.392 × 10−35.394 × 10−3
Flesh yellow1.328 × 10−27.265 × 10−202.752 × 10−11.752 × 10−42.046 × 10−12.332 × 10−11.090 × 10−1
Skin red1.760 × 10−23.535 × 10−516.585 × 10−11.511 × 10−102.695 × 10−11.342 × 10−17.590 × 10−2
Vinegary1.833 × 10−23.683 × 10−301.613 × 10−11.794 × 10−36.612 × 10−22.908 × 10−22.370 × 10−2
Moisture release2.978 × 10−24.515 × 10−339.058 × 10−19.558 × 10−65.027 × 10−16.828 × 10−24.903 × 10−2
Fishy smell/Ocean3.060 × 10−21.312 × 10−123.342 × 10−13.912 × 10−17.507 × 10−17.249 × 10−13.627 × 10−1
Saltiness3.117 × 10−21.680 × 10−483.191 × 10−12.575 × 10−33.940 × 10−31.068 × 10−33.258 × 10−3
Astringency2.232 × 10−11.599 × 10−459.491 × 10−16.062 × 10−146.593 × 10−11.447 × 10−11.839 × 10−1
Ripeness2.614 × 10−11.586 × 10−448.095 × 10−14.834 × 10−56.614 × 10−16.012 × 10−31.337 × 10−1
Soapy smell/Medicinal2.636 × 10−12.478 × 10−514.467 × 10−14.560 × 10−16.792 × 10−12.451 × 10−23.552 × 10−1
Bitterness2.710 × 10−14.434 × 10−382.556 × 10−15.075 × 10−11.940 × 10−12.364 × 10−32.248 × 10−1
Chewiness3.334 × 10−11.538 × 10−397.989 × 10−14.862 × 10−112.964 × 10−12.947 × 10−31.497 × 10−1
Briny3.567 × 10−11.708 × 10−377.944 × 10−14.671 × 10−68.067 × 10−12.152 × 10−21.891 × 10−1
Natural fruity/Floral4.102 × 10−12.781 × 10−301.840 × 10−14.931 × 10−36.946 × 10−13.302 × 10−12.571 × 10−1
Rancid4.867 × 10−11.815 × 10−413.093 × 10−12.110 × 10−16.873 × 10−12.669 × 10−22.601 × 10−1
Nutty4.892 × 10−14.653 × 10−263.041 × 10−13.637 × 10−32.203 × 10−11.108 × 10−21.157 × 10−1
Buttery5.223 × 10−17.225 × 10−423.749 × 10−11.292 × 10−94.572 × 10−16.488 × 10−31.907 × 10−1
Oak barrel5.496 × 10−14.336 × 10−443.501 × 10−17.740 × 10−39.681 × 10−22.796 × 10−65.227 × 10−2
Metallic5.778 × 10−11.374 × 10−241.115 × 10−11.010 × 10−15.508 × 10−68.859 × 10−11.062 × 10−1
Alcohol6.690 × 10−19.806 × 10−608.765 × 10−22.337 × 10−17.464 × 10−11.730 × 10−12.033 × 10−1
Mushroom6.795 × 10−15.867 × 10−264.712 × 10−13.910 × 10−61.256 × 10−16.242 × 10−76.280 × 10−2
Mouth coating6.925 × 10−12.358 × 10−567.926 × 10−15.014 × 10−172.719 × 10−11.033 × 10−31.365 × 10−1
Sourness7.219 × 10−11.227 × 10−246.145 × 10−11.206 × 10−28.290 × 10−18.804 × 10−43.133 × 10−1
Earthy/Soil7.335 × 10−11.713 × 10−282.620 × 10−12.050 × 10−18.907 × 10−23.771 × 10−12.335 × 10−1
Artificial fruity/Floral8.387 × 10−18.279 × 10−372.036 × 10−11.018 × 10−49.372 × 10−18.575 × 10−15.211 × 10−1
Residual8.937 × 10−14.326 × 10−431.212 × 10−11.784 × 10−96.901 × 10−19.118 × 10−21.062 × 10−1
Cheesy smell9.075 × 10−12.015 × 10−276.441 × 10−18.088 × 10−33.692 × 10−12.652 × 10−41.886 × 10−1
Gassy smell9.389 × 10−13.652 × 10−335.786 × 10−11.249 × 10−19.484 × 10−17.670 × 10−16.728 × 10−1
Note: Significant values at p ≤ 0.05 are indicated in bold.
Table 2. Discriminant power of panelists as assessed by the p-value of the F test.
Table 2. Discriminant power of panelists as assessed by the p-value of the F test.
A14A4A2A3A6A5A8A1A12A13A7A11A10A9 Median
Skin green0.00070.01190.00110.00100.00160.06870.00020.12435.2 × 10−70.04020.06550.51770.60860.17720.0261
Mouth coating0.04910.01880.24450.00230.01620.00634.8 × 10−70.09270.78960.00420.28800.42780.21850.82500.0709
Flesh red0.01030.02940.4480.00193.9 × 10−60.00730.64320.35360.04090.13842.4 × 10−50.16610.47060.19650.0897
Briny0.09120.04900.00740.30230.01380.19400.06790.55850.03700.11320.47390.23580.17620.04190.1022
Flesh green0.05640.00490.08850.07920.00010.21350.11920.15990.02400.18505.3 × 10−70.34970.15880.12160.1039
Skin red6.7 × 10−60.10420.02251.0 × 10−60.00630.51030.12450.37500.02510.00460.24550.08830.48220.22040.0963
Residual0.16020.00050.48530.11510.04800.09764.9 × 10−100.05140.51840.69760.35560.27280.73130.07270.1376
Oak barrel0.14270.00060.94110.19500.21700.86930.48580.01940.15480.29370.47780.16550.13730.15400.1802
Fibrousness0.10650.02206.4 × 10−50.33110.19910.38590.00550.10040.04100.51180.05250.67020.41880.60760.1528
Ripeness0.16020.00050.48530.48530.04800.09764.9 × 10−100.05140.51840.69760.35560.27280.73130.07270.1376
Firmness0.13450.04020.04440.00220.11300.77590.15520.41070.26390.16600.00090.37270.24810.88670.1606
Skin sheen0.27000.28600.03230.00030.36720.01090.02697.7 × 10−50.18410.00090.68770.18870.14890.87640.1665
Flesh yellow0.12310.01160.0810.16430.70310.29250.08780.51860.04420.51776.6 × 10−50.35820.07540.26870.1437
Bitterness0.01000.64740.00560.20790.51140.78130.23120.05010.25470.23410.99940.23900.18310.50970.2365
Astringency0.54810.02400.03740.25380.68870.20641.5 × 10−80.01590.14360.71400.05080.98160.61750.42250.2301
Moisture release0.12990.23330.26580.00040.00520.63280.04800.32070.06040.03300.1020.55760.86380.29830.1816
Nutty0.03920.07930.35980.25460.16770.08240.24380.26830.47060.83040.46580.19080.39280.10830.2492
Buttery7.6 × 10−70.20470.21916.1 × 10−60.60940.12680.81150.03560.26620.19920.06340.56580.30170.92220.2119
Mushroom0.89620.00030.65400.00060.60460.37990.70300.04760.47060.51580.00770.18540.51960.1360.4253
Chewiness0.00030.44480.71520.01560.26360.65846.5 × 10−90.00030.04220.29220.14410.69370.91950.62940.2779
Sourness0.29630.67650.65300.47060.01550.18440.47830.23970.39770.24610.35970.81060.07780.70950.3787
Saltiness0.54010.64280.46110.12920.07150.23070.00750.01470.18890.98750.79950.75770.47060.17140.3459
Vinegary0.13150.63390.30240.38300.22820.11080.47060.34420.07160.58150.36090.46510.47060.51460.3719
Nat. fruity/floral0.05980.74390.77780.01120.49590.49590.47060.34750.68360.27030.62520.31190.05540.78190.4091
Earthy soil0.32780.01360.11620.63210.96030.29850.47060.5610.47060.33670.54920.40130.52840.14960.4360
Rancid0.094770.26970.23860.51230.39150.60380.47060.38780.47060.48220.57930.14360.58660.61420.4706
Art. fruity/floral4.4 × 10−50.45680.33010.58100.09160.04570.02790.24410.47060.08800.47150.47060.48760.72570.3934
Metallic0.26450.58820.01670.49800.57810.83090.57700.37760.47060.47060.32360.35080.40740.54290.4706
Alcohol0.32890.40270.13030.55590.72700.20800.58360.41180.47060.31470.38410.47060.47060.76040.4412
Cheesy smell0.53120.54680.12700.66470.38810.71160.47060.35940.47060.77060.18500.51210.02750.71740.4914
Fishy smell0.76930.17630.15780.55700.20830.78990.47060.30090.47060.02890.42880.18920.70010.56440.4497
Soapy smell/med0.65240.59640.18280.53730.70190.64110.62370.81540.47060.01450.48530.50510.75120.49980.5669
Median0.12990.17630.18280.19500.21700.23070.23120.25830.26620.27030.35560.35820.47060.50970.2448
Note: Significant values at p ≤ 0.05 are indicated in bold.
Table 3. Panelist repeatability as assessed by the ANOVA residuals according to descriptors.
Table 3. Panelist repeatability as assessed by the ANOVA residuals according to descriptors.
A1A10A11A12A13A14A2A3A4A5A6A7A8A9
Skin red1.632.270.881.851.911.030.601.060.942.741.692.061.411.03
Skin green1.502.061.261.471.961.370.311.300.731.751.232.021.032.42
Skin sheen0.981.571.273.201.611.150.511.431.031.391.821.810.771.63
Flesh red1.740.061.412.893.101.230.841.640.941.581.111.611.861.27
Flesh yellow1.221.901.382.771.161.160.670.210.671.552.410.781.750.91
Flesh green1.762.051.462.762.641.650.541.060.882.881.401.111.852.08
Briny1.371.581.092.321.340.910.461.000.391.411.451.741.271.75
Mushroom0.580.911.060.091.351.090.751.130.531.082.191.651.410.88
Earthy soil0.781.662.000.090.980.290.540.680.861.211.860.59<0.010.77
Oak barrel0.411.930.651.331.030.360.890.530.591.631.881.061.930.79
Nutty0.211.190.800.100.640.400.540.770.540.511.070.962.061.33
Artificial fruity/floral0.180.930.020.120.720.360.590.150.781.141.191.190.790.96
Natural fruity/floral1.131.740.071.031.060.870.870.790.801.611.951.50<0.011.05
Vinegary0.24<0.010.281.551.200.380.530.390.532.311.980.590.410.61
Alcohol0.200.130.070.140.610.670.550.260.602.031.731.091.041.00
Fishy smell/ocean0.480.850.150.140.961.080.310.380.841.371.461.710.351.58
Cheese smell0.441.000.090.200.860.090.290.090.470.761.390.550.020.15
Sourness0.140.560.531.280.351.890.140.200.701.980.841.650.650.28
Bitterness0.801.060.301.271.540.870.420.550.871.991.072.731.590.37
Saltiness0.780.130.651.750.760.270.300.400.861.191.192.351.330.93
Ripeness1.272.810.522.121.951.730.450.980.731.431.492.071.691.04
Buttery0.712.260.681.411.710.910.531.180.771.091.871.571.262.12
Metallic1.621.370.540.250.821.450.521.090.951.842.001.690.820.80
Rancid0.360.730.04<0.010.450.560.370.330.740.702.010.26<0.010.05
Soapy smell/medicinal0.301.370.460.421.210.050.300.850.810.912.501.140.660.71
Gassy smell0.120.590.050.71<0.01<0.010.120.470.700.121.410.130.0220.07
Firmness1.291.360.641.122.361.680.381.140.821.851.831.101.031.01
Fibrousness1.020.770.621.252.021.430.251.510.711.141.561.351.531.17
Moisture release1.362.020.611.201.851.340.291.010.670.991.411.321.031.42
Mouth coating0.851.730.600.651.301.320.521.200.640.721.151.000.840.92
Chewiness0.751.460.641.101.941.110.361.101.261.361.731.040.840.92
Astringency0.051.600.531.191.990.070.120.910.620.541.371.450.910.10
Residual0.622.140.130.131.411.891.360.340.950.821.351.281.040.55
Notes: Significant higher values at p ≤ 0.05 are indicated in bold while an agreement is showed as bold and italic.
Table 4. Panelist agreement with panel as assessed by the correlation coefficient.
Table 4. Panelist agreement with panel as assessed by the correlation coefficient.
A6A8A14A5A1A7A13A10A9A3A2A4A12A11 Median
Skin green0.6740.8280.7920.8100.6600.6030.9320.5470.8140.8080.8530.8740.7910.8600.800
Skin sheen0.7620.7870.8400.9620.8090.8690.6770.8200.0500.2200.0920.8830.7620.3230.744
Flesh red0.9070.9040.4990.8410.6670.9560.190−0.3220.8160.2680.7590.5240.7220.4200.695
Firmness0.3350.6780.6000.2960.7540.7580.7890.6960.5000.2160.7200.8750.3350.8030.687
Flesh green0.9600.6010.1520.1980.7520.5500.7270.6150.5050.7510.3440.7640.0970.5720.637
Fibrousness0.6440.6390.1520.1980.7520.5500.7270.6150.5050.7510.3440.8330.965−0.6580.577
Flesh yellow0.3090.6730.2930.5290.6350.5980.8010.5560.6530.3060.6490.5380.2260.5040.558
Moist. release0.7810.4250.5790.5880.6670.6050.355−0.0780.6530.3060.6490.5380.2260.5040.558
Fishy smell−0.0180.7770.8850.4340.3360.3660.7230.2160.6500.886−0.5320.8860.7740.2450.542
Nutty−0.2060.8310.6060.401−0.049−0.4450.674−0.2620.6400.7060.6650.566−0.157−0.5800.484
Astringency−0.3870.9210.4880.7690.4700.850−0.4120.576−0.2540.4970.7300.373−0.249−0.4710.479
Briny−0.2500.478−0.0430.4700.6200.5600.386−0.4770.5300.513−0.1330.1830.6090.7350.474
Ripeness0.4200.0930.6740.4540.760−0.3090.6900.5590.618−0.1060.832−0.0570.4440.3730.473
Buttery0.0360.0650.691−0.2660.6000.4050.7510.5370.407−0.3530.6070.3090.4800.5820.444
Skin red0.8620.3560.551−0.0630.6760.2570.4440.1540.5850.3870.5170.2280.7760.2230.416
Chewiness0.1350.269−0.0450.4850.2350.5680.6460.4760.3110.4140.0460.6770.4060.8110.410
Vinegary0.9890.2550.7980.8530.3920.0790.422-−0.277−0.723−0.511−0.6960.9820.5940.392
Oak barrel0.6960.4010.5560.5110.2500.117−0.0580.3670.5460.1180.6400.334−0.426−0.2090.351
Bitterness−0.1450.7540.433−0.3120.5900.5330.7350.1270.1580.7400.2450.053−0.0390.4500.339
Saltiness0.6820.8500.0610.3970.4400.607−0.0330.7920.5310.060−0.3120.2440.3140.1700.279
Earthy soil0.703-−0.8430.0130.462−0.1330.4780.5440.3280.239−0.283−0.067−0.3920.5400.329
Mouth coating0.7700.770−0.5580.172−0.1890.283−0.143−0.5110.4060.8700.4340.817−0.393−0.2550.228
Natural fruity/floral−0.514-0.7730.7650.5460.210−0.7000.685−0.4820.9520.464−0.0130.055−0.7980.210
Mushroom−0.0640.4730.0230.8340.3860.4080.2860.1520.0760.343−0.0700.0540.0030.2290.190
Cheese smell0.2370.0500.258−0.058−0.0370.379−0.2410.830−0.4030.8460.478−0.6520.0900.3580.164
Gassy smell0.6950.537-−0.0990.155−0.132-0.138−0.0370.104−0.4640.1920.3410.1800.146
Alcohol−0.294−0.1190.5260.805−0.1920.6840.735−0.3060.3890.011−0.292−0.4230.7980.2600.135
Soapy smell/medicinal0.134−0.387−0.2560.8650.2790.7950.928−0.3700.3540.865−0.099−0.0560.0750.0700.105
Sourness0.605−0.0370.5320.5400.320−0.2130.123−0.1490.0610.241−0.056−0.4140.667−0.2870.092
Rancid0.932-−0.4070.6940.0380.038−0.1080.423−0.3970.5680.589−0.132-0.0770.058
Metallic0.466−0.117−0.1690.3070.7000.915−0.6010.543−0.311−0.061−0.1530.7760.087−0.5270.028
Artificial fruity/floral0.5970.5790.3100.3590.441−0.337−0.0580.103−0.217−0.219−0.1370.517−0.054−0.0540.024
Residual0.6800.842−0.8590.068−0.500−0.204−0.7310.522−0.7270.760−0.6800.876−0.384−0.234−0.217
Median0.5970.5580.4930.4700.4410.4080.4040.3950.3850.3870.3440.3340.3240.2450.400
Notes: Significant agreement is indicated in bold while opposed behavior is shown in bold and italic.

Share and Cite

MDPI and ACS Style

López-López, A.; Sánchez-Gómez, A.H.; Montaño, A.; Cortés-Delgado, A.; Garrido-Fernández, A. Panel and Panelist Performance in the Sensory Evaluation of Black Ripe Olives from Spanish Manzanilla and Hojiblanca Cultivars. Foods 2019, 8, 562. https://doi.org/10.3390/foods8110562

AMA Style

López-López A, Sánchez-Gómez AH, Montaño A, Cortés-Delgado A, Garrido-Fernández A. Panel and Panelist Performance in the Sensory Evaluation of Black Ripe Olives from Spanish Manzanilla and Hojiblanca Cultivars. Foods. 2019; 8(11):562. https://doi.org/10.3390/foods8110562

Chicago/Turabian Style

López-López, Antonio, Antonio Higinio Sánchez-Gómez, Alfredo Montaño, Amparo Cortés-Delgado, and Antonio Garrido-Fernández. 2019. "Panel and Panelist Performance in the Sensory Evaluation of Black Ripe Olives from Spanish Manzanilla and Hojiblanca Cultivars" Foods 8, no. 11: 562. https://doi.org/10.3390/foods8110562

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop