^{1}

^{2}

^{1}

^{*}

^{3}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

The granite processing sector of the northwest of Spain handles many varieties of granite with specific technical and aesthetic properties that command different prices in the natural stone market. Hence, correct granite identification and classification from the outset of processing to the end-product stage optimizes the management and control of stocks of granite slabs and tiles and facilitates the operation of traceability systems. We describe a methodology for automatically identifying granite varieties by processing spectral information captured by a spectrophotometer at various stages of processing using functional machine learning techniques.

A current trend in granite processing plants located in northwest Spain is the implementation of traceability systems to better control and manage stocks of slabs and end products resulting from the processing of a wide range of lithological materials, all considered as granite varieties from a commercial perspective. Cut and processed blocks have different mineralogical characteristics and origins, and this enormously complicates control over end products. Blocks are initially identified in the quarry by marks indicating type and origin and the edges of slabs, once sawn, are colour coded for identification purposes.

The market, however, demands end products of specific sizes and shapes (square and rectangular) and finishes with arrises and perpendicular edges. Consequently, the marks made on the edges of slabs are inevitable lost when these are cut using a diamond-disk saw.

We describe an expert system for identifying different types of granite by spectrophotometer-based colour characterization applied in all the processing phases until perfectly shaped and squared slabs are obtained, with the ultimate aim of improving the current discontinuous control and management system used in plants. The effectiveness of this approach in terms of analysing and characterizing stone types on the basis of colour has already been reported by other authors [

The classical methodology for classifying and identifying different varieties of granite is to analyse textural aspects in direct petrography studies of thin laminates. Other approaches are to study photomicrographs of rock in thin sections using digital image processing and texture analysis [

Our identification methodology is based on: (1) objectively characterizing stone colour using a spectrophotometer; (2) discretely transforming reflectance data (collected by the spectrophotometer sensors in various sections of the visible-light region) into spectral curves in a smoothing process; and (3) resolving the classification problem using machine learning techniques for functional data. Our approach ensures objectivity and minimizes possible human error in the identification process associated with different perceptions of colour, observation times and object sizes.

The functional spectral information was processed using a functional linear regression model and a functional support vector machine (SVM) with a PUK kernel (see [

The article is laid out as follows: Section 2 describes the theory underlying functional classification techniques used to handle spectral information collected by a spectrophotometer, Section 3 details both the methodology for automated granite rock identification and the stone processing phases to be integrated in the system. Section 4 describes the data processing results obtained for each implemented algorithm. Finally, Section 5 describes our main conclusions.

The resolution of classification, regression and principal component problems using statistical techniques is typically scalar or vectorial. The analysis of functions assumes a finite set of values [

In FDA, the first step is to perform smoothing to fit curves to a set of functional data. This process is described immediately below and the rest of the section describes the two FDA techniques used in our research to identify granite varieties from surface colour.

Given a set of observations _{j}_{p}_{j}_{1,...,}_{nb}_{k}_{k}_{b}

The smoothing problem now consists of determining the solution to the following regularization problem:
_{j}_{j}_{j}_{j}_{1},...,_{np}^{T}_{1},..., _{nb}^{T}_{p}_{b}_{jk}_{k}_{j}_{b}_{b}^{2} is the second-order differential operator.

Of possible families of basis functions, we can mention the polynomials, the splines and, in the specific case of the Fourier family of functions, orthonormal basis functions, where the matrix

The classic formulation of a linear regression model is given as:

To estimate the function _{β}

Each observed function can be expressed as a function of other basis functions

Therefore, the prediction _{φψ}

In a classification problem, the response variable takes values in a finite set of values,

SVMs for classification [

Given a typical classification problem of two classes and a sample of data
_{i}_{i}_{i}

The above problem is quadratic with linear constraints, and so the Kuhn-Tucker optimality conditions are necessary and sufficient. The solution, which can be obtained from the dual problem, is a linear combination of a subset of sample points denominated support vectors (s.v.) as follows:

The classification rule is ŷ(_{w,b}

If the input space is included in a functional Hilbert space spanned by a set of basis functions, _{1},...,_{nb}

If the kernel has the general form _{k}x^{k}φ_{k}_{i}_{i}_{kl}_{k}_{l}

Of the many possibilities for selecting the kernel function [_{0} of the peak, and

The parameters

The colour of granite varieties was characterized using a colour reflectance measurement instrument. As well as showing numeric information on colour in standard colour spaces (CIE L*a*b* [

The spectral information of the stone was reflected as a set of discrete points by a spectrophotometer. Parameters were specifically configured to enable optimal capture of the colour peculiarities of each sample of granite analysed.

A Konica-Minolta CM-700d/600d spectrophotometer was used, equipped with CM-S100w SpectraMagic NX software, D65 illuminant, 10° observer and target diameter 8 mm. The spectrophotometer recorded an integrated colour, the product of the reflectances of the different colours reflected in the same measurement and a direct function of the colour of minerals and grain size. The equipment (

A total of 48 specimens with a surface area of 50 cm^{2}, representative of three groups of 16 varieties of ornamental granite widely traded in the sector and different in terms of origin, colour and texture characteristics was used for data capture purposes.

Granite is characterized by a heterogenous surface in terms of colour and texture. In capturing colour by a contact measurement instrument such as a spectrophotometer—which spatially averages the light reflected from a fixed measurement area corresponding to the measurement aperture—measurements must be made at several points of the specimen to be able to assess the total real colour of the stone. The ideal approach would be to choose as small as possible a measurement area where the colour is representative of the overall rock colour (the result of the contributions of different minerals in different proportions), with the limitation, however, that the measurement area is determined by the measurement aperture used [

A total of 160 colour measurements were randomly made of the 48 specimens, yielding a sample {_{i}_{i}_{i}_{i}

In our research, the number of measurements analysed for each rock surface was not in line with the recommendations of Prieto ^{2} of rock surface when an 8-mm diameter measurement aperture is used. This was because granite varieties were not classified directly from the spectral information collected; rather, this information was processed by machine learning techniques for functional data, which, after suitable training and learning, acquired the ability to optimally solve classification problems with small samples. In particular, spectral information was collected for a set of 10 different measurement points, distributed randomly in the three available specimens, in order to classify each of the 16 types of granite to be identified. In the different measurements, each of the spectrophotometer sensors calculated the percentage of light reflectance in a strictly defined region of the visible-light wavelength range. Recorded data included, therefore, a great number of discrete variables.

Information processing was simplified by adopting a functional approach to the classification problem. Due to the nature of the information collected, considered a set of observations for a function in a finite set of values, it was necessary to perform a smoothing pre-process consisting of fitting the data to the nearest function representing them. This procedure simplified processing, with no loss of information, by reducing the number of state variables to 23.

The use of the Fourier orthonormal series as the basis functions in the smoothing pre-process (with

The sample generated in the smoothing process can be represented as the set {_{i}_{i}_{i}^{23} indicates the spectral functions and _{i}

To construct, compare and select the optimal algorithms for the processing of the spectral information, the sample was divided into a training set of 144 items—which underwent a process of cross-validation to determine optimal algorithm parameters—and a validation set of 16 randomly selected items representing each granite variety used to perform the final validation of the system.

For the 10-fold cross-validation, the entire training set was randomly divided into 10 disjoint sets; nine sets were used to train the model (for each range of variation in the internal parameters of the algorithm in question) and the remaining set was used to test the model. The optimal model resulting from the cross-validation process was selected on the basis of the average error rate for the 10 test sets generated in the process.

Optimal processing of the data collected makes no sense unless the automated characterization application can be implemented in the industrial granite production process. To do this, it was first necessary to identify the different phases where data would be captured and processed so as to be able to characterize, on an ongoing basis, the different rock types handled in a plant.

The protocol for automatically identifying different granite types covers spectral information capture from each of the granite types to be characterized using a spectrophotometer and subsequent processing by the expert system on a laptop computer connected to the measuring equipment.

After the various possible treatment processes aimed at improving the visual appearance, texture and functionality of the stone (that is, polishing, bush-hammering, honing, flaming or sandblasting), initial spectral information for the slabs was collected that recorded the colour and brightness characteristics conferred on the rock by the different treatments. The expert system then directly classified the stone, provided suitable and enough spectral information was available to do so. If the granite type had not previously been characterized by the system, the learning process and optimal selection of model parameters would have to be readjusted. Note that in this first phase of automated identification the degree of adjustment of the system could be checked at any time against the information provided by manually applied colour codes, which effectively acted as an expert supervisor of the system.

Up to the cutting phase, slabs are clearly identified through the manual marking system. Hence, our methodology is a complementary method at this stage, redundant with respect to the traditional system but necessary to adjust, update and provide feedback to the automated classification system on the basis of reliable information (color codes), so as to ultimately be able to identify the lithology of each end product in subsequent processing stages when marks are lost.

The proposed expert system is particularly important in end-product identification and control at the packaging, storage and sale stages, as its potency, flexibility and portability is such that it enables proper management of the product in the final stages of the supply cycle.

To ensure optimum operation of the automated identification system and study how it could be practically applied in the granite industry, implemented and compared were the two FDA techniques—functional linear regression and SVMs for functional data, described in Sections 2.3 and 2.4 above—for resolving the problem of classifying the 16 granite varieties represented by 48 specimens. More specifically, analysed were the results obtained by statistically processing, using linear functional regression and a functional SVM (FVSM) with a PUK kernel, the functions obtained in the smoothing pre-process carried out on the original spectral information collected from the granite specimens.

A set of 100 basis functions was selected to perform the smoothing process, as this was the minimum number that provided a 99% fit between the 40 discrete points and the function.

The sample generated in the smoothing process, {_{i}_{i}_{i}

The poor functional linear regression results would indicate that the granite identification problem is not linear in nature, thereby justifying the use of non-linear and more complex functional machine learning techniques. The cross-validation methodology and the selection of a PUK kernel to implement the FSVM improved on the results obtained in previous research [

The low error rate obtained using the FSVM model highlights the great predictive power of the algorithm, its flexibility and adaptability to the resolution of non-linear problems [

The correct identification of different types of granite in processing plants right through to the end-product stage optimizes the management and control of slabs and tiles and facilitates the implementation of new traceability systems in the sector.

The traditional color codes used to initially identify slabs are inevitably lost in subsequent processing phases. We have described a methodology that uses functional machine learning techniques to automatically classify rock at various stages of processing on the basis of spectral information captured by a spectrophotometer. Making the problem a functional one by smoothing the captured data enables all the information captured by the spectrophotometer to be analysed and evaluated and simplifies resolution of the granite classification or identification problem.

The good results obtained in processing spectral information using a FSVM with a PUK kernel would indicate this system to be an optimal model for inclusion, in the granite production process, as a system to automatically identify granite varieties. In addition to its great predictive power, the algorithm has great flexibility and is capable of updating to take new data into account.

At the industry level, it would be useful to implement a feedback and automatic application update procedure to make the adjustments necessary for the system to be able to correctly identify new varieties of stone and distinguish between stone types with similar mineralogical, textural and colour characteristics.

The main advantages of the proposed system are its functionality, flexibility and portability and the overall mixed-system approach to granite identification. The specific methodology developed in this research seeks to overcome the specific difficulties associated with the traditional method of manually marking slab edges for the purpose of characterization in subsequent processing stages. However, it complements, rather than overrides or replaces, this easy and rapid approach.

Vectorial spectral information captured for two granite specimens for 40 points corresponding to the visible area of the spectrum.

Reflectance curves resulting from the smoothing process for three granite specimens. Reflectance values are indicated as red squares.

Flowchart depicting the granite production process and indicating phases where manual colour coding can be complemented by automated characterization.

Granite slabs with manually applied colour codes (boxed area) used to identify slabs.

Mean training and validation error rates (percentage of poorly classified observations) for the two models.

| ||
---|---|---|

15.35 | 26.43 | |

0 | 0.82 |