A Novel Active Imaging Model to Design Visual Systems: A Case of Inspection System for Specular Surfaces

The use of visual information is a very well known input from different kinds of sensors. However, most of the perception problems are individually modeled and tackled. It is necessary to provide a general imaging model that allows us to parametrize different input systems as well as their problems and possible solutions. In this paper, we present an active vision model considering the imaging system as a whole (including camera, lighting system, object to be perceived) in order to propose solutions to automated visual systems that present problems that we perceive. As a concrete case study, we instantiate the model in a real application and still challenging problem: automated visual inspection. It is one of the most used quality control systems to detect defects on manufactured objects. However, it presents problems for specular products. We model these perception problems taking into account environmental conditions and camera parameters that allow a system to properly perceive the specific object characteristics to determine defects on surfaces. The validation of the model has been carried out using simulations providing an efficient way to perform a large set of tests (different environment conditions and camera parameters) as a previous step of experimentation in real manufacturing environments, which more complex in terms of instrumentation and more expensive. Results prove the success of the model application adjusting scale, viewpoint and lighting conditions to detect structural and color defects on specular surfaces.


Introduction
Scientific models and theories aimed at explaining the behaviour of specular reflections are sufficiently consolidated. However, the automatic processing, using artificial vision techniques, of scenes where there are specular surfaces, has problems that have not been solved yet. Vision systems designed to deal with specular objects must cope with the optical difficulties associated with this type of material [1]. The surfaces have a high reflection coefficient which causes undesired reflections and shine, concealing, in some cases, chromatic, morphological and topographical information about the object.
Most artificial vision techniques ignore specular reflections and focus on the diffuse component of the interaction of light with objects. Thus, these techniques, which are conceived to deal with Lambertian surfaces, produce wrong results when other surfaces are in the visual scene [2,3]. The solutions put forward in the literature adopt opposite ways of approaching the problem: developing methods to detect specularities in images, in order to take advantage of existing artificial vision techniques; and designing new vision techniques that explicitly deal with these surfaces.
I(x, y) := F(ρ), ρ = (ρ 1 , ρ 2 , ρ 3 , ..., ρ n ) ∈ P The components ρ i of the vector of scene magnitudes are measurable physical values involved in the process of image formation. They could be, in practice: scale, viewpoint, light intensity, frequency, saturation, etc. Each component could be modeled as a function depending on three inputs Equation (2): the subject of interest, m, in the scene (e.g., the object to be inspected), the environment, e, in which the subject is placed and, finally, the camera, c, that captures images from the scene.
The contribution of each element (m, e and c) can be expressed as three vectors made up of magnitudes: µ i related to the object Equation (3), i related to the environment Equation (4) and γ i related to the camera Equation (5). Intensity and wavelength of light sources, medium of transmission, relative position between scene and vision device, are examples of environment variables i . Regarding the camera contribution γ i , the variables are related to the sensor characteristics, optical and electronic elements: zoom, focus, diaphragm, size of the sensor, signal converters, etc. Finally, the reflectance, colour, shape, topography of the object are examples of object variables µ i . The values of each vector establish elements of the set M for the object, E for the environment and Γ for the camera. In Diagram 1 an outline of the magnitudes can be found.
To define our model, it is worth to remember the sensitivity as an important static characteristic of a sensor. The sensitivity is the slope of the calibration curve (see Figure 1). First, we can define the calibration curve as the function that maps a physical scene magnitude and its representation in the image space. Depending on the camera parameters the calibration curve will have a different function for different scene magnitudes. For instance, a camera with large depth of field will have a smoother calibration curve for intensity measurement (along a larger set of scene magnitudes for intensity, the camera will be able to distinguish between them) than a camera with short depth of field which will have a very abrupt calibration curve for the same scene magnitude (only a small set of intensity values will be distinguishable). With this function, sensitivity indicates the detectable output change for the minimum change of the measured magnitude. As a naive example, the detectable output could be the ability to perceive two different color that are actually different in the real scene. If the color change is too small, the sensor would return the same value of intensity. In our case, the detectable output in F for the minimum change of the scene magnitudes: Usually, the camera parameters are calibrated to a set of values γ i so that the sensitivity is optimized for all variables of ρ simultaneously with respect to a single metric (not necessarily maximized for each variable ρ i ). For example, a camera could be calibrated using a specific zoom, focus and sensor size optimizing the sensitivity of the system to perceive the colour of a subject in a wide range of object distances and viewpoints but the system is not optimized separately for distances or viewpoints. However, a given camera c has maximum sensitivity for a value of each ρ i . For convenience, we are going to define the tuning point as the corresponding point ρ s in the scene magnitudes space P for each camera of the set Γ in which the sensitivity of the VAS is the optimum (see Figure 1) . The sensitivity decreases in general for values of ρ differently from ρ s . In the same way, since the VAS depends on the contribution of the environment e, for each value of the set E, the corresponding point ρ t in the scene magnitudes will be named as the working point. The detected output of this point is related to the sensitivity curve of the VAS for each of the magnitudes ρ i because it restricts the limits where the system can work. In average conditions or in simple approaches to vision problems, its effect is usually considered to be negligible because perception takes place close to ρ s . This simplification is unacceptable in the case of adverse conditions as it occurs, for example, in dark environments, remote objects from the camera, or in the presence of specular objects that limit the capability of the VAS to perceive the scene.
Acquisition of the scene and representation on a plane carried out by F Equation (1) cause situations, related to vision in adverse conditions, where it is not possible to distinguish between the different scene magnitude vectors ρ that contribute to an image. The capacity to discern elements of P is related to the measurement in the image I of the magnitudes m, e, and/or c, that contribute to each of the components ρ i . A specific application could be interested in knowing the intensity lighting of a scene from the environment e, or the focal length from the camera c, for calibration purposes. However, generally, the contribution of the object m to the image, and, therefore, its magnitudes µ i , is the aim of the measurement. For example, as we can see in Figure 2, a VAS used to perceive color (µ 2 ) of objects is able to distinguish the different objects in the scene. However, if it is used to perceive shape (µ 1 ) using the same camera (including all variables γ i ), neither objects m 8 and m 9 nor m 10 and m 11 could be distinguished each other (as we mentioned above, a given camera has maximum sensitivity for a specific value of each ρ i ). Hence, capacity to discern magnitudes that contribute to the vector ρ can be delimited to distinguish elements of set M in the image (e.g., the colour of a region of an object, or the shape of a surface, or both, etc.). It is important to understand that the scene magnitudes are continuous, not discrete. The representation in the Figure 2 shows a set of magnitudes that are distinguishable in the space of the image I = F(ρ), but it is an example of the certain magnitudes in the continuous space of magnitudes P.
An object could be distinguishable from another one in the VAS when they are distinguishable in the measurement performed in F (i.e., in the vertical axis of Figures 1 and 2A). Let Ω m i be the set of objects that can be distinguished for an specific object m i , and let χ the minimum difference perceptible by the system (sensitivity), then Ω m i could be established as: Figure 2B, shows the measurement performed in F analysed from the point of view of the object space M considering just colour and shape. Objects close to m 1 , in the yellow area, are not distinguishable, for that object in this example. However, they are distinguishable for the object m 2 (light green area). Following this example, m 10 ∈ Ω m 9 but m 10 ∈ Ω m 11 , if the measurement performed in F takes into account both colour and shape.
A VAS has to deal with different situations that are a consequence of the subsets of P delimited by the problem to be solved (e.g., delimited by objects or by their characteristics to be analysed, or by environments, or by the cameras, etc.). A minimum value of sensitivity χ can be established in which any object is distinguishable from another considered in the acquisition (see Figure 1). The values of the vector of magnitudes ρ in which sensitivity is higher than a threshold χ conform the subset S of P defined by Equation (8).
There are no perception difficulties for situations of the subset S ⊂ P. They involve values of environment and camera, in addition to characteristics of objects, that make up magnitudes ρ of the set S Equation (8). In other words, it is possible to distinguish the images of two different objects (m i and m j in Equation (9)) from the acquisition performed by F if the VAS is working on points of S ( m i ρ denote a scene magnitude vector whose component m is the object m i ). Vision systems working on scene magnitudes of the complementary of S, S c , (P \ S) present the aforementioned adverse conditions to distinguish characteristics of objects from images. Hence, solutions have to be provided to achieve distinct images from different objects that the camera perceives as the same one. These solutions should be able to compensate the low sensitivity in S c (sensitivity less than χ in Figure 1). Among the three variables that provide values to the scene magnitudes, object is a constant due to them being the subject of interest. However, environment conditions and camera characteristics could be modified to set up scene magnitudes ρ of the subset S. Thus, the VAS is going to be able to have different images from different objects in the scene in order to distinguish them Equation (9). For this purpose, we propose two complementary alternatives (Diagram 2): • System Calibration (calibrating the system). This alternative tries to minimize the distance between the working point ρ t and the tuning point ρ s . • Measurement Enhancement (conditioning the measurement). Enhance the target measurements or parameters. This alternative can be considered as conditioning the measurement performed by the VAS.
System Calibration consists of shifting one of the points so that the working point be an element of the set S. The goal could be generating a new image of the object m i by a transformation Υ S Equation (10) in which the working point (ρ m i k ) be close enough to the tuning point. Do not confuse calibrating the system with calibrating the camera. The alternative presented here could be calibrating the sensor as well as changing other parameters of environment or object of interest. Hence, to carry out this alternative it is necessary to adjust the environment to shift the working point, for example moving the object closer to the camera, adjusting lighting conditions, etc. (Figure 3 shows an outline of this process). On the other hand, in order to shift the tuning point (see Figure 4), the camera could be recalibrated or new acquisition equipment can be used (traditionally this is done by replacing the camera with a more suitable one). In this case, subset S of P changes for the new camera from S o to S n .
For the second alternative, Measurement Enhancement aims to directly influence the sensitivity curve of the acquisition system. In a nutshell, it tries to somehow highlight or enhance the parameters ρ which want to be perceived. It can be carried out by means of two new alternatives. First, the output signal of the perception system can be amplified. That is, the classic conception of measurement system amplification at the signal conditioning step. Limitations of this technique are related to increasing the amplitude of the signal in ranges of minimal sensitivity (both minimum and maximum of the range of the sensor because the contribution of the object in the output signal is insignificant, the signal-to-noise ratio is very low). In this way, acquisition system improvements are limited because they are only applied in ranges of intermediate sensitivity.  The other alternative of enhancing the target measurement is increasing the differences of the values of the input magnitudes. This is to operate with large differences (η) of the input (ρ) to increase the differences of the output F for different objects (m i and m j ) until the differences are perceptible at the output (m j ∈ Ω m i ). Figure 5 schematically shows this concept. Elements used as input magnitudes of the set P to get large differences of the input make up the subset A in Equation (11). The goal is to reduce the number of possibilities that the VAS has to deal with; for example, restricting camera positions, viewpoints or lighting characteristics (in this paper, a lighting pattern has been used to inspect specular surfaces).
We model the transformation able to increase the differences of the values of the input magnitudes as Υ C :    The techniques are not exclusive and can be used together for designing vision system in which images of different objects can be distinguished. An example of the use could be a system which needs to perceive two colliding objects separately through a sensor with a fisheye lens. First, a Υ S transformation by means of calibrating the camera to reduce the distortion could be applied. After, Υ C could mean to colorize the objects to enlarge the perceived difference between them.

Method for Inspecting Specular Surfaces
In this section, we are going to use the previous model to specify a method for inspecting specular surfaces. The objective of an automated visual inspection system, AVI, aims to determine if a product differs from the manufacturer's specifications. This implies the AVI has to measure the object magnitudes in the scene in order to compare them with values of the magnitudes established in the design step of the product (e.g., reflectance, colour, shape, topography).
Two sets of objects are considered for modelling the AVI: M P and M I . The first one is composed of objects that are made up by magnitudes µ i defined in the manufacturing specifications. The set M I is composed by objects to be inspected looking for any deviation from objects of the set M P . These deviations cause, depending on the magnitude, different defects: morphological, chromatic, topographic defects, etc. The union of M P and M I is the subset M S = {M P ∪ M I } ⊂ M Equation (3) of the possible objects that the inspection system must consider.
The inspection goal could be modelled as in Equation (13). The AVI has to decide whether an object m i of the set M I could be distinguished (9) In order for the deviations (η) of the object magnitudes µ m j i from µ m k i to be detected in the image Equation (13), contributions from the environment and the camera to ρ must allow a suitable sensitivity to the AVI.
Since it was previously shown that the sensitivity of the VAS (AVI in this specific case) is optimized simultaneously for all variables of ρ, AVI sensitivity is not necessarily maximized for each variable ρ i . It is not maximized for each object m and each of its variables µ i . Moreover, generally an AVI is designed for performing a measurement of a subset of the object magnitudes µ i in the image (i.e., colour, shape). Therefore, the sensitivity of the AVI can be very low for some of the magnitudes µ i to be measured. That is, it is possible that the perception be suitable for measuring the surface colour or shape but it could not be suitable for both together. In other words, the intended measurement determines the perception capacity of the AVI. Calibration parameters or environment conditions must be adjusted to adequately perceive magnitudes µ i . This process requires great knowledge of the problem and accuracy for the solution.
For specular surfaces, the difficulty to perceive in scene magnitudes of S c is a consequence of the surface reflectance. The environment conditions produce an effect in the perception of the object being more important than other types of surfaces (e.g., Lambertian surfaces). For example, if it is considered that the calibration parameters γ i are the same for two different images, the difficulty to perceive with different environment conditions is given by:

•
The mirror itself: a given camera can be confused so that it cannot distinguish between the environment and the object. The spatial modulation of the environment contribution creates the illusion of the objects in a scene.

•
The lighting of the environment can cause shine on the surface and confuse the two images with different objects. For example, the image formed by a grey surface with a high reflection coefficient illuminated by white lights can be confused with the image of a lighter surface object.
The specular reflection causes the working point ρ t be easily located at the limit of the range of the VAS, at the maximum value of ρ that can be measured. Sensitivity is very low for the perception of objects in these conditions because specularity saturates the camera sensor.
The viable solution for compensating the lack of sensitivity produced in the ranges of the scene magnitudes ρ (in order to obtain an improved image) is related to the System Calibration that performs the perception system (see Section 2). It is necessary to increase the differences in the values of the input magnitudes, ∆ ρ , to raise the differences of the output magnitude until they can be measured (Equation (12)).
In other words, using the resolution (another important sensor characteristic that indicates the smallest change in the magnitude being measured that the sensor can detect, e.g., the smallest feature size of an object or the smallest change in colour that the VAS can distinguish), since the resolution in those ranges is very low, it is necessary to force large differences at the input. Thus, it is necessary to work using a subset of scene magnitudes A ⊂ P in order the perception will be suitable in the points around the tuning point and it will be facilitated in the ranges of sensor saturation. Differences in F will enable the perception among objects using the set S A in Equation (14). The elements do not necessarily correspond to those of subset S Equation (8).
In addition, if the Measurement Enhancement is not sufficient to discern the defects in the inspection, the distance between the working point ρ t and the tuning point ρ s must be minimized using the Equation (10). In this case, minimization is performed on the input magnitudes of the set A. Therefore, the transformation Υ S operates with the values of the input magnitudes S A Equation (14). In consequence, Equation (15) models the proposed solution for inspecting specular surfaces combining the two proposed transformations Υ S and Υ C according to the objects to be inspected and the possible defects to be detected (see Figure 6).

Conditioning the measurement
Minimizing distance between tuning and working point Objects Defects Image Compuer vision tasks

AVI Method Controlling Environmental Parameters
As it was previously shown, increasing the differences of the input magnitude values ∆ ρ can be performed by means of the control of the environmental conditions or of the camera parameters. These are two input parameters of the components ρ i Equation (2) that contribute to the formation of the image I Equation (1) and are variables that can be affected by the system. The third input parameter, the object, is considered as a constant because it is the object to be inspected.
It is known that the camera, as a photoelectric transducer, provides a measurement related to the scene radiance (see Figure 7). It is a function of environment and object magnitudes. The contribution of interest to the radiance is the radiance coming from the inspected object. This radiance, L R , is related to the object reflectance, f R [66] and the irradiance E incident on the surface according to Equation (16).
Reflectance f R contributes the necessary information about the behaviour of the light interacting surface of the object. Irradiance, the second factor of Equation (16), is related to the environment variables, in fact, to be precise, to the electromagnetic radiation that reaches the surface of the object. These magnitudes affect the contribution of the object in the camera.
Controlling the luminous energy of the environment, a function of the lighting sources and the transmission modulations, is a way of working directly with input magnitudes of the system. Without considering any other perception characteristics, irradiance is the key. Structuring the energy that reaches the object is a method of affecting the environment or the object. It enables areas of the surface of the object to be isolated, the contrast in the camera to be increased or decreased, etc. Thus, this variable allow us to force large differences at the input to perceive changes in the output image. Measurement Enhancement is the task of the transformation Υ C Equation (15). In this paper, we propose a instance Υ Φ of Υ C focused on lighting conditions of the environment. The transformation Υ Φ structures the lighting in order to establish regions on the object radiated by different spectral powers Φ. Areas of different radiances are formed in this way. The regions can be formed by means of spatial modulation, forming a grid. Also, a sequence of lightings using temporal modulation or a combination of both can be used. The regions contribute with different radiances to the input of the camera establishing independent areas in the image. The increasing of the differences at the input of the system is produced in space or time domain Equation (12).
In this paper, the transformation Υ Φ is carried out by means of spatial modulation (space domain). The different areas in the image are formed with characteristics proportional to the irradiance E. The object irradiance E actively affects the perception process. The environment parameters are established so that great lighting gradients on the object are formed. The projection of the radiance L, which arrives at the optics, forms an image with regions. The projection of the radiance L, which arrives at the optics, forms an image with regions. The gradient of the image is a function of the one generated on the object. The greater the gradient of the pattern, the greater the gradient formed on the photodetectors. Then, a larger difference among adjacent photodetectors is obtained. Therefore, each photodetector has a spectral power associated in an instant (that is the function of irradiance and object characteristics).
Structuring the lighting enables the projection of a pattern on the surface object. This pattern is deformed by the object characteristics. It may be considered that the irradiance E is modulated by the object. Then, any other object modulates the generated pattern in a different way, and, therefore, modulates the spectral power, which is received in the space or the time by each photodetector, in a different way (in terms of inspection systems, any object with defects will modulate the generated pattern in a different way than the same object without defects). Controlling the input values of the perception system, for each photodetector, enables a reduction in the elements of the set of scene magnitudes that the system has to deal with. In addition, the pattern has to be configured so that the differences of the output are perceptible according to the magnitudes of the object (shape, colour, topology, etc.) to be perceived.
Control of energy that reaches the surface of the object is needed in order to design a certain pattern. The task depends on the number of environment lights, the spatial distribution of lighting, the wavelengths that conform each of the sources, the time, the modulations of transmission, etc. Moreover, the pattern of spectral power Φ could be different in order to inspect a specific magnitude µ i of the object m. Then to this purpose, the transformation Υ Φ will determine the spectral power Φ ∈ P as a function of µ i ∈ U and m ∈ M: For practical considerations, it is interesting that the function of energy establishing the spectral power Φ is established in terms of the field radiance L f by considering four parameters: s ∈ S, ∆ ∈ D, ξ ∈ X and t ∈ R (see Figure 8).
Hence, the transformation Υ Φ could be defined by: The regions of lighting ROL on the surface of the object can be determined from the parameter s. It is a function that establishes the morphology of the regions of the pattern formed on the surface. Parameter ∆ determines the set of lighting characteristics that radiates each of the established regions ROL. The function ξ determines the spatial configuration of energy reached by the object. The task of the function is to distribute the lighting characteristics of the set ∆ over each of the regions determined by s. Finally, the function depends on time t. If the structured lighting is temporal, it is necessary to generate a sequence of patterns. Moreover, for practical reasons, we define ROG as the regions of lighting on the source. In the same way as ROL, the ROG is the morphology of the regions of the pattern conformed on the lighting source, in this case (see Figure 8).

Experiments
In this section, experiments performed by simulations to validate the model to deal with specular surfaces are presented. Specifically, the objective is to prove whether the transformation Υ Φ is able to compensate for the lack of sensitivity that takes place in certain values of the magnitudes of the scene. In other words, probing whether the transformation Measurement Enhancement, by actively controlling the lightning pattern, is able to detect object characteristics (e.g., defects in visual inspection) that are not detected in other conditions. The experiments are mainly based on the scale of perception and the point of view as magnitudes of the scene.
We propose the use of simulation based on virtual imaging as an efficient way to perform a large battery of tests (different environment conditions and camera parameters) as a previous step of experimentation in real manufacturing environments, more complex in terms of instrumentation, and thus more expensive. Then, in this section, experiments performed by simulations to validate the model to deal with specular surfaces are presented.

Experimental Setup
An extensive explanation of the experimental setup is shown. It integrates the simulator, the subject of interest, and the environmental parameters that allow to design the lighting patterns.

Simulator
A simulator has been implemented considering the model presented in Section 2. It is able to synthesize images by F(ρ) Equation (1). The components of ρ i Equation (2) of the scene magnitudes (scale, viewpoint, light intensity, etc.) are modelled by the characteristics of the subject of interest, m, the environment, e, and, finally, the camera, c (see Diagram 1).
According to the characteristics µ i of the subject of interest m, the simulator is basically interested in the reflection characteristics of the surfaces. The light reflection on an object depends on the atomic nature of the surface material and on the scattering and dispersion that takes place when the light contacts it. Consequently, in order to determine the reflected flow densities, the spatial configuration of the surface µ S and its electromagnetic properties, which determine the type of material, are taken into account. These properties provide the refraction index µ n of the surface, which represents the macroscopic optical properties required.
Regarding the environment e, the problem of specifying it is limited to the specific application of inspection. Therefore, in the process of transmitting the light signal from generating sources to the image, the different lighting sources are considered together with the modulations that the signal experiences, with the exception of those referring to the object to be inspected and those of the acquisition system. As environment variables, a distinction will be made between those related to lighting sources (that make up the fundamental element in the environment magnitudes) and the relative positions of the different elements involved in the scene. In order to carry out the study of local light reflection, the Bidirectional Reflectance Distribution Function (BRDF) presented by Cook and Torrance [67] has been used because different studies show its great capacity to adapt to reflectance extracted from objects [68].
Finally, variables related to the characteristics of the perception system include both the functions that define the optics and those related to the sensor itself. The aim of this research is to study the influence of environmental conditioning on the perception of the object. Then, controlled experimentation aimed to measure the radiance coming from the subject of interest is performed. Therefore, an ideal perception system is considered (images without noise, constant sensitivity for all spectral powers, constant diaphragm, etc.). It influences the radiometric measurement only with geometric variables like the focal distance and the size of the sensor. The other variables have been set to be constant.

Subject of Interest
Plane objects are considered in the experiments. They permit a simple control of the angle formed by the elements involved in the scene on all regions of the object. Control is necessary due to that fact that one of the aims of experimentation is to carry out a thorough test of the viewpoint.
The set M P of the objects defined in the manufacturing specifications is made up of planes of 12 mm × 12 mm. The function µ S establishes the points in a coordinate system that is local to the object independently of microscopic shape. The surface roughness is calculated by means of the Beckmann distribution model [69] by coherence with the BRDF model used. The root mean square is assigned to 0.1 (rms, or m in [67]). According to electromagnetic characteristics, there are two types of plane: dielectric and metallic. Dielectric planes have a refraction index µ n of 1.6 and metallic ones (base on chromium) have a refraction index µ n of 2.8 and an extinction index µ n i of 3.2 using the Fresnel equations. Also, other variables of the Cook-Torrance model such as a specularity coefficient (s in [67]) of 0.75 and a constant diffuse component (R s ) associated to a RGB of (0.6,0.6,0.6) have been considered.
The set M P to be inspected is made up of objects with the same characteristics of M P but including defects. Three types of defects have been distributed over these planes (see Figure 9): two changes in topography (a 0.6 mm-diameter crack or crater in Figure 9a,b and a change in colour (an area of 0.6 mm × 0.6 mm in Figure 9c). In this last case, a constant diffuse component R s associated to a RGB of (0.4,0.4,0.4) on a surface measuring 0.6 mm × 0.6 mm is established. In total, for each plane in M P , 3 planes (1 per type of defect) with 25 defects have been considered in M P . Table 1 summarizes the data used in the experiments.

Environment
Environment characteristics play an important role in validating the transformation Υ Φ Equation (19). Specifically, the goal is to study the transformation that configures the lighting of the scene to establish regions on the object surface radiated by different spectral powers by means of spatial modulation. Therefore, we have considered different spatial configurations of the energy that reaches the object surface according to different gradients by considering different s, ∆ and ξ.
Since the objects are planes, the regions of lighting ROL formed by s are established on a plane. In the experiments, the parameters are established from the lighting source domain using the ROG (see Figure 8). The set of lights conform a grid so that the whole lighting extension can be defined for different conditions. Different areas of the regions of grid ROG are considered 0.1 mm × 0.1 mm, 0.2 mm × 0.2 mm, 0.3 mm × 0.3 mm and 0.6 mm × 0.6 mm to determine the influence of the size of the regions inside the areas of defects. Then, the ratios between the size of the region and the size of the defect in one of the dimensions (0.6 mm) are: 6 (0.6/0.1), 3 (0.6/0.2), 2 (0.6/0.3) and 1 (0.6/0.6).
The set ∆ Equation (20) is made up of lighting sources using the same characteristics (polarization, power, etc.) except the wavelengths, δ i , to conform different spectral powers, Φ δ . The wavelengths used are from the visible electromagnetic spectrum. Also, lighting sources only radiate for specific wavelengths (monochromatic lights).
Finally, the function ξ determines the spatial configuration of energy emitted by the lighting. The lighting characteristics of the set ∆ are distributed over each of the regions ROL determined by s establishing different gradients: spatial and amplitude. In the experiments, for practical considerations, the ROG composed of a squared grid has been used ( Figure 10).
In this paper, four different configurations of lighting are considered: two for spatial gradients (ξ x and ξ xy ) and two for amplitude gradients (ξ L and ξ I ). Regarding the former two, the function ξ establishes spectral powers of ∆ Equation (20) into two different spatial distributions. Specifically, in the experiments the spatial gradient established by ξ is organized in one direction, ξ x , and in two directions, ξ xy , (see Figure 11). Let ROG(x, y) be the region of column x and row y of the lighting grid and let N x , N y be the column and row of neighbouring regions of the grid. A function ξ will be defined as ξ x if it only sets up different lighting characteristics in adjacent positions of an axis of the grid (Figure 10a) and the same ones in adjacent positions of the other axis of the grid. Then, any region in the grid ROG is assigned an element of the set ∆ such that: For the second spatial gradient, a function ξ will be defined as ξ xy if sets up different lighting characteristics in all adjacent positions of a region of the grid (Figure 10b). Any region in ROG is assigned to an element of the set ∆ Equation (20) such that: Regarding the amplitude gradients, two configurations are also considered for ξ: Linear, ξ L , and Interlaced, ξ I . A function ξ will be defined as ξ L if sets up lighting characteristics with close wavelengths in adjacent positions N of the regions of the lighting grid ROG (see Figure 11a) whereas the Interlaced, ξ I , configuration maximize the differences among wavelengths in these positions (see Figure 11b). Then, in case of ξ L , any region in ROG is assigned to an element of the set ∆ Equation (20) such that: In case of any function that is considered as an Interlaced function, ξ I , the ROG is assigned such that: Figure 11. Amplitude distribution considered for function ξ in the experiments.
Different transformations Υ Φ are formed using combinations of functions ξ accomplishing different spatial and amplitude gradients: Linear X (ξ L,x ), Interlaced X (ξ I,x ), Linear XY (ξ L,xy ), and Interlaced XY (ξ I,xy ) to carry out the experiments (see Table 2).
The function Linear X (ξ L,x ) is made up by using a function meeting a spatial gradient in one axis of the grid Equation (21) and an amplitude gradient formed by Linear configuration Equation (23). This function uses an ordered sequence ∆ L of the set ∆. Let Φ δ i be an element of ∆ that radiates energy with the wavelenght δ i . Let δ i and δ e be the minimum and maximum values and let δ n be the number of wavelengths considered in the sequence, then ∆ L is: An element of the sequence ∆ L is assigned to the region (x,y) of the grid lighting ROG by the function ξ x : The differences of wavelengths among neighbouring regions are constant in one of the axes of the grid.
The function Interlaced X(ξ I,xy ) is made up by a function that accomplishes a spatial gradient in one axis of the grid Equation (21) and an amplitude gradient formed by the 'Interlaced' configuration Equation (24). In this paper, this function uses an ordered sequence ∆ E of the set ∆ in which a maximum difference of the wavelengths between adjacent positions is established. Let ∆ t and ∆ b be the top half and bottom half of ∆ L Equation (25).  Table 2. Lighting configuration using combinations of the spatial and amplitude gradients.

X (ξ x ) XY (ξ xy )
Amplitude gradient Then, the sequence ∆ E combines ∆ t and ∆ b in an interlaced manner: An element of the sequence ∆ E is assigned to the region (x,y) of the grid by the function ξ x Equation (26).
The function 'Linear XY' accomplish a spatial gradient in two axes of the grid Equation (22) and an amplitude gradient formed by the 'Linear' configuration Equation (23). This function uses the ordered sequence ∆ L Equation (25) of the set ∆. An element of the sequence ∆ L is assigned to the region (x,y) of the lighting grid by the function ξ xy : Finally, the function 'Interlaced XY' is defined as meeting a spatial gradient in two axes of the grid Equation (22) and an amplitude gradient formed by the Interlaced configuration Equation (24). This function uses the ordered sequence ∆ E Equation (28) of the set ∆. An element of the sequence ∆ L is assigned to the region (x,y) of the grid by the function ξ xy Equation (29).
Also, a reference lighting configuration is defined. It permits the comparison of the improvement produced by enhancing the target parameters to measure. In this case, all the regions of the grid have the same characteristics. The set ∆ is built with one element (monochromatic or polychromatic lights).
The parameters used in the transformations Υ Φ are summarized below (see Table 3). The lighting covers the whole surface (12 mm × 12 mm). The areas of ROLs generated by s are 0.1 mm × 12 mm, 0.2 mm × 12 mm, 0.3 mm × 12 mm and 0.6 mm × 12 mm for the functions 'Linear X' and Interlaced X. The 'Linear XY' and 'Interlaced XY' functions from ROL have areas of 0.1 mm × 0.1 mm, 0.2 mm × 0.2 mm, 0.3 mm × 0.3 mm and 0.6 mm × 0.6 mm. According to lighting characteristics ∆, 'Linear X' and 'Linear XY' use the sequence ∆ L Equation (25) with 10 elements for all cases whereas Interlaced X and 'Interlaced XY' use the sequence ∆ E Equation (28) with 120, 60, 40 and 20 wavelengths uniformly distributed from the visible electromagnetic spectrum (380 nm to 780 nm). Finally, the function ξ xy is used for 'Linear XY' and 'Interlaced XY'. It establishes (120 × 120), (60 × 60), (40 × 40) and (20 × 20) ROLs on the plane. 'Linear X' and Interlaced X are made up using the function ξ x establishing 120, 60, 40 and 20 ROLs according to the areas of the regions considered. Figure 12 shows the different lighting patterns used in the experiments.

Performance Results
In order to obtain the performance results, the components ρ i Equation (2) considered are scale ρ E , angle ρ θ and intensity lighting ρ I . They are the most influential scene variables in the image formation and in the characteristics of the visual inspection systems.
The system is tuned to a set of scale values ρ E : 1 pixel/mm, 2 pixels/mm, 5 pixels/mm, 10 pixels/mm and 15 pixels/mm. The set of the component angle ρ θ , as the angle formed by the surface normal and the camera normal, is formed by the sequence from 0 • to 90 • with an increase of 10 • (10 different angles in total). Finally, the intensity lighting ρ I is the basic parameter measured by the camera and is related to the solution proposed for increasing the perception capacity of the system. A large set of camera, environment and object variables take part in the formation of this magnitude. The goal of the tests is to study the intensity lighting according to environment values related to the proposed transformations Υ Φ and not to influence any other variables. However, the angle formed by the surface normal and the lighting plane normal is very important in the design of inspection systems. It affects the intensity lighting ρ I . Therefore, it is considered as this parameter. The values contemplated in the tests are from 0 • to 90 • with differences of 10 • (10 different angles in total).
In order to measure the effectiveness of the transformation Υ Φ to inspect, the performance is calculated as the number of different pixels between the image of an object without defects and the image of the same object with defects. Specifically, this difference divided by the estimated number of pixels containing the defects, for a specific resolution, measures the success rate. The calculation of this rate is performed in all the possible combinations of the variables ρ i contemplated previously. A total of 68,000 images have been synthesized. That is, 17,000 images from objects without defects M P and 51,000 (17,000 × 3 types of defects) images from inspection objects M I with 3 types of defects (2 in topography and 1 in colour). The 17,000 images are the images of 17 lighting configurations (4 Υ Φ functions of environment condition using 4 different lighting areas defined by s and a reference lighting configuration Equation (30)) conditioning the measure of 2 objects (dielectric and metallic) for 500 values of vector ρ (5 ρ E , 10 ρ θ and 10 ρ I ).
The experimental results of the average success rates for scale ρ E , angle ρ θ , lighting ρ I and the influence of the size of lighting s regions are detailed in the next sections.

Scale of Perception
The study of the perception scale makes discerning the influence of the size of the defects in the image possible. Figure 13a shows the success rates according to the dielectric object. The function 'Interlaced XY' offers the best success rates whereas the 'Linear X' offers the minimum of the transformations Υ Φ . The average difference between the best function and the reference is 4.33%. The data from functions ('Linear XY' and 'Interlaced X') that uses only one of the maximum proposed gradients, scale or amplitude, is very similar. The average difference is only 0.16%.
The differences in success rates according to metallic object are more noticeable (see Figure 13a). The function 'Interlaced XY' shows similar success rates to the case of dielectric material. The values are practically independent of the type of material. However, the success rates of the reference lighting significantly decrease to values between 46% and 57.9%. This is an average difference of 5.85% lower than the success rate of the dielectric case. Therefore, the improvement of the capacity of perception of the system is better using the alternative Measurement Enhancement. It differs more than 10% using the best function (Interlaced XY). Also, the results show an increase in sensitivity using spatial distributions organized in two directions (ξ xy ).
It is interesting to consider the shape of the curve in Figure 13a. The graph represents the success rate of the system as a function of the scale. This is pixels per millimetre and not pixels per defect. The defects used in the tests have a maximum size of 0.6 mm for one of their three dimensions. Therefore, the scale values ρ E correspond to 0.6, 1.2, 3, 6 and 9 pixels per dimension of the defect. For the minimum scale (1 pixel/mm), a crack or crater defect is projected into 0.28 pixels 2 and a chromatic defect is projected into 0.36 pixels 2 . This corresponds to an area of 7.065 and 9 pixels 2 respectively in the image of a defective plane (25 defects, see Figure 9). Then, the differences of 1 pixel in the image suppose that success rates vary 11.11% or 14.15%. The ratio of success rate to pixel is very large. Conditions for the next scale (2 pixels/mm) are similar. In this case, differences of 1 pixel in the image means the detection or not of a defect. The topographic defects are projected into an area of 1.13 pixels 2 and the chromatic ones into an area of 1.44 pixels 2 . Differences of 1 pixel for the rest of the scale values mean a lesser impact on the detection. ρ E (pixels/mm) In short, the average success rates show higher scale ρ E , more capacity of perception of the defects both for dielectric and metallic materials, avoiding short scales (1 pixel/mm and 2 pixels/mm) due to the characteristics mentioned in the previous paragraph.
Comparing the functions Υ Φ and the reference function, according to the capacity of perception, the data shows that a greater adjustment of the scale is required using the latter lighting. For example, if the minimum threshold χ for determining the defect in the dielectric material (this parameter will depend on the type of application) is 55%. Using the reference lighting, approximately more than 2 pixels/mm are necessary to perceive the defects whereas with 'Interlaced XY' lighting only 1 pixel/mm is needed. In the case of metals, greater adjustment of the system is required. For the same threshold χ, more than 9 pixels/mm would be necessary using reference lighting and only 2 pixel/mm for each of the functions Υ Φ proposed. The best lighting, 'Interlaced XY', makes a system tune to only 1 pixel/mm to detect the possible defects.

Angle of Perception
The success rates of the different transformations Υ Φ according to the perception angle ρ θ are shown in Figure 13c,d, for the dielectric and metallic objects respectively.
The study of the average of the success rates shows that the best conditions to perceive according to ρ θ are in the interval [0 • , 40 • ] for all lighting configurations. The function 'Linear X' offers the worst results of the transformations Υ Φ even though their differences are minimum, an average of 2.85%. The functions 'Linear XY' and 'Interlaced X' present a similar behaviour. The average difference between them is 0.77% and it reaches a maximum of 1.12% for an angle of 20 • .
The best results are provided by the function 'Interlaced XY'. The improvement in the capacity of perception reaches the maximum differences, an average difference of 11.4%, in the interval [10 • , 40 • ] compared to using the reference lighting. According to the material type in this interval, the differences are 6.84% and 16.05% for the dielectric and metallic case respectively. This is due to the fact that the success rates significantly decrease from inspecting dielectrics materials to metallic one, an average of 8.99% in the interval. Also, the differences in the success rates between the transformations Υ Φ that light the metallic objects are more noticeable.
An angle of 20 • formed by the surface normal and the camera normal sets the maximum success rate. Using the function 'Interlaced XY' for this angle, similar results are obtained according to the type of material: 95.05% for dielectric materials and 95.19% for metallic one. Furthermore, for the metallic object in this angle, the difference between the 'Interlaced XY' and the lighting reference is 16.53%.
The success rate and the differences between the transformations and reference lighting decrease from 40 • to reach 8.2% at 90 • . In the latter case, it is only possible to detect convex defects.
Comparing the functions Υ Φ and the reference function, according to the capacity of perception, the data shows that the increase in the number of the magnitudes of a scene which the system is able to perceive is more significant than in the study of the scale. For example, if the minimum threshold χ for determining the defect in the dielectric material is 80% using the reference lighting, the angle ρ θ must be less than 30 • to perceive the defects. In this case, using the 'Interlaced XY' lighting, the angle can increase by up to approximately 43 • . Hence, the increase is more than 10 • . Assuming a threshold of 75%, the angle formed by the surface normal and the camera normal can be established in the interval [10 • -25 • ] for determining the defect in the metallic material using the reference lighting. Using any transformation Υ Φ , the angle tuning can be established in the interval [0 • -45 • ]. Then, the increase is about 30 • .

Intensity Lighting
Regarding the intensity lighting ρ I as the angle formed by surface normal and lighting plane normal, the success rates of the different lighting configurations are shown in the Figure 13e,f, for the dielectric and metallic objects respectively. The function 'Interlaced XY' provides the maximum success rate and the greatest increase in the capacity of perception of the system.
The study of the inspection capacity considering the material type shows the limited sensitivity for perceiving metallic objects using angles ρ I close to 0 • and the reference lighting. The characteristics of the lighting sources and of the metallic reflection cause sensor saturation in the angle interval [0 • , 10 • ]. The decreasing of the success rate, compared to the dielectric material, is considerably marked with values close to 20% (concretely 19.72% for 0 • and 20.77% for 10 • ). Also, transformations using minimum spatial differences show differences of more than 3% between metallic and dielectric in that interval: 'Interlaced X' 3.30% (0 • ) and 4.36% (10 • ), 'Linear X' 3.47% (0 • ) and 3.30% (10 • ). However, if the transformations with spatial distributions in two dimensions are used, sensor saturation is not important. The differences are 0.33% (0 • ) and 0.45% (10 • ) for the function 'Interlaced XY' and 0.73% (0 • ) and 0.37% (0 • ) for 'Linear XY'. The success rate is independent of the material and the differences are not significant for the rest of the lighting angles.
The success rate shows a low gradient for the intensity lighting ρ I according to lighting angle. The magnitude has an almost constant behaviour in the interval [0 • , 60 • ]. The standard deviation is lower than 2 for the transformations Υ Φ whereas the value is 4.21 for the reference lighting for all angles due to the behaviour in the initial values of the metallic plane (the standard deviation for the dielectric plane is 1.44).

Size of Lighting Regions
The influence of the size of the lighting regions ROL defined by s on the capacity of perception according to the scale is indicated on the left of Figure 14 and according to the angle on the right of the Figure 14. The size of ROL is represented by the minimum dimension of the region. Increasing the size of ROL decreases the rate of the regions inside the defect. This fact decreases the success rates of inspecting topographic defects using the transformations Υ Φ for all perception scales and angles. The average differences vary from 4% to 5% in the scale case. According to angle, the average differences are about 7% for the crack defect (Figure 14e) in the interval of the maximum sensitivity [10 • , 40 • ]. The behaviour for the crater defect (Figure 14d) is analogous; it establishes an average difference of sensitivity in that interval of 5%. The differences decrease as the perception angle increases until they are 0 for 90 • .
The capacity to modulate irradiance in the defect is inversely proportional to its size. In the extreme case where the size of the bands is greater than the size of the defects, the irradiance function will not be able to modulate the reflectance of the object, so it would behave in the same way as a the reference lighting.
The analysis of the chromatic defect shows that the size of lighting regions is independent of the success rate (see Figure 14c,f). In this case, it is practically constant. Therefore, a considerable increase in the perception capacity of the system is obtained.

Case Study: Automobile Logo
Simulation, as a previous step of experimentation in real manufacturing environments, allows a preliminary study that can be used to discern the best acquisition path planning or the equipment characteristics to perform the inspection of the specular surfaces. Finally, in this section an example of the application of the experimental results of the scale ρ E and angle ρ θ of perception on specular surface acquisition for inspecting a 100 mm diameter Mercedes Benz metallic logo is shown. The area to be inspected is 4676.15 mm 2 . Given that the minimum threshold χ depends on the application, in this case a value of 80% is assumed.
The inspection conditions are restricted to only 5 values of the scene magnitudes ρ i for a metallic object with an unstructured lighting (reference lighting). As can be seen in Figure 15     The choice of the conditions (scale, angles, etc.) to capture the whole object for inspection is a complex problem. It is necessary to take into account the particularities of each solution. In this case as example, the scale ρ E is established at 15 pixels/mm. Therefore, according to the previous scene magnitudes, the perception angle can deviate by up to 20 • using the reference lighting and by up to 40 • using the function 'Interlaced XY'. In other words, the angle formed by the vector normal to the camera and the normal vector of the surface to be inspected must be from 0 • up to 20 • using the reference ligthing and from 0 • up to 40 • in case of 'Interlaced XY' lighting is used.
In consequence, in order to apply the results, any point on the surface can be viewed as a point on a plane whose normal vector is the normal of the surface at that point. For example, Figure 16a shows an outline of this assumption. In this way, the surface can be analysed as a set of planes (a plane per point on the surface). Then, the experimental results (calculated for planes) can be extrapolated to calculate the perception scale and angle of surfaces of different curvature for any point on the surface. The use of planes for any point on the surface could be computationally expensive. Hence, as the function µ S establishes the points on the surface in a coordinate system that is local to the object, in practice the function µ S is defined as a triangle mesh: a collection of triangles that defines the surface shape of a polyhedral object in 3D computer graphics. The use of a triangle mesh allows the system to discretize the surface geometry as a reduced collection of planes. The number of planar faces will be determined by the geometry of the surface and the resolution used in the experiments (in this case more than 4000 polygons, although less triangles are enough, high details are not needed due to angle perception is discretized each 10 • ).  Since the scene magnitudes are a function of the characteristics of the object, the environment and the camera, it is required to make decisions about the appropriate magnitudes to infer to provide them. In other words, it is necessary to decide which magnitudes of the environment or the camera have to be modified to establish the adequate values of scale and angles calculated in the experiments. The scale will be determined by the number of pixels available to the sensor by setting the camera position at a focus distance and at a constant focal length. The conditions of the angle of perception will be established by the movement on the X and Y axis of the origin of coordinates located in the center of the of the object: the Yaw and Pitch movements of the camera (see Figure 16b). Due to select the variables is a complex problem, an approximation to the optimum solution is proposed in order to determine the appropriate angles between camera and object surface. This increases the captured area of the logo and reduces the images to be captured. For this purpose, a search tree was designed using a branch and bound algorithm, in which the solutions space of each node is reduced to a maximum of five children and a maximum temporal processing level is established.
The logo inspection requires 26 captures (see Figure 17a and Table 4) using the lighting reference and a camera of 1452 × 1452 pixels to acquire a surface area of 4619.96 mm 2 (98.79% of the logo). Some captures can be made using a lower resolution of up to 510 × 510 pixels. According to the function 'Interlaced XY', only 9 images are necessary (see Figure 17b and Table 5). The camera resolution varies between 826 × 826 and 1455 × 1455 in order to cover a total surface area of 4675.05 mm 2 (99.9%). The first capture obtains 45% of the total inspection of the logo whereas only 8.17% is obtained using the reference lighting. This fact shows how the lighting system allow to perceive in more scene magnitudes. Specifically, more points on the surface accomplish the angle of perception ρ θ .  Figure 17. Images of a metallic logo to be inspected using the reference lighting (a) and the function 'Interlaced XY' (b).

Conclusions
A novel active imaging model able to increase the perception capacity of the visual system is presented. The model provides solutions to vision problems in which it is difficult to perceive. It describes the parameters involved in a visual sensorization system in three main aspects: target object to perceive, environmental conditions, and sensor parameters. Each of them is individually parameterized as object size, color, etc.; environment parameters as light, objects-sensor orientations,...; and sensor parametrization such as focal length, monochrome/RGB/3D, etc. Moreover, the model describes the perception capabilities and the limits, and present solutions to these limits. In particular, the model is instantiated for the specular object inspection, as an example of a challenging situation for visual sensor perception. The specular surfaces problem means the device must operate in the intervals of low perception, related to reflections and shine. Traditionally, automated visual inspection requires a thorough analysis of the problem where solutions include everything from the acquisition equipment to the algorithm to recognize the possible defects. As a consequence, these vision systems are oriented to concrete applications and cannot be generalized. The model presented here deals with this lack by providing a general representation of vision systems and solutions for the perception limitations.
The solution proposal of the problem related to specular surfaces provides a normalization of the image in which different objects perceived as the same can be distinguished. First, Measurement Enhancement increasing the differences of the input magnitudes is performed. In this paper, the enhancement of measurements is carried out using environment variables, concretely controlling the lighting conditions. This enhancement spatially structures the lighting in order to set up regions on the surface of the object radiated using different spectral powers forming a grid. Finally, the system is tuned in order for the magnitudes of the scene (scale, angles, intensity lighting, etc.) to be properly perceived.
According to thorough analysis of the problem characteristics, use of virtual imaging simulations as a preliminary method step for validating hypotheses of the visual inspection systems is proposed. The validation of the conditions (point of view, scale, lighting, etc.) in which the inspection has to be performed, can be carried out in a flexible and low cost manner justifying the use of simulations in an early warning step of the system viability. Hence, the model can be pre-validated before the system is developed.
The use of simulations and knowledge bases and the generalist approach of the transformation provide a general solution that can be systematically applied. The method can be applied to the resolution of different inspection problems adapting the contents of the knowledge bases, avoiding a new design solution for each problem.
A realistic simulator has been designed to carry out the experimentation. This simulator recreates the conditions of the image formation and permits the validation of the inspection systems based on the model. The test shows the use of the transformations Υ Φ improves the capacity of perception of the system compared to a homogeneous environment (only one wavelength or colour). This is both for dielectric and metallic materials with regard to the tuning of the scale, angle or lighting. The use of transformation considering maximum amplitude gradient and maximum spatial differences ('Interlaced XY') obtains the best results to perceive the defects for all cases. In contrast, the function with minimum amplitude gradient and minimum spatial differences ('Linear X') offers the worst results of the transformations considered to perform the Measurement Enhancement. The functions ('Linear XY' and 'Interlaced X') that use only one of the maximum proposed gradients, scale or amplitude, generally present similar behaviour. The function that performs a homogeneous lighting of the object surface, which is used as reference, obtains the minimum success rate in all cases. Hence, the results prove the improvement in the capacity of perception for different conditions of scale, angle and intensity lighting. The proposed method enables the detection of surface defects in a greater number of values of scale, angle of perception and lighting conditions than in normal conditions using uniform lighting. The immediate repercussion is that a smaller number of captures of the scene is needed by the system. The research should continue by studying the Measurement Enhancement by means of the extension of the input magnitudes: using other transformations based on structured lighting (with different patterns and using the time domain) and using other parameters like variables of the capture system to provide great gradients in the image.
The simulation confirms the hypotheses. It would be wise to advance to physical experiments according to the concrete industry in which the system is to be technologically developed. Nowadays, the 'Interlaced XY' function (in this case, the pattern is made up using the ordered sequence ∆ E of grey levels instead of using wavelengths) is being tested for increasing the perception capacity of an inspection system aimed to detect shape defects on the surfaces of ceramic tiles.