3.2.1. A Hybrid Matchmaking Method
The proposed method, named , calculates a similarity between the service consumer’s requirements and the descriptions of service instances with regard to functionality and I/O parameters by combining a non-logical method and a logical method.
For describing geospatial web services, we use domain ontologies. These structures provide links that represent semantic information derived from the path lengths of knowledge networks.
The
non-logical method common similarity measure of Wu and Palmer
[
25] has been selected because of its adoption in some recent research [
34,
35] and its efficiency and simplicity of implementation, while remaining as expressive.
In context, this measure can be used to calculate the semantic similarity between two concepts in an ontology based on the hierarchical structure of the ontology. The method is defined as follows:
Given an ontology
formed by a set of concepts and a root concept
R,
and
represent two of its concepts, on which, the similarity will be calculated. The principle of the computation is based on the distances
,
and
, separating root concept
R, concept
and concept
from the closest common ancestor
(see
Figure 3). The
that assigns a score
is defined in Equation (
1).
The measure obtained with this method depends only on the concepts’ depth. Due to the fact that most ontologies have limited depths compared to the number of concepts, it can be seen that the method is processed in an acceptable time frame. However, this measure cannot be used directly to match web services since it is symmetrical (i.e., ); the types of concepts to match must be differentiated according to the role of the concept in the service description. Moreover, for comparing service parameters, the single use of such a method may cause a bias by its tendency to give priority to neighboring concepts (concepts having the same parent), rather than concepts belonging to the same hierarchy (a concept class and its sub-classes).
Therefore, to avoid this effect, we propose combining this method with a logical matching method.
Logical methods (i.e., logical-based matchmaking) have been used by several research studies to check whether the I/O parameters of a service are compatible with the I/O parameters of a request [
22]. A common approach to logical-based matchmaking is to define a set of rules (filters) that dictate what kind of logical relationships is acceptable between the I/O parameters of a service and the I/O parameters of a request [
27].
This kind of matching takes into account the entire I/O signature, so the degree of correspondence between a service and a request cannot be calculated. According to [
26], a more flexible approach is required to be able to assess the degree of a match between a service and a request.
Consequently, we propose a logical-based matchmaking method based on individual links between parameters of the service and request. The links will be used for functionality and I/O discovery purposes. Given a set of concepts in an ontology
, the logical link
between a service parameter concept
and a request parameter concept
can belong to one of the five filter categories detailed in
Table 3.
Finally, in order to calculate the matching score between a
parameter and a
parameter, we propose a hybrid method that combines the two previous ones (logical and non-logical). We define
(see Equation (
2)) as a function whose result is a score
obtained after the evaluation of two other functions (
and
) and taking into account the type of the parameter to match (
,
or
).
where:
The score assigned to an parameter type or a type depends on the request. We consider service functionalities (and outputs) that are the same or more specific than those mentioned in the request. Therefore, the method accepts only and filters to match these parameters. Inversely, since the score assigned to the parameter type depends on the service, we consider service inputs that are the same or more generic than the ones mentioned in the request. Therefore, the method accepts only and filters.
As mentioned earlier, a hybrid matching approach provides the benefits of both logical and non-logical matching. The hybrid approach presented in [
21] combines non-logical similarity matching based on a particular category of semantic relations in ontologies and logical matching with a definition of specific logical filters (subsumes, plugin, etc.). Therefore, from a methodological point of view, we have adopted a similar approach. The fundamental difference between the two approaches lies in the ontological properties retained for the calculation of similarity. In the case of the above paper, the similarity matching is based on the semantic relationships that may exist between the terminology used to identify services and that used by users. For this purpose, the authors used some well known linguistic principles, such as synonymy, hyperonymy, hyponymy, etc. The matching score is computed as an aggregation of numerical scores, between 0 and 1, depending on the evaluated relations and their relevance to the matching process. In the current version of our proposal, we have preferred to base the evaluation of the similarity on the semantic subsumption relations that exist in the ontology. The Wu and Palmer similarity measure that we use computes a numerical score that represents the semantic similarity between two concepts of an ontology as a function of the depth of the concepts, which gives it the ability to compute the semantic similarity of the concepts in a faithful way, i.e., respecting the hierarchical representation of the knowledge in the ontology. We can thus compute a similarity between two concepts, even if there is no obvious semantic relationship defined between them in the sense of [
21]. For example, let us consider “Geosemantic analysis” as a part of the requester requirements in terms of the desired categories of service and two concepts representing two classes of SWG categories linked by a subsumption relation in the service ontology, which are: “Geosemantic_analysis_service” and “NERC_service” (NERC is for named entity recognition and classification) (see Figure 7). According to our approach, the similarity score will be 0.88. While, from a terminological point of view, it appears to be rather complex to obtain, for the NERC service, one of the semantic relations in the sense of [
21], the score should be close to 0. In fact, the two approaches do not address the same problem and it would be very interesting in the long term to be able to make them cooperate. Since both works use the same methods to compute similarity on functionality and on I/O parameters, the same observation could be made when considering two concepts from the data ontology, e.g., the concepts “offset” and “DistanceCategory”.
3.2.2. Three-Step Matching Process
The first step is called semantic functionality matching. When a request is submitted, the service instances matching the requested functionality concept are then discovered. Only services whose score is equal to or greater than a given threshold are retained.
The matching is based on the proposed function with a type restriction. For example, if the geocoding concept is requested, all service instances annotated with a concept that has a matching score with a geocoding concept are retained at this step. The value of can be chosen from a number of matching categories (e.g., strict for , medium for and fuzzy for ). The functionality-based matching aims to quickly exclude large amounts of unrelated services.
The second step, called semantic I/O matching, is then applied to refine the result. The service descriptions are browsed to determine if their I/O properties meet the I/O properties defined by the service requester. We propose a method that allows for optimized matching according to the maximum score of the function and the type of parameter. As in the previous step, only services whose score is equal to or greater than a given threshold are retained. The threshold value can be chosen from a number of matching categories.
Given
an ontology,
S a service and
R a request, and
,
and
,
finite sets of linked concepts of input and output parameters for
S and
R, respectively, the
function is defined to calculate the I/O matching score (see Equation (
3a)).
The score assigned to an
matching is based on the number of service inputs instead of demand inputs. The priority here is to satisfy the notion that required inputs for the service have to be met. However, it remains acceptable if one of the inputs specified by the request is not used. Therefore, the maximum matching score for each service input is summed and then divided by the number of input parameters required by the service (see Equation (
3b)).
Inversely, the score assigned to an
matching depends on the number of outputs specified in the request. The priority here is to satisfy the notion that required outputs for the request have to be met. Even in cases where the service generates some additional outputs, this remains acceptable. Therefore, the maximum matching score for each request output is summed and then divided by the number of output parameters required by the request (see Equation (
3c)).
The third step, called
non-functional matching, extended the process for matching GWS, integrating contextual information (i.e., non-functional properties). In order to achieve this goal, we define
as a non-functional matching function. This function searches among the set of candidate service descriptions selected in the previous steps and uses the non-functional properties proposed in the service description meta-model (see
Section 3.1.1). Equation (
4) calculates the non-functional score value of candidate services. In this equation, we use weight
in order to increase or decrease the objective and subjective properties consideration.
where:
where:
Finally, after the three matching steps, final assessment values of services are determined. Here, we refer to the final assessment as the basic recommendation score (
). The computation of the
value uses both the functional score (result of
function) and non-functional score (result of
function) as described in (Equation (
5)). In this equation, we use weight
in order to increase or decrease the functional matching consideration. The weight allows us to give the user the possibility to parameterize the final score since the interest of the users concerning the non-functional aspect is different: some prefer to take it into account and others do not.
where:
Nevertheless, taking into account the non-functional properties, even in a basic way, gives a real added value in the final proposal for the ranking (in order of preference) of the discovered services.