IoT Service Clustering for Dynamic Service Matchmaking

As the adoption of service-oriented paradigms in the IoT (Internet of Things) environment, real-world devices will open their capabilities through service interfaces, which enable other functional entities to interact with them. In an IoT application, it is indispensable to find suitable services for satisfying users’ requirements or replacing the unavailable services. However, from the perspective of performance, it is inappropriate to find desired services from the service repository online directly. Instead, clustering services offline according to their similarity and matchmaking or discovering service online in limited clusters is necessary. This paper proposes a multidimensional model-based approach to measure the similarity between IoT services. Then, density-peaks-based clustering is employed to gather similar services together according to the result of similarity measurement. Based on the service clustering, the algorithms of dynamic service matchmaking, discovery, and replacement will be performed efficiently. Evaluating experiments are conducted to validate the performance of proposed approaches, and the results are promising.


Background
The Internet of things (IoT) integrates user requirement, cyberspace and physical space, which enables the seamless cooperation of human-machine-thing. SOC (Service-Oriented Computing) proposes techniques for provision, selection, discovery, and composition of Web services, and integrates heterogeneous and complicated software entities together organically [1,2]. As the adoption of service-oriented paradigms in the IoT environment [3], real-world devices will open their capabilities through service interfaces, which enable other functional entities to interact with them. In an IoT application, it is indispensable to find suitable services for satisfying users' requirements or replacing the unavailable services.
With the rapidly growing number of IoT services, discovery and selection for numerous services under the dynamic and large-scale environment of IoT is becoming a crucial task. Several middleware solutions have been proposed for the integration of the physical world with the Web, such as OpenIoT [4], GSN (Global Sensor Networks) [5], and Xively [6]. These solutions act as service platforms that manage millions of services around the world, which enable people to share and monitor environmental data from objects that are connected to the Web. However, most leading middleware solutions provide only limited service discovery and selection functions. It is effective to discover services through service matchmaking techniques [7,8]. However, from the perspective of performance, it is unreasonable to discover services from online repositories directly in the context of the IoT-scale environment [9,10]. Instead, if services are classified offline or clustered according to their similarity, and then the online examining of services will be controlled within several limited clusters. Then, the performance of finding desired services online will be optimistic [11,12]. Besides, because of the dynamic of IoT, the selected services may become unavailable or unfit for the current context, so re-selection and replacing services by similar services from the same cluster is necessary. Therefore, techniques that cluster services offline according to their similarity are critical for dynamic service matchmaking, discovery, and replacement [13].

Motivation
As the research [14] discusses, nearly 12,000 Web services are active on the Web. Even in such conditions, the similarity measurement and clustering of Web service has become a challenging problem. The same issue of IoT service will become a much more complex challenge due to the scale and complexity of IoT. As the IoT service acts on ternary space (i.e., user, cyber, and physical space) rather than Web service that only exists in cyber space, the context of IoT service is more complex than Web service. IoT services imply multidimensional semantic, for instance, the physical quantity observed by the service, the observation capabilities of service, the observation area of the service, and so on. When clustering or measuring similarity between services, these information should be taken into consideration.
A bundle of approaches about measuring similarity between Web services has been proposed in recent years. Basically, it can be divided into: information content-based approaches [15,16] and semantic-network-structure-based approaches [17,18]. However, it is inappropriate that directly using existing approaches on IoT services. Existing similarity measurements focus on the hierarchy and inheritance relations between services in the semantic model. They ignore the relation types, relation contexts, and relation restrictions that imply meaningful semantic information for distinguishing service. Besides, service nodes in a semantic model are defined with data-type and object-type properties. The properties should also not be ignored when computing the similarity between services.
Moreover, existing service models mix multiple feature dimensions of IoT service to construct complicated models. The dimension-mixed model cannot obtain a well-defined taxonomy structure. Thus, using semantic structure-based algorithms to measure the dimension-mixed model will not achieve a satisfactory accuracy. Besides, the property and restriction descriptions of multiple dimensions interfere with each other when measuring similarity based on service description. Therefore, using the existing algorithms to measure IoT services will not conform to the equivalence soundness and disjointedness incompatibility principles of similarity measurement, and the measurement results cannot reflect the real similarity between services [19]. Without accurate similarity measurement, it is impossible to obtain satisfactory clustering results, which will influences the effect of follow-up matchmaking and discovery of services. This paper proposes the multidimensional semantic model for describing IoT services. Each dimension constructs a semantic model including well-defined service classification, service properties, and property constraints. Based on this multidimensional service model, we propose an MDM (Multiple Dimensional Measuring) algorithm to calculate the similarity between services on each dimension by taking both model structure and model description into consideration. The similarity between services on each dimension is measured concurrently. If the context of service changes, MDM just needs to re-measure the similarity of changed dimensions, rather than existing approaches which require re-measuring the whole similarity. Thus, compared with dimension-mixed approaches, MDM is more accurate and efficient. After that, based on the result of similarity measurement, we employs density-peaks-based clustering [20] to divide services into clusters according to the distribution of their similarity. The similar service clusters are generated automatically without the artificial estimating of parameter (e.g., cluster size or number of cluster). Different services have personalized cluster sizes, which take the heterogeneity of service context into consideration. After clustering, the agile service matchmaking and discovery are possible. In particular, this paper has the following contributions:

1.
This paper proposes the MDM algorithm for measuring the similarity between IoT services based on multidimensional service model. The accuracy and efficiency of MDM outperform the dimension-mixed approaches.

2.
MDM algorithm employs a density-peaks-based clustering approach to gather similar services together according to the actual distribution of services. It avoids the complicated process of estimating or optimizing parameters. 3.
To evaluate the applicability of proposed approaches, we use a combined data set including real and synthetic data. The experiment results indicate that the performance of proposed approaches are applicable to real-life scenarios.

Preliminaries: Multidimensional Service Model and Model Vectorization
A series of works have been proposed to formally describe IoT services in ontology models, such as references [21][22][23][24][25][26][27][28]. However, existing models mix multiple feature dimensions of IoT service to construct complicated models. Therefore, in a model hierarchy multiple classifying criterions are referenced. The dimension-mixed model cannot obtain a well-defined taxonomy structure, and the distance and positional relationships between nodes are meaningless to reflect the similarity between services. Therefore, using semantic structure-based algorithms to measure the similarity between services will not achieve a satisfactory accuracy. Besides, the restrictions and property descriptions of different dimensions interfere with each other when measuring similarity based on service descriptions.
Based on the multidimensional service model proposed in our previous work [29], the service classification, service properties, and property constraints of each dimension are well defined. Then, the MDM algorithm discussed in Section 3 can calculate the similarity between services on each feature dimension accurately and concurrently. To reflect the similarity meeting the perspectives of different users, the whole similarity values are aggregated by users' personalized weight values. In this section, four representative dimensions are described to demonstrate the idea of multidimensional model, shown in Figure 1. The other parts of service model and the detailed discussion about the problem of existing model are presented in [29]. Figure 1a shows the dimension of observation principle, which is based on the standard definition of observation physical principle described in [30]. As a sensor is a converter of transforming nonelectrical effects into electrical signals, several steps are needed before outputting the electric signal. For example, the measurement principle of capacitive water-level sensor is dielectric-constant. This sensor is fabricated in a form of a coaxial capacitor where the surface of each conductor is coated with an isolating layer. If the water level increases, water occupies more and more space between the coaxial conductors, then transforming the capacitance. The model of this dimension is helpful to discover suitable services according to users' application scenarios. For instance, magnetic sensors are unfit for the environment with magnetic interference. Figure 1b depicts the dimension of observation quantity type. It defines the physical quantities that be measured by the IoT services. The quantity type model is a key criterion for service matchmaking; it avoids the ambiguous representation of physical quantity. For instance, the services of body temperature and environment temperature have the similar type of output. Without an exact definition of quantity type, a body temperature service may be offered to user when he requires observing the ambient temperature. The model is constructed based on the Climate and Forecast standard of W3C CF (Climate and Forecast)-feature ontology [31], which makes a standard definitions for common observed physical quantity. For instance, it has more than 50 quantity types to express temperature, such as surface air temperature, canopy temperature, and dry-bulb temperature, etc. Figure 1c indicates the dimension of application domain deriving from reference [30]. This dimension will help users to choose the services that are fit for their application domain. For instance, if we select a service to measure the gas concentration in coalmine domain, the service must be coalmine dedicated and "intrinsically-safe".  Figure 1d shows the measurement capability dimension. This model is derived from the capability model of W3C SSN (Semantic Sensor Network) ontology [32]. Due to the performance of IoT service may be influenced by operation environment, this model expresses the measurement capabilities of services under certain conditions, consisting the concepts of sensitivity, frequency, drift, and accuracy, etc. It can be used to check whether the service has been properly used or to determine how a service will perform in a particular environment. It is also an important criterion for service matchmaking. For instance, the capability of a temperature observation service is: with temperature −200 to 500 • C the accuracy is ±1.0 • C, while from 500 to 800 • C it is ±0.5%.
Sensors 2017, 17,1727 4 of 17 Figure 1d shows the measurement capability dimension. This model is derived from the capability model of W3C SSN (Semantic Sensor Network) ontology [32]. Due to the performance of IoT service may be influenced by operation environment, this model expresses the measurement capabilities of services under certain conditions, consisting the concepts of sensitivity, frequency, drift, and accuracy, etc. It can be used to check whether the service has been properly used or to determine how a service will perform in a particular environment. It is also an important criterion for service matchmaking. For instance, the capability of a temperature observation service is: with temperature −200 to 500 °C the accuracy is ±1.0 °C, while from 500 to 800 °C it is ±0.5%.   Based on the formally multidimensional model, the semantic similarity between IoT services can be measured. Before measuring, the model of services should be vectorized, that is, transforming the model description of a service to a tuple of terms. The model conversion approach of [33] is adopted. After vectorization, the semantic concept will be denoted as a tuple, as Equation (1) defines: where in OWL-annotated semantic documents, is the name (or URI) of the concept , each [] is a property term including a property and its restriction, is a restriction for the datatype property , ( = 1 … ) is an object property of the concept , is a restriction for the object property , ( = 1 … ) is a concept related by the object property , and is a Boolean operation between concepts .
After the model vectorization, the semantic description of an IoT service can be represented as a tuple: We use a simplified model structure (shown in Figure 2) to demonstrate the process of model vectorization. In this model, 1 to 5 are classes that form the inheritance structure, 1 to 4 are service instances (i.e., objects that belong to different classes). 1 and 2 are object properties that denote the relationships between service instances. Assuming that we want to measure the similarity between 1, 2, and 3. Before similarity measurement, we should vectorize the model of services into tuples as following according to above discussion:  Based on the formally multidimensional model, the semantic similarity between IoT services can be measured. Before measuring, the model of services should be vectorized, that is, transforming the model description of a service to a tuple of terms. The model conversion approach of [33] is adopted. After vectorization, the semantic concept C will be denoted as a tuple, as Equation (1) defines: where in OWL-annotated semantic documents, C is the name (or URI) of the concept C, each [] is a property term including a property and its restriction, is an object property of the concept C, γ o j is a restriction for the object property o j , C x o j (x = 1 . . . k) is a concept related by the object property o j , and λ y o j is a Boolean operation between concepts C x o j . After the model vectorization, the semantic description of an IoT service i can be represented as a tuple: We use a simplified model structure (shown in Figure 2) to demonstrate the process of model vectorization. In this model, C1 to C5 are classes that form the inheritance structure, s1 to s4 are service instances (i.e., objects that belong to different classes). P1 and P2 are object properties that denote the relationships between service instances. Assuming that we want to measure the similarity between s1, s2, and s3. Before similarity measurement, we should vectorize the model of services into tuples as following according to above discussion:

MDM Similarity Measurement
Before clustering IoT services, the similarities (or distances) between services should be measured based on MDM. MDM matches both the structure information of the model hierarchy and the description of service properties, relations and restrictions. It employs Li's approach [34] as the similarity computing method of structure information, which proposed a hybrid semantic similarity model by adopting a nonlinear model. For measuring the similarity of service description, based on the model vectorization algorithm discussed in Section 2, it adapts the TF-IDF (Term Frequency and Inverse Document Frequency) [35] and Cosine Similarity to calculate the similarity of service tuples. By combining the similarity of structure and description, MDM can measure the similarity of every dimension accurately and concurrently. Then, the overall similarity will be generated by aggregating the similarity of multiple dimensions according to users' preferences, for instance, allocating different weights for different dimensions.

Structure Similarity
A series of algorithms for measuring structure similarity have been proposed, considering the aspect of information content [15,16], depth in the hierarchy [36,37], semantic density [34,38], and shortest path length [39,40], etc. In order to achieve a good similarity measure, Li [34] investigated the effectiveness of a variety of strategies considering possible structure information. Its research results demonstrate that comparing the performance against human common sense is the only way to evaluate the quality of a method for calculating concept similarity. Therefore, the closer the result compares with human judgment, the better it will be. The work of [34] has confirmed the hypothesis that the human judgment of similarity is a nonlinear process. Its measurement algorithm, which models the length and depth of shortest path into a nonlinear function and combines them by multiplication, can obtain a dramatic improvement compared to previous methods. We employ their approach to calculate the structure similarity of services. Given the service a and b, the structure similarity on dimension between their class can be measured by Equation (2).
where ℎ denotes the depth of the subsume Class of and , and is the shortest path length between and . and are the impacts of and ℎ. Li [34] configures the optimal parameters that α = 0.2 and β = 0.6. Under these parameters, the correlation coefficient between this measurement and human similarity judgments is 0.8914, while correlation between different people is 0.9015. It indicates that the measurement performs nearly at a level of human replication.

MDM Similarity Measurement
Before clustering IoT services, the similarities (or distances) between services should be measured based on MDM. MDM matches both the structure information of the model hierarchy and the description of service properties, relations and restrictions. It employs Li's approach [34] as the similarity computing method of structure information, which proposed a hybrid semantic similarity model by adopting a nonlinear model. For measuring the similarity of service description, based on the model vectorization algorithm discussed in Section 2, it adapts the TF-IDF (Term Frequency and Inverse Document Frequency) [35] and Cosine Similarity to calculate the similarity of service tuples. By combining the similarity of structure and description, MDM can measure the similarity of every dimension accurately and concurrently. Then, the overall similarity will be generated by aggregating the similarity of multiple dimensions according to users' preferences, for instance, allocating different weights for different dimensions.

Structure Similarity
A series of algorithms for measuring structure similarity have been proposed, considering the aspect of information content [15,16], depth in the hierarchy [36,37], semantic density [34,38], and shortest path length [39,40], etc. In order to achieve a good similarity measure, Li [34] investigated the effectiveness of a variety of strategies considering possible structure information. Its research results demonstrate that comparing the performance against human common sense is the only way to evaluate the quality of a method for calculating concept similarity. Therefore, the closer the result compares with human judgment, the better it will be. The work of [34] has confirmed the hypothesis that the human judgment of similarity is a nonlinear process. Its measurement algorithm, which models the length and depth of shortest path into a nonlinear function and combines them by multiplication, can obtain a dramatic improvement compared to previous methods. We employ their approach to calculate the structure similarity of services. Given the service a and b, the structure similarity on dimension i between their class C can be measured by Equation (2).
where h denotes the depth of the subsume Class of C ai and C bi , and l is the shortest path length between C ai and C bi . α and β are the impacts of l and h. Li [34] configures the optimal parameters that α = 0.2 and β = 0.6. Under these parameters, the correlation coefficient between this measurement and human similarity judgments is 0.8914, while correlation between different people is 0.9015. It indicates that the measurement performs nearly at a level of human replication.

Service Description Similarity
Assuming that S is the candidate service set S = {s 1 , s 2 , . . . , s i , . . . , s m }, then according to Equation (1), S can be represented as: S = {tuple 1 , tuple 2 , . . . , tuple i , . . . , tuple m } Then we construct the feature vector of each service using the TF-IDF. TF-IDF is the product of two statistics: term frequency (TF) and inverse document frequency (IDF). The former is the frequency of a term in a document, while the latter represents the occurrence frequency of the term across all documents. It is obtained by dividing the total number of documents by the number of documents containing the term and then taking the logarithm of that quotient. The higher TF-IDF of a term, the more important it is for a document. In our study, corpus is the service set, document and term are tuple and description term respectively. We adopt TF-IDF to calculate the frequency of terms in the service tuple. The TF of a term in a service tuple is: |tuple| is the size of terms of the tuple, and f is the occurrence frequency of term in this tuple. The IDF of the term can be measured by: The cardinality of service set S is denoted as |S|, and |{tuple ∈ S : term ∈ tuple}| represents the amount of tuples that includes the term. Thus, the TF − IDF can be calculated by: Then, a vector of a service by calculating the TF-IDF of terms in its tuple is obtained. For a service s, its tuple tuple s = {term 1 , term 2 , . . . , term i , . . . , term k } and its vector is: vector s = {TFIDF 1 , TFIDF 2 , . . . , TFIDF i , . . . , TFIDF k } The similarity between two vectors can be measured by the cosine-similarity. The IDF not only strengthens the effect of terms whose frequencies are very low in a tuple, but also weakens the effect frequent terms. For instance, the property subClassof: Thing occurs in most ontology concepts, then the IDF of it is close to zero. Therefore, the terms with low IDF value will have weak impact on the cosine similarity measurement. The description similarity on the dimension d between two services i and j can be measured by:

Multidimensional Aggregation
The similarity in the i dimension between two services a and b can be calculated by combining sim C (Equation (2)) and sim P (Equation (3)). δ is the impact parameter which indicates the effect of structure information on the similarity measurement. The similarity values of each dimension can be aggregated by weights according to the users' preferences: where n is the dimension number of semantic service model.

IoT Service Clustering
This paper employs density-peaks-based clustering [20] to divide services into clusters according to the potential density distribution of similarity between services. Density-peaks-based clustering is a fast and accurate clustering approach for large-scale data. After clustering, the similar services are generated automatically without the artificial determining of parameter. The distance between two services can be calculated by Equation (6):

Local Density and Distance Calculating
The density-peaks algorithm is based on the assumptions that cluster centers are surrounded by neighbors with lower local density, and they are keep a large distance from other points with higher density. Assuming that S = {s i } N i=1 is the service set that will be clustered, s i is a service of S, I S = {1, 2, . . . , N} is the set of index. For each service s i in S, two quantities are defined: its local density ρ s and its distance θ s from services of higher density. The local density ρ of service i is defined as: where d c is a cutoff distance. If x < 0, χ(x) equals to 1, otherwise χ(x) = 0. θ is calculated by measuring the closest distance between the service i and other services with higher density than i: For the service with highest density, its density is defined as: θ i = max j d ij . Note that θ i is much larger than the typical nearest neighbor distance only for services that are local or global maxima in the density. Algorithm 1 describes the procedure of calculating clustering distance. Firstly, the data density are sorted in descending order, set {q i } N i=1 is the index generated from the descending order, i is the index of descending order and q i is the original index. Then, the clustering distance θ i of service s i is calculated by θ i = Dist(i, j), s j is the service that has larger density (than s i ) and closest to s i . In S, we use n i to denote the index of s j , namely, n i = j. {n i } N i=1 is defined as: The clustering distance of the point with largest density is defined as max j d ij that is the maximum value of all data points, and the index is n q 1 = 0.

Input:
F: the matrix of distances between services; ρ: local density of each service; Output: θ: the clustering distance of each service; n q i : the index of service that has larger density and closest to s q i ; Sort in descending order by density ρ 1: 2: q i ←descending order index of density; Distance assignment of θ 3: {n i } N i=1 ← 0; 4: for i:=1 to N do 5: θ qi ←d max ; 6: for j:=1 to i−1 do 7: if dist(s q i , s q j ) < θ qi then 8: θ qi = dist(s q i , s q j ); 9: n q i = q j ; 10: end if 11: end for 12: end for 13: θ q1 = max j≥2 θ j ;

Cluster Center Selecting
For services {x i } N i=1 in S, their local density and clustering distance can be calculated: . Cluster centers are the services that have both large ρ and large θ. In order to eliminate the difference of magnitude, the ρ and θ of each service are normalized to [0,1]. Then, the values that are comprehensive consideration of ρ and θ are calculated: Obviously, the higher value of γ, the more likely it becomes a cluster center. {γ i } N i=1 are sorted in descending order. The sorted γs are drawn on the coordinate plane, the horizontal axis is the index of γ, the vertical axis is the value of γ, as shown in Figure 3. This coordinate plane is defined as decision graph. In addition, then a number of service points are intercepted from front to back as the cluster centers. The decision graph shows that the γ values of cluster centers are larger and discrete, while non-center services are continuous and smooth. The transition of γ value from the cluster centers to the non-center services has a significant "jump", this "jump" can be detected by numerical detection method [41]. Therefore, the cluster center of the dataset S will be determined according to decision graph and numerical detection method.
Obviously, the higher value of , the more likely it becomes a cluster center. { } are sorted in descending order. The sorted s are drawn on the coordinate plane, the horizontal axis is the index of , the vertical axis is the value of , as shown in Figure 3. This coordinate plane is defined as decision graph. In addition, then a number of service points are intercepted from front to back as the cluster centers. The decision graph shows that the values of cluster centers are larger and discrete, while non-center services are continuous and smooth. The transition of value from the cluster centers to the non-center services has a significant "jump", this "jump" can be detected by numerical detection method [41]. Therefore, the cluster center of the dataset will be determined according to decision graph and numerical detection method. = is the combination of local density and clustering distance of service . is the index of services after they are sorted in descending order by .

Cluster Assignment
After the center of every cluster is assumed, the next step is to assign non-center services to clusters. Algorithm 2 describes the procedure of cluster assignment. Each service are assigned in the order of density descending, which is from the cluster center services to the cluster core services to the cluster halo services in the way of layer by layer.
Suppose that is the total number of cluster centers, naturally, the number of clusters is also . { } is the index of corresponding service for each cluster center, i.e., service is the center Figure 3. Decision graph for assuming cluster centers. γ i = ρ i θ i is the combination of local density ρ i and clustering distance θ i of service i. n is the index of services after they are sorted in descending order by γ.

Cluster Assignment
After the center of every cluster is assumed, the next step is to assign non-center services to clusters. Algorithm 2 describes the procedure of cluster assignment. Each service are assigned in the order of density descending, which is from the cluster center services to the cluster core services to the cluster halo services in the way of layer by layer.
Suppose that n c is the total number of cluster centers, naturally, the number of clusters is also n c . m j n c j=1 is the index of corresponding service for each cluster center, i.e., service s m j is the center of the jth cluster. {c i } N i=1 is the cluster of each service belongs to, i.e., service s i belongs to cluster c i . According to the definition of {n i } N i=1 in Equation (9), n i is the index of service which has larger density than ith service (s i ) and closest to s i . n c : total number of clusters (centers); n q i : the index of service which has larger density than s q i and closest to s q i ; Output: {c i } N i=1 : the cluster of each service belongs to, i.e., s i belongs to c i ; if c q i = −1 then 7: c q i = c n q i ; 8: end if 9: end for If the dataset has more than one cluster, each cluster can be furthermore divided into two parts: cluster core and cluster halo. The cluster core with higher density is the core part of a cluster. The cluster halo with lower density is the edge part of a cluster. The procedure of determining cluster core and cluster halo is described in Algorithm 3. We define the border region of a cluster as: the border region of cluster c 1 is consisted by the services s i that belongs to c 1 , and the distance between s i and s j (which belongs to another cluster c 2 ) is less than d c . An average density bound is defined as ρ b c i is the average density bound of cluster c i . If the density ρ of service s is larger than ρ b c i , then service s belongs to the core part of cluster c i ; otherwise, it belongs to the halo part of cluster c i .
After clustering, the similar service neighbors are generated automatically without the estimation of parameters. Moreover, different services have personalized neighbor sizes according to the actual density distribution, which may avoid the inaccurate matchmaking caused by constant neighbor size.

Input:
F: the matrix of distances between services; d c : cut-off distance; {c i } n c i=1 : the cluster of each service belongs; Output: {h i } N i=1 : the signal of core or halo that service s i belongs to; 3: for i:=1 to N−1 do 4: for j:=I + 1 to N do

5:
if c i = c j and dist s i , s j < d c then

Experimental Evaluation
In this section, we evaluate the performance of proposed MDM measurement and service clustering. We use a combined data set including real and synthetic data, which collects service from multiple sources and adds essential service instances and descriptions. The data sources of combined service set are shown in Table 1. In this paper, 510 real sensor services are collected from 6 sensor sets, including indoor and outdoor sensors. Then, the amount of service is expanded to 1000, and essential semantic service descriptions are supplemented for similarity measuring. The experimental evaluation is performed under the environment of 64-bit Windows 7 Professional, Java 7, Intel Xeon Processor E5-2650 2.3GHz processor, and 32 GB RAM. Section 5.1 discusses about the performance of MDM, and Section 5.2 discusses about the performance of service clustering.

Performance of Similarity Measurement
To evaluate the performance of similarity measurement, we employ the most widely used performance metrics from the information retrieval field. The performance metrics in this experiment are defined as follows: Precision. Precision is used to measure the preciseness of a search system. Precision for a single service refers to the proportion of matched and logically similar services in all services matched to this service, which can be represented by the following equation: where A is the number of logically similar service and B is the number of matched services calculated by MDM.
Recall. Recall is used to measure the effectiveness of a search system. Recall for a single service is the proportion of matched and logically similar services in all services that are logically similar to this service, which can be represented by the following equation: F-measure is employed as an aggregated performance scale for a search system. In this experiment, F-measure is the mean of precision and recall, which can be represented as: When the F-measure value reaches the highest level, it means that the aggregated value between precision and recall reaches the highest level at the same time.
In order to filter out the dissimilar services with lower similarity values, an optimal threshold value is needed to be estimated. In addition, the aggregative metric of F-measure is used as the primary benchmark for estimating the optimal threshold value. Besides, parameter δ is the impact of description and structure similarity for similarity measuring. To obtain the best performance, an optimal δ value should also be estimated. The initial values of two parameters are set to 0, and increasing incrementally by 0.1 until 1.0.
Figures 4 and 5 demonstrate the variation of F-measure values of dimension-mixed and multidimensional model as the changing of these two parameters. When the value of F-measure reaches the highest point, it achieves the best performance, and the optimal value of threshold and δ will be determined. As Figures 4 and 5 indicates, δ = 0.5 and threshold = 0.8 are the optimal values of dimension-mixed model, and the F-measure is 40 with these parameters; meanwhile δ = 0.7 and threshold = 0.7 are the optimal values of multidimensional model, and the F-measure is 63 with these parameters. Besides, the overall F-measure values of multidimensional model are higher than dimension-mixed model.
The performance comparison between multidimensional and dimension-mixed model is shown in Figure 6. As the results indicate, the performance of similarity measurement based on the multidimensional model outperforms to the dimension-mixed way. The reason is that, employing the multidimensional model, both description similarity and structure similarity can be measured accurately. For the structure similarity, each dimension has a well-defined semantic structure in which the distance and positional relationships between nodes are meaningful to reflect the similarity between services. For the description similarity, each dimension only focuses on the descriptions that are contributed to expressing the features of current dimension. Conversely, using the dimension-mixed way, which mixes the semantic structures and descriptions of all dimensions into a complicated model, the measurement can only obtain an overall similarity value. reaches the highest point, it achieves the best performance, and the optimal value of threshold and will be determined. As Figures 4 and 5 indicates, δ = 0.5 and threshold =0.8 are the optimal values of dimension-mixed model, and the F-measure is 40 with these parameters; meanwhile δ = 0.7 and threshold =0.7 are the optimal values of multidimensional model, and the F-measure is 63 with these parameters. Besides, the overall F-measure values of multidimensional model are higher than dimension-mixed model.  The performance comparison between multidimensional and dimension-mixed model is shown in Figure 6. As the results indicate, the performance of similarity measurement based on the multidimensional model as the changing of these two parameters. When the value of F-measure reaches the highest point, it achieves the best performance, and the optimal value of threshold and will be determined. As Figures 4 and 5 indicates, δ = 0.5 and threshold =0.8 are the optimal values of dimension-mixed model, and the F-measure is 40 with these parameters; meanwhile δ = 0.7 and threshold =0.7 are the optimal values of multidimensional model, and the F-measure is 63 with these parameters. Besides, the overall F-measure values of multidimensional model are higher than dimension-mixed model.  The performance comparison between multidimensional and dimension-mixed model is shown in Figure 6. As the results indicate, the performance of similarity measurement based on the multidimensional model outperforms to the dimension-mixed way. The reason is that, employing the multidimensional model, both description similarity and structure similarity can be measured accurately. For the structure similarity, each dimension has a well-defined semantic structure in which the distance and positional relationships between nodes are meaningful to reflect the similarity between services. For the description similarity, each dimension only focuses on the descriptions that are contributed to expressing the features of current dimension. Conversely, using the dimensionmixed way, which mixes the semantic structures and descriptions of all dimensions into a complicated model, the measurement can only obtain an overall similarity value.

Performance of Service Clustering
In this section, we evaluate the performance of clustering. The number of service that will be clustered is 1000 with essential semantic description and structure, as Table 1 describes. The cut-off distance for calculating local density of services is set to 0.03. As Figure 7 shows, although the

Performance of Service Clustering
In this section, we evaluate the performance of clustering. The number of service that will be clustered is 1000 with essential semantic description and structure, as Table 1 describes. The cut-off distance d c for calculating local density of services is set to 0.03. As Figure 7 shows, although the service set is high overlap in data distribution, the proposed approach successfully detects the cluster structure. The services are clustered into 5 clusters, the borders of clusters are clear, and each cluster is dense and compact.

Performance of Service Clustering
In this section, we evaluate the performance of clustering. The number of service that will be clustered is 1000 with essential semantic description and structure, as Table 1 describes. The cut-off distance for calculating local density of services is set to 0.03. As Figure 7 shows, although the service set is high overlap in data distribution, the proposed approach successfully detects the cluster structure. The services are clustered into 5 clusters, the borders of clusters are clear, and each cluster is dense and compact.  The size of service set and the number of feature dimensions are two important factors to evaluate the efficiency of proposed clustering approach. Figure 8 shows the time of clustering as the size of services is increased from 100 to 1000. The time of clustering 1000 services is 3.2 s. The results show that the clustering time is linear with respect to the number of IoT services to be clustered, and the clustering time of hundreds services is controlled within a few seconds. Figure 9 shows the time of clustering as the dimensions of service model increasing from 1 to 10. The number of services that will be clustered is set to 1000. The minimum clustering time is 2.8 s, when there are four feature dimensions of the model; and the maximum clustering time is 3.3 s, when the number of feature dimensions is eight. The results show that the clustering time will not increase as the increase of dimension number. It is because that MDM measures each dimension's similarity concurrently. Thus, the whole time of measuring similarity of all dimensions is equal to the time of single dimension that takes longer time than other dimensions. Besides, the clustering is based on the measurement result of MDM (distances between services), it will not be influenced by the dimension number. Therefore, the proposed approaches improve the accuracy of similarity measurement and service clustering in the condition of not increasing the computation time. show that the clustering time is linear with respect to the number of IoT services to be clustered, and the clustering time of hundreds services is controlled within a few seconds. Figure 9 shows the time of clustering as the dimensions of service model increasing from 1 to 10. The number of services that will be clustered is set to 1000. The minimum clustering time is 2.8 s, when there are four feature dimensions of the model; and the maximum clustering time is 3.3 s, when the number of feature dimensions is eight. The results show that the clustering time will not increase as the increase of dimension number. It is because that MDM measures each dimension's similarity concurrently. Thus, the whole time of measuring similarity of all dimensions is equal to the time of single dimension that takes longer time than other dimensions. Besides, the clustering is based on the measurement result of MDM (distances between services), it will not be influenced by the dimension number. Therefore, the proposed approaches improve the accuracy of similarity measurement and service clustering in the condition of not increasing the computation time.    The experimental results demonstrate that, the proposed clustering approach is able to cluster hundreds of IoT services in a reasonable amount of time. In the application domains of IoT SOC paradigm, the number of services usually does not exceed several thousands. Besides, if the scale of services is very large, the service clustering can be performed offline. Thus, the performance of proposed clustering approach is competent for applying in real application scenarios. The experimental results demonstrate that, the proposed clustering approach is able to cluster hundreds of IoT services in a reasonable amount of time. In the application domains of IoT SOC paradigm, the number of services usually does not exceed several thousands. Besides, if the scale of services is very large, the service clustering can be performed offline. Thus, the performance of proposed clustering approach is competent for applying in real application scenarios.

Conclusions
This paper proposes a multidimensional model-based approach to measure the similarity between IoT services. Then, density-peaks-based clustering is employed to gather similar services together according to the result of similarity measurement. A combined data set is used to evaluate the proposed approaches, which collects service from multiple sources and adds essential service instances and descriptions. The experiment results demonstrate that the performance of proposed approaches are promising and applicable to real-life scenarios.
Currently, the experiments are conducted using a centralized single dataset, and the size of test set is limited. Our future works include extending the experiments using distributed datasets and expanding the number of service set. Moreover, we plan to propose a quantitative model to diagnose the quality of service clustering, then to determine when the clustering structure becomes unacceptable and require re-clustering as the evolution of services.