3.2.1. Problem Formulation
We formulate the countryside ecological patterns recommendation problem as follows. In the countryside ecological patterns recommendation scenario, we have a set of patterns and a set of countryside . The pattern–countryside interaction matrix is defined depending on the real situation, where indicates that countryside is suitable for pattern ; otherwise, . Additionally, we also have a countryside knowledge graph , which is comprised of entity–relation–entity triples . Here , and denote the head, relation, and tail of , respectively. Given an example, the triple (Xiaonan Countryside, countryside.locate.district, Lvshunkou District) means the fact that the Xiaonan Countryside is located in Lvshunkou District.
Given the pattern–countryside interaction matrix
and the countryside knowledge graph
, we purpose to predict whether countryside
matches pattern
. Our aim is to learn a predication function
, minimizing the difference between the predicted value and the true value:
where
denotes the possibilities that countryside
will match pattern
, and
denotes the parameters of function
. The meanings of the symbols in KGCN4CEPR are shown in
Table 1.
3.2.2. Implementation of KGCN4CEP Recommendation Method
This subsection first describes the framework of KGCN4CEPR and follows the four key implementation steps, as shown in
Figure 5.
Step 1. Weights adjacency matrix: Convert the original heterogeneous countryside knowledge graph into a countryside knowledge graph with weights, learn the weights of the edges to obtain an adjacency matrix.
Step 2. construction: Mine the geographical spatial adjacency features of countryside by graph convolutional networks, and then embed the geographical spatial adjacency information into the countryside entities to obtain neighborhood vector , and finally aggregate into a countryside entity vector .
Step 3. Iteration: Propagate backward and forward the error gradient to update the parameters by using the gradient descent algorithm and repeat the training several times so that the difference between the predicted value and the true value is minimized.
Step 4. Recommendation: Based on the minimized result in step 3, predict whether the countryside matches pattern .
In step 1, for the calculation of the countryside knowledge graph edge weights, we use Formula (2):
where
denotes countryside ecological pattern vector, and
denotes the ith neighborhood relationship vector. The function
is used to calculate the inner product of vector
and vector
. The weight is calculated by inner product function
. Take
industrial development pattern as an example, according to the scoring of this pattern in the pattern–countryside interaction matrix, we can obtain countryside ecological pattern vector
. Correspondingly, we can find neighborhood relationship vector
on the basis of relationship linking with this pattern in the knowledge graph. In general,
indicates the importance of relation
on pattern
. The results of importance are accurate since each calculation of
takes into account the influence of relation
on pattern
instead of using a constant influence of relation.
Next, according to Formula (3), we perform the softmax normalization on
:
where
denotes the set of all entities directly connecting to the countryside entity
.
Then, we calculate the linear combination
of
’s neighborhoods to obtain the predicted value
in step 2:
where
denotes the ith representation of neighborhoods vector,
denotes the range of countryside entity selection.
According to the Laws of Geography, the closer the distance, the greater the correlation between objects. Therefore, there exist certain similarities among the countryside ecological patterns of geographic adjacent countryside. When there are too many neighbors of a countryside entity
in the countryside knowledge graph, it will bring too much computational pressure and non-essential neighborhood information. Therefore, we limit the visiting spatial scale of each countryside entity’s neighborhoods, instead visiting the whole path-reachable entities
. For each countryside entity, it only visits neighborhood nodes
times and obtains subset
, referring to Formula (5). This ensures that the aggregation of neighborhood entities does not expand outward indefinitely while keeping enough information of high relevance:
where
is a subset of
.
In step 2, the aggregation from
to
is shown in Formula (6):
where
is linear transformation matrix,
is bias, vector
is generated by the countryside entity in the previous iteration, and
is the nonlinear function, whose result is the final vector representation of the countryside entity
.
At last, a loss function is used to calculate the difference between the predicted value
and the true value
, as described in steps 3 and 4, referring to Formula (7):
In the following, we take a countryside ecological patterns recommendation for specified countryside Hancunhe as an example to illustrate the implementation of KGCN4CEPR, which consist of four steps.
In step 1, we firstly calculate the weight of each edge that is adjacent to Hancunhe in the countryside knowledge graph. The weight represents the preference of the industrial development pattern for each relationship.
In step 2, the number of times of neighborhood aggregation
is 2, where the second layer contains: “
Beijing city”, “road length” and “area”. We normalize and weight the entities to obtain the aggregated vector
of the “
Fangshan District”. The first layer contains “
Fangshan District”, “area”, “population” and “
industrial development pattern”. Similarly, we obtain the neighborhood vector
of the central location “
Hancunhe Countryside”. Then, we perform a message aggregation of the neighborhood vector
of “
Hancunhe Countryside” and the countryside entity
of “
Hancunhe Countryside” to obtain the final vector representation
of “
Hancunhe Countryside”.
Figure 6 describes the above aggregation operation of KGCN4CEPR. In
Figure 6, there are 2 kinds of objects, i.e., entities and properties. The circles represent different entities; the rectangles represent different properties; the arrow means the direction of aggregation operation; the objects with same color denote the objects involved in the same aggregation operation.
In step 3, the minimum value of the difference between the predicted value and the true value of “Hancunhe Countryside” is iteratively obtained.
Finally, in step 4, the pattern corresponding to the minimized result will be recommended for the “Hancunhe Countryside”, i.e., industrial development pattern.