SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction

Wu, Shangjia; Guo, Zhiqiang; Huang, Xiaofeng; Zhang, Jialiang; Ni, Yingfang

doi:10.3390/electronics13132571

Open AccessArticle

SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction

by

Shangjia Wu

,

Zhiqiang Guo

^*,

Xiaofeng Huang

,

Jialiang Zhang

and

Yingfang Ni

School of Information Engineering, Wuhan University of Technology, Wuhan 430073, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(13), 2571; https://doi.org/10.3390/electronics13132571

Submission received: 20 May 2024 / Revised: 21 June 2024 / Accepted: 24 June 2024 / Published: 30 June 2024

(This article belongs to the Special Issue Advanced Natural Language Processing Technology and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

As a crucial component of many natural language processing tasks, extracting entities and relations transforms unstructured text information into structured data, providing essential support for constructing knowledge graphs (KGs). However, current entity relation extraction models often prioritize the extraction of richer semantic features or the optimization of relation extraction methods, overlooking the significance of positional information and subject characteristics in this task. To solve this problem, we introduce the subject position-based complex exponential embedding for entity relation extraction model (SPECE). The encoder module of this model ingeniously combines a randomly initialized dilated convolutional network with a BERT encoder. Notably, it determines the initial position of the predicted subject based on semantic cues. Furthermore, it achieves a harmonious integration of positional encoding features and textual features through the adoption of the complex exponential embedding method. The experimental outcomes on both the NYT and WebNLG datasets reveal that, when compared to other baseline models, our proposed SPECE model demonstrates significant improvements in the F1 score on both datasets. This further validates its efficacy and superiority.

Keywords:

entity relation extraction; subject position features; positional encoding; complex exponential embedding

1. Introduction

In today’s digital era, vast quantities of semi-structured and unstructured textual data are continuously generated on the internet. Amidst this deluge of information, it is imperative to extract the core and critical details and organize them into a suitable data structure. Google introduced the groundbreaking concept of the knowledge graph in 2012 [1], which has revolutionized the way we perceive and utilize data. Currently, entity relation extraction technology stands as one of the pivotal techniques for constructing extensive knowledge graphs. This technology adeptly transforms unstructured textual information into structured relational triplets. This is achieved through the intricate feature modeling of the text, bridging the gap between raw data and meaningful insights.

The purpose of entity relation extraction technology is to extract subjects, objects, and the relationships between them from text, thereby forming relational triplets. Early entity relation extraction models primarily utilized a pipeline approach [2,3] for extracting subjects, objects, and their relationships. This approach divides entity relation extraction into two separate tasks: entity recognition and relation classification. However, this method is prone to error propagation, which can negatively impact the performance of the model. To address this issue, subsequent research has adopted a joint extraction approach [4,5] for relation extraction. This method extracts complete relational triplet information through semantic feature modeling. Therefore, this method heavily relies on the performance of the semantic feature extraction model. Early feature modeling methods based on machine learning were unable to meet the requirements of entity relation extraction tasks. Subsequent research has significantly improved model performance by leveraging deep neural networks [6,7] to extract relational triplets.

Despite the remarkable progress achieved using entity relation extraction methods based on joint extraction, current mainstream models still face challenges such as the issue of multiple triplets and overlapping, which hinder further improvements in model performance. Figure 1 illustrates the issue of overlapping in entity relation extraction tasks, as proposed by scholars such as Miwa and Bansal [8] and Zeng et al. [9]. The entity pair overlap (EPO) issue refers to the presence of multiple implicit relationships between a subject and an object. Single entity overlap (SEO), on the other hand, refers to a situation where a subject has relational information with multiple objects. Traditional methods for joint extraction methods have difficulty addressing the overlapping issues effectively. Related models suggest using the Seq2Seq framework to generate all relational triplets to address the issue of overlapping [8,9]. In subsequent research, Fu et al. [10] proposed utilizing the graph data structure to construct a graph convolutional neural network to address the issue of overlapping. Wei and Su et al. [11] proposed the CasRel framework, which divides the entity relation extraction task into subject identification and object-relation identification tasks. This approach enables the matching of a single subject with multiple objects, thereby addressing the issue of overlapping. Currently, the CasRel framework has been widely applied to entity relation extraction tasks and has achieved significant progress.

While the aforementioned methods have addressed the overlapping issue to a certain extent, most of the existing models overlook the dependency relationships that exist between subject information, objects, and subject–object pairs. In addition, mainstream models often focus solely on extracting semantic features or optimizing extraction methods, neglecting the impact of positional features in the text on entity relation extraction. Therefore, this paper proposes a method for entity relation extraction based on subject position complex exponential embedding, named SPECE (subject position encoder in complex embedding for relation extraction). Firstly, this paper utilizes the BERT pre-trained model and residual dilated convolutional neural network to extract semantic features. Then, a complex exponential embedding approach is proposed to combine the positional encoding features with textual features. Compared to previous models, our proposed model incorporates subject information into the entity relation extraction task. Furthermore, our model proposes to incorporate the positional information of the subject and introduces a novel embedding approach to achieve feature fusion.

The main contributions of this paper are as follows:

(1) Designing an encoding layer based on BERT and DGCNN to extract semantic features, alleviating the issue of long-distance dependencies in the model and enhancing its performance when dealing with long texts.

(2) Designing a subject position-based encoding method and propose a complex exponential embedding technique to achieve the fusion of subject features with textual semantic features.

(3) Compared to other baseline models, our proposed SPECE model demonstrates significant improvements in the F1 score on both datasets. Additionally, we conducted ablation experiments to verify the effectiveness of the subject position encoding and complex exponential embedding techniques.

This article is divided into six chapters. Section 1 is an introduction to relationship extraction. Section 2 is related work, introducing recent issues and methods related to triplet extraction. Section 3 introduces the various parts of the model. Section 4 presents the completed relevant experiments and their results. Section 5 is an analysis of the experimental results. Section 6 is a conclusion and outlook on this method.

2. Related Works

Early models used a pipeline approach to divide the task of entity relationship extraction into two parts: entity recognition and relationship classification. However, this method suffers from error propagation, which limits the performance improvement of the model. To solve this problem, scholars have proposed a joint extraction approach that utilizes an encoding layer to capture semantic feature information and designs a decoding layer to obtain complete relation triplets. Zheng et al. [12] designed the NovelTagging annotation structure and employed a joint extraction method to achieve relation extraction. The NovelTagging annotation structure is inspired by the commonly used BIEOS (begin, inside, end, other, single) tagging approach in the named entity recognition tasks. Additionally, they introduced the use of “1” and “0” labels to distinguish between the subject and object entities within a relation triplet. This method transforms the complex task of entity-relation extraction into a relatively simpler task of label classification.

As the joint extraction method requires the direct prediction of complete relation triplet information, models that employ this approach heavily rely on extracting semantic features. In recent years, scholars have widely adopted large-scale semantic models, such as ELMo (embeddings from language models) and BERT (bidirectional encoder representations from transformers), to extract semantic features. These models enable the accurate capture of meaningful information from text data, significantly enhancing the performance of various natural language processing tasks, such as entity and relation extraction. Scholars such as Huang et al. [13] have introduced the BERT pre-trained model as an encoding layer to capture semantic information from sentences. They utilize a pipeline approach to achieve relation extraction. Wadden et al. [14] utilized the BERT pre-trained model as the encoding layer for the entity-relation extraction task. They dynamically generated entity-relation graphs based on textual features and introduced graph convolutional networks (GCNs) to achieve relation extraction.

Despite the significant performance improvements achieved by the entity-relation extraction models based on language models, most of these models overlook the issues of multiple triplet extraction and overlapping. Certain end-to-end entity-relation extraction models exhibit poor performance when predicting multiple triplets. Consequently, the problem of overlapping significantly hinders the optimization of model performance. Therefore, Zeng et al. [15] employed a Seq2Seq model and introduced a copy mechanism to predict the subject, relation, and object information in sequence. Fu et al. [10] propose the utilization of a graphical data structure to record entity information and employ graph convolutional neural networks to achieve entity-relation extraction. Ye et al. [16] proposed a generative approach based on the transformer [17] architecture for predicting relational triplets. To address the challenge of predicting multiple triplets using end-to-end models, Wei and Su [11] propose the CasRel framework. The CasRel framework transforms the task of entity-relation extraction into subject recognition, object identification, and relation identification. This approach allows for matching relationships between a single subject and multiple objects, enabling the extraction of multiple triplets. In addition, Li et al. [18] proposed a parallel relation extraction framework named LAPREL, which integrates label information into the sentence-embedding process through a label-aware mechanism. Furthermore, an entity recognition module is implemented to refine the fuzzy boundary relation extraction, resulting in a more accurate extraction of relational triples. Also, Liao et al. [19] proposed SCDM based on the span and a cascaded dual decoding. Lai et al. [20] proposed a joint entity and relation extraction model called RMAN. Two multi-head attention layers are applied to incorporate additional semantic information and enhance the effectiveness. Both the SCDN and RMAN models have demonstrated the significance of semantic features in addressing the issue of triplet overlapping.

Indeed, the models mentioned above primarily focus on extracting semantic features from the text to enhance their performance in entity-relation extraction tasks. However, they tend to overlook the significance of subject features and positional information in this process. In this paper, we propose a subject position complex exponential embedding-based entity-relation extraction model (SPECE). This model utilizes BERT and DGCNN as its encoding layers. The decoding end consists of a subject recognition module, a position encoding embedding module, and an object and relation recognition module. In the following sections, we will delve into the details of these modules.

3. Model

This chapter will provide a detailed introduction to the various modules and structural characteristics of the SPECE model. Figure 2 shows the complete structure of our model. The goal of the entity relationship extraction task is to extract the relationship of the triplets contained in unstructured text

{s, r, o}

, in which s represents subject, r represents relation, and o represents object. Firstly, we transform the input text into semantic feature information

H_{E} = h^{1}, h^{2}, \dots, h^{L}

through an encoding layer. Then, based on semantic features

H_{E}

, the model designs a subject recognition module to predict the starting position

s_{s t a r t}

information of the subject. In addition, encoding the input text based on the subject’s location information to obtain the subject’s location code

P = p^{1}, p^{2}, \dots, p^{L}

. Then, we use the embedding layer to extract positional feature information

H_{p} = H_{p}^{1}, H_{p}^{2}, \dots, H_{p}^{L}

. Our model proposes a complex exponential embedding method to obtain deeper level feature information

H_{C E}

. Finally, we design an object and relationship recognition model to identify the starting position

o_{s t a r t}

, ending positions

o_{s t a r t}

of the object, and the corresponding relationship information r.

3.1. Encoder

3.1.1. BERT

Our model uses word segmentation and embedding layers provided by BERT [21] to map input text into vectors and obtains its semantic feature information through a multi-layer transformer model

H_{B E R T} = h_{B E R T}^{1}, h_{B E R T}^{2}, \dots, h_{B E R T}^{L}

.

3.1.2. DGCNN

At present, natural language processing models typically utilize temporal networks to extract semantic features from text. However, temporal networks face challenges such as gradient dispersion and gradient explosion. In response to this issue, scholars such as Zeng et al. [6] proposed a convolutional neural network based on position embedding to extract relation triplets. It is worth noting that our model uses dilated convolutions instead of traditional convolutions. This network expands the receptive field of the convolution kernel by inserting several untrained zeros between them, thereby achieving the perception of a larger range of input information. In addition, considering the information transmission problem of deep neural networks, we propose a dilated convolutional neural network based on a residual threshold mechanism. The overall network structure diagram is shown in Figure 3.

The DGCNN calculation formula is as follows:

h_{D C N N} = D C N N_{1} (h_{i n p u t})

(1)

h_{g a t e} = D C N N_{2} (h_{i n p u t})

(2)

σ = s i g m o i d (h_{g a t e})

(3)

h_{D G C N N} = h_{i n p u t} \times (1 - σ) + h_{D C N N} \times σ

(4)

where

h_{D C N N}, h_{g a t e} \in R^{d_{m o d e l}}

, and

d_{m o d e l}

are the dimensions of the input feature. Inspired by Dauphin et al. [22], the DGCNN model uses a threshold mechanism to control the degree of feature retention and forgetting.

Firstly, the DGCNN model introduces a dilated graph convolutional neural network to capture long-distance text feature information and incorporates another dilated graph convolutional neural network to extract crucial features. The threshold features are used to control the retention degree of the first dilated graph convolutional output features and the forgetting degree of the input features. The convolutional kernel parameters of the two components of the dilated convolution in the DGCNN model share the same weight

W \in R^{d \times k d}

and bias vector

b_{w} \in R^{d}

, where d is the input feature dimension

d_{m o d e l}

, and k is the convolutional kernel size. Then, we use Sigmoid function as the nonlinear activation function of the threshold feature

σ

, as the memory coefficient of the dilated convolutional feature

h_{D C N N}

. In addition, the forgetting coefficient

1 - σ

of the dilated convolution is regarded as the memory coefficient of the input feature

h_{i n p u t}

. Finally, the DGCNN model multiplies and adds the two parts of the features with their respective memory coefficients to obtain the semantic feature

h_{D} C N N

as the output of the DGCNN model.

The DGCNN network utilizes dilated convolutions to extract temporal feature information from distant input text. This approach not only addresses the problem of distant dependencies in temporal models but also surpasses the limitations of standard convolutional networks. These networks are limited by the size of convolution kernels and struggle to capture distant textual feature information. On the other hand, the DGCNN model introduces a residual threshold mechanism to enhance model performance by regulating the information flow within the model. The residual mechanism allows each piece of feature information to flow to the downstream network, enhancing the model’s convergence ability and reducing overfitting.

3.2. Decoder

3.2.1. Subject Tagger

This article uses the CasRel [11] framework as the decoding end of the model. This framework uses a “1” to mark the start and end positions of the subject and similarly uses a “0” to mark other positions in the input text. Based on this annotation structure and the output feature

h_{D G C N N}

of the encoding layer, we construct a binary classifier to predict the probability of each input character becoming the start or end position of the subject. The specific operation is as follows:

p_{s_s t a r t}^{i} = s i g m o i d (W_{s_s t a r t} h_{D G C N N}^{i} + b_{s_s t a r t})

(5)

p_{s_e n d}^{i} = s i g m o i d (W_{s_e n d} h_{D G C N N}^{i} + b_{s_e n d})

(6)

where

p_{s_s t a r t}^{i}

and

p_{s_e n d}^{i}

represent the probabilities of the

i - t h

position becoming the start and end positions of the subject. During training, the model uses the following maximum likelihood function to calculate the loss value for the subject recognition task.

p_{θ} (s | H_{E}) = \prod_{T \in {s_s t a r t, s_e n d}} \prod_{i = 1}^{L} {(p_{i}^{t})}^{I {Y_{i}^{t} = 0}}

(7)

where L is the length of the input text sequence,

y_{i}^{s_s t a r t}

and

y_{i}^{s_e n d}

are the label values at the

i - t h

position, and

p_{i}^{t}

is the predicted value at that position.

I {z}

represents a value of 1 when z is a positive sample, and 0 otherwise. The semi-joint extraction method uses the maximum likelihood function shown in Equation (7) to calculate the loss value and optimize it.

3.2.2. Subject Position Encoder

Considering the relevant information that exists between the subject features and the text features, our model suggests incorporating subject features into the entity relationship extraction task. The mainstream models tend to extract the semantic features of the subject by constructing a semantic feature network while ignoring the spatial information of the subject. Dufter et al. [23] believe that without position information, the meaning of a sentence is not well defined. Ke et al. [24] believe that the information conveyed by a character is closely correlated with its position within the text. They suggest that extracting the correlation information between characters through position embedding and other methods can enhance the model’s feature extraction capability. Raffel et al. [25] believe that the positional relationship between characters also affects the expression of semantic features of characters, and that extracting relative position information can capture the correlation information between texts.

Based on the theoretical analysis of the location features mentioned above, this paper proposes a model for entity relationship extraction that relies on the location features of the subject. It utilizes the complex exponential embedding method to fuse location features and text semantic features. First, we obtain the location information

s_{s t a r t}

and

s_{e n d}

of the subject through the subject recognition module. Then, we perform location encoding on the input text by considering the absolute distance between the sentence text and the subject text. Specifically, the characters located before the subject text will be marked with the distance between the current character and the first character of the subject as the location encoder, while the characters located after the subject text will be marked with the distance between the current character and the last character of the subject as the location encoder. It is worth noting that this annotation structure marks all the location encoders of the subject text as 0. However, it is difficult to directly apply location encoding to entity relationship extraction models. On the one hand, location encoding is influenced by the text length, which can result in a large encoded value. The direct utilization of this value may cause gradient explosion. On the other hand, location encoding features are only one-dimensional, making it challenging to capture rich semantic features. Therefore, our model introduces an embedding layer to map integer location encoding to multidimensional feature vectors. The embedding layer introduces a trainable matrix, enabling the model to iteratively optimize the relevant parameter information of the embedding layer. This allows for a more accurate fitting of the location feature information.

3.2.3. Complex Embedding

The position-encoded features generated based on trainable matrices can effectively extract spatial information from text. However, this method can only optimize the parameters of a single spatial feature during the training process and cannot learn the correlation information between position-encoded features. In addition, common feature fusion methods do not consider the correlation information between the semantic features and spatial features. Inspired by Wang et al. [26], our model proposes a feature embedding method based on complex exponentials. Firstly, to extract the relative position information of the spatial encoding features, the method utilizes Equation (8) as the mapping function for the position-encoded features, which adheres to the equality relations specified in Equations (9) and (10).

\begin{matrix} g (p o s) = z_{1}^{i} \cdot {(z_{2}^{i})}^{p o s} = r_{1} e^{I θ_{1}} \cdot {(r_{2} e^{i θ_{2}})}^{p o s} \end{matrix}

(8)

\begin{matrix} g (p o s + k) = r_{1} e^{I θ_{1}} \cdot {(r_{2} e^{i θ_{2}})}^{p o s + k} = g (p o s) \cdot T r a n s (k) \end{matrix}

(9)

\begin{matrix} g (p o s + k) = T r a n s_{k} (g (p o s)) \end{matrix}

(10)

According to Equations (9) and (10), in the complex exponential-based position encoding feature embedding method, based on the characteristics of complex exponential operations, the position encoding feature of the

p o s + k - t h

bit can be obtained by the linear mapping of the

p o s - t h

bit through operator

T r a n s (k)

. The value of operator

T r a n s (k)

is determined by the relative positional k among texts, so this embedding method can reflect the relative position feature information between position encoders. Our model expands the mapping function in Equation (8) to obtain Equation (11).

\begin{matrix} g (p o s) = r_{1} r_{2}^{p o s} e^{i (θ_{2} + θ_{1} p o s)} = A e^{i (φ + ω \cdot p o s)} = A (s i n α + i c o s α) \end{matrix}

(11)

\begin{matrix} s i n α = s i n (ω_{c o n t e x t} \times p o s + φ_{p o s i t i o n}) \end{matrix}

(12)

\begin{matrix} c o s α = c o s (ω_{c o n t e x t} \times p o s + φ_{p o s i t i o n}) \end{matrix}

(13)

\begin{matrix} h_{s p e c e} = A \times (c o s α + i s i n α) + h_{i n p u t} \end{matrix}

(14)

As shown in Equation (11), the mapping function based on complex exponential includes three parameters that can be optimized: amplitude A, initial phase

φ

, and period

ω

. In order to enhance the performance of the model, we use trainable parameters to fit the three feature parameters. This enables the model to extract more accurate parameter information during iterative training. Our model utilizes encoding layer features and position encoding features. It introduces a linear layer to incorporate amplitude feature information A, utilizes word segmentation information of the input text

t o k e n

, introduces an embedding layer to incorporate period parameter information

φ

, and utilizes position encoding information

p o s

to introduce an additional embedding layer for initial phase parameter information

ω

. Finally, the position encoding features are combined with the text semantic features to obtain the output features of the position encoding feature embedding method

h_{s p e c e}

, as shown in Equations (12)–(14).

The position encoding embedding technique based on complex exponential not only utilizes the characteristics of exponential operations to extract relative spatial feature information about positions but also addresses the limitation of training-based position encoding features that can only be trained independently. In addition, the method introduces trainable parameters based on the characteristics of complex exponential functions. This allows it to optimize parameter information through iterative training, thereby more accurately extracting positional feature information and enhancing the model’s generalization ability.

3.2.4. Relation-Object Tagger

Based on the output feature

h_{s p e c e}

of the complex exponential embedding method, our model constructs the same subject recognition module for each predefined relationship to predict whether there is corresponding object information under the relationship. The specific calculation process is as follows:

p_{o_s t a r t}^{i} = s i g m o i d (W_{o_s t a r t}^{r} h_{s p e c e}^{i} + b_{o_s t a r t}^{r})

(15)

p_{o_e n d}^{i} = s i g m o i d (W_{o_e n d}^{r} h_{s p e c e}^{i} + b_{o_e n d}^{r})

(16)

where

p_{o_s t a r t}^{i}

and

p_{o_e n d}^{i}

represent the probabilities of the

i - t h

position becoming the start and end positions of the subject when predicting the

r - t h

relationship. Our model also uses the maximum likelihood function in Equation (17) to calculate the loss values for the object and relationship prediction tasks. Combining the prediction results of the subject recognition module and the object and relationship recognition module, we use the following loss function to calculate the total loss value:

L = \sum_{j = 1}^{| D |} [\sum_{s \in T_{j}} log p_{θ} (s | H_{E}) + \sum_{r \in T_{j} | s} log p_{θ} (o | H_{C R}) + \sum_{r \notin T_{j} | s} log p_{θ} (o_{⊘} | H_{C R})]

(17)

where D represents the training dataset,

T_{j}

is the

j - t h

sentence in the dataset, and

r \in T_{j} | s

represents the subject S and relationship r contained in the

T_{j}

sentence.

4. Experience

4.1. Experiment Datasets and Evalution Metrics

In this experiment, we used the NYT dataset proposed by Riedel et al. [27] and the WebNLG dataset proposed by Gardent et al. [28] to validate our proposed model. The original NYT dataset was generated through remote supervision, resulting in more than one million sentences covering 24 predefined relationships. The original WebNLG dataset was used for text generation tasks, covering 246 predefined relationships. The above two datasets were modified by Zeng et al. to make them suitable for entity relationship extraction tasks. The modified NYT dataset consists of 56,195 sentences in the training set, 5000 sentences in the validation set, and 5000 sentences in the test set, covering 24 predefined entity relationships. The modified WebNLG dataset consists of 5019 sentences in the training set, 500 sentences in the validation set, and 703 sentences in the test set, covering 246 predefined relationships. At the same time, in order to verify the role of the model in dealing with entity overlap problems, we divided the sentences in the two datasets into the following three categories according to the categories of entity overlap: regular sentences, sentences with single entity overlap, and sentences with entity pair overlap. The statistical information of the two datasets is shown in Table 1.

Based on previous experimental ideas, we identified in the experimental design that for the extracted relationship triples, only when both entities and the relationship between them are correct can the prediction be considered correct. At the same time, in order to compare equally, we used accuracy, recall, and the F1 value as the criteria for evaluating the performance of the model.

4.2. Training Parameters Setting

The SPECE model uses the BERT+DGCNN model as the encoding layer and is fine tuned using default hyperparameters. The batch size for model training was set to 8 in the experiment, and the Adam [29] optimizer was used for optimization. The BERT pre-trained model learning rate was set to

1 \times 10^{- 5}

, and the learning rate of the other models was set to

1 \times 10^{- 4}

. Due to the output dimension of BERT pre-trained model being 768, the feature dimension of position encoding is set to 768. At the same time, in order to prevent overfitting of the model, the experiment introduces an early stop mechanism, which means stopping the experiment early if the F1 score indicator on the validation set does not improve for 15 consecutive cycles. The experiment uniformly sets the threshold for judging the start and end positions of entities to 0.5.

4.3. Experimental Result

The experimental results will be compared with models that have performed well in recent years, including the CasRel model proposed by Wei et al. [11], the RMAN model proposed by Lai et al. [20], the LAPREL model proposed by Li et al. [18], the RESA model proposed by Huang et al. [30], and the SCDM model proposed by Liao et al. [19]. Table 2 directly cites the experimental results of the original paper of the above model as a comparison.

We propose a model for entity relationship extraction based on complex exponential embedding, which uses the BERT-base, cased version of the BERT pre-training model. The model uses an expanded gate convolutional neural network as the encoding layer and introduces a cross-attention mechanism network as the decoding layer.

In addition to verifying the effectiveness of our proposed partial structures on entity extraction tasks, we conducted a series of ablation experiments. Our proposed model incorporates trainable position encoding for feature extraction techniques. In order to verify the role of position encoding and the role of complex exponential embedding, the

R E P E_{e m b}

model introduces the main body position encoding features through linear addition. The experimental results are shown in Table 3.

To verify the performance of the model in handling sentences with overlapping entities. We categorize the test dataset from the NYT dataset and WebNLG dataset into three groups: regular statements, statements with overlapping single entities, and statements with overlapping entity pairs. The SPECE model will extract entity relationships from these three types of statements separately to verify its ability to handle entity overlap problems. The results are shown in Figure 4.

At the same time, in response to the presence of multiple relational triplets in sentences in the dataset, we divided the two datasets based on the number of relational triplets. We utilized the SPECE model for entity relationship extraction to assess the model’s capability in handling multiple relational triplet statements. The experimental results are shown in Table 4.

5. Discussion

Introducing subject position encoding features and using corresponding embedding algorithms can effectively enhance the model’s performance. On both the NYT dataset and the WebNLG dataset, the F1-score of the SPECE model has shown improvement. In comparison to the SCDM model, the SPECE model increased F1-scores by 0.3% and 0.7% on the respective datasets. The experimental results above indicate that utilizing the location feature information of the subject can significantly enhance the performance of the model compared to models designed with complex structures like RMAN, LAPREL, and SAHT. Our model utilizes complex exponential embedding to fuse positional features with text semantic features. Then, a trainable matrix is introduced into the embedding layer, and the relevant parameters of the embedding layer are obtained through iterative optimization. Due to the positional information of the main features, it can be more accurately fitted. This confirms that our model can achieve better performance in entity relationship extraction tasks.

Meanwhile, in the ablation experiment, we found that incorporating subject position encoding features into the entity relationship extraction model can enhance model performance. The REPEemb model, which introduces subject position encoding features through linear addition, achieved a modest improvement in F1-score on the NYT dataset. However, its performance did not exhibit significant changes in the WebNLG dataset. The experimental results indicate that introducing only spatial feature information is challenging to effectively enhance the model’s capability to extract relationship triplets. The model needs to design a reasonable position embedding method to effectively utilize this feature information. In addition, by comparing the experimental results of the SPECE model and the

R E P E_{e m b}

model, it is demonstrated that introducing trainable parameters for position encoding feature extraction can effectively enhance the generalization ability of the model and the effectiveness of the complex exponential embedding method.

In addition to this, we also experimentally analyze the performance of the model when dealing with entity overlapping sentences as well as multi-relational ternary sentences. For the NYT dataset, the SPECE model performs significantly better than the other models in dealing with regular statements and optimizes its performance in dealing with EPO and SEO problems, and the F1 value also improves in dealing with EPO and SEO problems. For the WebNLG model, the SPECE model performs optimally in processing regular utterances and SEO problems. And in dealing with the multi-triad problem, better performance can be obtained because this method can fit the feature information of the location more accurately. The above experiments show the applicability of our proposed SPECE model in dealing with the entity overlap problem as well as the multi-triad problem.

6. Conclusions

This article proposes a position encoding feature fusion technique based on complex exponential embedding (SPECE) and verifies the performance and effectiveness of the model on two public datasets. We divide the entity relationship extraction task into two subtasks: subject extraction task and object and relationship extraction task. We utilize dilated gate convolutional networks as the encoding layer of the model. In the decoding layer, we estimate the positional information of the subject using a fully connected layer. Based on the subject’s location information, we utilize a cross-attention mechanism network to merge sentence features and subject features, enabling the prediction of the object’s position and relationship using the fused features. In addition to verifying the performance of the model, we also validated the SPECE model’s ability to handle entity overlap and multiple triplet problems. The experimental results all indicate that the SPECE model outperforms previous models. In future work, our research task is to come up with a lightweight model and improve it with more effective positional encoding.

Author Contributions

S.W.: conceptualization, methodology, software, validation, formal analysis, writing—original draft preparation; Z.G.: investigation and data curation, methodology, writing—review and editing X.H.: writing—review and editing, software, methodology, validation, conceptualization J.Z.: writing—review and editing Y.N.: writing—review and editing, software. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are available from https://github.com/alesandraa/SPECE-datasets, accessed on 1 May 2024.

Conflicts of Interest

The authors declare no conflict of interest.

References

Young, G.O. Synthetic Structure of Industrial Plastics; Plastics, McGraw-Hill: New York, NY, USA, 1964; pp. 15–64. [Google Scholar]
Chan, Y.S.; Dan, R. Exploiting Syntactico-Semantic Structures for Relation Extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2021; pp. 1426–1436. [Google Scholar]
Zeng, D.; Liu, K.; Lai, S.; Zhou, G.; Zhao, J. Relation Classification via Convolutional Deep Neural Network. In Proceedings of the COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, Ireland, 23–29 August 2014; pp. 2335–2344. [Google Scholar]
Qi, L.; Ji, H. Incremental Joint Extraction of Entity Mentions and Relations. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, MD, USA, 26–27 June 2014; pp. 402–412. [Google Scholar]
Dixit, K.; Al-Onaizan, Y. Span-Level Model for Relation Extraction. Association for Computational Linguistics(ACL). 2019, pp. 5308–5314. Available online: https://aclanthology.org/P19-1525/ (accessed on 1 May 2024).
Zeng, D.; Kang, L.; Chen, Y.; Zhao, J. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1753–1762. [Google Scholar]
Zhang, Y.; Guo, Z.; Lu, W. Attention Guided Graph Convolutional Networks for Relation Extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 241–251. [Google Scholar]
Miwa, M.; Bansal, M. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, 7–12 August 2016; pp. 1105–1116. [Google Scholar]
Zeng, X.; Zeng, D.; He, S.; Liu, K.; Zhao, J. Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 506–514. [Google Scholar]
Fu, T.J.; Li, P.H.; Ma, W.Y. GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1409–1418. [Google Scholar]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 1476–1488. [Google Scholar]
Zheng, S.; Wang, F.; Bao, H.; Hao, Y.; Zhou, P.; Xu, B. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, WA, Canada, 30 July–4 August 2017; pp. 1227–1236. [Google Scholar]
Huang, W.; Cheng, X.; Wang, T.; Chu, W. BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction. In Proceedings of the Natural Language Processing and Chinese Computing (NLPCC), Dunhuang, China, 9–14 October 2019; pp. 713–723. [Google Scholar]
Wadden, D.; Wennberg, U.; Luan, Y.; Hajishirzi, H. Entity, Relation, and Event Extraction with Contextualized Span Representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 5788–5793. [Google Scholar]
Zeng, X.; He, S.; Zeng, D.; Liu, K.; Liu, S.; Zhao, J. Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 367–377. [Google Scholar]
Ye, H.; Zhang, N.; Deng, S.; Chen, M.; Tan, C.; Huang, F.; Chen, H. Contrastive Triple Extraction with Generative Transformer. In Proceedings of the Advancement of Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; pp. 14257–14265. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:arXiv.1706.03762. [Google Scholar]
Li, X.; Yang, J.; Hu, P.; Liu, H. LAPREL: A Label-Aware Parallel Network for Relation Extraction. Symmetry 2021, 13, 961. [Google Scholar] [CrossRef]
Liao, T.; Sun, H.; Zhang, S. A Joint Extraction Model for Entity Relationships Based on Span and Cascaded Dual Decoding. Entropy 2023, 25, 1217. [Google Scholar] [CrossRef] [PubMed]
Lai, T.; Cheng, L.; Wang, D.; Ye, H.; Zhang, W. RMAN: Relational multi-head attention neural network for joint extraction of entities and relations. Appl. Intell. 2022, 52, 3132–3142. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Dauphin, Y.N.; Fan, A.; Auli, M.; Grangier, D. Language Modeling with Gated Convolutional Networks. arXiv 2016, arXiv:1705.03122. [Google Scholar]
Dufter, P.; Schmitt, M.; Schütze, H. Position Information in Transformers: An Overview. Comput. Linguist. 2022, 48, 733–763. [Google Scholar] [CrossRef]
Ke, G.; He, D.; Liu, T.Y. Rethinking Positional Encoding in Language Pre-training. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 3–7 May 2021. [Google Scholar]
Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv 2019, arXiv:1910.10683. [Google Scholar]
Wang, B.; Zhao, D.; Lioma, C.; Li, Q.; Zhang, P.; Simonsen, J.G. Encoding word order in complex embeddings. arXiv 2019, arXiv:1912.12333. [Google Scholar]
Riedel, S.; Yao, L.; Mccallum, A.K. Modeling relations and their mentions without labeled text. In Machine Learning and Knowledge Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2010; pp. 148–163. [Google Scholar]
Gardent, C.; Shimorina, A.; Narayan, S.; Perez-Beltrachini, L. Creating Training Corpora for NLG Micro-Planning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, WA, Canada, 30 July–4 August 2017; pp. 179–188. [Google Scholar]
Wojciechowski, A. On the Optimization of Opportunistic Maintenance Activities. Bachelor’s Thesis, University of Gothenburg, Gothenburg, Sweden, 2020. [Google Scholar]
Huang, X.; Guo, Z.; Zhang, J.; Cao, H.; Yang, J. RECA: Relation Extraction Based on Cross-Attention Neural Network. Electronics 2022, 11, 2161. [Google Scholar] [CrossRef]

Figure 1. Examples of normal, EPO, and SEO overlapping problem.

Figure 2. The complete structure of SPECE.

Figure 3. DGCNN network structure diagram.

Figure 4. (a) shows the F1-score of extracting relational triplets from sentences with different overlapping problems on NYT dataset, and (b) shows the F1-score of extracting relational triplets from sentences with different overlapping problems on WebNLG dataset.

Table 1. The statistical information of the NYT dataset and the WebNLG dataset.

Category		NYT		WebNLG
Category		Train	Test	Train	Test
Overlap	Normal	37,013	3266	1596	246
	EPO	9782	978	227	26
	SEO	14,735	1297	3406	457
Number	N = 1	36,868	3244	1716	266
	N = 2	12,058	1045	1264	171
	N = 3	3663	312	1043	131
	N = 4	2618	291	648	90
	N ≥ 5	988	108	348	45
ALL		56,195	5000	5019	703

Table 2. The statistical information of the NYT dataset and the WebNLG dataset (%).

Model	NYT			WebNLG
Model	Prec.	Rec.	F1	Prec.	Rec.	F1
Casrel [11]	89.7	89.5	89.6	93.4	90.1	91.8
RMAN [20]	87.1	83.8	85.4	83.6	85.3	84.5
LAPREL [18]	90.7	91.4	91.1	91.7	91.5	91.6
RESA [30]	90.9	91.5	91.2	92	92.4	92.2
SCDM [19]	89.8	92.7	91.2	91.6	92.2	91.9
SPECE	92.0	91.0	91.5	93.5	91.8	92.6

Table 3. Results table of ablation experiment (%).

Model	NYT			WebNLG
Model	Prec.	Rec.	F1	Prec.	Rec.	F1
Casrel [11]	89.7	89.5	89.6	93.4	90.1	91.8
$R E P E_{e m b}$	91.2	90.9	91.0	92.1	91.9	92.0
SPECE	92.0	91.0	91.5	93.5	91.8	92.6

Table 4. F1-score of extracting relational triplets from sentences with different number of triplets.

Model	NYT					WebNLG
Model	N = 1	N = 2	N = 3	N = 4	N ≥ 5	N = 1	N = 2	N = 3	N = 4	N ≥ 5
Casrel [11]	88.2	90.3	91.9	94.2	83.7	89.3	90.8	94.2	92.4	90.9
LAPREL [18]	89.0	92.5	93.3	95.2	96.8	87.2	91.2	94.6	92.9	90.2
RESA [30]	89.3	92.7	93.5	94.2	88	89.4	90.1	94.6	93.5	92.6
$R E P E_{e m b}$	89.3	92.4	92.4	95	86.9	88.4	90.6	94.0	93.1	92.8
SPECE	90.0	92.7	92.6	95.5	88	89.1	90.0	94.8	94.0	92.9

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, S.; Guo, Z.; Huang, X.; Zhang, J.; Ni, Y. SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction. Electronics 2024, 13, 2571. https://doi.org/10.3390/electronics13132571

AMA Style

Wu S, Guo Z, Huang X, Zhang J, Ni Y. SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction. Electronics. 2024; 13(13):2571. https://doi.org/10.3390/electronics13132571

Chicago/Turabian Style

Wu, Shangjia, Zhiqiang Guo, Xiaofeng Huang, Jialiang Zhang, and Yingfang Ni. 2024. "SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction" Electronics 13, no. 13: 2571. https://doi.org/10.3390/electronics13132571

APA Style

Wu, S., Guo, Z., Huang, X., Zhang, J., & Ni, Y. (2024). SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction. Electronics, 13(13), 2571. https://doi.org/10.3390/electronics13132571

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction

Abstract

1. Introduction

2. Related Works

3. Model

3.1. Encoder

3.1.1. BERT

3.1.2. DGCNN

3.2. Decoder

3.2.1. Subject Tagger

3.2.2. Subject Position Encoder

3.2.3. Complex Embedding

3.2.4. Relation-Object Tagger

4. Experience

4.1. Experiment Datasets and Evalution Metrics

4.2. Training Parameters Setting

4.3. Experimental Result

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI