Decentralized Federated Learning-Enabled Relation Aggregation for Anomaly Detection

: Anomaly detection plays a crucial role in data security and risk management across various domains, such as ﬁnancial insurance security, medical image recognition, and Internet of Things (IoT) device management. Researchers rely on machine learning to address potential threats in order to enhance data security. In the ﬁnancial insurance industry, enterprises tend to leverage the relation mining capabilities of knowledge graph embedding (KGE) for anomaly detection. However, auto insurance fraud labeling strongly relies on manual labeling by experts. The efﬁciency and cost issues of labeling make auto insurance fraud detection still a small-sample detection challenge. Existing schemes, such as migration learning and data augmentation methods, are susceptible to local characteristics, leading to their poor generalization performance. To improve its generalization, the recently emerging Decentralized Federated Learning (DFL) framework provides new ideas for mining more frauds through the joint cooperation of companies. Based on DFL, we propose a federated framework named DFLR for relation embedding aggregation. This framework trains the private KGE of auto insurance companies on the client locally and dynamically selects servers for relation aggregation with the aim of privacy protection. Finally, we validate the effectiveness of our proposed DFLR on a real auto insurance dataset. And the results show that the cooperative approach provided by DFLR improves the client’s ability to detect auto insurance fraud compared to single client training.


Introduction
The development of Internet technology makes digitized data and information easy to be transmitted and analyzed, and the subtle connections between data and information are easier to mine [1].But at the same time, hidden crises and potential risks, such as abnormal data and fraudulent behavior, are also mixed in.Whether it is fraud detection in the financial field, device quality monitoring in the IoT industry, disease diagnosis in the healthcare field, or intrusion detection in network security, all rely on anomaly detection to ensure system reliability and data integrity.Regardless of the industry sector, all involve serious economic losses and trust crises.Therefore, the research and development of effective detection mechanisms for the management and analysis of digitized information have become crucial.
However, anomaly samples are still rare and traditional auto insurance fraud detection relies directly on expert manual review.This results in extremely inefficient fraud detection [2].To reduce human error and missed inspections, insurance companies start to leverage the automated intelligence of machine learning [3].In addition to using unsupervised learning, semi-supervised learning, and other methods to improve the performance of anomaly detection models, researchers also use these methods: (1) Data generation: generating abnormal data through transformation, expansion, or learning from existing data for sample expansion, such as how Zhang et al. utilized MetaGAN, based on a Generative Adversarial Network, to generate images to strengthen the performance of sample-level image classification [4].(2) Transfer Learning [5]: learning anomaly detection models from the original domain dataset and transferring them to the target domain.
(3) Active Learning [6]: improving model performance by intelligently selecting which samples should be labeled, thereby reducing dependence on labeled data.(4) Cooperative train [7]: Collaborating between different data holders to jointly build anomaly detection models can also help solve the problem of data scarcity.The centralized training of data can indeed improve the efficiency of detection, but this completely disregards privacy concerns [8].Especially in the financial insurance industry, when it comes to a substantial amount of customer information, data sharing needs to be carried out with the precondition of ensuring privacy protection.
In the insurance industry, with the rapid increase in the number of auto insurance motor vehicles and the swift development of the auto insurance industry, auto insurance fraud has become one of the biggest threats to the current insurance industry [9].There are numerous ways to commit fraud, including faking the scene of a car accident, using fake license plates, and combining with repair shops to exaggerate damages [10].According to the annual property and casualty insurance claims service report [11] released by the China Life Insurance (Group) Company in 2022, annual claims amounted to 59.23 billion CNY, of which 40.79 billion CNY was paid out for auto insurance.
For the struggling auto insurance industry, cracking down on fraud is urgent.In order to protect the legitimate rights and interests of consumers and insurance companies, different countries have formulated a large number of rules and regulations.For instance, the "Regulations on Compulsory Insurance for Motor Vehicle Liability Insurance" issued by China provide a basic basis for the handling of insurance disputes in all aspects from insurance coverage and compensation to penalties.To actively fight against auto insurance fraud, the United States has set up the National Insurance Crime Bureau, armed with a series of bills such as the "Pre-Claims Underwriting Inspection Act" and the "Motor Vehicle Claims Information Collection Act".
In addition to solidifying the institutional foundation and strengthening the construction of the insurance industry team, various companies are promoting the in-depth development of anti-insurance-fraud work with technology-enabled fraud detection [12].The means evolve from traditional insurance detection methods to technology-assisted efficient detection ones, empowered by the following aspects:

•
Reduce over-reliance on high-quality expert experience.Considering saving labor and time costs in the loss determination, pricing, and compensation process, these companies construct expert knowledge-based empirical information bases in conjunction with image recognition technology to help identify damaged parts, types of damage, and assess the extent of damage.

•
Improve detection efficiency using machine learning.Insurance companies have extremely high requirements for the timeliness of auto insurance claims.In order to ensure the high accuracy of fraud detection, the insurance company mines the features of historical cases and utilizes machine learning algorithms such as decision tree [13], random forests [14], logistic regression [15], and XGBOOST [16,17] to train AI models that can be applied to reporting and surveying.

•
Utilize graph structure to mine team fraud.Due to the characteristics of team-based and industrialized insurance fraud in the current auto insurance industry, inspectors may face challenges in swiftly unraveling the implicit connections of team fraud between various cases.Therefore, companies leverage the network structure of graphs to mine additional feature factors from perspectives such as the incident location and reporting time.By identifying abnormal nodes, edges, and neighboring combinations, they aim to enhance the efficiency and success rate of team fraud detection [18].
We consider adopting collaborative learning to jointly combat fraud, with privacy protection requirements and limitations.However, as shown in Figure 1, the reality is that auto insurance data are usually distributed among different insurance companies and organizations.Since the data contain the privacy of the policyholder's personal information and company confidentiality, it is quite challenging for companies to put into effect traditional centralized data inspection methods in the real world.The auto insurance industry is urgently seeking an inspection method that balances the depth of cooperation with the efficiency of inspection and protects data privacy.The complex relations between auto insurance data naturally draw researchers' attention to the ability of Knowledge Graph (KG) to mine hidden relations [19].At the same time, the rise of Decentralized Federated Learning provides a new paradigm for data protection [20].Inspired by this, our work aims to leverage the privacy protection advantages of DFL combined with the high-quality expression of KGE.Our proposed DFL framework, DFLR, is a new solution designed for auto insurance anomalous fraud detection.The contributions of our work are summarized as follows:

•
We propose a framework, Decentralized Federated Learning-enabled Relation aggregation (DFLR), based on relation embedding aggregation for auto insurance anomaly detection.Insurance companies participating in this framework will be either in the client role or server role throughout the training.For the client role, it locally executes KGE training, while the server performs average aggregation on relation embeddings to enhance the expression capability of embeddings.

•
We design a dynamic server selection mechanism that continuously reorganizes federation groups based on data dissimilarity among clients.This approach elevates training efficiency and quality while ensuring privacy protection.

•
We conduct experiments on a real auto insurance dataset.The effectiveness of DFLR is successfully demonstrated through comparative experiments.In particular, the Com-plEX+SVM model utilizing DFLR can even improve the average prediction precision to 0.6575, which is 0.0898 higher than the result of training only on private data.
The remainder of the full paper is organized as follows: the Section 2 describes related research on DFL and knowledge graph embedding; the Sections 3 and 4 elaborate on the algorithm of the proposed DFLR framework; the Section 5 mainly analyzes the results of comparative experiments; and the Section 5 provides a summary of the full paper as well as an outlook for future research.

Related Work 2.1. Decentralized Federated Learning
Artificial Intelligence is steadily developing, and its technical underlying support remains data.The quantity, quality, and dimension of data have become some of the most important factors constraining the progress of science and technology.As data owners need to consider data security protection, competition relations, and legal regulations when facing data exchange and sharing, it leads to the problem of "data silos" between enterprises and industries [21].How to share data safely and effectively has become a popular research topic.
In 2017, Google first proposed and constructed a Federated Learning (FL) framework to realize the idea of model updating locally [22].They aimed to improve the prediction accuracy of what Android users associate with their next input when typing on their mobile terminals.Subsequently, a large number of scholars conducted more in-depth research on data security and personalized models.In 2019, Google released the first FL framework in the world, TensorFlow Federated Framework.And in the same year, Professor Yang Qiang with his team open-sourced the first FL framework in China, named Federated AI Technology Enabler [23], as a secure computing framework to support the Federated AI system.
FL can not only break through the "data silos" and "small data" limitations during the training process, but also ensures a certain degree of data privacy and security while benefitting all participants [24].Because of this, it has been a high concern of researchers in various fields.This framework has a wide range of applications, including medical image processing [25], auto plate recognition [26], air handwriting recognition [27], and so on.
However, due to the high dependence of FL on the central server, it is unable to cope with the problem of a single point of failure of the central server, and thus DFL emerges [28].A more decentralized federation aggregation is achieved through communication and interaction between participants.Currently, this framework has been applied in several fields.Lu et al. [29] extracted medical patient features more securely by constructing a DFL model that conforms to realistic cooperation.In addition, Kalapaaking et al. [30] combined blockchain with FL to improve the security of the system by using the traceable and untamperable characteristics of blockchain.They used blockchain with a Trusted Execution Environment to replace the central server to improve the fault tolerance and attack resistance of the system.
Compared with traditional FL, DFL has no central communication bottleneck, but it generates a huge client-client communication overhead.To deal with this problem, Liu et al. [31] pioneered the application of the Lloyd-Max algorithm to DFL.They utilized the exchange of model information between neighboring nodes to adaptively adjust the quantization level, and succeeded in improving the communication efficiency by reducing the amount of data of the federated transmission model parameters.Sun et al. [32] investigated Decentralized FedAVG with Momentum (DFedAvgM) based on the FedAVG paradigm, which reduces the communication overhead by mixing the matrices, Momentum, multiple local iterations of client training, and quantization of sending models.

Knowledge Graph Embedding
Knowledge Graph as a kind of mesh database manages loose multi-source heterogeneous data through a standardized structural organization (head entity, relation, tail entity) [33].With the advantages of graph structure to reflect and manage information, and to help accurate positioning and searching, KG provides strong underlying support for specific downstream applications such as Internet semantic searches, personalized recommendations, an intelligent Question Answering System and big data decision-making.
However, such a ternary structure is difficult to deal with directly due to the low portability of its underlying symbolic properties.Therefore, researchers apply knowledge graph embedding to ensure computational simplicity by embedding entities and relations into a continuous low-dimensional vector space, while preserving the structural information of the KG [34].The entities and relations are downscaled and stored in the low-dimensional space in the form of vector matrices or tensors.
The training of KGE involves the semantic understanding level.It is necessary to consider how to extract relations and entities from non-aligned heterogeneous data, and how to understand the real meaning of different relations and entities for the alignment task [35].
Existing knowledge graph embedding methods are mainly categorized into three types: (i) Translational Distance Models; (ii) Semantic Matching Models; and (iii) Neural Network Models.Based on the above three types of models, many models have been derived.
The Translational Distance Models are based on TransE [36] and extend to derive models such as TransR [37], RotatE [38], and HAKE [39].This type of method defines the scoring function by modeling the relation as the distance from the head entity to the tail entity like the Euclidean distance of TransE and the rotation transformation of RotatE.Semantic Matching Models measure the rationality of triples at the semantic level to construct score functions, mainly including RESCAL (bilinear model) models [40].Due to the lack of clear Euclidean inner product correspondence in hyperbolic spaces, in order to extend the calculation to hyperbolic spaces, Ivana Balaževic et al. first used a combination of a bilinear model and Poincaré ball [41].The MuRP model proposed by them can outperform Euclidean models on the link prediction task at lower dimensionality.As for the Neural Network Models, they score by embedding the head entity, relation, and tail entity into the neural network.Jiarui Zhang et al. [42] found that data-driven link prediction tasks rely on various labels and only utilize the structural information of the graph.Inspired by knowledge distillation, the DA-GCN proposed by them makes use of logical rules to reduce the dependence of graph neural networks on data and iterative rules to construct graph convolutional networks.The KGE trained by DA-GCN can perform excellently in link prediction tasks.

Proposed Approach
Fraud in the insurance industry usually refers to the behavior of the policyholder, the insured, or the beneficiary to obtain insurance benefits by various means of fabrication, fiction, exaggeration, concealment, and so on [43].Insurance fraud corresponds to the anomaly detection of entity nodes in a static graph [44].Consider a scenario where a network represents various insurance claims, policyholders, accident vehicles, and insurance companies.Detecting insurance fraud is akin to identifying irregularities in the interactions between nodes.By treating entities as nodes and their relationships as edges in a graph, one can employ anomaly detection techniques to uncover unusual connections or behaviors.In this section, for the auto insurance anomaly detection task, we will detail the overall framework of the proposed DFLR and introduce the method of server aggregation embedding and the process of client updating local embedding, respectively.In addition, this section includes the server selection mechanism we designed in DFLR.
The training process for one round of decentralized federated training is shown in Figure 2.For the purpose of facilitating the subsequent description of the algorithm and framework, we define some relevant terms and letters in Table 1.

DFLR Framework
In the auto insurance industry, it is virtually impossible for car owners to sign up for the same car insurance policy between different insurance companies, subject to state laws and auto insurance claim rules.For example, as a rule, an owner can only carry one Compulsory Third Party Liability Insurance (CTPL) policy on a specific vehicle.Thus, there is less overlap of entities, and cross-company federal aggregation of these entities makes little sense.On the contrary, in each company's private data, the relations between entities are similar, such as "own", "policy holder", and "overhaul at".The most important thing is that these simple relation embeddings hardly involve the details of the owner and vehicle.By sharing these relation embeddings for FL, the risk of a privacy breach is much less.Therefore, as described in Algorithm 1, after locally training and updating the relation and entity embeddings, the client uploads the relation embeddings to the server.As for the server, it is responsible for the aggregation of the relation embeddings in the whole training framework.
In a real enterprise competition scenario, it is quite challenging for all auto insurance companies to agree to work closely together, and the depth of cooperation varies from company to company.Therefore, following the traditional framework of FL, establishing only one central server appears more utopian when multiple companies cooperate.The introduction of the DFL model is realistic and reasonable.In DFL, the relation is more free and flexible compared to the structure under the traditional FL framework.As shown in Figure 3, any client node in the DFL framework can act as a central server for federation aggregation, so its role can be either client or server.Accordingly, in order to better accomplish the auto insurance anomaly detection task, we focus on the following three issues: (i) the server aggregation algorithm; (ii) the local model of clients; and (iii) the server selection mechanism for each round of DFL.

Server Aggregation
Initialize relation embeddings R 0 and select the server set S 0 for the first round of federation.

Server Aggregation
One of the key modules of this DFLR framework, the federated aggregation method, is based on the FedAVG [45] proposed by Mcmahan et al. in 2017.Computing by averaging can help individual clients within a federation group to achieve embedding consistency and cooperation.Under the traditional FL, there are two types of roles: (i) client-the company or institution that jointly trains; and (ii) server-a unique central server built after unified negotiation among all clients.However, in DFLR, the server is not uniquely determined.During the local training phase, the company maintains the client role, and during the federated aggregation phase, some companies transition from the client role to the server role through the server selection mechanism.
Considering that in the KG data of auto insurance, entity carries too much privacy information compared with relation and the private relations of different clients overlap much more, we chose relation embedding as the data transmitted between clients and servers.Through the collection, average aggregation, and distribution of relation embeddings, joint training cooperation between clients is ultimately achieved.
During the initialization phase, the embeddings of all relations are randomly generated based on the relations owned by all clients, ensuring that the same relation between clients remains consistent regarding embedding before training.Then, the first round of server role selection is carried out.This is based on the statistical results of the number of triples (?, r, ?) that exist in a certain type of relation r for each client, calculating cosine similarity as a measure of similarity between client data.The (?, r, ?) represents a set of triples where the relationship is specified, and the head and tail entities can be any entities.By using the K-means classification method, these clients are divided into K groups.Subsequently, the client with the highest number of triples in one group is selected as the server for the initial round of federation, forming a set of servers S. Each server distributes relation embeddings to all clients within its federation group.
The entity embeddings in the KG of the client are randomly generated locally, and the relation embeddings are distributed by the server.The client uses a KGE model to update all triple embeddings.After training with a specific number of epochs on triple embeddings, the client randomly perturbs relation embeddings.After noise processing, the client uploads relation embeddings to the server selected in the initial round.
The server is responsible for receiving the relation embeddings after training from all clients in its federation group.After receiving all, the server performs weighted average aggregation based on FedAVG.After aggregation by the server, relation embeddings are distributed to various clients.Subsequently, the above training steps are repeated until all embedding effects are no longer improved.
The server is not selected in the initial round and remains unchanged.During the federated training process for all clients, the server is regularly replaced through the server selection mechanism, as well as the federation group for each client.The specific process is detailed in Section 3.1.3.

Client Training
The client applies the KGE training model as a way to extract features.We chose seven typical KGE models to train triples (h, r, t).These models are listed in Table 2, along with their score functions.Additionally, we adopt loss function Equation (1) proposed by Zhiqing Sun et al. [38].
where (h, r, t') means the tail entity of true triple is replaced with the other tail t ∈ KG, γ denotes a margin hyperparameter, and [x] + denotes the positive part of x.We construct suitable negative samples by randomly replacing tail entities in the same class, like replacing different Policy IDs.The loss function tends to lower the score of a true triple.After KGE training, it helps to widen the score between positive and negative samples, making the embedding quality of positive samples increase.
In the training phase, negative samples, KG , need to be constructed by randomly replacing the tail entity.However, the negative triples generated using the randomized method are of poor quality, do not help in training the true triples, and even slow down the convergence.In order to effectively widen the score gap between positive and negative triple samples, it is important to strategically conduct negative sampling.The negative sampling we adopt references the method proposed by Yongqi Zhang et al. which utilizes the concept of negative triple confidence to measure the quality of the negative triples [46].First, negative triples generated by the uniform random method are used as negative sample candidates.And then, the negative triple with the highest confidence is selected for training based on the ranking of the confidence score results.According to the definition of triple confidence in CKRL, the confidence of a negative triple is calculated as follows: where the Q value is based on the score function of the KGE model.When the client receives the relation embedding transmitted from the server, it reupdates the local triple embedding based on the above embedding training model.
After local training on the client, the relation embeddings are randomly perturbed by adding Gaussian distribution noise to further ensure data privacy.By introducing Gaussian noise, the received embeddings at the server become more blurred and uncertain.This enhances the achievement of data de-identification, preventing attackers from using statistical information to back-propagate training data.Its probability density function is defined as where x is a random variable, µ is the mean value, and σ is the standard deviation.Afterwards, the client uploads the noised relation embeddings to the server.

Model Score Function
TransE [36]: TransE assumes that relations can be represented as translations between entities.It learns embeddings by minimizing the translation distance between the head and tail entities.
TransH [47]: TransH builds on TransE by introducing the concept of hyperplanes.Each relationship has a corresponding hyperplane, and entity embeddings are projected onto these hyperplanes for more flexible modeling.
TransF [48]: TransF is an improvement over TransH, using the Frobenius norm to measure the match between entities and relationships.f = (h + t) t RotatE [38]: RotatE employs rotation operations and represents relationships as complex numbers.It models the interaction between entities by rotating them in the complex plane.[49]: DISTMULT measures the match between entities and relationships by taking the dot product of their embeddings.
HolE [50]: HolE uses tensor representations to map entities and relationships to a high-dimensional space.It calculates match scores through convolution operations.
ComplEx [51]: ComplEx represents entities and relationships using complex numbers.It calculates match scores through complex multiplication, allowing for more flexible modeling of interactions.

Server Selection Mechanism
How to select a client to transit from the client role to server roles in a DFL framework is a key issue.
During the initialization phase, as mentioned in the Server Aggregation section, the first round of federated server selection is based on the number of triples in the client's private KG.In the following training phase, the server selection mechanism will automatically proceed after Z rounds of federation, and the specific process is as follows: (1) Each client will record the triple embeddings of the previous Z rounds of federation for subsequent traceability and comparison.(2) If a client's highest F1-score of fraud prediction on the validation set in the latest Z rounds is lower than the highest F1-score in the previous Z rounds, it means that this client did not benefit from this federation group.When more than P percentage of clients within one federation group have not benefited from training, then all clients within that group will be added into the Group Re-selected Client Sequences.(3.a)For all clients in the Group Re-selected Client Sequences, the client with the highest computing power will calculate the Euclidean distance between the highest quality relation embeddings in the previous Z rounds for K-means grouping.After constructing federation groups, the clients within the group are selected again based on the number of triples for the next round of federated training.(3.b)If client training stagnation occurs only within a single federation group, each client in it will be assigned to other groups.The Euclidean distance is also used as the grouping basis.What is calculated here is the Euclidean distance between the relation embeddings of clients in the Group Re-selected Client Sequences and the relation embeddings of clients who assume server roles within other groups.
One can perform federation group updates and server selection through the above steps.

Evaluation Metrics and Data
For this binary classification of anomaly detection, the experimental evaluation indicators used are listed in Table 3 according to the confusion matrix.They include the True Positive Case (TP), which indicates that an actual Fraudulent auto insurance claim is judged as Fraudulent by the model; the False Positive case (FP), which indicates that a normal auto insurance claim is judged as Fraudulent by the model; the True negative case (TN), which indicates that a normal auto insurance claim is judged as normal by the model; and the False negative case (FN), which indicates that a Fraudulent auto insurance claim is judged as normal by the model.
The experimental evaluation metrics are selected as the classical indicators for binary classification problems, which are precision, recall, F1-score, and accuracy.The dataset used in the experiment is from a cooperative auto insurance company.By extracting the entities in auto insurance cases, the graph is organized as shown in Figure 4, which includes five categories, 14 types of entities, and some main relations between entities.Due to the transmission of relations, some relations are not shown in Figure 4.After a series of matching and MD5 encryption operations, preliminary privacy protection was applied to data related to the personal information of the policyholder, plate number, driver's licenses, etc.After a series of data pre-processing, the dataset consisted of 725,388 triples, 11 relations, and 81,381 entities.
For the purpose of simulating the private data differences among clients, we divided the dataset into 6 and 12 clients through relation, and the average number of KGs is shown in Table 4.And we randomly deleted some relations so that the relations between clients are not completely consistent.In addition, in order to ensure the quality of embeddings, we split the dataset into training, validation, and testing sets in an 8:1:1 ratio.

Implementation
We set up three training modes to verify whether the DFLR method is effective, including Single-Client, All-Clients, and DFLR-Clients.The meanings of these three modes are as follows: • Single-Client: In this mode, there is no cooperation between clients and they are trained using only their own private data.In the federation phase, the server selection mechanism is performed first, and some of the client identities are switched to servers.After collecting the relation embeddings uploaded by the clients in the group, the server performs average aggregation based on FedAVG, and then distributes them.Finally each client evaluates the model effect on its own test dataset.
In the experiment, we construct different processes using the same type of GPU devices with the same configurations to simulate the clients for local training.And an SVM anomaly detection model [52] is applied after local KGE training.We set the embedding dimension trained in three modes as 256, the learning rate as 0.001, and the margin γ of Equation (1) as 1.0.For DFLR-Clients mode, the client training batch size B was 512, local training epoch E was five, and Z was four.As for the SVM model, we set the C parameter to 0.01 and the gamma parameter to 0.001.
To assess the training efficiency of the federation rounds, it was set that when the accuracy rate was no longer decreasing within six rounds, the federated training was ended, and the experimental effect was evaluated using the above-mentioned metrics.The accuracy of the validation set was evaluated every 10 epochs for training modes Single-Client and All-Clients, and every 5 Federation rounds for DFLR-Clients.The model fitting effect was ensured by using the early stopping method, and the whole training session ended when the accuracy was no longer improved for 12 consecutive epochs/rounds on the validation set.The trained model was then used on the test set for metrics evaluation.

Experiments
Our experiment mainly focuses on three important questions: (i) verifying whether the DFLR training mode is effective; (ii) detecting the impact on a specific client as the number of federated participants increases; and (iii) detecting the impact on a specific client as the number of federation groups increases.

Verify the Effectiveness of DFLR Training Mode
In this experiment, we set the number of clients N to six and the number of federation groups K in DFLR-Clients mode to three.The results of the three modes of training are listed in Tables 5 and 6.The scarcity of anomalous samples in the dataset resulted in relatively low F1-scores in these tables.However, there are still gaps in the predicted results that can be used for analysis under different mode settings.
Table 5 shows the average results for six clients.This table shows that our proposed DFLR successfully improved the average training performance of all clients compared to the Single-Client mode.Especially under the DFLR framework, the TransH+SVM and HolE+SVM models performed even better than the results of All-Clients mode.And in all DFLR-Clients training modes, the HolE+SVM model achieved the highest results, with 0.4822 on precision, 0.3823 on recall, and 0.4265 on F1-score.As mentioned in the above examples, some results show that DFLR-Clients mode can surpass All-Clients mode.This may be because the DFLR-Clients mode enables clients to cooperate with clients with higher embedding similarity.All-Clients, on the other hand, simply centralizes clients' data and does not take into account the differences between clients.In the All-Clients mode, there is a situation that the results of some clients are pulled down, so the average results will be relatively low.
Table 6 is the test result of one of the clients.The detection results of fraud in DFLR-Clients mode show that the embedding quality of the client was also effectively improved through training.The ComplEx+SVM model even achieved the optimal effect in DFLR-Clients mode.This result may be due to a significant deviation in the data distribution between Client-1 and other clients in the ALL-Clients mode.And the DFLR-Clients mode utilizes a reasonable grouping mechanism, thereby improving the prediction results of Client-1.
Based on the above results, we can verify the effectiveness of our proposed DFLR framework.The DFLR framework successfully improved the improvement limitation of single client training, achieving the benefits of each participant through federated cooperation.The DFLR framework successfully breaks through the limitations of single client training and benefits each participant through federated cooperation.Considering that the premise of FL is that multiple clients choose to cooperate for mutual benefit, we conducted a comparative experiment about the number of clients N participating in the cooperation.In this experiment, we used data split into 12 clients and set K = 2.For Client-1, we sequentially increased the number of clients participating in the DFLR.The accuracy result of Client-1 is shown in Figure 5.It intuitively reflects that as N continues to increase, for a certain participant, the ability to detect auto insurance fraud improved, which means an improvement in the quality of KGE.We used data split into 12 clients and set K = 2, 3, 4, 6 so as to observe the impact of K on a specific client.Figure 6 shows the F1-score results for two clients.The linear fitting results indicate that increasing the number of federation groups can improve the performance of the client.

Conclusions and Discussion
The proposed DFLR framework takes into account the high difficulty of achieving unified cooperation among auto insurance companies in real situations, and adopts dynamic server changes to solve the utopian cooperation of traditional FL.The DFLR uses simple and efficient KGE models for information mining on the local client, and aggregates the actual overlapping relations of auto insurance companies on the server to protect privacy.The experiment has proven that the DFLR framework has higher accuracy than single client training, which is more practical under real-world cooperative relations.
In experiments, we did not fully consider data heterogeneity, but used data with the same structure on the client for federal simulation.In real scenarios.,different companies have different requirements for the storage format and structure of data, which largely blocks the algorithm from obtaining qualitative improvement in real-time anomaly detection.Moreover, the data heterogeneity between all the data of different companies, including label offset, data volume imbalance, and so on, was not fully considered.In the future, efforts will be made to improve the efficiency of the data processing process.In addition, considering the reality of the real-time addition of cases, auto insurance detection for static auto insurance data will be expanded to dynamic interaction detection for data updates in the federal round to meet the real-time detection requirements of auto insurance cases.And it is necessary to consider the integrity of cooperation between agencies and companies to prevent the emergence of malicious participants in the federal sharing session.Based on game theory, we could start with incentives for cooperation to stimulate cooperative firms and agencies to reduce destructive behavior in future work.

Table 1 .
Glossary of notations.Symbol Definition Round t The complete training process for a full federation round consists of three stages.(i) The server sends the parameters down to the client; (ii) the client receives them for local training; (iii) the client uploads the parameters to the server for aggregation.N, K The number of clients; the number of the federation group.E, B The batch size of the client; The training epoch of the client.KG, E , R Knowledge Graph set owned by an auto insurance company; the entity embeddings of KG; the relation embeddings of KG. S t , S j t , C j Set of servers at round t; the server of the j-th federation group; the client's set of the j-th federation group.Z The fixed number of rounds that the client needs to participate in in federated training before implementing the server selection mechanism.P The proportion of clients within the same federation group which choose to regroup.

Figure 3 .
Figure 3. Traditional FL and Decentralized FL framework.

Figure 4 .
Figure 4. Entity and relation of auto insurance Knowledge Graph.

•
All-Clients: In this mode, competition between clients is ignored and no privacy protection is adopted.All clients pool their private data together, train a single detection model uniformly, and finally test the model effect with a test dataset on each client.• DFLR-Clients: In this model, each client has the ability to switch roles, including client or server.In the local training phase, all clients only rely on local data for training.

4. 3 . 2 .
The Impact of the Number of Participants N on One Specific Client

Figure 5 .
Figure 5.The impact of N on clients in DFLR-Clients mode.4.3.3.The Impact of the Number of Federation Groups K on One Specific Client

Figure 6 .
Figure 6.The impact of K on clients in DFLR-Clients mode.

Table 4 .
The average number of KGs owned by split clients.

Table 5 .
The average results of 6 clients in 3 different training modes.

Table 5 .
Cont.The bold data in the table indicate that the DFLR-Clients mode outperforms the Single-Client mode, and the underlined data indicate that the DFLR-Clients mode is the best of the three modes. *

Table 6 .
The results of Client-1 in 3 different training modes.The bold data in the table indicate that the DFLR-Clients mode outperforms the Single-Client mode, and the underlined data indicate that the DFLR-Clients mode is the best of the three modes. *