LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events

Ye, Qiaolin; Sun, Guoxuan; Chen, Yanwen; Xu, Xukan

doi:10.3390/electronics14153102

Open AccessArticle

LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events

¹

School of Business, Hohai University, Nanjing 211100, China

²

Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong 999077, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(15), 3102; https://doi.org/10.3390/electronics14153102

Submission received: 8 July 2025 / Revised: 31 July 2025 / Accepted: 1 August 2025 / Published: 4 August 2025

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

With the popularization of social media, information related to public health events has seen explosive growth online, making it essential to accurately identify informative tweets with decision-making and management value for public health emergency response and risk monitoring. However, existing methods often suffer performance degradation during cross-event transfer due to differences in data distribution, and research specifically targeting public health events remains limited. To address this, we propose the Label Confusion Domain Adversarial Network (LCDAN), which innovatively integrates label confusion with domain adaptation to enhance the detection of informative tweets across different public health events. First, LCDAN employs an adversarial domain adaptation model to learn cross-domain feature representation. Second, it dynamically evaluates the importance of different source domain samples to the target domain through label confusion to optimize the migration effect. Experiments were conducted on datasets related to COVID-19, Ebola disease, and Middle East Respiratory Syndrome public health events. The results demonstrate that LCDAN significantly outperforms existing methods across all tasks. This research provides an effective tool for information detection during public health emergencies, with substantial theoretical and practical implications.

Keywords:

public health events; information detection; label confusion; domain adaptation

1. Introduction

In today′s highly interconnected digital age, social media platforms such as Twitter and Weibo have become indispensable channels for information dissemination during public health events [1]. They not only provide governments and health organizations with convenient channels to issue alerts, health guidance, and updates on the epidemic to the public [2,3], but also facilitate rapid information flow and effective resource allocation by supporting two-way communication between affected communities and health institutions. However, the vast scale of user-generated content also presents significant challenges, particularly in accurately extracting valuable information from a large amount of redundant or ineffective information [4]. Although existing research has extensively explored information detection in the context of natural disasters [5], attention to public health events remains relatively insufficient. These events often possess unique attributes such as complex semantic features and diverse dissemination paths, necessitating differentiated modeling.

Effective information detection in public health events faces multiple challenges. First, the heterogeneity of social media data makes uniform information quality difficult to achieve. User-generated text typically contains informal language, emotional expressions, and context dependency [4], which increases the complexity of understanding and classification. Second, the dynamic nature of data distribution further exacerbates the difficulty of detection. In different public health events, the semantic features and distribution patterns of information may exhibit significant variations, thereby causing notable domain shifts that pose challenges to cross-event generalization. As shown in Figure 1, we randomly selected 1000 tweets related to the COVID-19 and Ebola events, respectively, and conducted in-depth visual analysis of the semantic distribution differences between different events. Word clouds showed that “coronavirus”, “COVID19” and “death” frequently appeared in COVID-19 tweets, whereas “Ebola”, “disease” and “symptom” were more prominent in Ebola-related tweets. These differences reflect the distinct semantic emphases and information needs of each event. Further, we used the t-SNE [6] method to perform dimensionality reduction and visualization on the above tweets (as shown in Figure 2) to reveal the distribution structure of text features in the low-dimensional space. The results show that tweets related to COVID-19 (blue) and Ebola (orange) formed clearly distinguishable clusters in the two-dimensional space, with well-defined domain boundaries. This indicates that, in the context of public health events, knowledge learned by a model from one event may encounter significant generalization challenges when directly transferred to another, due to the pronounced semantic and structural differences.

These challenges indicate that traditional information detection methods based on supervised learning have significant limitations in dealing with the complexity of public health events, and their performance often declines due to the lack of sufficient labeled data and cross-event adaptability. In recent years, domain adaptation, as a transfer learning approach [7], has garnered significant attention in the field of information detection. Domain adaptation seeks to reduce distribution disparities between the source domain and the target domain by learning domain-invariant features or adjusting model parameters, thereby enhancing model generalization in the unlabeled target domain. In this context, adversarial learning [8] has emerged as a research focus due to its superior performance in feature alignment, whereby a domain discriminator is introduced to distinguish source and target domain features, enabling the feature extractor to generate domain-invariant representations and thus facilitate cross-domain knowledge transfer. Although these methods have made progress in improving detection performance, the diversity and complexity of public health events continue to pose greater demands on model generalization and adaptability, as single-domain adaptation mechanisms and one-hot vector label representations may fail to adequately capture cross-event semantic variations and differences in sample contributions. From the perspective of sample labels, tweets in public health events vary significantly in terms of quality, semantic expression, and content structure, resulting in inconsistent transfer values in different target events. Traditional methods usually use hard-coded representations for category labels and treat all source domain samples equally, failing to reflect the strength of the association between different tweets and the target domain. This approach may conceal some essential transfer features and introduce a considerable amount of irrelevant or weakly relevant information, leading to the issue of label confusion. Label confusion primarily arises when the model fails to accurately align the label spaces of the source domain and the target domain due to significant semantic deviations of labels across different domains, resulting in incorrect assessments. Taking the COVID-19 and Ebola events as examples, although both belong to the category of infectious diseases, the tweets “COVID-19 daily cases update” and “Ebola virus detected in DR Congo” emphasize two different semantic cores of the labels, namely epidemic statistics and virus origin. Direct transfer would lead to an information mismatch. This difference in label distribution reflects the distance between the semantic centers of labels across domains, thus highlighting the necessity of constructing a label confusion awareness mechanism.

Based on the above background, this paper proposes a new framework—Label Confusion Domain Adversarial Network (LCDAN)—for detecting effective information in public health events. The core idea of LCDAN is to introduce label-level confusion modeling on the basis of the traditional adversarial domain adaptation framework, so as to achieve dual alignment of the semantic space and the label space simultaneously. On the one hand, the model extracts domain-invariant text features by introducing an adversarial training mechanism between the domain discriminator and the feature extractor, thus alleviating the distribution shift problem caused by event specificity. On the other hand, it is difficult to solve the potential conflicts at the label level only by relying on the adversarial mechanism. Especially in cross-event tasks, the semantic centers represented by the source domain labels may deviate significantly from those in the target domain. For example, in the Ebola event, tweets in the “symptom” category are often related to virus spread, while in the COVID-19 event, similar tweets are more likely to express public emotions and policy feedback. Such differences may cause confusion in the category prediction output by the model, even in the shared feature space.

To further strengthen the unified modeling of semantic structure and label structure by the model, LCDAN introduces the label confusion distribution mechanism, which incorporates the label modeling process into the adversarial training framework for collaborative optimization. Through this mechanism, the model can actively suppress the interference of source samples with obvious domain features but inconsistent labels in the training process. It emphasizes the key samples that are consistent with the target event both semantically and in terms of labels. It also effectively alleviates the negative transfer problem caused by label confusion, and improves the information recognition ability and generalization performance of the model on new events. The innovation of LCDAN combines the advantages of adversarial learning and label confusion. Adversarial domain adaptation ensures the alignment of cross-domain features, and label confusion can better learn meta-knowledge to adapt to new events. At the same time, a dynamic sample weighting mechanism is introduced to adjust sample weights to optimize the transfer effect. The above design gives LCDAN significant advantages in dealing with the complex data characteristics of public health events. The main contributions of this paper include the following aspects:

We propose the LCDAN framework, which integrates adversarial domain adaptation and label confusion, providing a novel solution for detecting effective information in public health events.
We address the text offset problem present in different domains of public health events. Experiments conducted on multiple public health event datasets demonstrate that LCDAN can effectively distinguish relevant information from irrelevant information in public health events and exhibits strong transfer performance.
We propose a label confusion method for transfer learning in public health events. By introducing label confusion between events and learning probabilistic label distributions, this method adjusts source domain sample weights to optimize transfer performance, evaluates sample importance to the target domain, and enhances the model’s ability to identify relevant information.
We introduce the informative message detection task into the public health field. Compared with general events, the information in public health tweets is complex and time-sensitive. This study expands the application boundary of cross-domain information detection.

The rest of the paper is structured as follows. Section 2 reviews related work concerning public health events, domain adaptation, and adversarial learning. Section 3 introduces the proposed LCDAN methodology. Section 4 presents empirical results across three domains, accompanied by detailed analysis and discussion. Finally, Section 5 concludes the paper.

2. Related Work

With the widespread adoption of the Internet, social media platforms have become critical channels for disseminating information during crisis situations. However, effectively identifying informative content from a vast volume of irrelevant or redundant posts remains a significant challenge. While prior research has extensively investigated information detection in the context of natural disasters, relatively limited attention has been devoted to recognizing informative messages in the domain of public health. This section provides a systematic review of recent advances in public health event analysis, domain adaptation techniques, adversarial domain adaptation methods and label distribution, thereby establishing the theoretical foundation and practical motivation for the proposed approach.

2.1. Public Health Incidents

Public health incidents are major infectious disease outbreaks, mass unexplained illnesses, major food and occupational poisoning and other events that seriously affect public health that cause or are likely to cause serious damage to public health. Such events are usually characterized by a high degree of uncertainty and rapid evolution, with far-reaching impacts on social stability and public security [9]. In the past decades, there have been several major public health events worldwide, such as Severe Acute Respiratory Syndrome (SARS) in 2003, the Ebola outbreak in 2014, and the COVID-19 outbreak in late 2019.

With the popularization of the Internet and social media, platforms such as Twitter and Weibo have become important channels for information dissemination during public health events [10]. These platforms enable rapid dissemination of alerts, official statements and public feedback, providing real-time data support to government agencies and health organizations. In addition, social media supports two-way communication between affected communities and decision makers, facilitating rapid information sharing and public participation. However, the explosion of social media data also poses a significant challenge: the amount of extraneous information embedded in the data.

To address this issue, researchers have proposed various methods to identify effective information on social media [11,12,13]. Early methods primarily relied on manual annotation and rule-based text analysis, but these approaches were inefficient for handling large-scale data and struggled to adapt to contextual variations across different events. In recent years, the advancement of machine learning technology has provided new possibilities for automated detection [14]. However, traditional machine learning methods typically require large amounts of labeled data, and in the early stages of public health events [15], the acquisition of labeled data is often constrained by time and resource limitations. Furthermore, semantic feature differences between events further increase the detection difficulty. These methods are typically designed for specific events, with limited generalization capability across events. Ghafarian et al. [4] proposed a Support Measurement Machine (SMM) based on distribution learning, enhancing the accuracy of effective information detection through word distribution modeling. Qu et al. [16] proposed a Class-Imbalanced Adversarial Neural Network (CADNN), addressing unsupervised domain adaptation and class imbalance issues in effective information detection, integrating adversarial domain adaptation and cross-domain interpolation models to optimize cross-domain feature representations. Consequently, developing methods for effective information detection that can generalize to new events has become a key research direction. However, research on effective information detection for public health events remains limited, necessitating the development of methods with strong generalization and adaptability to dynamic semantics.

2.2. Domain Adaptation

Domain adaptation focuses on solving the knowledge migration problem when the source and target domains have different marginal probability distributions but have the same conditional probability distribution [17]. In machine learning, the ideal scenario is to have richly labeled training examples that have the same distribution as the test data [18]. However, the data distributions of different events often have significant differences, which makes it difficult to directly apply the model trained on one event to another event [19], the so-called “domain drift” problem. Domain adaptation can be broadly categorized into traditional domain adaptation and deep domain adaptation. Traditional domain adaptation methods are mainly divided into two categories: instance weighting and feature alignment. The instance weighting method reduces the difference with the target domain by assigning weights to the source domain samples [20], while the feature alignment method maps the features of the source and target domains into a shared space so that the distributions match in the space. In recent years, significant progress has been made in deep learning-based domain adaptation methods. These methods utilize the nonlinear mapping capability of deep networks to automatically learn feature representations that are shared across domains, providing higher flexibility and adaptability compared to traditional methods. These deep domain adaptation approaches roughly fall into three categories: discrepancy-based methods, adversarial learning-based methods and multi-domain methods.

Our research mainly focuses on adversarial domain adaptation methods. Adversarial domain adaptation is an important development direction in the field of domain adaptation in recent years, inspired by generative adversarial networks. This method enables the model to learn cross-domain shared feature representations by introducing an adversarial training mechanism. Its core idea is to transform the domain adaptation problem into a game process: the feature extractor tries to generate domain-invariant features, while the domain discriminator tries to distinguish whether the features come from the source domain or the target domain. Through this minimax optimization, the model can achieve feature alignment between the source domain and the target domain. Domain Adversarial Neural Networks [21] (DANNs) implement adversarial training through a Gradient Reversal Layer (GRL): the feature extractor tries to maximize the classification error of the domain discriminator, while the domain discriminator tries to minimize its error. This approach has demonstrated its effectiveness in various tasks, including cross-linguistic sentiment classification and cross-dataset image recognition. In Adversarial Discriminative Domain Adaptive (ADDA) [22], the feature extractors of the source and target domains are not shared, and the parameters of the target model are initialized by the source model. Event Adversarial Neural Network (EANN) [23] combines an event classification task and domain adversarial training to successfully detect false information in social media. EANN improves the performance of the model on new events by introducing an event discriminator that separates event-specific features from shared features. The advantage of adversarial domain adaptation is its ability to automatically learn feature alignment without explicitly defining a distributional difference metric. However, the method has limitations. First, EANN assumes that the feature spaces of the source and target domains are fully shared, but in real-world scenarios, different events may contain a large number of event-specific features that may interfere with the effectiveness of adversarial training. Second, the instability of adversarial training may lead to difficulties in model convergence, especially when the amount of data is limited.

Despite considerable theoretical and practical advances in domain adaptation, its application in public health event scenarios remains relatively underexplored and is subject to notable limitations. Public health events often involve highly dynamic contexts, rapidly evolving semantics, and diverse information sources, resulting in significant domain discrepancies that conventional adaptation techniques may fail to address effectively. Furthermore, most existing methods pay limited attention to sample-level variability, overlooking the fact that the relevance and transferability of social media posts can vary greatly across different public health events. This may increase the risk of negative transfer and hinder the reliable detection of informative instances, underscoring the need for more fine-grained and adaptive solutions.

2.3. Label Distribution

In information detection, as a fine-grained label representation method, label distribution can capture the complex relationship between samples and labels more comprehensively compared with traditional one-hot encoding. Geng [24] first proposed Label Distribution Learning (LDL) to address the issue of label non-independence, which has been widely applied in sentiment analysis and text classification. Chen et al. [25] combined label distribution with non-negative matrix tri-factorization, proposing a semi-supervised cross-domain sentiment classification and emotion distribution learning framework, achieving label distribution transfer through stable associations between document and word clusters. Due to the complexity of public health events, traditional binary classification struggles to describe their multi-dimensional features. Label distribution provides a more detailed classification basis by quantifying the association between labels and samples.

Despite the potential of domain adaptation methods in addressing cross-domain tasks, there remains a significant lack of research focused on detecting new information related to public health events, indicating a need for deeper exploration. In stark contrast to existing work, this paper centers on the critical challenge of information detection for new public health events. To address this, we innovatively propose a label-confused domain adversarial network, which evaluates the importance of samples to the target domain based on a label-confused distribution and employs a weighting mechanism to calculate the importance weight of each social media post. This design aims to effectively alleviate the negative transfer problem, thereby enhancing the model′s key ability to identify genuinely effective public health information amidst noisy information streams. A comparison between the proposed model and the most relevant studies is provided in Table 1.

3. Methodology

3.1. Framework

As a central platform for information dissemination and public engagement in public health events, social media generates text data characterized by high heterogeneity, significant context dependency, and strong event specificity. These characteristics render the accurate identification of effective information from vast social media data a highly challenging task. To address this issue, a novel detection framework—Label Confusion Domain Adversarial Network (LCDAN)—is proposed, integrating adversarial learning with label confusion-aware mechanisms to enhance both information identification capabilities and cross-domain generalization performance in multi-event contexts. The design of the LCDAN framework is based on the following key considerations. First, social media text, typically unstructured and informal in expression, requires a feature extraction module capable of capturing deep semantic information to replace the reliance of traditional methods on shallow information. Second, to enhance cross-event transfer capabilities, event-specific feature interference must be suppressed through adversarial training, enabling the model to learn more generalizable representations. However, relying solely on feature alignment is insufficient to address differences in cross-domain label spaces. Due to significant semantic shifts in labels across different public health events, the transfer value of source domain tweets to the target task remains uncertain. If one-hot labels are uniformly applied and source samples are treated with equal weight, label confusion issues are likely to arise, thereby leading to negative transfer. Therefore, LCDAN introduces a label confusion distribution modeling mechanism to finely construct the semantic association between source samples and the target task in the label space. By dynamically weighting and regulating the weights of training samples, transfer optimization is achieved at the label level.

Specifically, the LCDAN framework comprises the following modules: The feature extractor maps social media texts to the semantic space based on a pretrained text model and extracts context-related features; the informative detector performs supervised classification on the extracted features to determine whether a tweet contains effective information; the domain discriminator, utilizing a gradient reversal mechanism, implements adversarial training to align source and target domain distributions in the feature space, reducing event-specific biases; the most crucial label confusion distribution module combines the label distribution and semantic center distance to measure the potential value of source samples for the target domain and adjusts their contributions during the training process through a dynamic weight mechanism, enhancing the model’s perception ability of transfer-related samples and alleviating the label confusion problem caused by semantic mismatch. Through the synergistic operation of these modules, LCDAN establishes a detection framework that integrates feature alignment and label optimization, thereby significantly improving the model’s generalization across multi-source heterogeneous events. The introduction of the label confusion mechanism not only enhances the discriminability of source sample selection but also provides an optimization approach for cross-event knowledge transfer based on semantic alignment, thereby distinguishing LCDAN from existing adversarial models. The process is illustrated in Figure 3.

The notations used frequently in this article are summarized in Table 2.

3.2. Feature Extractor

When identifying informative messages related to public health events on social networks, the text feature encoder encodes the original text data from different domains. Tweets typically contain numerous entities related to events and their domains, which vary across domains, thereby significantly impacting the classification performance of the classifier. Traditional text representation methods, such as bag-of-words models or static word embeddings, are inadequate due to their inability to sufficiently capture context dependency and event specificity. To address this, BERT [29] is adopted for text feature extraction, whereby latent semantic and contextual information is captured through a multi-layer bidirectional Transformer encoder, producing high-quality semantic representations. Given an input text sequence

x = [x_{1}, x_{2}, \dots, x_{n}]

, where

n

denotes the text length, BERT first converts each token

x_{i}

into a token representation through a tokenizer. Subsequently, each token is transformed into an input representation through three types of embeddings: token embeddings

E_{w}^{t o k e n}

, positional embeddings

E_{w}^{p o s i t i o n}

, and segment embeddings

E_{w}^{s e g m e n t}

. Specifically, for a token

w

, its input vector

E_{w}

is computed as

E_{w} = E_{w}^{t o k e n} + E_{w}^{p o s i t i o n} + E_{w}^{s e g m e n t}

(1)

The input vector sequence

E = [E_{1}, E_{2}, \dots, E_{n}] \in ℝ^{n \times d}

, where

d

denotes the embedding dimension, is fed into BERT’s multi-layer Transformer encoder, which, through a self-attention mechanism, produces context-dependent output representations

H = [h_{1}, h_{2}, \dots, h_{n}] \in ℝ^{n \times d}

:

h_{n} = G_{f} (x) = T r a n s f o r m e r_{n} (x)

(2)

where

G_{f}

denotes the BERT feature extractor, and

h

represents the fixed-dimensional feature vector for the text

x

.

To obtain a global feature representation of the text, the output vector of the [CLS] token,

h_{CLS} \in ℝ^{d}

, is typically extracted and mapped to the target feature space through a fully connected layer:

z = R e L U (W_{f} h_{CLS} + b_{f})

(3)

where ReLU denotes the activation function,

W_{f} \in ℝ^{d \times d_{h}}

represents the weight matrix,

b_{f} \in ℝ^{d_{h}}

represents the bias vector,

d

denotes the BERT output dimension, and

d_{h}

denotes the target feature dimension.

The module takes the raw text sequence as input and produces the feature

z

, providing semantically rich input for the information detection classifier and domain discriminator. The BERT Transformer encoder, owing to its bidirectional context modeling and self-attention mechanism, effectively captures the semantic features of public health event texts, thereby enhancing classification performance.

3.3. Informative Detector

The informative detector

G_{y}

outputs

\hat{y}

, which represents the probability that the post is “Informative”, based on the features

(z)

output by the feature extractor. Let the label be

y \in 0, 1

, where

y = 0

represents informative and

y = 1

represents not-informative text. The structure of

G_{y}

includes a two-layer fully connected network, equipped with a ReLU activation function and a softmax output layer, which is expressed as

\hat{y} = G_{y} (z) = s o f t m a x (W_{y} z + b_{y})

(4)

where

\hat{y}

denotes the predicted probability distribution, and

W_{y} \in ℝ^{2 \times d_{h}}

and

b_{y} \in ℝ^{2}

represent the network parameters. For the source domain data, the optimization objective of the informative detector is to minimize the cross-entropy loss:

L_{y} = E_{x, y \sim p_{s} (x, y)} [y_{i} \log \hat{y_{i}} + (1 - y_{i}) \log (1 - \hat{y_{i}})]

(5)

where

y_{i}

and

\hat{y_{i}}

denote the true label and predicted probability, respectively, for the

i

-th sample, effective information detection must address the generalization challenges posed by new domain events, as solely minimizing detection loss tends to capture domain-specific knowledge, making generalization difficult. Consequently, the model must be trained to learn more general feature representations to capture common features across all events.

3.4. Domain Discriminator

To reduce domain-specific features, a domain discriminator is introduced, which, through adversarial training, encourages the feature extractor to generate domain-invariant shared features. The domain discriminator

G_{d}

, via adversarial training, distinguishes whether the feature

z

originates from the source or target domain, drawing on the concept of adversarial learning. Let the domain label be

d \in 0, 1

, where

d = 1

denotes the source domain and

d = 0

denotes the target domain. Its structure consists of a multi-layer fully connected network, with the output representing the probability distribution of event categories:

\hat{d} = G_{d} (z) = s o f t m a x (W_{d} z + b_{d})

(6)

where

W_{d} \in ℝ^{K \times d_{h}}

,

b_{d} \in ℝ^{K}

,

K

represents the number of event categories, and

d_{h}

denotes the intermediate layer dimension.

To implement adversarial training, a gradient reversal layer (GRL) is introduced between the feature extractor and the event discriminator. The GRL maintains the input unchanged during forward propagation but multiplies the gradient by a negative coefficient

- λ

during back propagation, thereby enabling the feature extractor to maximize the loss of the event discriminator, promoting the generation of event-invariant features. The loss function of the domain discriminator is defined as the cross-entropy loss:

L_{d} = E_{x \sim p_{s} (x)} [l o g G_{d} (G_{f} (x))] + E_{x \sim p_{t} (x)} [l o g (1 - G_{d} (G_{f} (x)))]

(7)

3.5. Label Confusion Distribution

The label confusion distribution

G_{L}

takes the feature vector generated by

G_{f}

as input, quantifying the importance of source domain samples to the target domain through label distributions and domain center distances, and assigning weights to each sample to adjust its influence during training, thereby optimizing the model’s performance in the target domain.

Effective information detection in the public health domain faces challenges due to the high timeliness, diverse sources, and varied content of information. Traditional methods typically employ one-hot vectors to represent true labels. However, one-hot representations may fail to adequately capture the relationships between samples and labels [28] and are susceptible to mislabeling, leading to difficulties in distinguishing similar labels during prediction and causing label confusion issues. Particularly in cross-domain scenarios, if domain labels of source samples are uniformly assigned while ignoring inter-sample differences, the model may overfit to source domain-specific features, thereby reducing adaptability to the target domain. In this study, the objective is to extract features that maximally distinguish whether samples contain effective information while minimally identifying their domain. Consequently, by introducing label confusion to domain labels, the model’s reliance on domain-specific features can be effectively reduced, thereby enhancing the learning of general effective information features. Although achieving a true label distribution theoretically is challenging, it can be approximated by mining the semantic information underlying instances and labels.

Assume the output space is

Y

, which contains

c

possible labels. For each source domain sample

x_{s}^{(i)}

, its label distribution is defined as a vector

p_{s}^{(i)} = [p_{s, 1}^{(i)}, p_{s, 2}^{(i)}, \dots, p_{s, k}^{(i)}]

, where

p_{s, k}^{(i)}

denotes the probability that the sample belongs to label

k

. The domain label of each sample is represented by a two-dimensional vector

d^{(i)} = [d_{s}^{(i)}, d_{t}^{(i)}]

, corresponding to the probabilities of belonging to the source and target domains, respectively, satisfying the constraint

d_{s}^{(i)} + d_{t}^{(i)} = 1

. The domain label

d_{s}^{(i)}

can be approximated through the output

G_{d} (z_{s}^{(i)})

of the domain discriminator

G_{d}

. Here,

z_{s}^{(i)}

is the feature vector generated by the feature extractor

G_{f}

from the source domain sample

x_{s}^{(i)}

. For a target domain sample

x_{t}^{(j)}

, its domain label is fixed as

d_{t}^{(j)} = [0, 1]

, indicating complete affiliation with the target domain.

In public health, there are different categories of fields. Although each field belongs to the scope of public health, their natures, impacts, and response methods vary greatly. These differences are reflected in the semantics of data labels, resulting in significant variations in label distances between different domains. To better understand and leverage these differences, the label distance is defined as the cosine distance between the source domain’s central label representation and the target domain’s central label representation. The central label representations for the source and target domains are, respectively, defined as

μ_{s} = \frac{1}{n_{s}} \sum i = 1^{n_{s}} v_{s}^{(i)}, μ_{t} = \frac{1}{n_{t}} \sum j = 1^{n_{t}} v_{t}^{(j)}

(8)

where

v_{s}^{(i)} = f^{L} (d_{s}^{(i)})

,

v_{t}^{(j)} = f^{L} (d_{t}^{(j)})

,

n_{s}

and

n_{t}

denote the central label representations and the number of labels for the source and target domains, respectively.

The label distance is calculated as

d_{L} = 1 - \cos (μ_{s}, μ_{t}) = 1 - \frac{μ_{s} \cdot μ_{t}}{{|μ_{s}|}_{2} {|μ_{t}|}_{2}}

(9)

By integrating label distributions and domain label center distances, the dynamic weight of the source domain sample is calculated. Specifically, the weight

w_{i}

of the source domain sample

x_{i}

is defined as

w_{i} = e x p (γ + (1 - γ) \cdot (1 - d_{s}^{(i)}) \cdot d_{L})

(10)

Here, when

d_{s}^{(i)}

approaches 1, the sample may contain significant source domain-specific information, and its weight should be reduced to mitigate negative transfer. Thus, such tweets should be assigned lower weights, and the weights are applied to the information detection loss and domain discrimination loss.

3.6. Loss Function

LCDAN uses a weighted mechanism of label confusion to more accurately align the distributions of different domains, reduce the influence of irrelevant or abnormal source tweets on unlabeled target tweets, and more effectively detect effective information in new domains. Through dynamic weighting, the label confusion mechanism can adjust the contribution of source domain samples to the training loss and optimize the cross-domain adaptability of the model. The optimized information detection loss is

L_{y w} = E_{x, y \sim p_{s} (x, y)} w_{i} [y_{i} \log \hat{y_{i}} + (1 - y_{i}) \log (1 - \hat{y_{i}})]

(11)

The weighted domain discrimination loss is

L_{d w} = E_{x \sim p_{s} (x)} [w_{i} l o g G_{d} (G_{f} (x))] + E_{x \sim p_{t} (x)} [l o g (1 - G_{d} (G_{f} (x)))]

(12)

The total loss function comprises two components: the weighted information detection loss

L_{y w}

and the weighted domain discrimination loss

L_{d w}

. The final loss of LCDAN is formulated as a linear combination of these two losses:

L = L_{y w} - λ L_{d w}

(13)

where

λ

is a hyperparameter used to balance the trade-off between effective information detection and domain discrimination.

A gradient reversal layer (GRL) is incorporated before the fully connected layer of

G_{d}

to facilitate a min–max game between

G_{f}

and

G_{d}

. During training, the model parameters are optimized by minimizing

L

, while the feature extractor is optimized to minimize

L_{y w}

and simultaneously maximize

L_{d w}

.

4. Experiments

4.1. Datasets

This study is based on the open-source datasets provided in references [30,31]. Three typical tweets related to public health events were selected to construct the informative message recognition task. These three public health events are COVID-19, Ebola disease, and Middle East Respiratory Syndrome. Specifically, the COVID-19 Twitter dataset was obtained from [30], while tweets on Ebola virus disease and Middle East Respiratory Syndrome (MERS) were extracted from [31]. These three events are public health events that have attracted significant global attention in recent years, with notable social impact and transmissibility. COVID-19 exemplifies a global pandemic, Ebola represents one of the most severe infectious diseases worldwide, and Middle East Respiratory Syndrome (MERS) is characterized by cross-border transmission of a respiratory illness. The selection of these three events enables a comprehensive evaluation of the proposed method’s capability to identify informative message across diverse types and scales of public health crises, thereby validating the model’s robustness and applicability. All tweets were classified as informative messages and non-informative content. For example, the tweet “Correction: PA has 83 new cases of COVID-19, bringing our statewide total to 268 cases, said PA Health Secretary Dr. Rachel Levine. That is more than double the new cases Wednesday” provides an epidemic update and is considered a typical informative message. In contrast, the tweet “Starting off the weekend because I am never afraid. I am legend on AMC already feeling better. #COVID-19”, despite mentioning keywords, lacks useful information and is not considered an informative message. This classification provides high-quality sample support for subsequent model training and cross-domain transfer, facilitating the validation of the proposed method’s effectiveness in identifying informative messages in public health scenarios. For more details about these datasets, please see [30,31].

We manually deleted some unnecessary data and performed simple data augmentation on the imbalanced data. Table 3 shows the label distribution of the three public health events. Six informative message detection tasks are constructed as follows:

C \to W, W \to C, C \to M, M \to C, W \to M

and

M \to W

. Our datasets are available at https://github.com/yyy-2200/LCDAN (accessed on 4 July 2025).

Considering the different text characteristics in different domains, we visualized the word clouds of specific domains, as shown in Figure 4. The word cloud display clearly shows that Twitter significantly emphasizes the themes and locations of all emergencies. For the COVID-19 event, words such as the number of people, death, and COVID-19 appear frequently in the word cloud. In the Ebola tweets, Ebola is the most frequently occurring word. Meanwhile, words such as disease, treatment, and symptoms are also very common. Regarding Middle East Respiratory Syndrome, MERS is one of the most frequently occurring words, and words such as the Middle East, Saudi Arabia, and disease are also prominent in the vocabulary. The word cloud shows obvious differences between texts in different domains. When identifying informative messages across multiple domains, it is crucial to consider knowledge transfer in the text.

As shown in Figure 5, the t-SNE visualization results illustrate the distribution of tweet texts from three public health events (i.e., COVID-19, Worldwide Ebola, and Middle East Respiratory Syndrome) in the semantic space, where blue nodes represent COVID-19, orange nodes represent Ebola, and cyan nodes represent Middle East Respiratory Syndrome. It can be clearly observed that tweets from different events form relatively distinct clustering structures in the embedding space, reflecting significant differences in language expression, thematic content, and user focus across different event domains. This cross-event distribution inconsistency (i.e., domain shift) indicates strong event dependency in text features, further confirming the challenge of achieving robust generalization in cross-domain informative message detection tasks when relying solely on single-event training models.

4.2. Experimental Settings

For each detection task, we split the dataset into training, validation, and test sets with a ratio of

7 : 2 : 1

. The splits were randomly sampled but fixed across all runs to ensure consistency and comparability. Additionally, all target domain data was strictly held out during training to prevent data leakage and maintain the integrity of the transfer learning evaluation. In the LCDAN parameter settings, the dropout rate is set to 0.3 to prevent overfitting. The training batch size is 32, and the learning rate is

10^{- 4}

. We used metrics commonly used in classification tasks, including accuracy, precision, recall, and F1-score.

We selected baseline models from different categories, including traditional feature extraction-based models, deep learning models, and deep domain adaptation models, to verify the effectiveness of the model.

4.2.1. Text Topic Model

To compare models, we initially selected topic modeling, a widely adopted and interpretable approach for short text classification. Specifically, we chose BERTopic [32], the prevalent model in this category, as our baseline for comparison.

BERTopic is a topic modeling tool based on BERT embeddings and clustering, extracting document topics through Transformer models. It integrates HDBSCAN clustering and c-TF-IDF to automatically discover topics, making it suitable for analyzing dynamic and diverse text data.

4.2.2. Deep Learning-Based Text Encoder

The proposed model was compared with traditional deep learning models, including Text Convolutional Neural Network (Text-CNN), Attention-based Recurrent Neural Network (Att-RNN), and Bidirectional Long Short-Term Memory (Bi-LSTM).

(a): Text-CNN is a deep learning model widely applied to various NLP tasks. For text classification tasks, Text-CNN utilizes convolutional neural networks to extract relevant features from text data automatically.
(b): Att-RNN is an extension of the standard RNN, incorporating an attention mechanism. This enhancement enables the model to more effectively capture dependencies, handle variable-length sequences, and provide interpretability. Additionally, a sigmoid function in the fully connected layer is used to predict the informativeness of content.
(c): Bi-LSTM offers significant advantages over LSTM in capturing bidirectional context and long-range dependencies, which is crucial for understanding text sequences. These advantages make Bi-LSTM a preferred choice for many NLP tasks where contextual understanding is critical.

4.2.3. Pretrained Models

We selected several widely used pretrained text models for fine-tuning on the information recognition task. The pretrained models included in the comparison are RoBERTa [33] and BERTweet [34].

(a): RoBERTa is an optimized NLP model that enhances BERT’s pretraining performance through larger-scale data, longer sequences, dynamic masking, and hyperparameter tuning.
(b): BERTweet, a model based on RoBERTa, is optimized for Twitter data, pretrained on a large corpus of tweets, and excels at handling short sentences and non-standard language in social media text.

4.2.4. Deep Domain Adaptation Models

The deep domain adaptation models compared in this study include EANN [23], Margin Disparity Discrepancy (MDD) [35], and BDANN [26]. Details of these models are provided below.

(a): Margin Disparity Discrepancy (MDD) is a technique used in domain adaptation and machine learning, primarily designed to address domain shift issues. MDD aims to reduce distribution differences between source and target domains in domain adaptation scenarios. This is achieved by explicitly minimizing the disparity in decision boundaries between source and target domain data points.
(b): Event Adversarial Neural Network (EANN): A multimodal feature extractor, an event discriminator, and a fake news detector are jointly trained for multimodal fake news detection. As this study focuses on the text modality, the visual modality feature extraction component was removed, and the model is denoted as EANN-text.
(c): BDANN: A BERT-based domain adaptation neural network, which eliminates event-specific dependencies through a feature extractor, a domain classifier, and a fake news detector, used for fake news detection. Similarly, the visual modality feature extraction component was removed, and the model is denoted as BDANN-text.

4.3. Informative Message Detection Results

Table 4 and Table 5 present the experimental results of the baseline models and the proposed method across six tasks. It is evident that the proposed method significantly outperforms the baseline methods in all evaluated tasks. For instance, in the

C \to W

task, the proposed method achieved an accuracy of 0.914 and an F1 score of 0.911, surpassing all other baseline methods. Similarly, in other tasks, the method demonstrates consistently excellent performance. However, among the transfer tasks, the

M \to C

task exhibits relatively poorer performance compared to others, likely attributable to greater disparities between the source and target domains in this specific task.

Although Text-CNN is capable of extracting local features, its performance is marginally inferior to that of the proposed method in integrating both local and global feature representations. When compared to memory-based neural networks, such as Bi-LSTM and Att-RNN, the proposed method demonstrates superior performance in processing short text data associated with public health events. While Bi-LSTM and Att-RNN are effective in modeling sequential dependencies, they still fall short of the proposed method’s ability to handle short, event-driven text. This is likely because tweets related to public health events are typically composed of shorter text samples. Additionally, LCDAN demonstrates superior performance compared to MDD, EANN-text, and BDANN-text, suggesting that employing label confusion to dynamically weighted samples facilitates more detailed learning and training at the sample level, proving effective for informative message detection tasks.

In conclusion, the proposed method exhibits substantial performance improvements in detecting effective information related to public health events, thereby confirming its effectiveness in cross-domain transfer scenarios.

4.4. Ablation Study

To clearly illustrate the roles of the feature extractor

G_{f}

, domain discriminator

G_{d}

, and label confusion distribution

G_{L}

, an ablation study was conducted for the six detection tasks. As shown in Table 6,

w / o

G_{L}

denotes the LCDAN without the label confusion distribution, and

w / o G_{e}

denotes the LCDAN without the domain discriminator. When

G_{L}

is removed, the importance of each source tweet to the target event cannot be obtained. Furthermore, when

G_{d}

is removed alone,

G_{L}

cannot optimize the model to reduce distribution differences between events, rendering the model equivalent to a non-transfer learning method (BERT).

Compared to the LCDAN, removal of the feature extractor

w / o G_{f}

results in an average accuracy decrease of 20.92%, confirming the effectiveness of the feature extractor. In the

C \to W

task, the accuracy drops to 0.591, highlighting the critical role of this module in extracting deep semantic features across domains. The removal of the feature extractor has a particularly significant impact on tasks requiring fine-grained feature representations; for instance, in the COVID-19 to MERS task, the accuracy decreases from 0.854 to 0.631, a reduction of 26.11%. This may be attributed to the diverse linguistic patterns in public health event texts, where the feature extractor effectively captures textual features associated with public health events.

Removal of the domain discriminator

w / o G_{d}

results in an average accuracy decrease of 5.75% compared to the LCDAN, indicating that knowledge transfer significantly enhances the detection performance of deep models on new data. In the

W \to C

task, the accuracy decreases from 0.867 to 0.739, a significant reduction, highlighting the critical role of the domain discriminator in aligning feature distributions between source and target domains, particularly in tasks with significant domain distribution differences. In tasks with smaller domain differences, such as

W \to M

, the impact is relatively minor, with the accuracy decreasing from 0.897 to 0.884, indicating that the impact of this module is positively correlated with the degree of domain differences.

Removal of the domain discriminator

w / o G_{L}

results in an average accuracy decrease of 5.75% compared to the LCDAN, indicating that knowledge transfer significantly enhances the detection performance of deep models on new data. In the

W \to C

task, the accuracy decreases from 0.867 to 0.739, a significant reduction, highlighting the critical role of the domain discriminator in aligning feature distributions between source and target domains, particularly in tasks with significant domain distribution differences. In tasks with smaller domain differences, such as

W \to C

, the impact is relatively minor, with the accuracy decreasing from 0.897 to 0.884, indicating that the impact of this module is positively correlated with the degree of domain differences.

4.5. Parameter Sensitivity

The hyperparameter

λ

controls the trade-off between loss terms. To investigate parameter sensitivity, experiments were conducted on two selected tasks. The experimental results for varying

λ

in the set

\{0.1, 0.2, 0.5, 1.0, 1.5, 2.0\}

are shown in Figure 6.

The model achieves optimal performance when

λ = 1.0

, effectively balancing task-specific classification and domain alignment, thereby enhancing cross-domain generalization capability. Deviating from this value leads to a noticeable trend, where when

λ > 1.0

, accuracy decreases rapidly. This trend indicates that an excessively large

λ

causes the model to overemphasize domain feature alignment, neglecting task-specific classification requirements. Conversely, a smaller

λ

also impairs performance, though the impact is less severe. This suggests that insufficient domain adaptation results in residual distribution differences, limiting the effectiveness of transfer learning. The accuracy exhibits an inverted U-shaped curve with respect to

λ

, with

λ = 1.0

as the inflection point, indicating optimal coordination between classification and domain alignment objectives at this point.

The response of different tasks to variations in

λ

further reveals its impact. In the task

C \to W

, the accuracy decreases from 0.914 at

λ = 1.0

to 0.851 at

λ = 2.0

, indicating high sensitivity to excessive alignment. In contrast, the task

M \to W

exhibits greater robustness, with accuracy only slightly decreasing from 0.917 at

λ = 1.0

to 0.910 at

λ = 0.5

. This difference may stem from domain similarity, where tasks with closer domains exhibit higher tolerance to variations in

λ

, whereas tasks with greater domain differences require more precise hyperparameter tuning.

4.6. Case Analysis

We selected the EANN_text model for comparison with the LCDAN to analyze misclassified samples. It was found that the main errors of EANN_text stemmed from over-reliance on event-specific features, neglecting the overall semantics of the text, and a lack of in-depth understanding of text semantics, which led to misjudgments. For example, the sample “I’ve been thinking. Maybe it would be a good idea for the CDC to focus on disease prevention instead of bike paths” was misjudged as informative by EANN_text because it contained “cdc” and “disease prevention”, but in fact, it was an expression of opinion. The LCDAN reduced the weight of opinion-type texts through the label confusion mechanism and correctly identified them as non-informative. The sample “Via @NatureNews: #Ebola by numbers: 1 person infected usually spreads the disease to 1–2 people”, as factual data, was misjudged as non-informative, reflecting its insufficient semantic understanding. In contrast, the model proposed in this paper introduced the label confusion and sample weighting mechanisms, which could dynamically adjust the training weights of samples according to their label distribution and semantic features, thereby significantly improving the recognition accuracy. This mechanism reduced the model′s reliance on single keywords by modeling the label distribution and semantic consistency and better grasped the integrity of the context and factual content. In the two examples of EANN_text misjudgments, the proposed method could accurately identify their speculative expressions and successfully avoid misclassification.

To further explore the limitations of the LCDAN in public health event information detection, we analyzed the misclassified samples in the task and found that the main errors were concentrated in scenarios of non-standard expressions with complex semantics, keyword misguidance, and cross-domain semantic differences. In texts related to public health events, non-standard expressions (such as irony and metaphor) increased the difficulty of semantic understanding, making it difficult for the model to distinguish between informative and non-informative content accurately. For example, the humorous mention of “Ebola” in the sample “Easiest way to dump your girlfriend is to tell the police she has Ebola-like symptoms” was misjudged as informative. In addition, in the context of the public health field, the model was sensitive to keywords (such as “disease”) and ignored the context, resulting in non-informative texts being misclassified as informative. For example, “Review: The End of Plagues: The Global Battle against Infectious Disease by John Rhodes” was misjudged as informative because it contained “infectious disease”, ignoring the semantics of “review”. Although LCDAN performed well in most tasks, to address the challenges of non-standard expressions and domain differences, it was necessary to further optimize it by enhancing context modeling, introducing sentiment analysis, and expanding multilingual training data to improve its robustness and adaptability.

By effectively detecting informative messages on social media, this study provides a valuable tool for public health event management. With the detected informative messages, the government or public health institutions can monitor the development trend of events in real time, issue early warning information in a timely manner, and prevent the spread of misleading information. Simultaneously, aggregating and analyzing the detected informative messages can provide decision makers with information from multiple perspectives and help formulate response strategies.

5. Discussion

Although LCDAN shows strong potential for cross-domain applications, it still faces several challenges and limitations. First, although the proposed application of label confusion techniques in domain adaptation for public health event detection has demonstrated effectiveness, future work can further explore advanced label confusion strategies to enhance domain adaptability. Second, the current evaluation is primarily based on three infectious disease outbreaks—COVID-19, Ebola, and MERS—due to the limited availability of other high-quality annotated public health datasets. In future research, once suitable datasets become available, we will seek to incorporate additional public health scenarios to assess the generalizability of LCDAN across diverse event types comprehensively.

Automatic information detection in health crises presents ethical challenges. The system may misclassify non-informative content as critical, or amplify unverified information, potentially impacting public perception and decision-making. Therefore, it is necessary to incorporate validation mechanisms to mitigate such risks. Additionally, practical deployment of such systems requires careful handling of personal data, adherence to privacy regulations, and effective integration with existing public health systems to provide timely and dependable support for emergency decision-making and resource coordination.

6. Conclusions

This paper proposes an effective information detection framework for public health events, LCDAN, based on adversarial learning and label confusion. By dynamically weighting samples through adversarial domain adaptation and label confusion, it can effectively extract transferable meta-knowledge from historical events and apply it to the information event detection of new events. The experimental results show that LCDAN performs excellently in the informative message detection tasks of different public health events, outperforming the existing benchmark methods. Especially in the case of large distribution differences, it significantly improves the generalization ability of the model. This study provides a feasible and efficient solution for the identification of informative messages in the complex, dynamic, and highly heterogeneous social media environment, further enriching the intelligent information acquisition methods in public health emergency management. Especially in the actual situation where social media data sources are extensive, expressions are diverse, and the cost of manual annotation is high, the proposed method helps to quickly screen the core information that is truly valuable for prevention and control decision-making and public opinion research. The research results not only have theoretical promotion significance but also provide technical references for constructing a public health information monitoring system for multiple events and scenarios, with high application prospects.

Author Contributions

Conceptualization, Q.Y. and G.S.; methodology, G.S.; software, Q.Y.; validation, G.S.; formal analysis, Y.C.; data curation, Y.C.; writing—original draft preparation, Q.Y. and Y.C.; writing—review and editing, X.X.; visualization, Y.C.; supervision, X.X.; funding acquisition, Q.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Our dataset is available at https://github.com/yyy-2200/LCDAN (accessed on 4 July 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wahid, J.A.; Xu, M.; Ayoub, M.; Jiang, X.; Lei, S.; Gao, Y.; Hussain, S.; Yang, Y. AI-Driven Social Media Text Analysis during Crisis: A Review for Natural Disasters and Pandemics. Appl. Soft Comput. 2025, 171, 112774. [Google Scholar] [CrossRef]
Phengsuwan, J.; Shah, T.; Thekkummal, N.B.; Wen, Z.; Sun, R.; Pullarkatt, D.; Thirugnanam, H.; Ramesh, M.V.; Morgan, G.; James, P.; et al. Use of Social Media Data in Disaster Management: A Survey. Future Internet 2021, 13, 46. [Google Scholar] [CrossRef]
Wahid, J.A.; Shi, L.; Gao, Y.; Yang, B.; Tao, Y.; Wei, L.; Hussain, S. Identifying and Characterizing the Propagation Scale of COVID-19 Situational Information on Twitter: A Hybrid Text Analytic Approach. Appl. Sci. 2021, 11, 6526. [Google Scholar] [CrossRef]
Ghafarian, S.H.; Yazdi, H.S. Identifying Crisis-Related Informative Tweets Using Learning on Distributions. Inf. Process. Manag. 2020, 57, 102145. [Google Scholar] [CrossRef]
Li, X.; Caragea, D. Domain Adaptation with Reconstruction for Disaster Tweet Classification. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 1561–1564. [Google Scholar]
Van der Maaten, L.; Hinton, G. Visualizing Data Using T-SNE. J. Mach. Learn. Res. 2008, 9, 4. [Google Scholar]
Zhang, L.; Gao, X. Transfer Adaptation Learning: A Decade Survey. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 23–44. [Google Scholar] [CrossRef]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
Jia, Q.; Guo, Y.; Wang, G.; Barnes, S.J. Big Data Analytics in the Fight against Major Public Health Incidents (Including COVID-19): A Conceptual Framework. Int. J. Environ. Res. Public Health 2020, 17, 6161. [Google Scholar] [CrossRef] [PubMed]
Han, X.; Wang, J.; Zhang, M.; Wang, X. Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China. Int. J. Environ. Res. Public Health 2020, 17, 2788. [Google Scholar] [CrossRef] [PubMed]
Madichetty, S.; Madisetty, S. A RoBERTa Based Model for Identifying the Multi-Modal Informative Tweets during Disaster. Multimed. Tools Appl. 2023, 82, 37615–37633. [Google Scholar] [CrossRef]
Paul, N.R.; Sahoo, D.; Balabantaray, R.C. Classification of Crisis-Related Data on Twitter Using a Deep Learning-Based Framework. Multimed. Tools Appl. 2023, 82, 8921–8941. [Google Scholar] [CrossRef]
Xie, S.; Hou, C.; Yu, H.; Zhang, Z.; Luo, X.; Zhu, N. Multi-Label Disaster Text Classification via Supervised Contrastive Learning for Social Media Data. Comput. Electr. Eng. 2022, 104, 108401. [Google Scholar] [CrossRef]
Chamola, V.; Hassija, V.; Gupta, S.; Goyal, A.; Guizani, M.; Sikdar, B. Disaster and Pandemic Management Using Machine Learning: A Survey. IEEE Internet Things J. 2020, 8, 16047–16071. [Google Scholar] [CrossRef]
Imran, M.; Ofli, F.; Caragea, D.; Torralba, A. Using AI and Social Media Multimodal Content for Disaster Response and Management: Opportunities, Challenges, and Future Directions; Elsevier: Amsterdam, The Netherlands, 2020; Volume 57, p. 102261. ISBN 0306-4573. [Google Scholar]
Qu, Z.; Lyu, C. CADNN: Class-Imbalanced Adversarial Neural Network for Unsupervised Domain Adaption in Emergency Events. IEEE Trans. Comput. Social. Syst. 2025, 1–19. [Google Scholar] [CrossRef]
Li, D.; Yang, Y.; Song, Y.-Z.; Hospedales, T. Learning to Generalize: Meta-Learning for Domain Generalization. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A Comprehensive Survey on Transfer Learning. Proc. IEEE 2020, 109, 43–76. [Google Scholar] [CrossRef]
Zhou, X.; Zafarani, R. A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput. Surv. (CSUR) 2020, 53, 1–40. [Google Scholar] [CrossRef]
Bruzzone, L.; Marconcini, M. Domain Adaptation Problems: A DASVM Classification Technique and a Circular Validation Strategy. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 770–787. [Google Scholar] [CrossRef]
Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; March, M.; Lempitsky, V. Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 2016, 17, 1–35. [Google Scholar]
Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial Discriminative Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7167–7176. [Google Scholar]
Wang, Y.; Ma, F.; Jin, Z.; Yuan, Y.; Xun, G.; Jha, K.; Su, L.; Gao, J. Eann: Event Adversarial Neural Networks for Multi-Modal Fake News Detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 849–857. [Google Scholar]
Geng, X. Label Distribution Learning. IEEE Trans. Knowl. Data Eng. 2016, 28, 1734–1748. [Google Scholar] [CrossRef]
Chen, Y.; Rao, Y.; Chen, S.; Lei, Z.; Xie, H.; Lau, R.Y.; Yin, J. Semi-Supervised Sentiment Classification and Emotion Distribution Learning Across Domains. ACM Trans. Knowl. Discov. Data 2023, 17, 1–30. [Google Scholar] [CrossRef]
Zhang, T.; Wang, D.; Chen, H.; Zeng, Z.; Guo, W.; Miao, C.; Cui, L. BDANN: BERT-Based Domain Adaptation Neural Network for Multi-Modal Fake News Detection. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
Ding, Y.; Guo, B.; Liu, Y.; Liang, Y.; Shen, H.; Yu, Z. Metadetector: Meta Event Knowledge Transfer for Fake News Detection. ACM Trans. Intell. Syst. Technol. (TIST) 2022, 13, 1–25. [Google Scholar] [CrossRef]
Guo, B.; Han, S.; Han, X.; Huang, H.; Lu, T. Label Confusion Learning to Enhance Text Classification Models. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 12929–12936. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Nguyen, D.Q.; Vu, T.; Rahimi, A.; Dao, M.H.; Nguyen, L.T.; Doan, L. WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets. arXiv 2020, arXiv:2010.08232. [Google Scholar]
Imran, M.; Mitra, P.; Castillo, C. Twitter as a Lifeline: Human-Annotated Twitter Corpora for NLP of Crisis-Related Messages. arXiv 2016, arXiv:1605.05894. [Google Scholar]
Grootendorst, M. BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure. arXiv 2022, arXiv:2203.05794. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A Robustly Optimized Bert Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Nguyen, D.Q.; Vu, T.; Nguyen, A.T. BERTweet: A Pre-Trained Language Model for English Tweets. arXiv 2020, arXiv:2005.10200. [Google Scholar]
Zhang, Y.; Liu, T.; Long, M.; Jordan, M. Bridging Theory and Algorithm for Domain Adaptation. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 7404–7413. [Google Scholar]

Figure 1. Word clouds of COVID-19 and Ebola disease. (a) COVID-19; (b) Ebola disease.

Figure 2. t-SNE plot of content on COVID-19 and Ebola.

Figure 3. LCDAN framework: The input text is first processed by the feature extractor to generate a low-dimensional feature representation, followed by classification performed by the informative detector based on this feature. The domain discriminator competes with the feature extractor through the gradient reversal layer to promote the generation of event-independent features. The label confusion distribution assigns weights to the source domain samples to optimize the knowledge transfer process.

Figure 4. Word clouds for different domains. The frequency of words is represented by their size, with larger words indicating higher frequency. (a) COVID-19. (b) Worldwide Ebola. (c) Middle East Respiratory Syndrome.

Figure 5. t-SNE plot of content on COVID-19,Ebola, and MERS.

Figure 6. Experimental results for hyperparameter sensitivity (accuracy). (a) task C → W; (b) task W → M.

Table 1. Comparison between LCDAN and the most relevant approaches.

Approach	Pretrained Model Embedding	Domain Adaptation	Distance Calculation	Label Confusion	Sample Weighting
Ganin et al. [21]	-	✓	-	-	-
Wang et al. [23]	-	✓	-	-	-
Zhang et al. [26]	✓	✓	-	-	-
Ding et al. [27]	-	✓	✓	-	✓
Guo et al. [28]	✓	-	-	✓	-
LCDAN	✓	✓	✓	✓	✓

Table 2. Frequently used notations in this paper.

Notation	Description
$x_{s}^{(i)}$ $/ x_{t}^{(j)}$	source/target samples
$n_{s}$ $/ n_{t}$	number of source/target domain samples
$y_{i}$ $, \hat{y_{i}}$	label and predicted label of sample
$p_{s}^{(i)}$	source domain sample label distribution vector
$d^{(i)}$	domain label vector distribution
$p_{s} (x, y)$ / $p_{t} (x, y)$	source/target domain distribution
$G_{f}$	feature generator
$G_{y}$	informative detector
$G_{d}$	domain discriminator
$λ$	trade-off parameters
$w_{i}$	dynamic weight
$μ_{s} / μ_{t}$	source/target domain center label
$d_{L}$	the center distance between the source and target
$L_{y}$	loss of informative detector
$L_{d}$	loss of domain discriminator

Table 3. Statistical information of cross-domain datasets.

	COVID-19	Ebola	Middle East Respiratory Syndrome
Informative	4689	1422	1107
Uninformative	5247	1000	1000

Table 4. Experimental results with tasks C → W, C → M, and M → C.

Method	C → W				C → M				M → C
Method	Acc	P	R	F1	Acc	P	R	F1	Acc	P	R	F1
BERTopic	0.781	0.862	0.745	0.749	0.779	0.831	0.765	0.763	0.722	0.722	0.722	0.722
Text-CNN	0.807	0.835	0.783	0.791	0.677	0.722	0.660	0.646	0.798	0.799	0.798	0.798
Att-RNN	0.750	0.761	0.727	0.731	0.743	0.749	0.736	0.737	0.502	0.520	0.502	0.360
Bi-LSTM	0.773	0.804	0.745	0.750	0.803	0.805	0.806	0.803	0.646	0.646	0.646	0.646
ROBERTa	0.836	0.886	0.809	0.819	0.843	0.845	0.840	0.841	0.737	0.772	0.737	0.728
BERTweet	0.861	0.879	0.845	0.853	0.823	0.829	0.828	0.823	0.729	0.787	0.729	0.715
MDD	0.823	0.860	0.798	0.807	0.817	0.830	0.810	0.813	0.759	0.764	0.759	0.758
EANN-text	0.834	0.831	0.830	0.831	0.785	0.785	0.784	0.784	0.741	0.757	0.741	0.737
BDANN-text	0.894	0.892	0.892	0.892	0.844	0.866	0.836	0.839	0.846	0.846	0.846	0.846
LCDAN	0.914	0.919	0.907	0.911	0.872	0.886	0.866	0.869	0.854	0.855	0.854	0.854

Table 5. Experimental results with tasks M → W, W → C, and W → M.

Method	M → W				W → C				W → M
Method	Acc	P	R	F1	Acc	P	R	F1	Acc	P	R	F1
BERTopic	0.822	0.826	0.805	0.811	0.651	0.672	0.651	0.640	0.787	0.848	0.708	0.724
Text-CNN	0.768	0.776	0.744	0.750	0.751	0.764	0.751	0.748	0.801	0.822	0.742	0.759
Att-RNN	0.640	0.626	0.615	0.615	0.740	0.743	0.740	0.739	0.663	0.636	0.638	0.637
Bi-LSTM	0.770	0.764	0.769	0.766	0.636	0.636	0.636	0.635	0.814	0.798	0.795	0.797
ROBERTa	0.862	0.860	0.871	0.861	0.722	0.719	0.733	0.722	0.856	0.843	0.842	0.843
BERTweet	0.877	0.883	0.863	0.870	0.716	0.787	0.716	0.697	0.877	0.899	0.837	0.856
MDD	0.878	0.874	0.881	0.876	0.649	0.702	0.649	0.624	0.883	0.879	0.863	0.870
EANN-text	0.885	0.882	0.881	0.882	0.744	0.750	0.744	0.743	0.787	0.794	0.730	0.745
BDANN-text	0.892	0.892	0.884	0.888	0.816	0.816	0.816	0.815	0.881	0.887	0.853	0.865
LCDAN	0.917	0.917	0.911	0.914	0.867	0.867	0.867	0.867	0.897	0.893	0.881	0.886

Table 6. Ablation study results for LCDAN.

Method	C → W		C → M		M → C		M → W		W → C		W → M
Method	Acc	F1	Acc	F1	Acc	F1	Acc	F1	Acc	F1	Acc	F1
w/o G_f	0.591	0.584	0.713	0.701	0.631	0.608	0.797	0.786	0.593	0.589	0.741	0.685
w/o G_d	0.861	0.851	0.780	0.763	0.824	0.824	0.888	0.886	0.739	0.736	0.884	0.876
w/o G_L	0.894	0.892	0.844	0.839	0.846	0.846	0.836	0.805	0.816	0.815	0.881	0.865
LCDAN	0.914	0.911	0.872	0.869	0.854	0.854	0.917	0.914	0.867	0.867	0.897	0.886

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ye, Q.; Sun, G.; Chen, Y.; Xu, X. LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events. Electronics 2025, 14, 3102. https://doi.org/10.3390/electronics14153102

AMA Style

Ye Q, Sun G, Chen Y, Xu X. LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events. Electronics. 2025; 14(15):3102. https://doi.org/10.3390/electronics14153102

Chicago/Turabian Style

Ye, Qiaolin, Guoxuan Sun, Yanwen Chen, and Xukan Xu. 2025. "LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events" Electronics 14, no. 15: 3102. https://doi.org/10.3390/electronics14153102

APA Style

Ye, Q., Sun, G., Chen, Y., & Xu, X. (2025). LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events. Electronics, 14(15), 3102. https://doi.org/10.3390/electronics14153102

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LCDAN: Label Confusion Domain Adversarial Network for Information Detection in Public Health Events

Abstract

1. Introduction

2. Related Work

2.1. Public Health Incidents

2.2. Domain Adaptation

2.3. Label Distribution

3. Methodology

3.1. Framework

3.2. Feature Extractor

3.3. Informative Detector

3.4. Domain Discriminator

3.5. Label Confusion Distribution

3.6. Loss Function

4. Experiments

4.1. Datasets

4.2. Experimental Settings

4.2.1. Text Topic Model

4.2.2. Deep Learning-Based Text Encoder

4.2.3. Pretrained Models

4.2.4. Deep Domain Adaptation Models

4.3. Informative Message Detection Results

4.4. Ablation Study

4.5. Parameter Sensitivity

4.6. Case Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI