Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting

Lee, Pilhyeon; Park, Sungho

doi:10.3390/app16041684

Open AccessArticle

Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting

by

Pilhyeon Lee

¹

and

Sungho Park

^1,2,*

¹

Department of Artificial Intelligence, Inha University, Incheon 22212, Republic of Korea

²

Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(4), 1684; https://doi.org/10.3390/app16041684

Submission received: 6 January 2026 / Revised: 1 February 2026 / Accepted: 5 February 2026 / Published: 7 February 2026

(This article belongs to the Special Issue Machine Learning and Soft Computing: Current Trends and Applications)

Download

Browse Figures

Versions Notes

Abstract

Fairness-aware classification with respect to sensitive attributes, such as gender and race, is one of the most important topics in machine learning. Although numerous studies have made outstanding progress through various approaches, one key limitation is that they necessarily require additional labels of sensitive attributes for training. This poses a significant challenge since sensitive attributes typically correspond to personal information. To this end, we propose a novel reweighting method that dynamically gives more weights to underrepresented groups across potential sensitive attributes. Without auxiliary networks or strong assumptions about sensitive attributes, the proposed method significantly improves fairness under various scenarios on benchmark datasets, outperforming the existing state-of-the-art methods.

Keywords:

bias and fairness; fairness in AI; FAI; trustworthy AI; debiasing

1. Introduction

The abundance of training data and the development of sophisticated deep learning models have allowed us to solve various real-world problems with high accuracy and efficiency [1,2,3,4]. This has enabled the widespread use of AI models in real-world systems but has also brought forth a new issue, namely AI fairness, which has been overshadowed by its remarkable performance [5,6,7]. For instance, AI models deployed by national institutions and global corporations, such as COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) [8], Google Photos [9], and Face App [10], sometimes exhibit unethical behaviors related to race, resulting in significant social repercussions. Specifically, the COMPAS algorithm, which predicts the risk of recidivism, was found to more frequently judge black defendants as being at a higher risk of recidivism than white defendants [8]. On the other hand, Google Photos and Face App provided discriminatory image detection and translation services with regard to certain races [9,10]. These issues have motivated researchers to investigate the topic of fairness in AI [11,12,13,14,15].

To this end, researchers established the definition of sensitive attributes (e.g., gender, race, age, and region) as the characteristics by which people should not be discriminated, and introduced the definition of fairness (e.g., demographic parity [11], equal opportunity, and equalized odds [16]). On top of that, researchers have proposed several methods to improve fairness by generating a fair dataset [17,18,19], excluding (or suppressing) sensitive attribute information in representation learning [20,21,22,23], and reweighting sample importance [24,25].

However, most of these methods require additional labels for sensitive attributes to ensure fairness, which leads to several limitations [26,27]. First, many popular benchmark datasets in the research community do not contain labels for sensitive attributes [1,28,29]. Moreover, since sensitive attributes are often related to personal information, labeling them on the existing datasets can be ethically risky and may require the data providers’ consent [26]. This is a significant obstacle to the deployment of existing fairness-aware methods in the real world.

To this end, recent studies have endeavored to alleviate model bias without demographics. Many of these works, particularly those based on debiasing approaches, tackle this challenge by capitalizing on specific assumptions about the characteristics of sensitive attributes or unwanted biases. For instance, they assume texture or color biases [30,31,32], a malignant attribute [33,34,35,36], and a high correlation with their proxies [27,37,38], allowing the effective mitigation of model bias without demographics. However, since sensitive attributes can be diversely defined based on social consensus and context, they may neither be malignant attributes nor have a substantial correlation with proxies. Consequently, these assumptions impose limitations on achieving universal fairness for potential sensitive attributes. On the other hand, some studies [39,40,41,42] do not rely on such assumptions; however, they suffer from significant performance degradation [39,40] and additionally require a small validation set with sensitive attribute labels [41] or a complex teacher network [42].

To address these limitations, we introduce a simple but effective reweighting method that enhances fairness without strong assumptions regarding sensitive attributes. To ameliorate fairness, conventional reweighting approaches [25,39,43,44] utilize predefined sensitive information to identify underrepresented groups and assign higher weights to them than to others. However, our method can assign higher weights to underrepresented groups without any knowledge of sensitive attributes.

In contrast to defining underrepresentation based solely on group size, we define it as performance-wise underrepresentation. For instance, a minority subgroup with clear and consistent features may achieve a low error rate and thus is not considered underrepresented under our definition. Conversely, a majority subgroup with substantial intra-group variability may exhibit poor performance and is treated as underrepresented. Motivated by this performance-based view, we propose to dynamically cluster samples into two groups, i.e., misclassified and correctly classified samples, based on the outputs of the current model, and subsequently assign a higher weight to the misclassified group. Since performance-wise underrepresented groups inherently contain a higher proportion of misclassified samples than their counterparts, a higher average weight is effectively given to such groups across any potential sensitive attributes.

On benchmark datasets for fair classification (i.e., CelebA [2], UTK Face [45], and COMPAS [8]), we extensively validate the effectiveness of the proposed method. Our method not only achieves state-of-the-art performances under diverse experimental settings but exhibits greater improvement in more severely unfair environments. The main contributions of this paper are summarized as follows.

We propose a novel method that dynamically groups data samples based on the current outputs and assigns more weights to the group with misclassified samples.
Regardless of the choice of sensitive attributes, the proposed method gives more weights to underrepresented groups, resulting in enhancing fairness across unknown sensitive attributes.
Without strong assumptions about sensitive attributes and auxiliary networks, our method significantly outperforms state-of-the-art methods on the benchmark datasets.

2. Related Work

2.1. Fairness-Aware Classification

In the literature [20,23,46,47], it has been indicated that learning sensitive attributes during the training phase is a major cause of unfairness. To solve the problem, prior works have attempted to suppress the learning of unnecessary biases in the training process or eliminate them in the test phase. For instance, some approaches employed the gradient reversal layer (GRL) to prevent the model from learning sensitive attributes [46,48,49]. Meanwhile, others proposed to directly disentangle the representation for target and sensitive attributes in the feature space [20,21,47]. On the other hand, generative approaches have been proposed [17,18,19] that synthesize a fair dataset based on a generative adversarial network (GAN) [50]. Lastly, some works enhanced fairness by upweighting or upsampling the data of minority groups [24,25,43,44].

2.2. Fairness/Debiasing Without Bias Supervision

To mitigate model bias without prior knowledge, many studies have relied on certain assumptions about the characteristics of sensitive attributes and biases. Some of them [30,31,32] assumed specific types of biases (e.g., texture and color) and designed an auxiliary network specialized to those biases. Others [33,34,35,36] grouped biases into two categories, i.e., malignant and benign biases, based on their relationship with the target attribute. They aimed at handling only the malignant biases, while the benign ones were left untouched. Meanwhile, others [27,37,38] assumed that proxies, such as clusters and feature representations, have a high correlation with sensitive attributes, and leveraged them as surrogates for the labels.

On the other hand, several works [39,40,51] adopted minimum–maximum fairness that minimizes the risk of the worst group. While they effectively enhance fairness by mitigating the performance gaps across all groups, they may experience a degradation in performance. Just Train Twice (JTT) [41] is similar to ours in that it upweights the misclassified samples. However, it constructs a fixed dataset by oversampling the misclassified samples with a pre-trained classifier. Subsequently, a new classifier is trained fairly using this dataset. Since the dataset fails to reflect the evolving classifier, it may not guarantee assigning more weight to minority groups for potential sensitive attributes. Furthermore, it needs a small validation set with sensitive attribute labels to tune the hyperparameters. Chai, Jang, and Wang [42] improved fairness without demographics by distilling knowledge from a complex teacher. Some works [52,53] proposed new approaches to enhance fairness in self-supervised settings.

Additionally, recent studies have proposed diverse perspectives to improve fairness without explicit bias supervision. Shared Latent Space-based Debiasing (SLSD) [54] leverages sensitive attribute information in an auxiliary source domain to learn a shared latent space between the source and target domains, and applies adversarial debiasing so that sensitive information is not encoded in the target model. However, since it relies on sensitive attribute labels in the source domain, it differs from methods that assume no access to sensitive labels. VFair [55] adopts a Rawlsian perspective that aims to improve the worst group performance without demographic labels, mitigating latent group disparities by minimizing the variance of losses across samples. Nonetheless, it reports that its effectiveness may be limited in classification tasks due to quantized utility measurements. The Graph of Gradients (GoG) [56] constructs a graph of gradients to learn sample weights by propagating weights to harder instances through graph neighborhoods. However, it incurs additional overhead due to gradient extraction and graph construction, and the reported improvements in the fairness–accuracy trade-off are sometimes modest.

3. Method

Problem Definition. Consider a data sample x comprising a target attribute $y \in {c_{1}, c_{2}, \dots, c_{n}}$ and unknown sensitive attributes $S \in {s_{1}, s_{2}, \dots, s_{m}}$ . During the training of a classifier parameterized by $θ$ to predict y, the model may capture biased features associated with sensitive attributes as a shortcut to minimize the average training loss. Our goal is to train a classifier that is invariant to the sensitive attribute $s \in S$ , without access to sensitive attribute labels, as follows.

$arg min_{θ} E_{(x, y) \sim D} [L (θ (x), y)] s . t . I (θ (x); s) = 0$

(1)

Here, $D$ denotes the training data distribution and $E_{(x, y) \sim D} [\cdot]$ denotes the expectation over samples $(x, y)$ from $D$ . $L (\cdot, \cdot)$ denotes the classification loss and $I (\cdot; \cdot)$ denotes mutual information.

In this work, we adopt equalized odds (EO) [16] for the definition of fairness, as follows:

P_{\dot{s}} (\tilde{y} = \tilde{c} | y = c) = P_{\ddot{s}} (\tilde{y} = \tilde{c} | y = c), \forall \dot{s}, \ddot{s}, c, \tilde{c},

(2)

where

\dot{s}, \ddot{s} \in S

and

c, \tilde{c} \in y

. Here,

\tilde{y}

denotes the predicted label of the classifier.

P_{\dot{s}} (\cdot)

and

P_{\ddot{s}} (\cdot)

denote the conditional probabilities for samples in the subgroups with

\dot{s}

and

\ddot{s}

, respectively. This condition requires that, for any two sensitive groups

\dot{s}

and

\ddot{s}

, the conditional distribution of predictions given

y = c

is identical across groups.

Overall Framework. As illustrated in Figure 1, we design the overall framework using Bias Pseudo-Attribute (BPA) [27] as a baseline. The major difference between the two frameworks is that BPA calculates fixed weights for samples using clusters from the pre-trained classifier, whereas ours dynamically updates the sample weights based on the outputs $\tilde{y}$ of the current training model.

3.1. Strategy for Sample Grouping

In the previous work [27], BPA clusters training samples in the feature space of a biased model to estimate sensitive pseudo-attributes, assuming a high correlation between the target and sensitive attributes. Specifically, they pre-train a biased classification model

\bar{θ}

with target labels y and perform k-means clustering for samples with the same target class c, which is formulated as

k_{y} =

k-means

(\bar{θ}, x, y)

. Subsequently, the samples are assigned into groups

g_{i} = {(x, y) | i = y \times k_{y}}

based on the obtained clusters. Within their assumption, the groups effectively identify sensitive pseudo-attributes; however, they may encounter challenges in identifying them when the target and potential sensitive attributes are not sufficiently correlated.

To overcome this limitation, we first suppose the existence of underrepresented groups

U (y, S)

and overrepresented groups

O (y, S)

for the target attribute y and unknown sensitive attributes S. We define the underrepresented groups in terms of performance-wise underrepresentation, referring to subgroups that underperform under the current model, rather than being defined solely by their sample counts. Since

U (y, S)

exhibits lower classification performances than

O (y, S)

,

U (y, S)

has a higher proportion of misclassified samples. Therefore, assigning more loss weights to misclassified samples than to the other samples leads to attributing greater loss weights to

U (y, S)

than

O (y, S)

, regardless of the characteristics of the sensitive attributes S. From this motivation, we propose a new strategy for grouping samples. Specifically, our approach dynamically clusters samples into binary groups

g_{i}

(

i \in {0, 1}

) based on the prediction

\tilde{y}

of a training classification model

θ

as follows.

g_{i} = {(x, y) | i = 1 (\tilde{y} \neq y)},

(3)

where

1 (\cdot)

is the Boolean indicator function. Subsequently, we assign higher loss weights to the group with misclassified samples (i.e.,

g_{1}

) than the other group with correctly classified samples (i.e.,

g_{0}

). Consequently, we can upweight the underrepresented groups

U (y, S)

by utilizing the misclassified samples as a proxy. This leads to enhancing fairness by diminishing the accuracy gaps between

U (y, S)

and

O (y, S)

for any potential sensitive attributes.

3.2. Calculation of Group-Wise Weight

With regard to reweighting the grouped samples, we employ a smoothed version of Group Distributionally Robust Optimization (Group DRO) [24] proposed in [27]. When a target attribute y and sensitive attributes S are given, Group DRO [24] defines groups

g \in G = {1, 2, \dots, m}

, where

m = | y | \times | S |

. Here,

| \cdot |

denotes the cardinality of a set. It assumes that the overall data distribution P is a mixture of the group distributions

{P_{g}}_{g \in G}

, and minimizes the empirical worst group risk

R (θ)

as follows.

{\hat{θ}}_{D R O} = \underset{θ}{\arg \min} \{R (θ) : = max_{g \in G} E_{(x, y) \sim {\hat{P}}_{g}} [l (x, y; θ)]\},

(4)

where

{\hat{P}}_{g}

is the empirically approximated distribution over training data for the group distribution

P_{g}

, and

l (\cdot)

is a loss function.

On the other hand, BPA [27] cannot access S during training, and it substitutes the bias with the clusters. In addition, instead of minimizing the worst group risk, it assigns different weights

w_{g_{i}}

to the groups as follows:

{\hat{θ}}_{B P A} = \underset{θ}{\arg \min} \{R (θ) : = E_{(x, y) \sim P} [w_{g_{i}} l (x, y; θ)]\} .

(5)

The importance weight

w_{g_{i}}

is proportional to the average loss of the group and inversely proportional to the size of that as follows.

w_{g_{i}} = \frac{E_{(x, y) \sim g_{i}} [l (x, y; θ)]}{| g_{i} |},

(6)

where

| g_{i} |

denotes the cardinality of group

g_{i}

.

In our method, the group with misclassified samples (i.e.,

g_{1}

) has a higher average loss than the other group (i.e.,

g_{0}

). It is intuitive and we provide empirical support in Figure 2. In addition, as the classification model becomes more proficient during the training process, the size of

g_{1}

decreases relatively in comparison to that of

g_{0}

. Therefore, both of these factors are required to ensure the effective reweighting of the groups and assign a greater weight to

g_{1}

than

g_{0}

. This adaptive mechanism eliminates the need for the tuning of the weighting ratio, which is typically infeasible when sensitive attribute labels are unavailable. Consequently, our method automatically balances the relative emphasis between groups without requiring additional hyperparameter tuning.

3.3. Algorithm for Overall Flow

To clarify the overall procedure of the proposed method, we present Algorithm 1. Here, D denotes the training dataset, E is the total number of epochs, B is the batch size,

i t e r

is the mini-batch iteration index within each epoch,

η

is the learning rate, and

l (\cdot)

denotes the loss function.

{\tilde{y}}_{k}

denotes the predicted label for sample

x_{k}

.

g_{i}

(i \in {0, 1})

denotes the binary groups defined by misclassification, and

w_{g_{i}}

denotes the corresponding group weight.

Algorithm 1 Overall Flow

Require:: Training data D, classifier $θ$ , loss function l, total epochs E, batch size $| B |$ , learning rate $η$

1:: Initialize $θ$
2:: for $e = 1, 2, \dots, E$ do
3:: while $i t e r \leq \frac{| D |}{| B |}$ do
4:: Randomly sample a batch ${(x_{j}, y_{j})}_{j = 1}^{B}$
5:: if $e = 1$ then
6:: $θ \leftarrow θ - η \frac{1}{B} \sum_{j = 1}^{B} \nabla l (x_{j}, y_{j}; θ)$
7:: else
8:: $θ \leftarrow θ - η \frac{1}{B} \sum_{j = 1}^{B} \nabla w_{g_{i}} l (x_{j}, y_{j}; θ)$
9:: end if
10:: $i t e r = i t e r + 1$
11:: end while
12:: $g_{i} = {(x_{k}, y_{k}) | i = 1 (\tilde{y_{k}} \neq y_{k})}$ for $k = 1, \dots, | D |$
13:: $w_{g_{i}} = \frac{E_{(x, y) \sim g_{i}} [l (x, y; θ)]}{| g_{i} |}$
14:: end for

During the first epoch, we train the classification model with the standard cross-entropy loss. With the updated model

θ

, we group the training samples into

g_{i}

and calculate the group-wise loss. Based on the loss and the size of groups, we calculate the weight

w_{g_{i}}

. In the next epoch, we fairly train the classification model with the weighted loss for the groups. This process is repeated until the end of training.

4. Experiments

4.1. Datasets

CelebA contains about 200k face images with 40 facial attributes. Following the previous works [23,27,33], we set Male as the sensitive attribute for evaluation. Following the convention [27], among the remaining attributes, we exclude 5 o’clock Shadow, Bald, Rosy Cheeks, Sideburns, Goatee, Mustache, and Wearing Necktie, for which minority groups contain few or no samples.
UTK Face includes about 20k face images where Gender, Race, and Age attributes are annotated. We set Gender as the target attribute and the others as the sensitive attributes. Following the previous work [23], we convert Race and Age into binary attributes and construct an imbalanced training set. Specifically, we construct three subsets according to the severity of data imbalance $γ$ . Here, $γ$ represents the ratio between the majority and minority groups, ranging from two to four with a step size of one. We construct the validation and test sets to be fully balanced.
COMPAS includes about 7k samples with 11 attributes. Following the convention [42], we only utilize Caucasian and African-American samples and set Race and Sex as sensitive attributes.
Cat and Dog has 40k images of dogs or cats. Following the previous method [22], we set species as the target attribute and color as the bias attribute. We construct a training set in which dogs and cats are correlated with white and black color, respectively, following the setup of the previous work [23]. The validation and test sets are fully balanced.

4.2. Evaluation Metrics

We evaluate the performance of the models based on three metrics. The first is balanced accuracy (BAcc.) to measure classification performance, which is defined as the average accuracy across all groups defined by both target and sensitive attributes. This metric evaluates the generalized performance of the models [22,27,33]. Regarding fairness, there are a range of conventional metrics for fairness, such as demographic parity [11], equal opportunity, and equalized odds [16]. Among them, we adopt equalized odds (EO) to consider the overall distribution of errors across diverse demographic groups. It is noted that demographic parity and equal opportunity are not chosen since they concentrate only on the true positive rates and predicted positive rates, respectively. Lastly, we also report the standard deviation of group-wise accuracy as a supplementary metric of EO. Specifically, we can detect undesirable situations, such as the disproportionate favoritism of the model toward a single class, via an extremely high standard deviation.

4.3. Comparison on COMPAS

We report the comparison results on COMPAS for sensitive attributes Race and Sex in Table 1 and Table 2, respectively. Focal Loss [57] effectively improves accuracy by upweighting hard examples. However, since it does not explicitly address performance gaps across sensitive groups, it yields only limited improvements in equalized odds (EO). Distributionally Robust Optimization (DRO) [39] and Adversarially Reweighted Learning (ARL) [40] significantly improve fairness over the baseline without strong assumptions about sensitive attributes. However, they show a lower trade-off performance between accuracy and EO than the other methods for fairness. FairRF [38], which is a proxy-based approach, and Chai, Jang, and Wang [42] exhibit superior trade-off performances. Without strong assumptions and auxiliary networks, ours achieves the best trade-off performance.

Comparison with Boosting. Boosting methods, such as Adaboost [58] and Gradient Boosting [59], have a similarity with ours in that they upweight misclassified samples during the training phase. However, there are significant differences. First, our method trains a single model by dynamically reweighting samples based on the outputs of the previous training stage while boosting methods iteratively train new weak learners with sample weights based on the outputs of previous learners. Second, the sample weights for each weak learner are not dynamically updated in boosting methods. Lastly, the methods for calculating weights are different from each other. We compare ours with the boosting approaches in Table 1. While boosting methods effectively improve classification accuracy over the baseline, they fail to enhance fairness.

4.4. Comparison on CelebA

To validate the proposed method under various scenarios, we conduct experiments with 32 target attributes on CelebA, excluding five attributes for reliable evaluation. In Table 3, we report the summarized results, along with the cases where EO of ResNet [3] exceeds 10 and 20 to analyze the effectiveness in an environment with severe unfairness. Group DRO [24] utilizes sensitive attribute labels during the training phase, and thus it serves as an upper bound for the other comparative methods. Learning from Failure (LfF) [33], Disentangled Feature Augmentation (DFA) [34], Bias Pseudo-Attribute (BPA) [27], and ours significantly improve both balanced accuracy (i.e., BAcc.) and equalized odds (i.e., EO) without sensitive attribute labels. Ours not only achieves the fairest results among them but also shows larger gains as unfairness becomes more severe (i.e., EO ≥ 10 and EO ≥ 20). In contrast to those, Just Train Twice (JTT) utilizes additional sensitive attribute labels during the validation process; nevertheless, ours outperforms it in terms of BAcc. and EO. In addition, we report the standard deviation (Std Dev.) to detect a shortcut, such as a model consistently predicting only one class, since it leads to extremely high standard deviation. For example, it is observed that the baseline shows extremely high standard deviation for several attributes, including Big Lips, Chubby, and Wearing Necklace. We include the full results tables in Table 4 and Table 5.

4.5. Comparison on UTKFace

In Table 6 and Table 7, we compare the effectiveness of models under different levels of data imbalance. As

γ

increases, ResNet shows a significant decrease in fairness. BPA and DFA significantly enhance fairness for the target attribute Age, while having a limited impact on the other target attribute Race. In contrast, LfF notably mitigates unfairness for Age; however, it suffers from a substantial performance drop. Ours largely improves fairness over the baseline and achieves the best trade-off performances in all the settings. As

γ

increases, ours exhibits a more significant improvement in fairness compared to the baseline. Similar to the results on CelebA, this indicates that our method has greater efficacy in addressing severe unfair scenarios.

4.6. Exploring Assumptions for Sensitive Attributes

As mentioned above, many previous works rely on assumptions about sensitive attributes to enhance fairness without demographics. In particular, it is popularly assumed that the sensitive attribute is a malignant attribute or is highly correlated with proxies. In this section, we analyze some scenarios where such assumptions do not hold. Specifically, we compare our method with the representative methods corresponding to each assumption, namely LfF [33] and BPA [27], for four target attributes: Attractive, Wearing Earrings, Chubby, and Brown Hair. Regarding Attractive, the sensitive attribute (i.e., Male) is a malignant attribute yet is less correlated with the proxy (i.e., target attribute). In contrast, for Wearing Earrings, it is not a malignant attribute and has a high correlation with the target attribute. For Chubby and Brown Hair, it is neither a malignant attribute nor highly correlated with the target attributes.

For all the scenarios, we measure the average weights of majority and minority groups for the comparative methods. It is noted that all the weights are normalized for better visualization. Figure 3 shows that LfF effectively gives more weight to minority groups for Attractive, where the sensitive attribute is a malignant attribute. However, it instead upweights majority groups for Wearing Earrings where it is not a malignant attribute. In contrast, BPA gives very similar weight to both majority and minority groups for Attractive, which has a low correlation with the sensitive attribute, while it preferentially upweights the minority group for Wearing Earrings. For Chubby and Brown Hair, both LfF and BPA fail to upweight the minority groups. As we claimed that the proposed method can attribute a higher average weight to the underrepresented groups regardless of the choice of sensitive attributes, it effectively assigns greater weight to minority groups in all the scenarios.

4.7. Effectiveness of Components

In this section, we analyze the effectiveness of our design through a comparison between different clustering strategies and an ablation study. For this purpose, we prepare several competitors using different clustering strategies.

BPA* is a modified version of BPA [27] that dynamically updates the clusters in the latent space of the debiased model with k-means clustering. In the first epoch, it generates clusters using k-means clustering in the latent space of the pre-trained model. In subsequent epochs, the clusters are updated with the same clustering method in the latent space of the current training model at each epoch. Based on the updated clusters, it utilizes the reweighting method such as BPA.
Random randomly clusters samples for each epoch. For each batch, it clusters the input samples into two groups randomly. For each cluster, samples are differently weighted based on the average loss and size of the respective groups, as in our method.
Loss sorts samples by loss and subsequently separates them into groups based on their ranking. At each epoch, it computes the training loss of all samples and separates them into two groups based on the loss, i.e., one group consists of samples with high loss, whereas the other group comprises samples with low loss. Specifically, we set the threshold to the median value of the loss and then assign weights to the samples in proportion to the average loss of the respective groups. If the groups are more finely separated, it converges toward Instance, as shown in Table 8.
EO first composes clusters by k-means clustering similar to BPA. In subsequent epochs, multiple candidate groups are generated by permuting samples between the groups with a probability of $p = 0.3$ . These candidate groups are evaluated in terms of equalized odds, and the groups with the largest equalized odds between them are selected. We then assign weights to the groups with our reweighting strategy.
Instance computes the training loss for all samples using the current training model and assigns a weight to each sample in proportion to the loss. These weights are then normalized by the batch so that their sum equals one.
${Ours}^{†}$ is an ablative version of our method. In the first epoch, it clusters samples based on whether they are correctly classified or not (i.e., the proposed method). However, it does not update the clusters.

For the comparison, we choose two target attributes that cause severe unfairness on CelebA, i.e., Attractive and Arched Eyebrows. As reported in Table 9, BPA improves balanced accuracy over the baseline for Arched Eyebrows, which is highly correlated with the sensitive attribute, but it is not very effective in terms of EO. Dynamic updating (i.e., BPA*) slightly improves both balanced accuracy and EO over BPA for both target attributes. The results of Random indicate that randomly reweighting samples can be somewhat effective in improving the generalized performance. On the other hand, other strategies (i.e., Loss, EO, and Instance) exhibit limited improvements in terms of fairness. In particular, Instance is vulnerable to outliers, such as mislabeled data, as it assigns the largest weights to extreme examples [40], as shown in Table 10. We note that this experiment is intended to compare sensitivity to label noise between the proposed method and instance-level weighting, rather than to claim absolute robustness. In addition, the results of

{ours}^{†}

suggest that dynamically updating the clusters is highly important in our method. Lastly, ours, with dynamic clustering, achieves the fairest performance among the competitors, demonstrating its effectiveness.

Table 8. Relationship between Loss and Instance. We respectively set Arched Eyebrows and Male as the target and sensitive attributes.

2 Groups		4 Groups		8 Groups		Instance
BAcc.	EO	BAcc.	EO	BAcc.	EO	BAcc.	EO
76.7	32.9	75.9	29.3	74.1	28.4	71.4	23.8

Table 9. Analysis on clustering strategy and ablation study. We exploit two target attributes, Attractive and Arched Eyebrows, which respectively have low and high correlation with the sensitive attribute, Male. Updating indicates whether the initial clusters are fixed or dynamically updated.

Method	Updating	Attractive			Arched Eyebrows
Method	Updating	Balanced Accuracy	Equalized Odds	Std Dev.	Balanced Accuracy	Equalized Odds	Std Dev.
ResNet	✗	76.7	25.6	15.5	70.8	33.8	27.0
BPA	✗	76.4	24.1	16.6	73.6	34.3	31.9
BPA*	✓	77.7	21.2	13.5	77.1	28.5	21.6
Random	✓	77.7	21.5	13.7	74.9	38.9	23.6
Loss	✓	78.0	18.2	10.5	76.7	32.9	23.2
EO	✓	77.0	22.0	13.2	75.7	34.6	27.1
Instance	✓	76.1	23.2	19.9	71.4	23.8	24.2
${Ours}^{†}$	✗	77.4	23.6	14.2	74.4	38.2	22.2
Ours	✓	75.5	6.4	9.3	75.9	15.3	20.5

Table 10. Robustness to mislabeled samples. We set Race and Gender as the target and sensitive attributes on UTKFace. We deliberately assign incorrect target labels to 10% or 20% of the training data.

Method	Clean		10% Noised		20% Noised
Method	BAcc.	EO	BAcc.	EO	BAcc.	EO
Instance	79.4	16.4	75.4	25.0	68.2	33.3
Ours	80.9	9.5	75.2	14.1	68.6	17.0

4.8. Fairness Improvement in Semi-Supervised Setting

When access to sensitive attributes is not completely prohibited, we demonstrate that a significant improvement in terms of fairness can be achieved by annotating sensitive attribute labels for a limited number of training data. In this semi-supervised setting, we train a classification model by utilizing Group DRO [24] for data with sensitive attribute labels and our method for unlabeled data. As shown in Table 11, our method shows a substantial improvement in fairness with just 10% of labeled data, achieving comparable EO to the fully-supervised Group DRO, as the amount of the labeled data increases.

4.9. Analysis on Cat and Dog

Differing from CelebA and UTK Face with sensitive facial attributes, Cat and Dog includes a more general bias, i.e., color. In Table 12, we demonstrate that the proposed method can effectively reduce different types of biases. For all the comparable methods, both balanced accuracy and equalized odds are significantly improved over the baseline, and ours achieves the best trade-off performance. In particular, the equalized odds of the proposed method is approximately half that of the second-best method.

4.10. Implementation Details

For a fair comparison, all comparative models are implemented using ResNet-18 [3] as the encoder network and a single fully connected layer as the classifier. We utilize the cross-entropy loss as the classification loss

l (\cdot)

. Following the previous work [27], we adopt an improved version of LfF [33] for comparison. For DFA [34], we search the hyperparameters in the range of {0.1, 1, 5, 10}. For BPA [27], we set the number of clusters k to four since the performance saturates afterward. We follow the original implementation for the other hyperparameters. For JTT [41] and GDRO [24], we strictly follow the original settings from the papers. Our method is built based on BPA [27] and optimized by Adam [60] with a learning rate of

10^{- 4}

. For all models, we choose the best model regarding the balanced accuracy on the validation set for 30 epochs.

5. Limitations and Future Work

The proposed method significantly improves fairness across diverse datasets without demographic information. However, a performance gap still remains compared to fully supervised approaches that leverage sensitive attribute labels (e.g., Group DRO). In addition, since our debiasing strategy relies on whether a sample is misclassified, it can be sensitive to label noise. This limitation could be mitigated by incorporating robust learning methods for label noise. Lastly, since we dynamically regroup training samples and compute the corresponding group-wise weights at every epoch, our method incurs additional training time and computational overhead. For example, on UTKFace, the proposed method requires 1.08× the training time of the baseline (ResNet), on average. Nevertheless, our method provides a favorable trade-off by substantially improving fairness with a relatively small additional cost. Future work can explore more efficient group-wise weight update procedures to reduce this overhead.

6. Conclusions

In this paper, we aimed to enhance fairness in classification tasks without demographic information. First, we pointed out the limitations of prior approaches, including the strong assumption about the characteristics of sensitive attributes and reliance on the auxiliary information or networks. To overcome these limitations, we proposed a dynamic group-wise reweighting method that enhances fairness without these requirements. Our approach assigns greater weights to underrepresented groups for potential sensitive attributes by utilizing misclassified samples as a proxy. Through extensive experiments, we empirically demonstrated the effectiveness of the proposed method under various scenarios. Moreover, our method achieved state-of-the-art performance in terms of fairness on benchmark datasets.

Author Contributions

Conceptualization, S.P.; methodology, S.P. and P.L.; investigation, S.P. and P.L.; writing—original draft preparation, S.P.; writing—review and editing, S.P. and P.L.; and supervision, S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Incheon National University Research Grant in 2024.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

This work was supported by Incheon National University Research Grant in 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. (IJCV) 2015, 115, 211–252. [Google Scholar] [CrossRef]
Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2015. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar]
Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Brandao, M. Age and gender bias in pedestrian detection algorithms. arXiv 2019, arXiv:1906.10490. [Google Scholar] [CrossRef]
Kärkkäinen, K.; Joo, J. FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age. arXiv 2019, arXiv:1908.04913. [Google Scholar] [CrossRef]
Burns, K.; Hendricks, L.A.; Darrell, T.; Rohrbach, A. Women also Snowboard: Overcoming Bias in Captioning Models. In Proceedings of the European Conference on Computer Vision (ECCV); IEEE: New York, NY, USA, 2018. [Google Scholar]
Angwin, J.; Larson, J.; Mattu, S.; Kirchner, L. Machine bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. ProPublica, 23 May 2016. [Google Scholar]
Dougherty, C. Google photos mistakenly labels black people gorillas. Twitter, 1 July 2015. [Google Scholar]
Lomas, N. FaceApp apologizes for building a racist AI. TechCrunch, 25 April 2017. [Google Scholar]
Dwork, C.; Hardt, M.; Pitassi, T.; Reingold, O.; Zemel, R. Fairness through Awareness. In ITCS ’12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference; ACM: New York, NY, USA, 2012; pp. 214–226. [Google Scholar] [CrossRef]
Kusner, M.J.; Loftus, J.; Russell, C.; Silva, R. Counterfactual Fairness. In Proceedings of the Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 4066–4076. [Google Scholar]
Wang, T.; Zhao, J.; Yatskar, M.; Chang, K.W.; Ordonez, V. Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2019; pp. 5309–5318. [Google Scholar]
Wang, Z.; Qinami, K.; Karakozis, I.C.; Genova, K.; Nair, P.; Hata, K.; Russakovsky, O. Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2020. [Google Scholar]
Gong, S.; Liu, X.; Jain, A. Jointly De-biasing Face Recognition and Demographic Attribute Estimation. In Computer Vision—ECCV 2020; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Hardt, M.; Price, E.; Srebro, N. Equality of Opportunity in Supervised Learning. In NIPS’16: Proceedings of the 30th International Conference on Neural Information Processing Systems; ACM: New York, NY, USA, 2016; pp. 3323–3331. [Google Scholar]
Ramaswamy, V.V.; Kim, S.S.Y.; Russakovsky, O. Fair Attribute Classification Through Latent Space De-Biasing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2021; pp. 9301–9310. [Google Scholar]
Sattigeri, P.; Hoffman, S.C.; Chenthamarakshan, V.; Varshney, K.R. Fairness GAN: Generating datasets with fairness properties using a generative adversarial network. IBM J. Res. Dev. 2019, 63, 3:1–3:9. [Google Scholar] [CrossRef]
Xu, D.; Yuan, S.; Zhang, L.; Wu, X. FairGAN: Fairness-aware Generative Adversarial Networks. In 2018 IEEE International Conference on Big Data (Big Data); IEEE: New York, NY, USA, 2018; pp. 570–575. [Google Scholar] [CrossRef]
Creager, E.; Madras, D.; Jacobsen, J.H.; Weis, M.; Swersky, K.; Pitassi, T.; Zemel, R. Flexibly Fair Representation Learning by Disentanglement. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp. 1436–1445. [Google Scholar]
Park, S.; Hwang, S.; Kim, D.; Byun, H. Learning Disentangled Representation for Fair Facial Attribute Classification via Fairness-awareunbiased Information Alignment. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 2403–2411. [Google Scholar]
Kim, B.; Kim, H.; Kim, K.; Kim, S.; Kim, J. Learning Not to Learn: Training Deep Neural Networks With Biased Data. In Proceedings of the The IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2019. [Google Scholar]
Park, S.; Lee, J.; Lee, P.; Hwang, S.; Kim, D.; Byun, H. Fair Contrastive Learning for Facial Attribute Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 10389–10398. [Google Scholar]
Sagawa, S.; Koh, P.W.; Hashimoto, T.B.; Liang, P. Distributionally Robust Neural Networks. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
Kim, D.; Park, S.; Hwang, S.; Byun, H. Fair classification by loss balancing via fairness-aware batch sampling. Rev. Econ. Dyn. 2023, 518, 231–241. [Google Scholar] [CrossRef]
Jung, S.W.; Chun, S.; Moon, T. Learning Fair Classifiers with Partially Annotated Group Labels. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 10338–10347. [Google Scholar]
Seo, S.; Lee, J.Y.; Han, B. Unsupervised Learning of Debiased Representations With Pseudo-Attributes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 16742–16751. [Google Scholar]
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
Heilbron, F.C.; Escorcia, V.; Ghanem, B.; Niebles, J.C. ActivityNet: A large-scale video benchmark for human activity understanding. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2015; pp. 961–970. [Google Scholar] [CrossRef]
Wang, H.; He, Z.; Lipton, Z.L.; Xing, E.P. Learning Robust Representations by Projecting Superficial Statistics Out. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Bahng, H.; Chun, S.; Yun, S.; Choo, J.; Oh, S.J. Learning De-biased Representations with Biased Representations. In Proceedings of the 37th International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2020. [Google Scholar]
Hong, Y.; Yang, E. Unbiased Classification through Bias-Contrastive and Bias-Balanced Learning. In Proceedings of the Advances in Neural Information Processing Systems; Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2021. [Google Scholar]
Nam, J.; Cha, H.; Ahn, S.; Lee, J.; Shin, J. Learning from Failure: De-biasing Classifier from Biased Classifier. In Proceedings of the Advances in Neural Information Processing Systems; Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 20673–20684. [Google Scholar]
Lee, J.; Kim, E.; Lee, J.; Lee, J.; Choo, J. Learning Debiased Representation via Disentangled Feature Augmentation. In Proceedings of the Advances in Neural Information Processing Systems; Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2021. [Google Scholar]
Lim, J.; Kim, Y.; Kim, B.; Ahn, C.; Shin, J.; Yang, E.; Han, S. BiasAdv: Bias-Adversarial Augmentation for Model Debiasing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2023; pp. 3832–3841. [Google Scholar]
Zhang, Y.K.; Wang, Q.W.; Zhan, D.C.; Ye, H.J. Learning Debiased Representations via Conditional Attribute Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2023; pp. 7599–7608. [Google Scholar]
Grari, V.; Lamprier, S.; Detyniecki, M. Fairness without the Sensitive Attribute via Causal Variational Autoencoder. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Vienna, Austria, 23–29 July 2022; Raedt, L.D., Ed.; IJCAI: Vienna, Austria, 2023; pp. 696–702. [Google Scholar] [CrossRef]
Zhao, T.; Dai, E.; Shu, K.; Wang, S. Towards Fair Classifiers Without Sensitive Attributes: Exploring Biases in Related Features. In WSDM ’22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining; ACM: New York, NY, USA, 2022; pp. 1433–1442. [Google Scholar] [CrossRef]
Hashimoto, T.; Srivastava, M.; Namkoong, H.; Liang, P. Fairness Without Demographics in Repeated Loss Minimization. In International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2018; pp. 1929–1938. [Google Scholar]
Lahoti, P.; Beutel, A.; Chen, J.; Lee, K.; Prost, F.; Thain, N.; Wang, X.; Chi, E.H. Fairness without Demographics through Adversarially Reweighted Learning. In Advances in Neural Information Processing Systems; NIPS’20; Curran Associates, Inc.: Red Hook, NY, USA, 2020. [Google Scholar]
Liu, E.Z.; Haghgoo, B.; Chen, A.S.; Raghunathan, A.; Koh, P.W.; Sagawa, S.; Liang, P.; Finn, C. Just Train Twice: Improving Group Robustness without Training Group Information. arXiv 2021, arXiv:2107.09044. [Google Scholar] [CrossRef]
Chai, J.; Jang, T.; Wang, X. Fairness without Demographics through Knowledge Distillation. In Proceedings of the Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2022. [Google Scholar]
Chai, J.; Wang, X. Fairness with Adaptive Weights. In Proceedings of the 39th International Conference on Machine Learning; Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S., Eds.; PMLR: Cambridge, MA, USA, 2022; Volume 162, pp. 2853–2866. [Google Scholar]
Kamiran, F.; Calders, T. Data Pre-Processing Techniques for Classification without Discrimination. Knowl. Inf. Syst. 2011, 33, 1–33. [Google Scholar] [CrossRef]
Zhang, Z.; Song, Y.; Qi, H. Age Progression/Regression by Conditional Adversarial Autoencoder. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2017. [Google Scholar]
Raff, E.; Sylvester, J. Gradient Reversal against Discrimination: A Fair Neural Network Learning Approach. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA); IEEE: New York, NY, USA, 2018; pp. 189–198. [Google Scholar] [CrossRef]
Sarhan, M.H.; Navab, N.; Eslami, A.; Albarqouni, S. Fairness by Learning Orthogonal Disentangled Representations. In Proceedings of the Computer Vision—ECCV 2020—16th European Conference, Glasgow, UK, 23–28 August 2020; Volume 12374, pp. 746–761. [Google Scholar] [CrossRef]
Zhang, B.H.; Lemoine, B.; Mitchell, M. Mitigating Unwanted Biases with Adversarial Learning. In AIES ’18: Proceedings of the Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society; ACM: New York, NY, USA, 2018; pp. 335–340. [Google Scholar] [CrossRef]
Wadsworth, C.; Vera, F.; Piech, C. Achieving fairness through adversarial learning: An application to recidivism prediction. arXiv 2018, arXiv:1807.00199. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2014; pp. 2672–2680. [Google Scholar]
Lokhande, V.S.; Sohn, K.; Yoon, J.; Udell, M.; Lee, C.Y.; Pfister, T. Towards Group Robustness in the Presence of Partial Group Labels. In Proceedings of the ICML 2022: Workshop on Spurious Correlations, Invariance and Stability, Baltimore, MD, USA, 22 July 2022. [Google Scholar]
Chai, J.; Wang, X. Self-Supervised Fair Representation Learning without Demographics. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2022. [Google Scholar]
Jang, T.; Wang, X. Difficulty-Based Sampling for Debiased Contrastive Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2023; pp. 24039–24048. [Google Scholar]
Islam, R.; Chen, H.; Cai, Y. Fairness without Demographics through Shared Latent Space-Based Debiasing. Proc. Aaai Conf. Artif. Intell. 2024, 38, 12717–12725. [Google Scholar] [CrossRef]
Wang, X.; Li, J.; Tsang, I.W.; Ong, Y.S. Towards Harmless Rawlsian Fairness Regardless of Demographic Prior. In Advances in Neural Information Processing Systems; Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2024; Volume 37, pp. 80908–80935. [Google Scholar] [CrossRef]
Luo, Y.; Li, Z.; Liu, Q.; Zhu, J. Fairness without Demographics through Learning Graph of Gradients. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, KDD 2025, Toronto, ON, Canada, 3–7 August 2025; pp. 918–926. [Google Scholar] [CrossRef]
Lin, T.; Goyal, P.; Girshick, R.B.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. arXiv 2017, arXiv:1708.02002. [Google Scholar] [CrossRef]
Schwenk, H.; Bengio, Y. Training Methods for Adaptive Boosting of Neural Networks. In Advances in Neural Information Processing Systems; Jordan, M., Kearns, M., Solla, S., Eds.; MIT Press: Cambridge, MA, USA, 1997; Volume 10. [Google Scholar]
Badirli, S.; Liu, X.; Xing, Z.; Bhowmik, A.; Doan, K.D.; Keerthi, S. Gradient Boosting Neural Networks: GrowNet. arXiv 2020, arXiv:2002.07971. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]

Figure 1. Overall flow of the proposed method. The proposed method dynamically updates the groups based on the output of the classification model. Subsequently, the updated groups are utilized to fairly train the model. These procedures are iterated cyclically. The solid and dotted lines, respectively, depict the flow of the current and next training steps. All notations in the figure follow the same definitions as in the text, including

\tilde{y}

,

g_{i}

, and

w_{g_{i}}

.

Figure 1. Overall flow of the proposed method. The proposed method dynamically updates the groups based on the output of the classification model. Subsequently, the updated groups are utilized to fairly train the model. These procedures are iterated cyclically. The solid and dotted lines, respectively, depict the flow of the current and next training steps. All notations in the figure follow the same definitions as in the text, including

\tilde{y}

,

g_{i}

, and

w_{g_{i}}

.

Figure 2. Empirical validation on the assumption. We set Age and Gender as the target and sensitive attributes, respectively, on UTK Face.

Figure 3. Group-wise average weight on CelebA. We set Attractive (a), Wearing Earrings (b), Chubby (c), and Brown Hair (d) as the target attributes. The sensitive attribute, i.e., Male, is a malignant attribute with low correlation with Attractive, whereas it is not a malignant attribute but highly correlated with Wearing Earrings. For Chubby and Brown Hair, the sensitive attribute is not a malignant attribute and has a low correlation with them.

Table 1. Results on COMPAS for Race. We report a comparison with the previous methods for boosting and fairness. * indicates that the marked results are taken from the previous work [42]. For a fair comparison, we also measure conventional classification accuracy for the other methods.

Method	Accuracy (↑)	Equalized Odds (↓)
ResNet [3] *	64.1	38.3
Focal Loss [57]	66.4	33.7
Adaboost [58]	65.6	31.3
GrowNet [59]	66.1	38.9
DRO [39] *	62.6	30.4
ARL [40] *	63.2	29.5
FairRF [38] *	63.3	25.7
Chai, Jang, and Wang [42] *	63.3	20.3
Ours	63.4	16.0

Table 2. Results on COMPAS for Gender. * indicates that the marked results are taken from the previous work [42].

Method	Accuracy (↑)	Equalized Odds (↓)
ResNet [3] *	64.1	20.2
Focal Loss [57]	66.4	22.2
Adaboost [58]	65.1	26.3
GrowNet [59]	66.8	22.2
DRO [39] *	62.7	18.8
ARL [40] *	63.2	19.1
FairRF [38] *	63.3	18.7
Chai, Jang, and Wang [42] *	63.4	14.3
Ours	63.3	9.4

Table 3. Results on CelebA. We set the sensitive attribute as Male and report the summarized classification results for 32 target attributes. SA indicates whether sensitive attribute labels were used during training. The best and second-best scores are emphasized in bold and underlined, respectively.

Method	SA	All Results			EO ≥ 10			EO ≥ 20
Method	SA	BAcc. (↑)	EO (↓)	Std Dev. (↓)	BAcc. (↑)	EO (↓)	Std Dev. (↓)	BAcc. (↑)	EO (↓)	Std Dev. (↓)
ResNet	✗	75.0	20.9	26.1	72.7	26.6	29.2	72.9	33.9	30.5
Group DRO	✓	81.5	4.5	4.6	79.6	4.9	5.2	78.0	6.1	6.8
JTT	✗	77.2	15.4	15.9	74.9	18.9	17.2	73.8	22.4	19.1
LfF	✗	78.1	17.8	15.4	76.1	20.1	17.1	74.7	22.8	18.9
DFA	✗	76.2	15.5	21.2	74.2	18.6	23.2	74.0	22.9	23.2
BPA	✗	79.9	16.7	13.7	78.2	20.3	15.3	76.7	23.6	17.3
Ours	✗	79.3	11.2	10.7	77.4	13.2	11.2	76.8	16.1	12.8

Table 4. BAcc. on CelebA. We set Male as the sensitive attribute. The reported results for 32 target attributes are separated based on whether algorithmic bias is present with the sensitive attribute (above) or not (below). We clarify that JTT [41] utilizes additional sensitive attribute labels during the validation procedure, and Group DRO [24] uses sensitive attribute labels during training.

Target Attribute	No Sensitive Labels					Sensitive Labels Required
Target Attribute	ResNet	LfF	DFA	BPA	Ours	JTT	Group DRO
Arched Eyebrows	70.8	64.0	75.9	73.6	75.9	72.7	74.9
Bags Under Eyes	73.1	72.3	69.4	75.6	72.5	61.1	75.5
Bangs	89.2	93.5	90.9	91.2	93.0	94.3	96.8
Big Lips	58.4	61.9	62.3	70.6	66.4	51.7	63.5
Big Nose	67.3	68.4	70.3	70.4	71.0	68.6	72.5
Blond Hair	79.2	87.6	84.1	83.5	86.8	85.6	91.1
Blurry	76.4	83.9	76.9	88.5	87.6	88.6	89.2
Brown Hair	75.4	86.3	82.1	82.5	84.0	83.9	84.0
Bushy Eyebrows	77.1	83.4	78.7	83.8	80.9	83.8	83.6
Chubby	64.1	75.4	66.0	76.9	70.3	81.4	81.6
Double Chin	64.2	75.7	65.7	83.0	83.7	84.1	84.7
Gray Hair	74.9	83.9	77.0	83.7	80.0	89.9	93.9
Heavy Makeup	71.8	71.4	74.8	77.9	75.4	71.3	75.6
Narrow Eyes	76.2	76.2	74.5	76.4	76.6	75.6	77.1
No Beard	72.1	77.0	73.4	71.5	77.0	78.0	81.8
Oval Face	62.4	60.1	62.9	64.1	62.1	62.6	64.6
Pale Skin	71.6	86.4	79.0	89.5	86.3	87.7	91.1
Pointy Nose	62.2	64.8	63.4	65.2	68.0	60.4	68.7
Receding Hairline	74.0	82.6	78.0	82.0	79.2	84.4	66.0
Straight Hair	68.6	65.9	64.3	73.7	72.0	73.1	76.0
Wavy Hair	75.7	73.1	76.5	78.8	75.7	69.5	81.0
Wearing Hat	87.9	93.1	92.2	89.8	89.9	98.0	97.5
Wearing Earrings	74.2	70.4	73.2	81.1	81.2	61.3	83.0
Wearing Lipstick	73.1	74.6	71.1	79.1	80.0	71.0	80.7
Wearing Necklace	53.5	62.4	56.2	65.7	56.6	52.0	66.3
Young	79.7	77.7	78.1	78.3	78.1	66.8	78.7
Attractive	76.8	76.4	75.3	76.5	75.5	72.9	78.0
High Cheekbone	83.6	84.5	82.8	84.3	83.1	75.5	83.6
Black Hair	87.2	86.8	85.9	83.5	86.8	84.0	87.6
Mouth Slightly Open	93.2	92.9	94.1	92.2	93.5	87.3	92.2
Eyeglasses	96.8	97.1	97.2	94.4	98.0	98.4	98.8
Smiling	90.6	90.8	91.1	90.4	92.3	89.4	90.4
Average	75.0	78.1	76.2	79.9	79.3	77.2	81.5

Table 5. EO and STD Dev. on CelebA. We set Male as the sensitive attribute. We clarify that JTT [41] utilizes additional sensitive attribute labels during the validation procedure, and Group DRO [24] uses sensitive attribute labels during training.

Target Attribute	Equalized Odds (↓)							Std Dev. (↓)
	No Sensitive Labels					Sensitive Labels Required		No Sensitive Labels					Sensitive Labels Required
	ResNet	LfF	DFA	BPA	Ours	JTT	Group DRO	ResNet	LfF	DFA	BPA	Ours	JTT	Group DRO
Arched Eyebrows	33.9	32.7	30.3	34.4	15.3	28.1	6.6	27.0	31.9	23.2	31.9	20.5	17.4	4.2
Bags Under Eyes	44.0	21.1	16.2	20.1	15.9	37.9	6.6	27.0	17.9	26.0	15.4	17.2	32.1	4.5
Bangs	2.4	3.7	7.6	4.4	2.3	2.7	6.3	10.7	2.4	7.0	8.8	1.6	3.2	3.7
Big Lips	11.7	38.9	13.2	16.0	13.8	20.9	9.8	46.2	27.4	41.0	12.0	10.7	45.0	14.6
Big Nose	23.8	18.6	23.2	33.8	17.9	20.5	2.3	30.8	13.5	15.8	20.7	10.9	27.3	2.5
Blond Hair	30.6	14.9	18.5	11.6	6.0	4.8	2.3	29.8	12.9	18.7	16.0	4.2	3.2	1.6
Blurry	6.7	6.7	4.9	5.6	5.4	4.7	3.2	25.0	6.1	23.2	8.3	7.3	4.0	2.6
Brown Hair	18.4	7.1	4.7	9.4	4.3	4.3	1.7	14.9	3.1	3.4	8.0	5.9	8.1	2.2
Bushy Eyebrows	23.5	6.9	14.8	9.7	5.7	7.9	3.1	25.9	5.8	20.4	5.7	3.6	12.7	2.3
Chubby	15.1	28.0	13.9	29.1	25.0	26.6	2.8	41.2	21.8	37.5	17.3	14.6	15.4	2.5
Double Chin	15.2	25.0	11.4	22.4	20.0	18.2	5.1	41.2	23.1	38.2	14.1	11.7	10.9	3.6
Gray Hair	13.1	16.9	16.1	11.2	6.5	13.7	3.8	29.9	17.9	27.5	20.3	8.3	8.0	2.9
Heavy Makeup	44.3	47.2	45.0	39.3	38.8	27.8	25.6	36.6	32.1	26.1	22.7	22.6	21.8	23.9
Narrow Eyes	27.1	3.6	4.8	3.4	3.9	7.5	1.9	32.0	2.2	26.2	9.5	6.6	19.9	2.0
No Beard	44.9	32.7	39.8	50.3	32.4	37.0	14.3	41.6	25.7	33.1	29.2	18.7	21.4	15.1
Oval Face	22.0	16.5	17.3	18.0	14.8	21.0	3.8	32.1	21.3	21.3	14.4	12.7	14.1	5.0
Pale Skin	15.0	8.9	2.9	3.8	3.8	9.4	0.8	33.5	12.3	21.1	8.2	12.6	5.6	0.6
Pointy Nose	22.5	24.7	21.2	29.0	10.5	33.4	2.7	36.3	32.1	31.8	23.3	9.7	33.3	5.4
Receding Hairline	45.0	16.5	12.2	13.9	13.3	14.8	3.6	27.9	11.2	22.1	13.4	20.4	8.7	4.3
Straight Hair	9.8	9.6	4.1	6.6	6.3	7.1	1.7	26.2	31.1	33.0	9.6	12.4	7.0	2.0
Wavy Hair	18.4	4.2	17.8	14.9	3.3	14.3	1.6	22.0	13.0	24.3	10.5	4.7	10.1	6.8
Wearing Hat	23.4	3.6	5.2	8.8	7.9	0.6	0.5	15.0	4.8	7.6	5.3	5.1	0.5	1.0
Wearing Earrings	43.2	39.9	34.4	22.6	18.1	38.6	4.6	31.3	26.8	20.7	13.4	11.9	31.5	2.7
Wearing Lipstick	47.4	41.3	38.2	36.6	25.7	33.7	8.0	33.7	27.2	32.9	22.4	15.5	23.8	10.7
Wearing Necklace	0.8	57.7	21.1	14.7	17.4	22.0	10.8	52.7	34.1	45.6	18.9	42.1	37.0	9.1
Young	17.5	17.4	17.0	18.1	3.7	23.4	1.3	16.1	10.2	17.7	12.2	6.9	19.1	1.9
Attractive	25.7	11.5	21.6	24.1	6.4	5.0	0.9	15.5	10.1	16.5	16.6	9.3	6.1	1.6
High Cheekbone	12.9	5.0	7.3	6.7	4.6	5.1	4.8	15.0	6.2	5.8	4.8	5.5	9.4	4.9
Black Hair	5.0	5.0	5.6	11.6	4.2	9.5	1.1	3.0	4.3	6.7	16.0	5.2	8.4	2.1
Mouth Slightly Open	2.0	1.9	0.9	0.6	2.3	0.5	0.2	2.9	1.6	1.2	2.9	1.5	8.0	0.1
Eyeglasses	1.3	3.0	2.0	3.4	1.3	1.3	0.8	3.6	2.0	2.6	2.6	0.9	0.8	0.6
Smiling	3.9	2.7	4.0	2.0	2.5	3.1	2.1	9.0	2.3	2.8	4.2	1.7	2.5	1.3
Average	20.9	17.8	15.5	16.7	11.2	15.4	4.5	26.1	15.4	21.2	13.7	10.7	15.9	4.6

Table 6. Results on UTK Face for target attribute Age. We report the experimental results on three subsets of UTK Face. Severity (

γ

) represents the degree of data imbalance. We set Age and Gender as the target and sensitive attributes respectively. As

γ

increases, the fairness of ours is more improved over the baseline.

Table 6. Results on UTK Face for target attribute Age. We report the experimental results on three subsets of UTK Face. Severity (

γ

) represents the degree of data imbalance. We set Age and Gender as the target and sensitive attributes respectively. As

γ

increases, the fairness of ours is more improved over the baseline.

Method	Severity $(γ) = 2$			Severity $(γ) = 3$			Severity $(γ) = 4$
Method	BAcc. (↑)	EO (↓)	Std Dev. (↓)	BAcc. (↑)	EO (↓)	Std Dev. (↓)	BAcc. (↑)	EO (↓)	Std Dev. (↓)
ResNet	82.1	17.7	10.3	81.9	21.0	12.3	80.3	26.0	16.1
LfF	77.7	4.1	3.4	77.2	7.3	4.4	76.8	14.4	10.4
DFA	82.1	15.8	9.3	81.8	18.1	10.4	79.6	20.2	13.8
BPA	83.1	16.6	9.6	82.0	21.8	12.6	80.3	22.6	13.1
Ours	80.3	3.1	2.2	80.5	4.0	5.3	79.2	8.2	7.1

Table 7. Results on UTK Face for target attribute Race. We set Race and Gender as the target and sensitive attributes respectively.

Method	Severity $(γ) = 2$			Severity $(γ) = 3$			Severity $(γ) = 4$
Method	BAcc. (↑)	EO (↓)	Std Dev. (↓)	BAcc. (↑)	EO (↓)	Std Dev. (↓)	BAcc. (↑)	EO (↓)	Std Dev. (↓)
ResNet	81.8	10.1	6.1	81.6	15.2	9.2	79.3	21.2	12.4
LfF	81.8	4.8	2.9	81.6	9.2	6.6	80.5	12.9	9.8
DFA	81.7	6.6	10.0	81.8	12.8	10.3	81.0	13.3	10.1
BPA	81.9	6.5	3.9	81.8	10.2	10.7	80.6	11.5	8.7
Ours	82.4	4.5	3.3	82.6	5.7	3.9	80.9	9.5	5.5

Table 11. Fairness improvement through semi-supervised learning. Labeled data represents the proportion of data with sensitive attribute labels in the entire training set.

Method	Labeled Data	Balanced Accuracy	EO
Group DRO	1	74.2	3.4
Ours	1/2	72.3	4.2
	1/4	71.6	4.7
	1/10	71.4	5.0
	0	71.0	17.9

Table 12. Results on Cat and Dog. We set species and color to the target and sensitive attributes, respectively.

Method	Balanced Accuracy	Equalized Odds	Std Dev.
ResNet	79.9	20.7	17.7
LfF	81.6	14.0	8.4
DFA	86.8	13.4	7.8
BPA	87.7	10.7	7.9
Ours	85.9	5.7	7.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, P.; Park, S. Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting. Appl. Sci. 2026, 16, 1684. https://doi.org/10.3390/app16041684

AMA Style

Lee P, Park S. Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting. Applied Sciences. 2026; 16(4):1684. https://doi.org/10.3390/app16041684

Chicago/Turabian Style

Lee, Pilhyeon, and Sungho Park. 2026. "Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting" Applied Sciences 16, no. 4: 1684. https://doi.org/10.3390/app16041684

APA Style

Lee, P., & Park, S. (2026). Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting. Applied Sciences, 16(4), 1684. https://doi.org/10.3390/app16041684

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fair Classification Without Sensitive Attribute Labels via Dynamic Reweighting

Abstract

1. Introduction

2. Related Work

2.1. Fairness-Aware Classification

2.2. Fairness/Debiasing Without Bias Supervision

3. Method

3.1. Strategy for Sample Grouping

3.2. Calculation of Group-Wise Weight

3.3. Algorithm for Overall Flow

4. Experiments

4.1. Datasets

4.2. Evaluation Metrics

4.3. Comparison on COMPAS

4.4. Comparison on CelebA

4.5. Comparison on UTKFace

4.6. Exploring Assumptions for Sensitive Attributes

4.7. Effectiveness of Components

4.8. Fairness Improvement in Semi-Supervised Setting

4.9. Analysis on Cat and Dog

4.10. Implementation Details

5. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI