MDPI - Publisher of Open Access Journals

17 pages, 1467 KiB

Open AccessArticle

Confidence-Based Knowledge Distillation to Reduce Training Costs and Carbon Footprint for Low-Resource Neural Machine Translation

by Maria Zafar, Patrick J. Wall, Souhail Bakkali and Rejwanul Haque

Appl. Sci. 2025, 15(14), 8091; https://doi.org/10.3390/app15148091 - 21 Jul 2025

Viewed by 621

Abstract

The transformer-based deep learning approach represents the current state-of-the-art in machine translation (MT) research. Large-scale pretrained transformer models produce state-of-the-art performance across a wide range of MT tasks for many languages. However, such deep neural network (NN) models are often data-, compute-, space-, [...] Read more.

The transformer-based deep learning approach represents the current state-of-the-art in machine translation (MT) research. Large-scale pretrained transformer models produce state-of-the-art performance across a wide range of MT tasks for many languages. However, such deep neural network (NN) models are often data-, compute-, space-, power-, and energy-hungry, typically requiring powerful GPUs or large-scale clusters to train and deploy. As a result, they are often regarded as “non-green” and “unsustainable” technologies. Distilling knowledge from large deep NN models (teachers) to smaller NN models (students) is a widely adopted sustainable development approach in MT as well as in broader areas of natural language processing (NLP), including speech, and image processing. However, distilling large pretrained models presents several challenges. First, increased training time and cost that scales with the volume of data used for training a student model. This could pose a challenge for translation service providers (TSPs), as they may have limited budgets for training. Moreover, CO₂ emissions generated during model training are typically proportional to the amount of data used, contributing to environmental harm. Second, when querying teacher models, including encoder–decoder models such as NLLB, the translations they produce for low-resource languages may be noisy or of low quality. This can undermine sequence-level knowledge distillation (SKD), as student models may inherit and reinforce errors from inaccurate labels. In this study, the teacher model’s confidence estimation is employed to filter those instances from the distilled training data for which the teacher exhibits low confidence. We tested our methods on a low-resource Urdu-to-English translation task operating within a constrained training budget in an industrial translation setting. Our findings show that confidence estimation-based filtering can significantly reduce the cost and CO₂ emissions associated with training a student model without drop in translation quality, making it a practical and environmentally sustainable solution for the TSPs. Full article

(This article belongs to the Special Issue Deep Learning and Its Applications in Natural Language Processing)

► Show Figures

Figure 1

31 pages, 19893 KiB

Open AccessArticle

A Low-Measurement-Cost-Based Multi-Strategy Hyperspectral Image Classification Scheme

by Yu Bai, Dongmin Liu, Lili Zhang and Haoqi Wu

Sensors 2024, 24(20), 6647; https://doi.org/10.3390/s24206647 - 15 Oct 2024

Viewed by 1385

Abstract

The cost of hyperspectral image (HSI) classification primarily stems from the annotation of image pixels. In real-world classification scenarios, the measurement and annotation process is both time-consuming and labor-intensive. Therefore, reducing the number of labeled pixels while maintaining classification accuracy is a key [...] Read more.

The cost of hyperspectral image (HSI) classification primarily stems from the annotation of image pixels. In real-world classification scenarios, the measurement and annotation process is both time-consuming and labor-intensive. Therefore, reducing the number of labeled pixels while maintaining classification accuracy is a key research focus in HSI classification. This paper introduces a multi-strategy triple network classifier (MSTNC) to address the issue of limited labeled data in HSI classification by improving learning strategies. First, we use the contrast learning strategy to design a lightweight triple network classifier (TNC) with low sample dependence. Due to the construction of triple sample pairs, the number of labeled samples can be increased, which is beneficial for extracting intra-class and inter-class features of pixels. Second, an active learning strategy is used to label the most valuable pixels, improving the quality of the labeled data. To address the difficulty of sampling effectively under extremely limited labeling budgets, we propose a new feature-mixed active learning (FMAL) method to query valuable samples. Fine-tuning is then used to help the MSTNC learn a more comprehensive feature distribution, reducing the model’s dependence on accuracy when querying samples. Therefore, the sample quality is improved. Finally, we propose an innovative dual-threshold pseudo-active learning (DSPAL) strategy, filtering out pseudo-label samples with both high confidence and uncertainty. Extending the training set without increasing the labeling cost further improves the classification accuracy of the model. Extensive experiments are conducted on three benchmark HSI datasets. Across various labeling ratios, the MSTNC outperforms several state-of-the-art methods. In particular, under extreme small-sample conditions (five samples per class), the overall accuracy reaches 82.97% (IP), 87.94% (PU), and 86.57% (WHU). Full article

(This article belongs to the Special Issue Sensors for Hyperspectral Imaging: Technologies, Methods and Data Processing)

► Show Figures

Figure 1

28 pages, 3664 KiB

Open AccessArticle

Multiuser Incomplete Preference K-Nearest Neighbor Query Method Based on Differential Privacy in Road Network

by Liping Zhang, Xiaojing Zhang and Song Li

ISPRS Int. J. Geo-Inf. 2023, 12(7), 282; https://doi.org/10.3390/ijgi12070282 - 15 Jul 2023

Viewed by 1378

Abstract

In view of the existing research in the field of k-nearest neighbor query in the road network, the incompleteness of the query user’s preference for data objects and the privacy protection of the query results are not considered, this paper proposes a [...] Read more.

In view of the existing research in the field of k-nearest neighbor query in the road network, the incompleteness of the query user’s preference for data objects and the privacy protection of the query results are not considered, this paper proposes a multiuser incomplete preference k-nearest neighbor query algorithm based on differential privacy in the road network. The algorithm is divided into four parts; the first part proposes a multiuser incomplete preference completion algorithm based on association rules. The algorithm firstly uses the frequent pattern tree proposed in this paper to mine frequent item sets, then uses frequent item sets to mine strong correlation rules, and finally completes multiuser incomplete preference based on strong correlation rules. The second part proposes attribute preference weight coefficient based on multiuser’ s different preferences and clusters users accordingly. The third part compares the dominance of the query object, filters the data with low dominance, and performs a k-neighbor query. The fourth part proposes a privacy budget allocation method based on differential privacy technology. The method uses the Laplace mechanism to add noise to the result release and balance the privacy and availability of data. Theoretical research and experimental analysis show that the proposed method can better deal with the multiuser incomplete preference k-nearest neighbor query and privacy protection problems in the road network. Full article

► Show Figures

Figure 1

10 pages, 7441 KiB

Open AccessArticle

Anti-Toothbreaker: A Novel Low-Budget Device Enabling Contactless Dental Protection and a Forbidden Technique during Direct Laryngoscopy for Endotracheal Intubation

by Sam Razaeian and Helena Kristin Liebich

Diagnostics 2023, 13(4), 594; https://doi.org/10.3390/diagnostics13040594 - 6 Feb 2023

Cited by 2 | Viewed by 2438

Abstract

Background: Iatrogenic dental injury is the most common complication of conventional laryngoscopy during orotracheal intubation. The main cause is unintended pressure and leverage forces from the hard metal blade of the laryngoscope. The aim of this pilot study was to introduce and test [...] Read more.

Background: Iatrogenic dental injury is the most common complication of conventional laryngoscopy during orotracheal intubation. The main cause is unintended pressure and leverage forces from the hard metal blade of the laryngoscope. The aim of this pilot study was to introduce and test a novel, reusable low-budget device not only providing contactless dental protection during direct laryngoscopy for endotracheal intubation, but also enabling, in contrast to established tooth protectors, active levering with conventional laryngoscopes for easier visualization of the glottis. Methods: A constructed prototype for intrahospital usage was evaluated by seven participants on a simulation manikin for airway management. Endotracheal intubation was performed with and without the device using a conventional Macintosh laryngoscope (blade size 4) and a 7.5 mm endotracheal tube (Teleflex Medical GmbH, Fellbach, Germany). Necessary time and success of first pass were determined. Degree of visualization of the glottis with and without the device was stated by the participants according to the Cormack and Lehane (CL) classification system and the Percentage of Glottic Opening (POGO) scoring system. In addition, subjective physical effort, feeling of safety regarding successful intubation, and risk for dental injury were queried on a numeric scale between 1 and 10. Results: All participants except one stated that the intubation procedure was easier with usage of the device than without it. On average, this was subjectively perceived as being approximately 42% (range, 15–65%) easier. In addition, time to first pass success, as well as degree of glottis visualization, subjective physical effort, and feeling of safety regarding risk for dental injury, were clearly better with usage of the device. Concerning feeling of safety regarding successful intubation, there was only a minor advantage. No difference in first pass success rate and number of total attempts could be observed. Conclusion: The Anti-Toothbreaker is a novel, reusable low-budget device which might not only provide contactless dental protection during direct laryngoscopy for endotracheal intubation, but also enables, in contrast to established tooth protectors, active levering with conventional laryngoscopes for easier visualization of the glottis. Future human cadaveric studies are needed to investigate whether these advantages also prove themselves there. Full article

(This article belongs to the Special Issue Diagnosis and Management in Trauma Surgery)

► Show Figures

Figure 1

19 pages, 2602 KiB

Open AccessArticle

ShrewdAttack: Low Cost High Accuracy Model Extraction

by Yang Liu, Ji Luo, Yi Yang, Xuan Wang, Mehdi Gheisari and Feng Luo

Entropy 2023, 25(2), 282; https://doi.org/10.3390/e25020282 - 2 Feb 2023

Cited by 2 | Viewed by 2983

Abstract

Machine learning as a service (MLaaS) plays an essential role in the current ecosystem. Enterprises do not need to train models by themselves separately. Instead, they can use well-trained models provided by MLaaS to support business activities. However, such an ecosystem could be [...] Read more.

Machine learning as a service (MLaaS) plays an essential role in the current ecosystem. Enterprises do not need to train models by themselves separately. Instead, they can use well-trained models provided by MLaaS to support business activities. However, such an ecosystem could be threatened by model extraction attacks—an attacker steals the functionality of a trained model provided by MLaaS and builds a substitute model locally. In this paper, we proposed a model extraction method with low query costs and high accuracy. In particular, we use pre-trained models and task-relevant data to decrease the size of query data. We use instance selection to reduce query samples. In addition, we divided query data into two categories, namely low-confidence data and high-confidence data, to reduce the budget and improve accuracy. We then conducted attacks on two models provided by Microsoft Azure as our experiments. The results show that our scheme achieves high accuracy at low cost, with the substitution models achieving 96.10% and 95.24% substitution while querying only 7.32% and 5.30% of their training data on the two models, respectively. This new attack approach creates additional security challenges for models deployed on cloud platforms. It raises the need for novel mitigation strategies to secure the models. In future work, generative adversarial networks and model inversion attacks can be used to generate more diverse data to be applied to the attacks. Full article

(This article belongs to the Special Issue Trustworthy AI: Information Theoretic Perspectives)

► Show Figures

Figure 1

32 pages, 4753 KiB

Open AccessArticle

A Novel Low-Query-Budget Active Learner with Pseudo-Labels for Imbalanced Data

by Alaa Tharwat and Wolfram Schenck

Mathematics 2022, 10(7), 1068; https://doi.org/10.3390/math10071068 - 26 Mar 2022

Cited by 8 | Viewed by 2500

Abstract

Despite the availability of a large amount of free unlabeled data, collecting sufficient training data for supervised learning models is challenging due to the time and cost involved in the labeling process. The active learning technique we present here provides a solution by [...] Read more.

Despite the availability of a large amount of free unlabeled data, collecting sufficient training data for supervised learning models is challenging due to the time and cost involved in the labeling process. The active learning technique we present here provides a solution by querying a small but highly informative set of unlabeled data. It ensures high generalizability across space, improving classification performance with test data that we have never seen before. Most active learners query either the most informative or the most representative data to annotate them. These two criteria are combined in the proposed algorithm by using two phases: exploration and exploitation phases. The former aims to explore the instance space by visiting new regions at each iteration. The second phase attempts to select highly informative points in uncertain regions. Without any predefined knowledge, such as initial training data, these two phases improve the search strategy of the proposed algorithm so that it can explore the minority class space with imbalanced data using a small query budget. Further, some pseudo-labeled points geometrically located in trusted explored regions around the new labeled points are added to the training data, but with lower weights than the original labeled points. These pseudo-labeled points play several roles in our model, such as (i) increasing the size of the training data and (ii) decreasing the size of the version space by reducing the number of hypotheses that are consistent with the training data. Experiments on synthetic and real datasets with different imbalance ratios and dimensions show that the proposed algorithm has significant advantages over various well-known active learners. Full article

(This article belongs to the Section E: Applied Mathematics)

► Show Figures

Figure 1

17 pages, 2701 KiB

Open AccessArticle

Iterative Learning for K-Approval Votes in Crowdsourcing Systems

by Joonyoung Kim, Donghyeon Lee and Kyomin Jung

Appl. Sci. 2021, 11(2), 630; https://doi.org/10.3390/app11020630 - 11 Jan 2021

Viewed by 1719

Abstract

Crowdsourcing systems have emerged as cornerstones to collect large amounts of qualified data in various human-powered problems with a relatively low budget. In eliciting the wisdom of crowds, many web-based crowdsourcing platforms have encouraged workers to select top-K alternatives rather than just [...] Read more.

Crowdsourcing systems have emerged as cornerstones to collect large amounts of qualified data in various human-powered problems with a relatively low budget. In eliciting the wisdom of crowds, many web-based crowdsourcing platforms have encouraged workers to select top-K alternatives rather than just one choice, which is called “K-approval voting”. This kind of setting has the advantage of inducing workers to make fewer mistakes when they respond to target tasks. However, there is not much work on inferring the correct answer from crowd-sourced data via a K-approval voting. In this paper, we propose a novel and efficient iterative algorithm to infer correct answers for a K-approval voting, which can be directly applied to real-world crowdsourcing systems. We analyze the average performance of our algorithm, and prove the theoretical error bound that decays exponentially in terms of the quality of workers and the number of queries. Through extensive experiments including the mixed case with various types of tasks, we show that our algorithm outperforms Expectation and Maximization (EM) and existing baseline algorithms. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 1328 KiB

Open AccessArticle

Determinants of Pro-Environmental Consumption: Multicountry Comparison Based upon Big Data Search

by Donghyun Lee, Suna Kang and Jungwoo Shin

Sustainability 2017, 9(2), 183; https://doi.org/10.3390/su9020183 - 27 Jan 2017

Cited by 8 | Viewed by 6147

Abstract

The Korean government has promoted a variety of environmental policies to revitalize pro-environmental consumption, and the government’s budget for this purpose has increased. However, there is a lack of quantitative data and analysis regarding the effects upon the pro-environmental consumption of education and [...] Read more.

The Korean government has promoted a variety of environmental policies to revitalize pro-environmental consumption, and the government’s budget for this purpose has increased. However, there is a lack of quantitative data and analysis regarding the effects upon the pro-environmental consumption of education and changing public awareness of the environment. In addition, to improve pro-environmental consumption, the determinant and hindrance factors of pro-environmental consumption should be analyzed in advance. Accordingly, herein we suggest a pro-environmental consumption index that represents the condition of pro-environmental consumption based on big data queries and use the index to analyze determinants of and hindrances to pro-environmental consumption. To verify the reliability of the proposed indicator, we examine the correlation between the proposed indicator and Greendex, an existing survey-based indicator. In addition, we conduct an analysis of the determinants of pro-environmental consumption across 13 countries based upon the proposed indicator. The index is highest for Argentina and average for Korea. An analysis of the determinants shows that the levels of health expenditure, the ratio of the population aged over 65 years, and past orientation are signiﬁcantly negatively related to the pro-environmental consumption index, but the level of preprimary education is signiﬁcantly positively related with it. We also ﬁnd that high-GDP countries have a signiﬁcantly positive relationship between economy growth and pro-environmental consumption, but low-GDP countries do not have this relationship. Full article

(This article belongs to the Special Issue Big Data and Predictive Analytics for Sustainability)

► Show Figures

Figure 1

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI