MDPI - Publisher of Open Access Journals

26 pages, 389 KB

Open AccessArticle

Integrating AI with Meta-Language: An Interdisciplinary Framework for Classifying Concepts in Mathematics and Computer Science

by Elena Kramer, Dan Lamberg, Mircea Georgescu and Miri Weiss Cohen

Information 2025, 16(9), 735; https://doi.org/10.3390/info16090735 - 26 Aug 2025

Viewed by 515

Abstract

Providing students with effective learning resources is essential for improving educational outcomes—especially in complex and conceptually diverse fields such as Mathematics and Computer Science. To better understand how these subjects are communicated, this study investigates the linguistic structures embedded in academic texts from selected subfields within both disciplines. In particular, we focus on meta-languages—the linguistic tools used to express definitions, axioms, intuitions, and heuristics within a discipline. The primary objective of this research is to identify which subfields of Mathematics and Computer Science share similar meta-languages. Identifying such correspondences may enable the rephrasing of content from less familiar subfields using styles that students already recognize from more familiar areas, thereby enhancing accessibility and comprehension. To pursue this aim, we compiled text corpora from multiple subfields across both disciplines. We compared their meta-languages using a combination of supervised (Neural Network) and unsupervised (clustering) learning methods. Specifically, we applied several clustering algorithms—K-means, Partitioning around Medoids (PAM), Density-Based Clustering, and Gaussian Mixture Models—to analyze inter-discipline similarities. To validate the resulting classifications, we used XLNet, a deep learning model known for its sensitivity to linguistic patterns. The model achieved an accuracy of 78% and an F1-score of 0.944. Our findings show that subfields can be meaningfully grouped based on meta-language similarity, offering valuable insights for tailoring educational content more effectively. To further verify these groupings and explore their pedagogical relevance, we conducted both quantitative and qualitative research involving student participation. This paper presents findings from the qualitative component—namely, a content analysis of semi-structured interviews with software engineering students and lecturers. Full article

(This article belongs to the Special Issue Advancing Educational Innovation with Artificial Intelligence)

► Show Figures

Figure 1

18 pages, 4574 KB

Open AccessArticle

Spatio-Temporal Generalization of VIS-NIR-SWIR Spectral Models for Nitrogen Prediction in Sugarcane Leaves

by Carlos Augusto Alves Cardoso Silva, Rodnei Rizzo, Marcelo Andrade da Silva, Matheus Luís Caron and Peterson Ricardo Fiorio

Remote Sens. 2024, 16(22), 4250; https://doi.org/10.3390/rs16224250 - 14 Nov 2024

Viewed by 1330

Abstract

Nitrogen fertilization is a challenging task that usually requires intensive use of resources, such as fertilizers, management and water. This study explored the potential of VIS-NIR-SWIR remote sensing for quantifying leaf nitrogen content (LNC) in sugarcane from different regions and vegetative stages. Conducted in three regions of São Paulo, Brazil (Jaú, Piracicaba and Santa Maria), the research involved three experiments, one per location. The spectral data were obtained at 140, 170, 200, 230 and 260 days after cutting (DAC). From the hyperspectral data, clustering analysis was performed to identify the patterns between the spectral bands for each region where the spectral readings were made, using the Partitioning Around Medoids (PAM) algorithm. Then, the LNC values were used to generate spectral models using Partial Least Squares Regression (PLSR). Subsequently, the generalization of the models was tested with the leave-one-date-out cross-validation (LOOCV) technique. The results showed that although the variation in leaf N was small, the sensor demonstrated the ability to detect these variations. Furthermore, it was possible to determine the influence of N concentrations on the leaf spectra and how this impacted cluster formation. It was observed that the greater the average variation in N content in each cluster, the better defined and denser the groups formed were. The best time to quantify N concentrations was at 140 DAC (R² = 0.90 and RMSE = 0.74 g kg⁻¹). From LOOCV, the areas with sandier soil texture presented a lower model performance compared to areas with clayey soil, with R² < 0.54. The spatial generalization of the models recorded the best performance at 140 DAC (R² = 0.69, RMSE = 1.18 g kg⁻¹ and dr = 0.61), decreasing in accuracy at the crop-maturation stage (260 DAC), R² of 0.05, RMSE of 1.73 g kg⁻¹ and dr of 0.38. Although the technique needs further studies to be improved, our results demonstrated potential, which tends to provide support and benefits for the quantification of nutrients in sugarcane in the long term. Full article

(This article belongs to the Special Issue Monitoring and Managing Environmental Sustainability Using Remote Sensing)

► Show Figures

Figure 1

18 pages, 4152 KB

Open AccessArticle

Significant Improvement in Soil Organic Carbon Estimation Using Data-Driven Machine Learning Based on Habitat Patches

by Wenping Yu, Wei Zhou, Ting Wang, Jieyun Xiao, Yao Peng, Haoran Li and Yuechen Li

Remote Sens. 2024, 16(4), 688; https://doi.org/10.3390/rs16040688 - 15 Feb 2024

Cited by 5 | Viewed by 2935

Abstract

Soil organic carbon (SOC) is generally thought to act as a carbon sink; however, in areas with high spatial heterogeneity, using a single model to estimate the SOC of the whole study area will greatly reduce the simulation accuracy. The earth surface unit division is important to consider in building different models. Here, we divided the research area into different habitat patches using partitioning around a medoids clustering (PAM) algorithm; then, we built an SOC simulation model using machine learning algorithms. The results showed that three habitat patches were created. The simulation accuracy for Habitat Patch 1 (R² = 0.55; RMSE = 2.89) and Habitat Patch 3 (R² = 0.47; RMSE = 3.94) using the XGBoost model was higher than that for the whole study area (R² = 0.44; RMSE = 4.35); although the R² increased by 25% and 6.8%, the RMSE decreased by 33.6% and 9.4%, and the field sample points significantly declined by 70% and 74%. The R² of Habitat Patch 2 using the RF model increased by 17.1%, and the RMSE also decreased by 10.5%; however, the sample points significantly declined by 58%. Therefore, using different models for corresponding patches will significantly increase the SOC simulation accuracy over using one model for the whole study area. This will provide scientific guidance for SOC or soil property monitoring with low field survey costs and high simulation accuracy. Full article

(This article belongs to the Special Issue Remote Sensing for Advancing Nature-Based Climate Solutions)

► Show Figures

Figure 1

15 pages, 19203 KB

Open AccessArticle

Improved Faster Region-Based Convolutional Neural Networks (R-CNN) Model Based on Split Attention for the Detection of Safflower Filaments in Natural Environments

by Zhenguo Zhang, Ruimeng Shi, Zhenyu Xing, Quanfeng Guo and Chao Zeng

Agronomy 2023, 13(10), 2596; https://doi.org/10.3390/agronomy13102596 - 11 Oct 2023

Cited by 18 | Viewed by 2687

Abstract

The accurate acquisition of safflower filament information is the prerequisite for robotic picking operations. To detect safflower filaments accurately in different illumination, branch and leaf occlusion, and weather conditions, an improved Faster R-CNN model for filaments was proposed. Due to the characteristics of safflower filaments being dense and small in the safflower images, the model selected ResNeSt-101 with residual network structure as the backbone feature extraction network to enhance the expressive power of extracted features. Then, using Region of Interest (ROI) Align improved ROI Pooling to reduce the feature errors caused by double quantization. In addition, employing the partitioning around medoids (PAM) clustering was chosen to optimize the scale and number of initial anchors of the network to improve the detection accuracy of small-sized safflower filaments. The test results showed that the mean Average Precision (mAP) of the improved Faster R-CNN reached 91.49%. Comparing with Faster R-CNN, YOLOv3, YOLOv4, YOLOv5, and YOLOv6, the improved Faster R-CNN increased the mAP by 9.52%, 2.49%, 5.95%, 3.56%, and 1.47%, respectively. The mAP of safflower filaments detection was higher than 91% on a sunny, cloudy, and overcast day, in sunlight, backlight, branch and leaf occlusion, and dense occlusion. The improved Faster R-CNN can accurately realize the detection of safflower filaments in natural environments. It can provide technical support for the recognition of small-sized crops. Full article

(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture)

► Show Figures

Figure 1

25 pages, 2382 KB

Open AccessArticle

The Community Structure of eDNA in the Los Angeles River Reveals an Altered Nitrogen Cycle at Impervious Sites

by Savanah Senn, Sharmodeep Bhattacharyya, Gerald Presley, Anne E. Taylor, Rayne Stanis, Kelly Pangell, Daila Melendez and Jillian Ford

Diversity 2023, 15(7), 823; https://doi.org/10.3390/d15070823 - 29 Jun 2023

Cited by 1 | Viewed by 2874 | Correction

Abstract

In this study, we sought to investigate the impact of urbanization, the presence of concrete river bottoms, and nutrient pollution on microbial communities along the L.A. River. Six molecular markers were evaluated for the identification of bacteria, plants, fungi, fish, and invertebrates in 90 samples. PCA (principal components analysis) was used along with PAM (partitioning around medoids) clustering to reveal community structure, and an NB (negative binomial) model in DESeq2 was used for differential abundance analysis. PCA and factor analysis exposed the main axes of variation but were sensitive to outliers. The differential abundance of Proteobacteria was associated with soft-bottom sites, and there was an apparent balance in the abundance of bacteria responsible for nitrogen cycling. Nitrogen cycling was explained via ammonia-oxidizing archaea; the complete ammonia oxidizers, Nitrospira sp.; nitrate-reducing bacteria, Marmoricola sp.; and nitrogen-fixing bacteria Devosia sp., which were differentially abundant at soft-bottom sites (p adj < 0.002). In contrast, the differential abundance of several cyanobacteria and other anoxygenic phototrophs was associated with the impervious sites, which suggested the accumulation of excess nitrogen. The soft-bottom sites tended to be represented by a differential abundance of aerobes, whereas the concrete-associated species tended to be alkaliphilic, saliniphilic, calciphilic, sulfate dependent, and anaerobic. In the Glendale Narrows, downstream from multiple water reclamation plants, there was a differential abundance of cyanobacteria and algae; however, indicator species for low nutrient environments and ammonia-abundance were also present. There was a differential abundance of ascomycetes associated with Arroyo Seco and a differential abundance of Scenedesmaceae green algae and cyanobacteria in Maywood, as seen in the analysis that compared suburban with urban river communities. The proportion of Ascomycota to Basidiomycota within the L.A. River differed from the expected proportion based on published worldwide freshwater and river 18S data; the shift in community structure was most likely associated with the extremes of urbanization. This study indicates that extreme urbanization can result in the overrepresentation of cyanobacterial species that could cause reductions in water quality and safety. Full article

(This article belongs to the Special Issue Biodiversity Conservation in Metacommunities)

► Show Figures

Figure 1

13 pages, 2502 KB

Open AccessEditor’s ChoiceArticle

Typing of the Gut Microbiota Community in Japanese Subjects

by Tomohisa Takagi, Ryo Inoue, Akira Oshima, Hiroshi Sakazume, Kenta Ogawa, Tomo Tominaga, Yoichi Mihara, Takeshi Sugaya, Katsura Mizushima, Kazuhiko Uchiyama, Yoshito Itoh and Yuji Naito

Microorganisms 2022, 10(3), 664; https://doi.org/10.3390/microorganisms10030664 - 20 Mar 2022

Cited by 35 | Viewed by 13656

Abstract

Gut microbiota are involved in both host health and disease and can be stratified based on bacteriological composition. However, gut microbiota clustering data are limited for Asians. In this study, fecal microbiota of 1803 Japanese subjects, including 283 healthy individuals, were analyzed by 16S rRNA sequencing and clustered using two models. The association of various diseases with each community type was also assessed. Five and fifteen communities were identified using partitioning around medoids (PAM) and the Dirichlet multinominal mixtures model, respectively. Bacteria exhibiting characteristically high abundance among the PAM-identified types were of the family Ruminococcaceae (Type A) and genera Bacteroides, Blautia, and Faecalibacterium (Type B); Bacteroides, Fusobacterium, and Proteus (Type C); and Bifidobacterium (Type D), and Prevotella (Type E). The most noteworthy community found in the Japanese subjects was the Bifidobacterium-rich community. The odds ratio based on type E, which had the largest population of healthy subjects, revealed that other types (especially types A, C, and D) were highly associated with various diseases, including inflammatory bowel disease, functional gastrointestinal disorder, and lifestyle-related diseases. Gut microbiota community typing reproducibly identified organisms that may represent enterotypes peculiar to Japanese individuals and that are partly different from those of indivuals from Western countries. Full article

(This article belongs to the Special Issue State-of-the-Art Gut Microbiota Research in Asia)

► Show Figures

Figure 1

17 pages, 3077 KB

Open AccessArticle

A Parallel Architecture for the Partitioning around Medoids (PAM) Algorithm for Scalable Multi-Core Processor Implementation with Applications in Healthcare

by Hassan Mushtaq, Sajid Gul Khawaja, Muhammad Usman Akram, Amanullah Yasin, Muhammad Muzammal, Shehzad Khalid and Shoab Ahmad Khan

Sensors 2018, 18(12), 4129; https://doi.org/10.3390/s18124129 - 25 Nov 2018

Cited by 9 | Viewed by 5421

Abstract

Clustering is the most common method for organizing unlabeled data into its natural groups (called clusters), based on similarity (in some sense or another) among data objects. The Partitioning Around Medoids (PAM) algorithm belongs to the partitioning-based methods of clustering widely used for objects categorization, image analysis, bioinformatics and data compression, but due to its high time complexity, the PAM algorithm cannot be used with large datasets or in any embedded or real-time application. In this work, we propose a simple and scalable parallel architecture for the PAM algorithm to reduce its running time. This architecture can easily be implemented either on a multi-core processor system to deal with big data or on a reconfigurable hardware platform, such as FPGA and MPSoCs, which makes it suitable for real-time clustering applications. Our proposed model partitions data equally among multiple processing cores. Each core executes the same sequence of tasks simultaneously on its respective data subset and shares intermediate results with other cores to produce results. Experiments show that the computational complexity of the PAM algorithm is reduced exponentially as we increase the number of cores working in parallel. It is also observed that the speedup graph of our proposed model becomes more linear with the increase in number of data points and as the clusters become more uniform. The results also demonstrate that the proposed architecture produces the same results as the actual PAM algorithm, but with reduced computational complexity. Full article

(This article belongs to the Special Issue Smart, Secure and Sustainable (3S) Technologies for IoT Based Healthcare Applications)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI