Skip to Content

40 Results Found

  • Article
  • Open Access
10 Citations
4,036 Views
21 Pages

A Novel Query Strategy-Based Rank Batch-Mode Active Learning Method for High-Resolution Remote Sensing Image Classification

  • Xin Luo,
  • Huaqiang Du,
  • Guomo Zhou,
  • Xuejian Li,
  • Fangjie Mao,
  • Di’en Zhu,
  • Yanxin Xu,
  • Meng Zhang,
  • Shaobai He and
  • Zihao Huang

7 June 2021

An informative training set is necessary for ensuring the robust performance of the classification of very-high-resolution remote sensing (VHRRS) images, but labeling work is often difficult, expensive, and time-consuming. This makes active learning...

  • Article
  • Open Access
2,715 Views
18 Pages

4 August 2021

Nearest neighbor (NN) and range (RN) queries are basic query types in spatial databases. In this study, we refer to collections of NN and RN queries as spatial proximity (SP) queries. At peak times, location-based services (LBS) need to quickly proce...

  • Article
  • Open Access
6,698 Views
17 Pages

Efficient and Effective Directed Minimum Spanning Tree Queries

  • Zhuoran Wang,
  • Dian Ouyang,
  • Yikun Wang,
  • Qi Liang and
  • Zhuo Huang

Computing directed Minimum Spanning Tree (DMST) is a fundamental problem in graph theory. It is applied in a wide spectrum of fields from computer network and communication protocol design to revenue maximization in social networks and syntactic pars...

  • Article
  • Open Access
1 Citations
2,676 Views
17 Pages

23 June 2022

Given a set of facilities F and a query point q, a k-farthest neighbor (kFN) query returns the k farthest facilities f1,f1,,fk from q. This study considers the moving k-farthest neighbor (MkFN) query that constantly retrieves the k facilities...

  • Article
  • Open Access
1 Citations
5,112 Views
28 Pages

Conjunctive queries play a key role in retrieving data from a database. In a database, a query containing many conditions in its predicate, connected by an “and/&/∧” operator, is called a conjunctive query. Retrieving the outcome of a conjunctive...

  • Review
  • Open Access
98 Citations
23,118 Views
13 Pages

17 October 2014

Today, big data are generated from many sources, and there is a huge demand for storing, managing, processing, and querying on big data. The MapReduce model and its counterpart open source implementation Hadoop, has proven itself as the de facto solu...

  • Article
  • Open Access
3 Citations
3,615 Views
19 Pages

Query Rewriting for Incremental Continuous Query Evaluation in HIFUN

  • Petros Zervoudakis,
  • Haridimos Kondylakis,
  • Nicolas Spyratos and
  • Dimitris Plexousakis

8 May 2021

HIFUN is a high-level query language for expressing analytic queries of big datasets, offering a clear separation between the conceptual layer, where analytic queries are defined independently of the nature and location of data, and the physical laye...

  • Article
  • Open Access
21 Citations
12,071 Views
21 Pages

Classification Active Learning Based on Mutual Information

  • Jamshid Sourati,
  • Murat Akcakaya,
  • Jennifer G. Dy,
  • Todd K. Leen and
  • Deniz Erdogmus

5 February 2016

Selecting a subset of samples to label from a large pool of unlabeled data points, such that a sufficiently accurate classifier is obtained using a reasonably small training set is a challenging, yet critical problem. Challenging, since solving this...

  • Article
  • Open Access
577 Views
19 Pages

19 December 2025

Transferring cell type annotations from reference dataset to query dataset is a fundamental problem in AI-based single-cell data analysis. However, single-cell measurement techniques lead to domain gaps between multiple batches or datasets. The exist...

  • Article
  • Open Access
5 Citations
4,303 Views
31 Pages

Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache Spark

  • Panagiotis Moutafis,
  • George Mavrommatis,
  • Michael Vassilakopoulos and
  • Antonio Corral

Aiming at the problem of spatial query processing in distributed computing systems, the design and implementation of new distributed spatial query algorithms is a current challenge. Apache Spark is a memory-based framework suitable for real-time and...

  • Article
  • Open Access
2 Citations
3,055 Views
20 Pages

JQPro:Join Query Processing in a Distributed System for Big RDF Data Using the Hash-Merge Join Technique

  • Nahla Mohammed Elzein,
  • Mazlina Abdul Majid,
  • Ibrahim Abaker Targio Hashem,
  • Ashraf Osman Ibrahim,
  • Anas W. Abulfaraj and
  • Faisal Binzagr

6 March 2023

In the last decade, the volume of semantic data has increased exponentially, with the number of Resource Description Framework (RDF) datasets exceeding trillions of triples in RDF repositories. Hence, the size of RDF datasets continues to grow. Howev...

  • Article
  • Open Access
2 Citations
2,526 Views
15 Pages

4 June 2022

Active learning is a method that can actively select examples with much information from a large number of unlabeled samples to query labeled by experts, so as to obtain a high-precision classifier with a small number of samples. Most of the current...

  • Article
  • Open Access
4 Citations
4,164 Views
27 Pages

14 June 2021

It is extremely important to extract valuable information and achieve efficient integration of remote sensing data. The multi-source and heterogeneous nature of remote sensing data leads to the increasing complexity of these relationships, and means...

  • Article
  • Open Access
2 Citations
1,863 Views
18 Pages

28 February 2025

Multitask learning models provide benefits by reducing model complexity and improving accuracy by concurrently learning multiple tasks with shared representations. Leveraging inductive knowledge transfer, these models mitigate the risk of overfitting...

  • Article
  • Open Access
8 Citations
2,622 Views
22 Pages

Continuous k nearest neighbor queries over spatial–textual data streams (abbreviated as CkQST) are the core operations of numerous location-based publish/subscribe systems. Such a system is usually subscribed with millions of CkQST and evaluate...

  • Article
  • Open Access
1 Citations
1,057 Views
21 Pages

PhishGraph: A Disk-Aware Approximate Nearest Neighbor Index for Billion-Scale Semantic URL Search

  • Dimitrios Karapiperis,
  • Georgios Feretzakis and
  • Sarandis Mitropoulos

5 November 2025

The proliferation of algorithmically generated malicious URLs necessitates a shift from syntactic detection to semantic analysis. This paper introduces PhishGraph, a disk-aware Approximate Nearest Neighbor (ANN) search system designed to perform bill...

  • Article
  • Open Access
1,712 Views
20 Pages

HitSim: An Efficient Algorithm for Single-Source and Top-k SimRank Computation

  • Jing Bai,
  • Junfeng Zhou,
  • Shuotong Chen,
  • Ming Du,
  • Ziyang Chen and
  • Mengtao Min

12 June 2024

SimRank is a widely used metric for evaluating vertex similarity based on graph topology, with diverse applications such as large-scale graph mining and natural language processing. The objective of the single-source and top-k SimRank query problem i...

  • Article
  • Open Access
40 Citations
11,984 Views
15 Pages

RaMP: A Comprehensive Relational Database of Metabolomics Pathways for Pathway Enrichment Analysis of Genes and Metabolites

  • Bofei Zhang,
  • Senyang Hu,
  • Elizabeth Baskin,
  • Andrew Patt,
  • Jalal K. Siddiqui and
  • Ewy A. Mathé

22 February 2018

The value of metabolomics in translational research is undeniable, and metabolomics data are increasingly generated in large cohorts. The functional interpretation of disease-associated metabolites though is difficult, and the biological mechanisms t...

  • Article
  • Open Access
687 Views
19 Pages

DOCB: A Dynamic Online Cross-Batch Hard Exemplar Recall for Cross-View Geo-Localization

  • Wenchao Fan,
  • Xuetao Tian,
  • Long Huang,
  • Xiuwei Zhang and
  • Fang Wang

Image-based geo-localization is a challenging task that aims to determine the geographic location of a ground-level query image captured by an Unmanned Ground Vehicle (UGV) by matching it to geo-tagged nadir-view (top-down) images from an Unmanned Ae...

  • Article
  • Open Access
393 Views
16 Pages

27 December 2025

Accurate distance perception and collision reasoning are crucial for robotic manipulation in the confined interior of tokamak vacuum vessels. Traditional mesh- or voxel-based methods suffer from discretization artifacts, discontinuities, and heavy me...

  • Article
  • Open Access
7 Citations
5,091 Views
20 Pages

This paper is focused on comparing database replication over spatial data in PostgreSQL and MySQL. Database replication means solving various problems with overloading a single database server with writing and reading queries. There are many replicat...

  • Article
  • Open Access
2 Citations
1,191 Views
16 Pages

25 February 2025

Currently, there is no publicly available dataset for the classification of potato pest and disease-related queries. Moreover, traditional query classification models generally adopt a single maximum-pooling strategy when performing down-sampling ope...

  • Article
  • Open Access
15 Citations
3,099 Views
28 Pages

Polygon Simplification for the Efficient Approximate Analytics of Georeferenced Big Data

  • Isam Mashhour Al Jawarneh,
  • Luca Foschini and
  • Paolo Bellavista

29 September 2023

The unprecedented availability of sensor networks and GPS-enabled devices has caused the accumulation of voluminous georeferenced data streams. These data streams offer an opportunity to derive valuable insights and facilitate decision making for urb...

  • Review
  • Open Access
6 Citations
6,542 Views
29 Pages

Integrating OLAP with NoSQL Databases in Big Data Environments: Systematic Mapping

  • Diana Martinez-Mosquera,
  • Rosa Navarrete,
  • Sergio Luján-Mora,
  • Lorena Recalde and
  • Andres Andrade-Cabrera

The growing importance of data analytics is leading to a shift in data management strategy at many companies, moving away from simple data storage towards adopting Online Analytical Processing (OLAP) query analysis. Concurrently, NoSQL databases are...

  • Feature Paper
  • Article
  • Open Access
7 Citations
3,109 Views
29 Pages

22 October 2020

Recognizing the identity of a query individual in a surveillance sequence is the core of Multi-Object Tracking (MOT) and Re-Identification (Re-Id) algorithms. Both tasks can be addressed by measuring the appearance affinity between people observation...

  • Article
  • Open Access
1,717 Views
24 Pages

20 October 2025

Packing approaches enhance training efficiency by filling the padding space in each batch with shorter sequences, thereby reducing the total number of batches per epoch. This approach has proven effective in both pre-training and supervised fine-tuni...

  • Article
  • Open Access
15 Citations
10,791 Views
29 Pages

Challenges in NoSQL-Based Distributed Data Storage: A Systematic Literature Review

  • Shabana Ramzan,
  • Imran Sarwar Bajwa,
  • Rafaqut Kazmi and
  • Amna

Key-Value stores (KVSs) are the most flexible and simplest model of NoSQL databases, which have become highly popular over the last few years due to their salient features such as availability, portability, reliability, and low operational cost. From...

  • Article
  • Open Access
7 Citations
3,138 Views
17 Pages

On the Design and Implementation of a Blockchain-Based Data Management System for ETO Manufacturing

  • Zhengjun Jing,
  • Niuping Hu,
  • Yurong Song,
  • Bo Song,
  • Chunsheng Gu and
  • Lei Pan

13 September 2022

Engineer-to-order (ETO) is a currently popular production model that can meet customers’ individual needs, for which the orders are primarily non-standard parts or small batches. This production model has caused many management challenges, incl...

  • Article
  • Open Access
6 Citations
1,711 Views
17 Pages

18 November 2024

Remote sensing image retrieval (RSIR) plays a crucial role in remote sensing applications, focusing on retrieving a collection of items that closely match a specified query image. Due to the advantages of low storage cost and fast search speed, deep...

  • Article
  • Open Access
4 Citations
4,171 Views
14 Pages

21 December 2019

Currently, the dual use of IPv4 and IPv6 is becoming a problem. In particular, Network Address Translation (NAT) is an important issue to be solved because of traversal problems in end-to-end applications for lots of mobile IoT devices connected to d...

  • Article
  • Open Access
1,122 Views
22 Pages

Detecting and Exploring Homogeneous Dense Groups via k-Core Decomposition and Core Member Filtering in Social Networks

  • Zeyu Zhang,
  • Yuan Gao,
  • Zhihao Li,
  • Haotian Huang,
  • Yijun Gu,
  • Xi Li,
  • Dechun Yin and
  • Shunshun Fu

6 October 2025

Exploring homogeneous dense groups is one of the important issues in social network structure measurement. k-core decomposition and core member filtering are common methods to uncover homogeneous dense groups in a network. However, existing methods o...

  • Article
  • Open Access
170 Views
17 Pages

29 January 2026

Trajectory similarity calculation, a cornerstone of trajectory data mining, is pivotal for diverse applications such as clustering, classification, and retrieval. While existing representation learning-based methods offer notable advantages in effici...

  • Article
  • Open Access
32 Citations
11,111 Views
25 Pages

3 June 2020

In recent years, the application and wide adoption of Internet of Things (IoT)-based technologies have increased the proliferation of monitoring systems, which has consequently exponentially increased the amounts of heterogeneous data generated. Proc...

  • Article
  • Open Access
12 Citations
3,078 Views
14 Pages

This paper proposes an external breaking vibration identification method of transmission line tower based on a radio frequency identification (RFID) sensor and deep learning. The RFID sensor is designed to obtain the vibration signal of the transmiss...

  • Article
  • Open Access
1,068 Views
34 Pages

A Deployment-Aware Framework for Carbon- and Water- Efficient LLM Serving

  • Julian Hoxha,
  • Marsela Thanasi-Boçe and
  • Tarek Khalifa

22 November 2025

Inference now dominates the lifecycle footprint of large language models, yet published estimates often use inconsistent boundaries and optimize carbon while ignoring water. We present a provider-agnostic framework that unifies scope-transparent meas...

  • Article
  • Open Access
28 Citations
6,854 Views
22 Pages

Multistorey buildings typically include stratified legal interests which provide entitlements to a community of owners to lawfully possess private properties and use communal and public properties. The spatial arrangements of these legal interests ar...

  • Article
  • Open Access
27 Citations
4,020 Views
24 Pages

6 August 2022

Vibration signals collected in real industrial environments are usually limited and unlabeled. In this case, fault diagnosis methods based on deep learning tend to perform poorly. Previous work mainly used the unlabeled data of the same diagnostic ob...

  • Article
  • Open Access
5 Citations
2,390 Views
22 Pages

13 March 2024

In this paper we propose the method for detecting potential anomalous cosmic ray particle tracks in big data image dataset acquired by Complementary Metal-Oxide-Semiconductors (CMOS). Those sensors are part of scientific infrastructure of Cosmic Ray...

  • Article
  • Open Access
705 Views
55 Pages

Hybrid AI and LLM-Enabled Agent-Based Real-Time Decision Support Architecture for Industrial Batch Processes: A Clean-in-Place Case Study

  • Apolinar González-Potes,
  • Diego Martínez-Castro,
  • Carlos M. Paredes,
  • Alberto Ochoa-Brust,
  • Luis J. Mena,
  • Rafael Martínez-Peláez,
  • Vanessa G. Félix and
  • Ramón A. Félix-Cuadras

1 February 2026

A hybrid AI and LLM-enabled architecture is presented for real-time decision support in industrial batch processes, where supervision still relies heavily on human operators and ad hoc SCADA logic. Unlike algorithmic contributions proposing novel AI...

  • Article
  • Open Access
1,071 Views
37 Pages

23 October 2025

Today’s rapidly increasing number and performance of Remotely Piloted Aircraft Systems (RPASs) and sensors allows for an innovative approach in monitoring, mitigating, and responding to natural disasters and risks. At present, there are 100s of...