Skip to Content
You are currently on the new version of our website. Access the old version .

298 Results Found

  • Article
  • Open Access
35 Citations
4,667 Views
22 Pages

GPU Parallelization of a Hybrid Pseudospectral Geophysical Turbulence Framework Using CUDA

  • Duane Rosenberg,
  • Pablo D. Mininni,
  • Raghu Reddy and
  • Annick Pouquet

8 February 2020

An existing hybrid MPI-OpenMP scheme is augmented with a CUDA-based fine grain parallelization approach for multidimensional distributed Fourier transforms, in a well-characterized pseudospectral fluid turbulence code. Basics of the hybrid scheme are...

  • Article
  • Open Access
1 Citations
6,185 Views
26 Pages

Raymarching Distance Fields with CUDA

  • Avelina Hadji-Kyriacou and
  • Ognjen Arandjelović

9 November 2021

Raymarching is a technique for rendering implicit surfaces using signed distance fields. It has been known and used since the 1980s for rendering fractals and CSG (constructive solid geometry) surfaces, but has rarely been used for commercial renderi...

  • Article
  • Open Access
4 Citations
2,576 Views
18 Pages

An Implementation of LASER Beam Welding Simulation on Graphics Processing Unit Using CUDA

  • Ernandes Nascimento,
  • Elisan Magalhães,
  • Arthur Azevedo,
  • Luiz E. S. Paes and
  • Ariel Oliveira

The maximum number of parallel threads in traditional CFD solutions is limited by the Central Processing Unit (CPU) capacity, which is lower than the capabilities of a modern Graphics Processing Unit (GPU). In this context, the GPU allows for simulta...

  • Article
  • Open Access
5 Citations
4,982 Views
17 Pages

Watershed analysis, as a fundamental component of digital terrain analysis, is based on the Digital Elevation Model (DEM), which is a grid (raster) model of the Earth surface and topography. Watershed analysis consists of computationally and data int...

  • Article
  • Open Access
2 Citations
2,863 Views
20 Pages

10 December 2021

In the field of computational biology, sequence alignment is a very important methodology. BLAST is a very common tool for performing sequence alignment in bioinformatics provided by National Center for Biotechnology Information (NCBI) in the USA. Th...

  • Article
  • Open Access
3 Citations
5,647 Views
29 Pages

Accelerating a Geometrical Approximated PCA Algorithm Using AVX2 and CUDA

  • Alina L. Machidon,
  • Octavian M. Machidon,
  • Cătălin B. Ciobanu and
  • Petre L. Ogrutan

13 June 2020

Remote sensing data has known an explosive growth in the past decade. This has led to the need for efficient dimensionality reduction techniques, mathematical procedures that transform the high-dimensional data into a meaningful, reduced representati...

  • Article
  • Open Access
4 Citations
3,183 Views
15 Pages

4 December 2021

In this study, a CUDA Fortran-based GPU-accelerated Laplace equation model was developed and applied to several cases. The Laplace equation is one of the equations that can physically analyze the groundwater flows, and is an equation that can provide...

  • Article
  • Open Access
12 Citations
9,393 Views
19 Pages

OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA

  • Roberto L. Castro,
  • Diego Andrade and
  • Basilio B. Fraguela

24 August 2021

Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The improvement is being pushed by algorithmic a...

  • Communication
  • Open Access
7 Citations
4,702 Views
11 Pages

Parallel Algorithm for Connected-Component Analysis Using CUDA

  • Dominic Windisch,
  • Christian Kaever,
  • Guido Juckeland and
  • André Bieberle

1 February 2023

In this article, we introduce a parallel algorithm for connected-component analysis (CCA) on GPUs which drastically reduces the volume of data to transfer from GPU to the host. CCA algorithms targeting GPUs typically store the extracted features in a...

  • Article
  • Open Access
13 Citations
4,357 Views
20 Pages

A CUDA-Based Parallel Geographically Weighted Regression for Large-Scale Geographic Data

  • Dongchao Wang,
  • Yi Yang,
  • Agen Qiu,
  • Xiaochen Kang,
  • Jiakuan Han and
  • Zhengyuan Chai

Geographically weighted regression (GWR) introduces the distance weighted kernel function to examine the non-stationarity of geographical phenomena and improve the performance of global regression. However, GWR calibration becomes critical when using...

  • Article
  • Open Access
1,713 Views
14 Pages

5 March 2025

The Greiner–Hormann algorithm is a commonly used polygon overlay analysis algorithm. It uses a double-linked list structure to store vertex data, and its intersection calculation step has a significant effect on the overall operating efficiency...

  • Article
  • Open Access
3 Citations
6,231 Views
18 Pages

Geospatial transformations in the form of reprojection calculations for large datasets can be computationally intensive; as such, finding better, less expensive ways of achieving these computations is desired. In this paper, we report our efforts in...

  • Article
  • Open Access
1,584 Views
35 Pages

Error Classification and Static Detection Methods in Tri-Programming Models: MPI, OpenMP, and CUDA

  • Saeed Musaad Altalhi,
  • Fathy Elbouraey Eassa,
  • Sanaa Abdullah Sharaf,
  • Ahmed Mohammed Alghamdi,
  • Khalid Ali Almarhabi and
  • Rana Ahmad Bilal Khalid

The growing adoption of supercomputers across various scientific disciplines, particularly by researchers without a background in computer science, has intensified the demand for parallel applications. These applications are typically developed using...

  • Article
  • Open Access
1,111 Views
31 Pages

Efficient scheduling of virtual power plants (VPPs) is essential for the integration of distributed energy resources into modern power systems. This study presents a CUDA-accelerated Multiple-Chain Simulated Annealing (MC-SA) algorithm tailored for o...

  • Article
  • Open Access
13 Citations
4,014 Views
14 Pages

28 October 2018

An efficient parallel computation using graphics processing units (GPUs) is developed for studying the electromagnetic (EM) backscattering characteristics from a large three-dimensional sea surface. A slope-deterministic composite scattering model (S...

  • Article
  • Open Access
1 Citations
1,077 Views
23 Pages

11 September 2025

This paper introduces a parallelized approach to reconstruct Koopman computational graphs from the perspective of parallel computing to address the computational efficiency bottleneck in approximating Koopman operators within high-dimensional spaces....

  • Article
  • Open Access
2 Citations
3,621 Views
15 Pages

CUDA-Optimized GPU Acceleration of 3GPP 3D Channel Model Simulations for 5G Network Planning

  • Nasir Ali Shah,
  • Mihai T. Lazarescu,
  • Roberto Quasso and
  • Luciano Lavagno

The simulation of massive multiple-input multiple-output (MIMO) channel models is becoming increasingly important for testing and validation of fifth-generation new radio (5G NR) wireless networks and beyond. However, simulation performance tends to...

  • Article
  • Open Access
1 Citations
2,860 Views
14 Pages

The growing number of space objects leads to increases in the potential risks of damage to satellites and generates space debris after colliding. Conjunction assessment analysis is the one of keys to evaluating the collision risk of satellites and sa...

  • Article
  • Open Access
6 Citations
3,165 Views
14 Pages

Designing automatic optimizing compilers is an advanced engineering process requiring a great deal of expertise, programming, testing, and experimentation. Maintaining the approach and adapting it to evolving libraries and environments is a time-cons...

  • Article
  • Open Access
3 Citations
4,762 Views
8 Pages

CuDDI: A CUDA-Based Application for Extracting Drug-Drug Interaction Related Substance Terms from PubMed Literature

  • Yin Lu,
  • Aditya Chandra Vothgod Ramachandra,
  • Minh Pham,
  • Yi-Cheng Tu and
  • Feng Cheng

19 March 2019

Drug-drug interaction (DDI) is becoming a serious issue in clinical pharmacy as the use of multiple medications is more common. The PubMed database is one of the biggest literature resources for DDI studies. It contains over 150,000 journal articles...

  • Article
  • Open Access
2,233 Views
12 Pages

Cropped Quad-Tree Based Solid Object Colouring with Cuda

  • Abdullah Çavuşoğlu,
  • Baha Şen,
  • Caner Özcan and
  • Salih Görgünoğlu

1 December 2013

In this study, surfaces of solid objects are coloured with Cropped Quad-Tree method utilizing GPU computing optimization. There are numerous methods used in solid object colouring. When the studies carried out in different fields are taken into consi...

  • Proceeding Paper
  • Open Access
1 Citations
602 Views
9 Pages

Performance Analysis of CUDA-Based Galileo Signal Quality Monitoring

  • Florian Binder,
  • Daniel J. Bauer,
  • Thomas Pany and
  • Torben Schüler

The aim of this study was to develop basic findings for a continuous Signal Quality Monitoring system based on a measurement campaign. Four Galileo satellites were repeatedly recorded, using a dish antenna, and their metrics were analyzed. Due to the...

  • Article
  • Open Access
5 Citations
4,352 Views
15 Pages

Accelerating the Finite-Element Method for Reaction-Diffusion Simulations on GPUs with CUDA

  • Hedi Sellami,
  • Leo Cazenille,
  • Teruo Fujii,
  • Masami Hagiya,
  • Nathanael Aubert-Kato and
  • Anthony J. Genot

22 September 2020

DNA nanotechnology offers a fine control over biochemistry by programming chemical reactions in DNA templates. Coupled to microfluidics, it has enabled DNA-based reaction-diffusion microsystems with advanced spatio-temporal dynamics such as traveling...

  • Article
  • Open Access
2 Citations
1,083 Views
13 Pages

26 May 2025

Nowadays, underwater activities are becoming more and more important. As the number of underwater sensing devices grows rapidly, the amount of bandwidth needed also increases very quickly. Apart from underwater communication, direct communication acr...

  • Article
  • Open Access
1 Citations
2,643 Views
15 Pages

Lightweight GPU-Accelerated Parallel Processing of the SCHISM Model Using CUDA Fortran

  • Hongchun Zhang,
  • Qian Cao,
  • Changmao Wu,
  • Guangjun Xu,
  • Yuli Liu,
  • Xingru Feng,
  • Meibing Jin and
  • Changming Dong

The SCHISM model is widely used for ocean numerical simulations, but its computational efficiency is constrained by the substantial resources it requires. To enhance its performance, this study develops GPU–SCHISM, a GPU-accelerated parallel ve...

  • Article
  • Open Access
19 Citations
3,484 Views
17 Pages

Feature-Based Sentimental Analysis on Public Attention towards COVID-19 Using CUDA-SADBM Classification Model

  • Siva Kumar Pathuri,
  • N. Anbazhagan,
  • Gyanendra Prasad Joshi and
  • Jinsang You

23 December 2021

The COVID-19 pandemic has spread to almost all countries of the World and affected people both mentally and economically. The primary motivation of this research is to construct a model that takes reviews or evaluations from several people who are af...

  • Article
  • Open Access
1 Citations
3,086 Views
17 Pages

The three-dimensional (3D) geological voxel model is essential for numerical simulation and resource calculation. However, it can be challenging due to the point in polygon test in 3D voxel modeling. The commonly used Winding number algorithm require...

  • Article
  • Open Access
537 Views
24 Pages

A GPU-CUDA Numerical Algorithm for Solving a Biological Model

  • Pasquale De Luca,
  • Giuseppe Fiorillo and
  • Livia Marcellino

Tumor angiogenesis models based on coupled nonlinear parabolic partial differential equations require solving stiff systems where explicit time-stepping methods impose severe stability constraints on the time step size. Implicit–Explicit (IMEX)...

  • Feature Paper
  • Article
  • Open Access
2 Citations
1,036 Views
21 Pages

Accelerated Numerical Simulations of a Reaction-Diffusion- Advection Model Using Julia-CUDA

  • Angelo Ciaramella,
  • Davide De Angelis,
  • Pasquale De Luca and
  • Livia Marcellino

30 April 2025

The emergence of exascale computing systems presents both opportunities and challenges in scientific computing, particularly for complex mathematical models requiring high-performance implementations. This paper addresses these challenges in the cont...

  • Article
  • Open Access
2 Citations
2,750 Views
16 Pages

CUDA and OpenMp Implementation of Boolean Matrix Product with Applications in Visual SLAM

  • Amir Zarringhalam,
  • Saeed Shiry Ghidary,
  • Ali Mohades and
  • Seyed-Ali Sadegh-Zadeh

29 January 2023

In this paper, the concept of ultrametric structure is intertwined with the SLAM procedure. A set of pre-existing transformations has been used to create a new simultaneous localization and mapping (SLAM) algorithm. We have developed two new parallel...

  • Article
  • Open Access
3 Citations
3,583 Views
18 Pages

Towards Enhancing Coding Productivity for GPU Programming Using Static Graphs

  • Leonel Toledo,
  • Pedro Valero-Lara,
  • Jeffrey S. Vetter and
  • Antonio J. Peña

The main contribution of this work is to increase the coding productivity of GPU programming by using the concept of Static Graphs. GPU capabilities have been increasing significantly in terms of performance and memory capacity. However, there are st...

  • Article
  • Open Access
14 Citations
6,422 Views
37 Pages

ReS2tAC—UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

  • Boitumelo Ruf,
  • Jonas Mohrs,
  • Martin Weinmann,
  • Stefan Hinz and
  • Jürgen Beyerer

7 June 2021

With the emergence of low-cost robotic systems, such as unmanned aerial vehicle, the importance of embedded high-performance image processing has increased. For a long time, FPGAs were the only processing hardware that were capable of high-performanc...

  • Article
  • Open Access
2 Citations
1,897 Views
16 Pages

31 May 2023

To improve the seismic connectivity reliability (SCR) analysis efficiency of water distribution systems (WDS) based on Monte Carlo (MC) simulation, the quasi-Monte Carlo (QMC) method sampled by a low-discrepancy sequence is applied. Furthermore, a pa...

  • Article
  • Open Access
22 Citations
13,469 Views
17 Pages

4 November 2021

To achieve high accuracy when performing deep learning, it is necessary to use a large-scale training model. However, due to the limitations of GPU memory, it is difficult to train large-scale training models within a single GPU. NVIDIA introduced a...

  • Article
  • Open Access
1 Citations
2,400 Views
22 Pages

3 January 2025

Estimating ego-motion in autonomous vehicles is critical for tasks such as localization, navigation, obstacle avoidance, and so on. While traditional methods often rely on direct pose estimation or AI-based approaches, these can be computationally in...

  • Article
  • Open Access
2 Citations
2,673 Views
16 Pages

BooLSPLG: A Library with Parallel Algorithms for Boolean Functions and S-Boxes for GPU

  • Dushan Bikov,
  • Iliya Bouyukliev and
  • Mariya Dzhumalieva-Stoeva

14 April 2023

In this paper, we present a library with sequential and parallel functions for computing some of the most important cryptographic characteristics of Boolean and vectorial Boolean functions. The library implements algorithms to calculate the nonlinear...

  • Article
  • Open Access
10 Citations
3,357 Views
22 Pages

Using a GPU to Accelerate a Longwave Radiative Transfer Model with Efficient CUDA-Based Methods

  • Yuzhu Wang,
  • Yuan Zhao,
  • Wei Li,
  • Jinrong Jiang,
  • Xiaohui Ji and
  • Albert Y. Zomaya

27 September 2019

Climatic simulations rely heavily on high-performance computing. As one of the atmospheric radiative transfer models, the rapid radiative transfer model for general circulation models (RRTMG) is used to calculate the radiative transfer of electromagn...

  • Article
  • Open Access
22 Citations
11,051 Views
24 Pages

GPU Computing with Python: Performance, Energy Efficiency and Usability

  • Håvard H. Holm,
  • André R. Brodtkorb and
  • Martin L. Sætra

In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy effic...

  • Article
  • Open Access
3 Citations
4,875 Views
21 Pages

26 February 2024

Currently, cryptographic hash functions are widely used in various applications, including message authentication codes, cryptographic random generators, digital signatures, key derivation functions, and post-quantum algorithms. Notably, they play a...

  • Article
  • Open Access
40 Citations
7,209 Views
13 Pages

13 July 2017

The use of unmanned aerial vehicles (UAV) can allow individual tree detection for forest inventories in a cost-effective way. The scale-space filtering (SSF) algorithm is commonly used and has the capability of detecting trees of different crown size...

  • Article
  • Open Access
6 Citations
3,396 Views
16 Pages

15 September 2022

Real-time, simultaneous, and adaptive beam steering into multiple regions of interest replaces conventional raster scanning with a less time-consuming and flexible beam steering framework, where only regions of interest are scanned by a laser beam. C...

  • Article
  • Open Access
3 Citations
3,820 Views
26 Pages

10 July 2019

It is well known that aurorae have very high research value, but the data volume of aurora spectral data is very large, which brings great challenges to storage and transmission. To alleviate this problem, compression of aurora spectral data is indis...

  • Article
  • Open Access
4 Citations
7,836 Views
13 Pages

Exploring Numba and CuPy for GPU-Accelerated Monte Carlo Radiation Transport

  • Tair Askar,
  • Argyn Yergaliyev,
  • Bekdaulet Shukirgaliyev and
  • Ernazar Abdikamalov

This paper examines the performance of two popular GPU programming platforms, Numba and CuPy, for Monte Carlo radiation transport calculations. We conducted tests involving random number generation and one-dimensional Monte Carlo radiation transport...

  • Article
  • Open Access
1 Citations
3,146 Views
21 Pages

11 October 2021

Graphics processing units (GPUs) have been in the spotlight in various fields because they can process a massive amount of computation at a relatively low price. This research proposes a performance acceleration framework applied to Monte Carlo metho...

  • Article
  • Open Access
1,309 Views
18 Pages

This paper proposes an optimized shared memory access technique to enhance parallel processing performance and reduce memory accesses for the ARIA block cipher in GPU environments. To overcome the limited size of GPU shared memory, we merged ARIA&rsq...

  • Article
  • Open Access
923 Views
12 Pages

Feasibility of Implementing Motion-Compensated Magnetic Resonance Imaging Reconstruction on Graphics Processing Units Using Compute Unified Device Architecture

  • Mohamed Aziz Zeroual,
  • Natalia Dudysheva,
  • Vincent Gras,
  • Franck Mauconduit,
  • Karyna Isaieva,
  • Pierre-André Vuissoz and
  • Freddy Odille

22 May 2025

Motion correction in magnetic resonance imaging (MRI) has become increasingly complex due to the high computational demands of iterative reconstruction algorithms and the heterogeneity of emerging computing platforms. However, the clinical applicabil...

  • Article
  • Open Access
3 Citations
2,962 Views
22 Pages

21 October 2023

With the development of engineering technology, engineering has higher requirements for the accuracy and the scale of simulation calculation. The computational efficiency of traditional serial programs cannot meet the requirements of engineering. The...

  • Article
  • Open Access
6 Citations
3,984 Views
21 Pages

Enhancement of In-Plane Seismic Full Waveform Inversion with CPU and GPU Parallelization

  • Min Bahadur Basnet,
  • Mohammad Anas,
  • Zarghaam Haider Rizvi,
  • Asmer Hamid Ali,
  • Mohammad Zain,
  • Giovanni Cascante and
  • Frank Wuttke

2 September 2022

Full waveform inversion is a widely used technique to estimate the subsurface parameters with the help of seismic measurements on the surface. Due to the amount of data, model size and non-linear iterative procedures, the numerical computation of Ful...

  • Article
  • Open Access
6 Citations
2,177 Views
19 Pages

6 July 2023

The structure of metallic materials has a significant impact on their properties. One of the most popular methods to form the properties of metal alloys is heat treatment, which uses thermally activated transformations that take place in metals to ac...

  • Article
  • Open Access
7 Citations
2,242 Views
21 Pages

31 July 2023

The numerical solution for fractional dynamics problems can create a high computational load, which makes it necessary to implement efficient algorithms for their solution. The main contribution to the computational load of such computations is creat...

of 6