Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (14)

Search Parameters:
Keywords = big data in astronomy

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
41 pages, 9508 KB  
Article
CTAARCHS: Cloud-Based Technologies for Archival Astronomical Research Contents and Handling Systems
by Stefano Gallozzi, Georgios Zacharis, Federico Fiordoliva and Fabrizio Lucarelli
Metrics 2025, 2(3), 18; https://doi.org/10.3390/metrics2030018 - 8 Sep 2025
Viewed by 1029
Abstract
This paper presents a flexible approach to a multipurpose, heterogeneous archive and data management system model that merges the robustness of legacy grid-based technologies with modern cloud and edge computing paradigms. It leverages innovations driven by big data, IoT, AI, and machine learning [...] Read more.
This paper presents a flexible approach to a multipurpose, heterogeneous archive and data management system model that merges the robustness of legacy grid-based technologies with modern cloud and edge computing paradigms. It leverages innovations driven by big data, IoT, AI, and machine learning to create an adaptive data storage and processing framework. In today’s digital age, where data are the new intangible gold, the “gold rush” lies in managing and storing massive datasets effectively—especially when these data serve governmental or commercial purposes, raising concerns about privacy and data misuse by third-party aggregators. Astronomical data, in particular, require this same thoughtful approach. Scientific discovery increasingly depends on efficient extraction and processing of large datasets. Distributed archival models, unlike centralized warehouses, offer scalability by allowing data to be accessed and processed across locations via cloud services. Incorporating edge computing further enables real-time access with reduced latency. Major astronomical projects must also avoid common single points of failure (SPOFs), often resulting from suboptimal technological choices driven by collaboration politics or In-Kind Contributions (IKCs). These missteps can hinder innovation and long-term project success. The principal goal of this work is to outline best practices in archival and data management projects—from policy development and task planning to use-case definition and implementation. Only after these steps can a coherent selection of hardware, software, or virtual environments be made. The proposed model—CTAARCHS (Cloud-based Technologies for Astronomical Archiving Research Contents and Handling Systems)—is an open-source, multidisciplinary platform supporting big data needs in astronomy. It promotes broad institutional collaboration, offering code repositories and sample data for immediate use. Full article
Show Figures

Figure 1

34 pages, 11750 KB  
Article
Accelerated and Energy-Efficient Galaxy Detection: Integrating Deep Learning with Tensor Methods for Astronomical Imaging
by Humberto Farias, Guillermo Damke, Mauricio Solar and Marcelo Jaque Arancibia
Universe 2025, 11(2), 73; https://doi.org/10.3390/universe11020073 - 18 Feb 2025
Cited by 3 | Viewed by 1172
Abstract
Addressing the astronomical challenges posed by the interplay of data volume, AI sophistication, and energy consumption is crucial for the future of astronomy. As astronomical surveys continue to produce vast amounts of data, the computational and energy demands for galaxy classification have escalated, [...] Read more.
Addressing the astronomical challenges posed by the interplay of data volume, AI sophistication, and energy consumption is crucial for the future of astronomy. As astronomical surveys continue to produce vast amounts of data, the computational and energy demands for galaxy classification have escalated, necessitating more efficient and sustainable approaches. This study presents a novel application of tensor factorization within the Faster R-CNN framework, resulting in the development of our model, T-Faster R-CNN, designed to enhance both the energy efficiency and computational performance of deep learning models used in galaxy classification. By integrating tensor factorization, our T-Faster R-CNN significantly reduces the model’s complexity, memory footprint, and CO2 emissions, while maintaining, and in some cases even improving, the accuracy of morphological classification. The effectiveness of this optimized model is validated using data from the Galaxy Zoo DECaLS, where it demonstrates substantial improvements in computational efficiency without compromising classification precision. Furthermore, this research incorporates green code principles, emphasizing reductions in energy consumption and environmental impact in computational astronomy. The T-Faster R-CNN model offers a resource-efficient, sustainable methodology for analyzing large-scale astronomical data, addressing the critical need for greener computational practices in the era of big data. Full article
(This article belongs to the Section Astroinformatics and Astrostatistics)
Show Figures

Figure 1

23 pages, 2073 KB  
Article
Leveraging Deep Learning for Time-Series Extrinsic Regression in Predicting the Photometric Metallicity of Fundamental-Mode RR Lyrae Stars
by Lorenzo Monti, Tatiana Muraveva, Gisella Clementini and Alessia Garofalo
Sensors 2024, 24(16), 5203; https://doi.org/10.3390/s24165203 - 11 Aug 2024
Cited by 3 | Viewed by 3606
Abstract
Astronomy is entering an unprecedented era of big-data science, driven by missions like the ESA’s Gaia telescope, which aims to map the Milky Way in three dimensions. Gaia’s vast dataset presents a monumental challenge for traditional analysis methods. The sheer scale of this [...] Read more.
Astronomy is entering an unprecedented era of big-data science, driven by missions like the ESA’s Gaia telescope, which aims to map the Milky Way in three dimensions. Gaia’s vast dataset presents a monumental challenge for traditional analysis methods. The sheer scale of this data exceeds the capabilities of manual exploration, necessitating the utilization of advanced computational techniques. In response to this challenge, we developed a novel approach leveraging deep learning to estimate the metallicity of fundamental mode (ab-type) RR Lyrae stars from their light curves in the Gaia optical G-band. Our study explores applying deep-learning techniques, particularly advanced neural-network architectures, in predicting photometric metallicity from time-series data. Our deep-learning models demonstrated notable predictive performance, with a low mean absolute error (MAE) of 0.0565, the root mean square error (RMSE) of 0.0765, and a high R2 regression performance of 0.9401, measured by cross-validation. The weighted mean absolute error (wMAE) is 0.0563, while the weighted root mean square error (wRMSE) is 0.0763. These results showcase the effectiveness of our approach in accurately estimating metallicity values. Our work underscores the importance of deep learning in astronomical research, particularly with large datasets from missions like Gaia. By harnessing the power of deep-learning methods, we can provide precision in analyzing vast datasets, contributing to more precise and comprehensive insights into complex astronomical phenomena. Full article
(This article belongs to the Collection Machine Learning and AI for Sensors)
Show Figures

Figure 1

16 pages, 5483 KB  
Review
A Needle in a Cosmic Haystack: A Review of FRB Search Techniques
by Kaustubh M. Rajwade and Joeri van Leeuwen
Universe 2024, 10(4), 158; https://doi.org/10.3390/universe10040158 - 28 Mar 2024
Cited by 7 | Viewed by 2431
Abstract
Ephemeral Fast Radio Bursts (FRBs) must be powered by some of the most energetic processes in the Universe. That makes them highly interesting in their own right, and as precise probes for estimating cosmological parameters. This field thus poses a unique challenge: FRBs [...] Read more.
Ephemeral Fast Radio Bursts (FRBs) must be powered by some of the most energetic processes in the Universe. That makes them highly interesting in their own right, and as precise probes for estimating cosmological parameters. This field thus poses a unique challenge: FRBs must be detected promptly and immediately localised and studied based only on that single millisecond-duration flash. The problem is that the burst occurrence is highly unpredictable and that their distance strongly suppresses their brightness. Since the discovery of FRBs in single-dish archival data in 2007, detection software has evolved tremendously. Pipelines now detect bursts in real time within a matter of seconds, operate on interferometers, buffer high-time and frequency resolution data, and issue real-time alerts to other observatories for rapid multi-wavelength follow-up. In this paper, we review the components that comprise a FRB search software pipeline, we discuss the proven techniques that were adopted from pulsar searches, we highlight newer, more efficient techniques for detecting FRBs, and we conclude by discussing the proposed novel future methodologies that may power the search for FRBs in the era of big data astronomy. Full article
(This article belongs to the Special Issue New Insights in Fast Radio Bursts)
Show Figures

Figure 1

4 pages, 15553 KB  
Proceeding Paper
Three-Dimensional Visualization of Astronomy Data Using Virtual Reality
by Gilles Ferrand
Phys. Sci. Forum 2023, 8(1), 71; https://doi.org/10.3390/psf2023008071 - 5 Dec 2023
Viewed by 1988
Abstract
Visualization is an essential part of research, both to explore one’s data and to communicate one’s findings with others. Many data products in astronomy come in the form of multi-dimensional cubes, and since our brains are tuned for recognition in a 3D world, [...] Read more.
Visualization is an essential part of research, both to explore one’s data and to communicate one’s findings with others. Many data products in astronomy come in the form of multi-dimensional cubes, and since our brains are tuned for recognition in a 3D world, we ought to display and manipulate these in 3D space. This is possible with virtual reality (VR) devices. Drawing from our experiments developing immersive and interactive 3D experiences from actual science data at the Astrophysical Big Bang Laboratory (ABBL), this paper gives an overview of the opportunities and challenges that are awaiting astrophysicists in the burgeoning VR space. It covers both software and hardware matters, as well as practical aspects for successful delivery to the public. Full article
(This article belongs to the Proceedings of The 23rd International Workshop on Neutrinos from Accelerators)
Show Figures

Figure 1

9 pages, 8999 KB  
Proceeding Paper
Bayesian and Machine Learning Methods in the Big Data Era for Astronomical Imaging
by Fabrizia Guglielmetti, Philipp Arras, Michele Delli Veneri, Torsten Enßlin, Giuseppe Longo, Lukasz Tychoniec and Eric Villard
Phys. Sci. Forum 2022, 5(1), 50; https://doi.org/10.3390/psf2022005050 - 15 Feb 2023
Viewed by 2105
Abstract
The Atacama large millimeter/submillimeter array with the planned electronic upgrades will deliver an unprecedented number of deep and high resolution observations. Wider fields of view are possible with the consequential cost of image reconstruction. Alternatives to commonly used applications in image processing have [...] Read more.
The Atacama large millimeter/submillimeter array with the planned electronic upgrades will deliver an unprecedented number of deep and high resolution observations. Wider fields of view are possible with the consequential cost of image reconstruction. Alternatives to commonly used applications in image processing have to be sought and tested. Advanced image reconstruction methods are critical to meet the data requirements needed for operational purposes. Astrostatistics and astroinformatics techniques are employed. Evidence is given that these interdisciplinary fields of study applied to synthesis imaging meet the Big Data challenges and have the potential to enable new scientific discoveries in radio astronomy and astrophysics. Full article
Show Figures

Figure 1

15 pages, 10788 KB  
Article
Launching the VASCO Citizen Science Project
by Beatriz Villarroel, Kristiaan Pelckmans, Enrique Solano, Mikael Laaksoharju, Abel Souza, Onyeuwaoma Nnaemeka Dom, Khaoula Laggoune, Jamal Mimouni, Hichem Guergouri, Lars Mattsson, Aurora Lago García, Johan Soodla, Diego Castillo, Matthew E. Shultz, Rubby Aworka, Sébastien Comerón, Stefan Geier, Geoffrey W. Marcy, Alok C. Gupta, Josefine Bergstedt, Rudolf E. Bär, Bart Buelens, Emilio Enriquez, Christopher K. Mellon, Almudena Prieto, Dismas Simiyu Wamalwa, Rafael S. de Souza and Martin J. Wardadd Show full author list remove Hide full author list
Universe 2022, 8(11), 561; https://doi.org/10.3390/universe8110561 - 27 Oct 2022
Cited by 8 | Viewed by 5624
Abstract
The Vanishing & Appearing Sources during a Century of Observations (VASCO) project investigates astronomical surveys spanning a time interval of 70 years, searching for unusual and exotic transients. We present herein the VASCO Citizen Science Project, which can identify unusual candidates driven by [...] Read more.
The Vanishing & Appearing Sources during a Century of Observations (VASCO) project investigates astronomical surveys spanning a time interval of 70 years, searching for unusual and exotic transients. We present herein the VASCO Citizen Science Project, which can identify unusual candidates driven by three different approaches: hypothesis, exploratory, and machine learning, which is particularly useful for SETI searches. To address the big data challenge, VASCO combines three methods: the Virtual Observatory, user-aided machine learning, and visual inspection through citizen science. Here we demonstrate the citizen science project and its improved candidate selection process, and we give a progress report. We also present the VASCO citizen science network led by amateur astronomy associations mainly located in Algeria, Cameroon, and Nigeria. At the moment of writing, the citizen science project has carefully examined 15,593 candidate image pairs in the data (ca. 10% of the candidates), and has so far identified 798 objects classified as “vanished”. The most interesting candidates will be followed up with optical and infrared imaging, together with the observations by the most potent radio telescopes. Full article
Show Figures

Figure 1

17 pages, 2192 KB  
Article
SAX and Random Projection Algorithms for the Motif Discovery of Orbital Asteroid Resonance Using Big Data Platforms
by Lala Septem Riza, Muhammad Naufal Fazanadi, Judhistira Aria Utama, Khyrina Airin Fariza Abu Samah, Taufiq Hidayat and Shah Nazir
Sensors 2022, 22(14), 5071; https://doi.org/10.3390/s22145071 - 6 Jul 2022
Cited by 1 | Viewed by 2411
Abstract
The phenomenon of big data has occurred in many fields of knowledge, one of which is astronomy. One example of a large dataset in astronomy is that of numerically integrated time series asteroid orbital elements from a time span of millions to billions [...] Read more.
The phenomenon of big data has occurred in many fields of knowledge, one of which is astronomy. One example of a large dataset in astronomy is that of numerically integrated time series asteroid orbital elements from a time span of millions to billions of years. For example, the mean motion resonance (MMR) data of an asteroid are used to find out the duration that the asteroid was in a resonance state with a particular planet. For this reason, this research designs a computational model to obtain the mean motion resonance quickly and effectively by modifying and implementing the Symbolic Aggregate Approximation (SAX) algorithm and the motif discovery random projection algorithm on big data platforms (i.e., Apache Hadoop and Apache Spark). There are five following steps on the model: (i) saving data into the Hadoop Distributed File System (HDFS); (ii) importing files to the Resilient Distributed Datasets (RDD); (iii) preprocessing the data; (iv) calculating the motif discovery by executing the User-Defined Function (UDF) program; and (v) gathering the results from the UDF to the HDFS and the .csv file. The results indicated a very significant reduction in computational time between the use of the standalone method and the use of the big data platform. The proposed computational model obtained an average accuracy of 83%, compared with the SwiftVis software. Full article
(This article belongs to the Special Issue Big Data Analytics in Internet of Things Environment)
Show Figures

Figure 1

14 pages, 396 KB  
Article
A Survey of Big Data Archives in Time-Domain Astronomy
by Manoj Poudel, Rashmi P. Sarode, Yutaka Watanobe, Maxim Mozgovoy and Subhash Bhalla
Appl. Sci. 2022, 12(12), 6202; https://doi.org/10.3390/app12126202 - 18 Jun 2022
Cited by 8 | Viewed by 4399
Abstract
The rise of big data has resulted in the proliferation of numerous heterogeneous data stores. Even though multiple models are used for integrating these data, combining such huge amounts of data into a single model remains challenging. There is a need in the [...] Read more.
The rise of big data has resulted in the proliferation of numerous heterogeneous data stores. Even though multiple models are used for integrating these data, combining such huge amounts of data into a single model remains challenging. There is a need in the database management archives to manage such huge volumes of data without any particular structure which comes from unconnected and unrelated sources. These data are growing in size and thus demand special attention. The speed with which these data are growing as well as the varied data types involved and stored in scientific archives is posing further challenges. Astronomy is also increasingly becoming a science which is now based on a lot of data processing and involves assorted data. These data are now stored in domain-specific archives. Many astronomical studies are producing large-scale archives of data and these archives are then published in the form of data repositories. These mainly consist of images and text without any structure in addition to data with some structure such as relations with key values. When the archives are published as remote data repositories, it is challenging work to organize the data against their increased diversity and to meet the information demands of users. To address this problem, polystore systems present a new model of data integration and have been proposed to access unrelated data repositories using an uniform single query language. This article highlights the polystore system for integrating large-scale heterogeneous data in the astronomy domain. Full article
(This article belongs to the Topic Data Science and Knowledge Discovery)
Show Figures

Figure 1

10 pages, 2011 KB  
Communication
Evaluation of HPC Acceleration and Interconnect Technologies for High-Throughput Data Acquisition
by Alessandro Cilardo
Sensors 2021, 21(22), 7759; https://doi.org/10.3390/s21227759 - 22 Nov 2021
Cited by 2 | Viewed by 3316
Abstract
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific computing, big data, and high-performance computing, impacting demanding data acquisition applications from high-energy physics to astronomy, where dedicated accelerators such as FPGA devices play a key role coupled [...] Read more.
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific computing, big data, and high-performance computing, impacting demanding data acquisition applications from high-energy physics to astronomy, where dedicated accelerators such as FPGA devices play a key role coupled with high-performance interconnect technologies. Building on the outcome of the RECIPE Horizon 2020 research project, this work evaluates the use of high-bandwidth interconnect standards, namely InfiniBand EDR and HDR, along with remote direct memory access functions for direct exposure of FPGA accelerator memory across a multi-node system. The prototype we present aims at avoiding dedicated network interfaces built in the FPGA accelerator itself, leaving most of the resources for user acceleration and supporting state-of-the-art interconnect technologies. We present the detail of the proposed system and a quantitative evaluation in terms of end-to-end bandwidth as concretely measured with a real-world FPGA-based multi-node HPC workload. Full article
(This article belongs to the Special Issue Intelligent IoT Circuits and Systems)
Show Figures

Figure 1

12 pages, 573 KB  
Article
Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7
by Zhiyuan Hu, Jianyu Chen, Bin Jiang and Wenyu Wang
Universe 2021, 7(11), 438; https://doi.org/10.3390/universe7110438 - 15 Nov 2021
Cited by 10 | Viewed by 2124
Abstract
The search for special and rare celestial objects has always played an important role in astronomy. Cataclysmic Variables (CVs) are special and rare binary systems with accretion disks. Most CVs are in the quiescent period, and their spectra have the emission lines of [...] Read more.
The search for special and rare celestial objects has always played an important role in astronomy. Cataclysmic Variables (CVs) are special and rare binary systems with accretion disks. Most CVs are in the quiescent period, and their spectra have the emission lines of Balmer series, HeI, and HeII. A few CVs in the outburst period have the absorption lines of Balmer series. Owing to the scarcity of numbers, expanding the spectral data of CVs is of positive significance for studying the formation of accretion disks and the evolution of binary star system models. At present, the research for astronomical spectra has entered the era of Big Data. The Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) has produced more than tens of millions of spectral data. the latest released LAMOST-DR7 includes 10.6 million low-resolution spectral data in 4926 sky regions, providing ideal data support for searching CV candidates. To process and analyze the massive amounts of spectral data, this study employed the Light Gradient Boosting Machine (LightGBM) algorithm, which is based on the ensemble tree model to automatically conduct the search in LAMOST-DR7. Finally, 225 CV candidates were found and four new CV candidates were verified by SIMBAD and published catalogs. This study also built the Gradient Boosting Decision Tree (GBDT), Adaptive Boosting (AdaBoost), and eXtreme Gradient Boosting (XGBoost) models and used Accuracy, Precision, Recall, the F1-score, and the ROC curve to compare the four models comprehensively. Experimental results showed that LightGBM is more efficient. The search for CVs based on LightGBM not only enriches the existing CV spectral library, but also provides a reference for the data mining of other rare celestial objects in massive spectral data. Full article
Show Figures

Figure 1

15 pages, 2771 KB  
Article
Supernovae Detection with Fully Convolutional One-Stage Framework
by Kai Yin, Juncheng Jia, Xing Gao, Tianrui Sun and Zhengyin Zhou
Sensors 2021, 21(5), 1926; https://doi.org/10.3390/s21051926 - 9 Mar 2021
Cited by 4 | Viewed by 4508
Abstract
A series of sky surveys were launched in search of supernovae and generated a tremendous amount of data, which pushed astronomy into a new era of big data. However, it can be a disastrous burden to manually identify and report supernovae, because such [...] Read more.
A series of sky surveys were launched in search of supernovae and generated a tremendous amount of data, which pushed astronomy into a new era of big data. However, it can be a disastrous burden to manually identify and report supernovae, because such data have huge quantity and sparse positives. While the traditional machine learning methods can be used to deal with such data, deep learning methods such as Convolutional Neural Networks demonstrate more powerful adaptability in this area. However, most data in the existing works are either simulated or without generality. How do the state-of-the-art object detection algorithms work on real supernova data is largely unknown, which greatly hinders the development of this field. Furthermore, the existing works of supernovae classification usually assume the input images are properly cropped with a single candidate located in the center, which is not true for our dataset. Besides, the performance of existing detection algorithms can still be improved for the supernovae detection task. To address these problems, we collected and organized all the known objectives of the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) and the Popular Supernova Project (PSP), resulting in two datasets, and then compared several detection algorithms on them. After that, the selected Fully Convolutional One-Stage (FCOS) method is used as the baseline and further improved with data augmentation, attention mechanism, and small object detection technique. Extensive experiments demonstrate the great performance enhancement of our detection algorithm with the new datasets. Full article
Show Figures

Figure 1

4 pages, 265 KB  
Proceeding Paper
Signal Processing Techniques Intended for Peculiar Star Detection in APOGEE Survey
by Raul Santovena, Arturo Manchado and Carlos Dafonte
Proceedings 2019, 21(1), 32; https://doi.org/10.3390/proceedings2019021032 - 1 Aug 2019
Viewed by 1739
Abstract
Like other disciplines, Astronomy faces the era of Big Data, where the analyses and discovery of specific objects is a significant and non-trivial matter. The APOGEE survey and Gaia mission are good examples of how these kind of projects have increased the amount [...] Read more.
Like other disciplines, Astronomy faces the era of Big Data, where the analyses and discovery of specific objects is a significant and non-trivial matter. The APOGEE survey and Gaia mission are good examples of how these kind of projects have increased the amount of data to be managed. In this context, we have developed an algorithm to search for specific features in the APOGEE database. The main purpose is to seek spectral lines both in absorption or emission, in the whole APOGEE database, in order to find chemically-peculiar stars. We propose an algorithm which has been validated using cerium lines and we have applied it to the search for other chemical compounds. Full article
(This article belongs to the Proceedings of The 2nd XoveTIC Conference (XoveTIC 2019))
Show Figures

Figure 1

9 pages, 5402 KB  
Article
Science Pipelines for the Square Kilometre Array
by Jamie Farnes, Ben Mort, Fred Dulwich, Stef Salvini and Wes Armour
Galaxies 2018, 6(4), 120; https://doi.org/10.3390/galaxies6040120 - 20 Nov 2018
Cited by 17 | Viewed by 4779
Abstract
The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of five zettabytes of data per year. A [...] Read more.
The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of five zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources and perform Faraday tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe. Full article
(This article belongs to the Special Issue The Power of Faraday Tomography)
Show Figures

Figure 1

Back to TopTop