1. Introduction
Unmanned Aerial Vehicles (UAVs), initially designed for military and defense missions, have undergone rapid technological development in recent decades, and their applications have become increasingly widespread. Due to advances in sensor technology, artificial intelligence, and battery efficiency, UAVs are more accessible and versatile nowadays. Thus, naturally, UAVs—commonly known as drones—are now used in a number of civilian sectors, including agriculture [
1,
2,
3], environmental monitoring [
4,
5,
6,
7], disaster management [
8,
9], logistics [
10,
11], and even entertainment [
12].
With the spread of UAV use, the number of studies conducted in various fields has also increased significantly [
13,
14,
15]. Various studies address regulatory challenges and innovative applications in topics such as the design, development, and optimization of UAV navigation systems [
16,
17,
18]. Interdisciplinary studies also examine topics such as the social and economic impacts of UAVs [
19,
20,
21].
Given the wide range of UAV applications, it is crucial to stay up-to-date on the latest trends and developments in the field. Therefore, an analysis of the existing literature is necessary to provide a comprehensive overview and identify key areas of focus. In this study, we perform a bibliometric analysis to map the research on UAVs. Our goal is to identify both past and emerging trends in the field and to highlight key contributors (countries, institutions, and authors). For this, the following two databases were used to extract relevant research articles: Web of Science and Scopus. Data are analyzed using the bibliometrix package in the statistical program R (version 4.3.2) [
22], with the objective of understanding publication patterns, citation dynamics, and research impact. The results are presented through visualizations, such as keyword co-occurrence maps and citation trends, to offer an understanding of the UAV studies.
Understanding the development and trajectory of UAV research is crucial for several reasons: for example, to support researchers in identifying gaps in the current knowledge base or for funding agencies to prioritize investment areas. The results can also help policy-makers design regulations that promote innovation while ensuring security and privacy. In addition, this bibliometric analysis can provide directions for fundamental works and key contributors.
The paper is structured as follows:
Section 2 presents a brief history of UAVs and their applications. The research questions are shown in
Section 3. In
Section 4, the materials and methodology are detailed, followed by the presentation of the results in
Section 5, which includes publication trends, citation analysis, and collaboration patterns.
Section 6 interprets the findings in the context of broader research trends and implications for future studies. Finally, in
Section 7, a summary of key insights and recommendations is concluded in the field of UAVs.
2. The History of UAVs and Their Applications
The history of UAVs provides fundamental knowledge about the development of UAV technology, highlighting the most important innovations and milestones that have shaped this field. This historical context helps identify technological and research trends, illustrating shifts in focus and advancements over time. It also highlights past challenges and the innovative solutions that addressed them and provides information on current methods and technologies.
UAVs have a long history of development, reflecting important advances in both military and civilian applications. The origins of drone technology date back to World War I, when the concept of unmanned flight was first explored. The first drone was the Kettering Bug, an experimental aerial torpedo developed by the United States in 1918. Although some tests were successful, it was never used for combat purposes. Nevertheless, the Kettering Bug marked the beginning of UAVs [
23,
24].
During World War II, the Radioplane OQ-2 was created, which was the first mass-produced UAV [
25]. It was used for target practice by the US military. This UAV can be considered an important milestone, as it demonstrated the practical application of UAVs in military training. Further advancements in drone technology were made during the Cold War era, mainly in reconnaissance missions. The Lockheed D-21 was developed in the 1960s and was one of the first high-speed, high-altitude reconnaissance drones used by the United States. The Ryan Firebee also appeared during this period and was a versatile drone used for both reconnaissance and target practice [
26,
27].
The modern era of UAVs began in the late 20th century, when more sophisticated and versatile models appeared. In the 1990s and early 2000s, the US expanded its drone capabilities with the development of the MQ-1 Predator. It was originally designed for reconnaissance, but later became armed with missiles [
28]. The deployment of the MQ-9 Reaper marked a new level of drone warfare—it is larger and more capable than the Predator, designed for long-endurance surveillance and armed strike missions [
29].
However, today UAVs can also be used for civilian purposes. Civilian applications of drone technology have grown significantly in the past two decades. UAVs have quickly become important tools in a number of civilian sectors, including agriculture, environmental monitoring, emergency response, and delivery services. In agriculture, they can be utilized for precision farming. Equipped with multispectral or hyperspectral cameras, drones enable farmers to monitor crop health, assess soil conditions, and manage irrigation with unprecedented efficiency. These UAVs can cover large areas quickly since they can provide high-resolution data that helps optimize crop yields and reduce costs. Empirical studies demonstrate that drones can greatly improve the accuracy of crop monitoring and aid in agricultural decision-making [
30].
In environmental monitoring and conservation efforts, UAVs are used to survey wildlife populations, track animal migrations, and monitor deforestation and habitat destruction. The ability to capture real-time data from hard-to-reach areas makes UAVs important for researchers and conservationists. For example, drones have been used to monitor orangutan populations in Borneo, providing critical data to support conservation strategies [
31].
In disaster response and management, UAVs can offer rapid assessment capabilities by delivering real-time images and data to rescue teams. They can access areas that are difficult or dangerous for humans to reach during natural disasters. In disaster response, UAVs can be quickly deployed to assess damage, locate survivors, and deliver necessary supplies, even in areas that are too dangerous or inaccessible for human rescue teams [
32].
UAVs can also be used for infrastructure inspection and maintenance [
33]. They can be advantageous for commercial package delivery in remote or congested urban areas where traditional delivery methods are inefficient. Recent research highlights the potential of UAVs to transform logistics by enabling faster and more flexible delivery services [
34].
The film and photography industries also use UAVs. Drones equipped with high-resolution cameras can take aerial photographs that previously required expensive helicopter rentals, making aerial photography accessible to amateur filmmakers and photographers [
35,
36].
Overall, civilian use of UAVs continues to expand, driven by technological advances and growing recognition of their practical benefits. As the technology becomes more sophisticated, the potential applications of drones in civilian life are also likely to increase, offering innovative solutions to a number of challenges. Therefore, as UAVs are used in more and more areas, it is critical to assess the direction in which such research is heading.
4. Materials and Methods
The following two subsections describe the materials and methods used in this study. In
Section 4.1, the data collection as well as the preprocessing are detailed from the two scientific databases mentioned earlier. Then, in
Section 4.2, the analysis of the collected records is presented.
4.1. Data Collection and Preprocessing
As mentioned in the introduction, we used two scientific databases to search for bibliographic records: Web of Science (WoS) and Scopus. Data collection took place on 28 June 2024. We used the following search terms:
“Unmanned Aerial Vehicle” OR “UAV” OR “Drone” OR
“Unmanned Aircraft System” OR “UAS” OR “Remotely Piloted Aircraft” OR
“Quadcopter” OR “Multirotor” OR “Flying Robot”
We used this query to search for titles, abstracts, author keywords, and additional keywords in both scientific databases. We did not specify a time interval. After the query, we initially found 51,997 WoS records and 146,155 Scopus records. Therefore, a total of 198,152 raw items were retrieved. These raw exports were imported into the R program using the bibliometrix package [
22]. This package was chosen since it is widely used in bibliometric analysis in research [
37,
38,
39,
40,
41,
42,
43,
44,
45]. Using it, a bibliometrix schema was created (i.e., fields such as TI, AB, DE/ID, SO, PY, DI, C1, TC, AU). Afterward, they were harmonized into a single data frame with snake_case names (title, abstract, keywords, source, year, doi, affiliations, citations). Titles, abstracts, and author keywords were concatenated for text mining while preserving the original fields.
As the next step, we removed duplicates in three stages. First, we merged records with the same DOI into a single representative record. Second, we performed an exact matching on the normalized titles, while keeping one record for each identical
. Third, we applied blockwise fuzzy matching based on the Jaro–Winkler distance (
) within blocks formed by the publication year and the first
k characters of
(with
) [
46]. We marked pairs
if
, with
. Blocking avoids
comparisons and reduces false merges. After removing duplicates, 129,124 unique records remained.
Although our explicit search terms do not include the word “swarm”, records related to insects may still be included due to the ambiguity of the word drone (male bee) in titles/abstracts and the index/author keywords provided by the database which link the word swarm to entomological items. To reliably separate UAV-related records from insect-related ones, we applied a two-stage procedure. We first constructed regular expression families that capture unambiguous contexts:
: UAV-specific terms (e.g., drone(s), uav(s), quadrotor/quadcopter, aerial robot, formation control);
: entomological terms (e.g., bee, honeybee, wasp, pollination).
Then, we combined the explicit and regular expression families with a UAV context rule for the word swarm () and a finely tuned classifier, removing insect-related noise while retaining articles related to UAV swarm/coordination.
: flight- and control-related terms (e.g., flight, aerial, airborne, multi-agent, planner/controller, cooperative, communication link, antenna).
For labeling, let
denote the concatenation of the title, abstract, and keywords of the record
i. Weak labels were assigned as follows (Equation (
1)):
This strict rule-based pre-labeling step yielded 128,148 positive (UAV) and 25 negative (insect) records, while 951 items remained unlabeled.
The unlabeled set was then resolved using a logistic regression classifier with elastic-net regularization (implemented via the
glmnet package [
47]). The model was trained on high-confidence weak labels with unigram TF-IDF features:
where
N is the number of documents and
is the document frequency of token
t for document
d. With sparse vectors
, the optimization problem is (Equation (
3))
where
,
, and
are cross-validated (default
). Training was performed on a balanced sample for scalability, and the full corpus was scored in batches.
The decision threshold
was selected to meet a predefined precision target
on a held-out validation split (Equation (
4)):
The final labels were assigned as . Borderline cases () were manually inspected to prevent leakage of insect-related articles and to retain relevant multi-UAV research.
Among the previously unlabeled records, the classifier identified 951 additional UAV publications. The final dataset therefore comprises 129,099 UAV-related (positive) and 25 non-UAV (negative) records, forming a reliable corpus for subsequent trend, topic, and network analyses.
In summary, the PRISMA 2020-like flow was as follows:
198,152 for the merged raw corpus;
129,124 after the de-duplication;
128,148 (positive), 25 (negative), and 951 (unlabeled) after the rule-based method;
951 were model-positive among the unlabeled records;
129,099 were in the final UAV dataset.
As the database indexing for the past calendar year is incomplete, the time series data for the past year are marked as partial and trends are interpreted accordingly.
4.2. Data Analysis
To understand the bibliometric trends and their evolution, we analyze the co-occurrence of keywords and co-authorship networks. Each record adds a unique set of keywords; the co-occurring pairs accumulate an edge weight, resulting in a global network. We calculated a degree-weighted score for each year.
We then extract the most important topics of each year, which reveal how methods (e.g., reinforcement learning) fit into the fundamental problems of UAVs (e.g., path planning between dynamic obstacles, collision avoidance, energy and communication constraints) and how these relationships evolve over time. To facilitate collaboration, we build an undirected co-author network from author pairs within records and identify bridging actors based on their betweenness centrality, thereby operationalizing “close alliances” and hub structures.
The performance of countries may reflect the size of their population; therefore, we report per capita indicators (using publicly available population data). The importance of a location is measured by a focus index, which represents the proportion of UAV-related articles out of all publications appearing in a location during the study period.
For the linkage analysis, only abstracts (excluding title text) were considered, and strict dictionaries were used to minimize false positives. Reinforcement learning (RL) studies were identified based on the terms “reinforcement learning” and “deep reinforcement learning”; terms consisting solely of abbreviations (e.g., PPO, DDPG, SAC) were not considered. With broader specifications (title + abstract, abbreviations allowed), the number of selected RL studies increases. This expansion slightly increases the proportion of articles dealing with communication and collaboration, but our main conclusions remain unchanged.
To avoid trivial saturation—i.e., the artificial result that the ratio will be equal if both the numerator and denominator come exclusively from the UAV-specific corpus—we determine the denominator using an external reference dataset covering all topics. Specifically, for all research areas, we obtained site-level numbers from OpenAlex [
48,
49]. For each venue
v, and limited to the time period corresponding to our study, we calculate the estimated site focus as
This mixed-source ratio provides useful information for ranking based on UAV concentration but cannot be considered a strictly “true” ratio. The numerator comes from our cleaned WoS and Scopus UAV corpus, while the denominator uses aggregated data from OpenAlex. The figures therefore assume comparable coverage and the same publication period.
To implement this measure, we normalized the names of locations appearing in bibliographic fields (e.g., SO/SRCTITLE) and assigned them to OpenAlex source identifiers. For each mapped location, we retrieved the total number of works published on all research topics in the same year range, grouped and summed by year of publication. We linked these total numbers to the UAV-positive numbers to calculate the value. To reduce noise due to the small sample size, we limited the analysis to locations with at least 100 UAV-positive records in our corpus. When matching locations, ISSN-/eISSN-based resolution was given priority; name-based alignment with serial-level normalization (removing year or serial number markers) was only applied when ISSNs were not available, using a strict similarity threshold (Jaro–Winkler: ≥0.92). Locations where the external denominator covering all topics was smaller than our UAV count were excluded, as such inconsistencies indicate mapping errors. Thus, reflects not only UAV-related hits, but also the relative concentration of UAV-related works in the overall publication output of the website during the same years and thus serves as a normalized indicator of venue-level thematic specialization.
6. Discussion
The aim of this study was to provide a reproducible, context-sensitive summary of the literature on UAVs and to answer the five guiding questions (RQ1–RQ5) with evidence that goes beyond a bibliographical listing. The dataset was created by combining exports from Web of Science and Scopus, removing duplicates with three additional matchers, and applying a two-step content filter that combines rules and a fine-tuned classifier.
Regarding RQ1, the following can be stated: The annual series (1955–2024) shows long-term growth, which accelerated after 2010, in line with the phase when low-cost sensing, embedded computing, and open-source tools reached maturity. The last calendar year of the series is marked as partial, so the changes visible on the right-hand side should be interpreted as indexing delays rather than structural declines. More informative than the single descent is the thematic connection that emerges over time: reinforcement learning appears more and more frequently on the topic timeline alongside route/path planning and collision avoidance, indicating a shift from manual control to data-driven decision-making in the face of uncertainty and dynamics. At the same time, two technical constraints—energy and communication—remain. This is consistent with the durability- and connectivity-related performance and reliability constraints faced by real-world platforms and explains why seemingly “algorithmic” developments are often developed in parallel with performance, aerial, and network architecture considerations.
In the case of RQ2, the following can be said: The aggregated data for countries by themselves are confusing, as they mix scientific output with population size and index coverage. In order to separate these effects, we examined three additional indicators together: the quantity in the corpus (who publishes the most), the MCP proportion, and the intensity per capita (who is most specialized in relation to the population). In practice, this gives a more nuanced picture: large systems dominate in absolute numbers; externally oriented systems rank high in MCP even with moderate volume; and small but specialized systems are visible in per capita values. Institutional rankings are useful for discovery, but they should be read with caution regarding size; in order to reduce the number of works originating from department-level strings, we normalized the affiliations analyzed from the single c1 field to recognizable parent universities. In the case of authors, raw name counts are unreliable in cases of homonymy, so we focus on network position: bridges with high-betweenness positions are actors who connect communities of perception, control, and communication and therefore concentrate integrative contributions. This network perspective better indicates where cross-domain synthesis actually occurs than volume alone.
The answer to RQ3 shows the following results: A coherent picture emerges at different levels. Countries with high MCP are centers of cross-border work; leading institutions form dense inter-institutional clusters; and the aforementioned integrating authors ensure shortcuts between clusters. Conceptually, these three layers form the spine from which practical advances originate: externally oriented systems challenge researchers with heterogeneous limitations and datasets; strong institutional anchors amortize infrastructure; and bridging authors transmit ideas between subfields, enabling communication-aware design, perception-aware control, and safety in learning.
Regarding RQ4, the analysis intentionally exceeds the list of frequent words and seeks to answer the question of why the expressions appear together. On the topic timeline, RL fit into design and collision avoidance precisely when platforms, simulators, and onboard computers enabled training and implementation on a large scale. Hardware and link layer terms (e.g., antennas) continue to play a central role in co-occurring networks, not as trivial terms, but as indicators that real-time operations with multiple UAVs are constrained by bandwidth and delay limitations. Specifically, in the RL articles, we quantify how the methods address the problems: multi-agent/collaboration (about a quarter of the RL articles), communication constraints (about two-thirds), and energy constraints (about one-seventh) are mentioned together in non-trivial proportions, with additional but smaller signs for uncertainty and dynamic obstacles. These proportions support the claim that the methodological boundaries of the field are more closely linked to endurance and connectivity constraints than they are to independent development.
When addressing RQ5, in simple frequency tables, synonyms for “UAV” inevitably appear, which are tautological with the search query and therefore analytically meaningless. This study reduces the weight of such tautologies and interprets common expressions through their intersections: control trigrams (sliding mode control, model predictive control) refer to robust and predictive strategies for underactuated platforms; perception/vision terms (convolutional neural network, synthetic aperture radar) denote the spine of perception; and optimization terms (particle swarm optimization, coverage/route planning) reveal how search and allocation problems are formulated. In other words, we use the conceptual surface as a structure for the scientific problem, not as an end in itself.
Finally, discovery requires not only knowing where articles related to UAVs appear, but also where the topic is concentrated. To this end, venue analysis supplements the counts in the corpus with an “estimated focus” index, which divides UAV-related articles by an external denominator based on the total output of the venue during the same period. The high-focus publications identified in this way provide useful guidance for readers who are looking for ongoing UAV-related coverage rather than occasional special issues.
Although the process was designed to be reproducible and scalable, several limitations affect the results and open up specific opportunities for future work. First, coverage is limited by the underlying bibliographic sources: Web of Science and Scopus differ in terms of journal and conference proceeding coverage, regional distribution, and update frequency. Even after harmonizing fields and removing multiple matches, gaps remain (e.g., unindexed regional conferences, workshop proceedings, or very recent items). The right margin of the time series is particularly sensitive to indexing delays; we mark the last calendar year as explicitly partial, but any short-term changes should be interpreted with caution. Second, affiliation information is parsed from a single free text field, which improves recall across export formats but inherits noise from heterogeneous author–provided strings. Normalization to parent institutions reduces this variance but cannot fully resolve ambiguous department-level labels or institutional name changes. Similarly, country detection from affiliations can misclassify border cases (multi-campus institutions, special administrative regions, etc.). Third, author-level analyses are susceptible to homonymy and name variants. We mitigate this by emphasizing network position (betweenness centrality) over raw name counts, yet betweenness centrality is still a proxy for brokerage rather than a direct measure of intellectual contribution, and it can be affected by incomplete metadata (missing co-authors, split identities). Fourth, the content filter is fine-tuned with high precision to avoid false positives. However, this design decision may under–recall borderline UAV items that use atypical terminology or metaphorical language. Similarly, co–mention indicators (e.g., reinforcement learning with energy or communication constraints) are evidence of thematic proximity, not causal deployment or performance gains. Full–text, methodological-level extraction would be necessary to determine implementation details and outcomes. Fifth, venue “estimated focus” relies on external denominators queried by programming; while this reduces bias within the corpus, it results in dependence on third-party coverage, speed limitations, and the accuracy of identifier mapping. Sixth, linguistic and domain biases remain: English-language articles, computer science and engineering venues, and open-access publications predominate over UAV application areas (e.g., law, ethics, politics), which mitigates overall ecosystem-based statements.
These limitations suggest specific next steps:
With regard to data sources, supplementing WoS/Scopus with preprint servers (e.g., arXiv), patent databases, and regional indexes to improve early detection and geographical balance;
Tracking versions to ensure the verifiability of dynamic updates;
When resolving entities, using persistent identifiers (ORCID for authors, ROR/GRID for institutions) and probabilistic meaning resolution to reduce homonymy and affiliation noise;
In terms of understanding content, if licenses allow, considering not only abstracts but also full texts, and using modern NLP (entity linking, section-sensitive relation extraction) to distinguish between mentions and methods used, as well as to quantify results (e.g., increase in endurance, delay budget), rather than just considering common mentions;
With regard to indicators, supplementing the national indicators per capita with R&D workforce denominators (number of researchers per million people, R&D expenditure) to separate specialization from size and pairing collaboration indicators (MCP, betweenness position) with outcome-oriented proxy indicators (e.g., cross-cutting citations, articles setting benchmarks);
For modeling, moving from static co-occurrences to dynamic theme embeddings and time-varying community detection to capture changes in problem formulation with uncertainty boundaries (e.g., transition from single-UAV control to multi-agent autonomy);
For quality assurance, hardening the pipeline with pre–registered sensitivity analyses (e.g., varying classifier thresholds, alternative swarm/insect rule sets), human verification of boundary elements, and bootstrapped uncertainty for all key ratios;
For knowledge transfer, presenting the analysis as an open, re-runable workflow with an easy-to-use dashboard that updates the numbers, co-occurrence rates, and location focus as new records arrive.
Collectively, these enhancements can refine attribution, improve coverage, and shift the synthesis away from related factors toward what works, when, and under what constraints—which is ultimately the most important question for the UAV community.
In summary, the field has expanded in scale and reorganized around data-driven autonomy under real-world constraints. Collaboration patterns highlight the intermediaries through whom this reorganization takes place, while thematic evidence links advances in learning and planning to persistent energy and communication bottlenecks. Methodologically, the contribution lies in a transparent pipeline that retains the swarm and coordination literature while filtering out insect-related noise, combined with scale-aware indicators that avoid overinterpretation of raw counts. Substantively, the emerging picture is one of convergence: perception, control, and communication are increasingly co-designed and progress at scale depends as much on endurance and link reliability as on algorithmic innovation.
Overall, throughout the paper we identified the following research gaps:
Trade-offs between endurance and payload: energy constraints systematically limit real-world applications; comparative studies based on standardized mission profiles are rare.
Communication reliability in urban/suburban environments: coordination with multiple UAVs often assumes stable connections; the number of design benchmarks that take delay/jitter into account is limited.
Method–problem integration: RL applications focus on path/collision planning, yet standardized datasets, simulation–reality protocols, and ablation reports remain uneven.
Evaluation standards: few cross-site, reproducible benchmarks link guidance, observation, and network constraints.
Identity resolution in science measurement: limited use of ORCID/ROR hinders the clarification of authors/affiliations; the use of persistent identifiers would improve comparability between studies.