1. Introduction
In the digital era, computer vision technology, a cornerstone of the artificial intelligence domain, is being rapidly integrated into a wide range of industries. This technology provides innovative and efficient solutions to complex problems, as clearly demonstrated by its applications across multiple sectors [
1]. In the medical field, it facilitates accurate disease diagnosis, such as the precise detection of lung nodules [
2,
3,
4,
5], thereby increasing early detection rates and enhancing treatment outcomes. In industrial production, it supports rigorous quality control by identifying product defects in real time [
6,
7,
8], reducing waste and improving product standards. Within the security sector, it enables advanced monitoring systems to swiftly recognize potential threats, thereby strengthening public safety. In the transportation industry, computer vision serves as a foundational component for autonomous driving technologies [
9,
10,
11,
12,
13], which aim to deliver safer and more efficient mobility.
Central to these applications is the development of efficient and precise vertical domain models. The effectiveness of these models depends fundamentally on the availability of high-quality datasets [
14,
15,
16]. Such datasets allow models to acquire a comprehensive understanding of diverse features and patterns, equipping them to produce reliable predictions and decisions.
Nevertheless, as computer vision technology becomes more deeply embedded in specialized fields, the demands on datasets are intensifying. Owing to the dynamic and evolving nature of real-world environments, datasets must exhibit both timeliness and diversity. In the medical domain, for instance, the continuous mutation of diseases and the emergence of new variants necessitate regular updates to datasets to include novel cases and symptoms [
17,
18]. This ongoing refinement ensures that models retain their accuracy in diagnosis and prediction. In industrial production, the introduction of new products and the optimization of manufacturing processes require datasets to adapt promptly to changing production conditions [
19,
20,
21]. Failure to update datasets may lead to diminished performance in defect detection and process optimization. In the engineering sector, such as in water conservancy projects [
1], data reflect rapid and continuous variations. For example, river water levels and quality fluctuate in response to seasonal patterns, climatic conditions, and anthropogenic influences. The surface water extent of the Haihe River Basin, for instance, changes with precipitation levels and water usage. Constructing a dynamic dataset capable of capturing these shifts in a timely manner is essential for maintaining the accuracy and relevance of data used in decision-making.
From a research standpoint, as investigations advance, datasets require ongoing refinement. This process encompasses adding new data, updating existing records, and removing outdated entries. Moreover, the tasks associated with datasets and their categorical structures often demand repeated adjustments. Consequently, it is essential to improve both the efficiency and quality of dataset annotation and construction from a dynamic and adaptable perspective.
In practical applications, as data continue to accumulate, integrating the most recent information into existing datasets becomes vital for enhancing model recognition capabilities. A dynamic dataset capable of accommodating temporal and spatial variations not only strengthens model performance but also serves as a critical driver for the expanded implementation of artificial intelligence in specialized domains. Therefore, the effective development of such dynamic datasets has emerged as a prominent research focus.
Nevertheless, most current research efforts predominantly emphasize optimizing model accuracy and recall rates. Although these metrics are important, this line of research tends to offer limited practical value and faces challenges in achieving widespread adoption in the near term [
1]. In reality, the availability of large volumes of high-quality data and robust data-processing techniques is often the primary contributor to improved model outcomes. As a technical discipline, it is increasingly important to embrace an engineering-oriented and practical approach. Addressing problems through strategies aimed at lowering usage costs, enhancing training and inference efficiency, strengthening generalization capabilities, and reducing the thresholds and complexities of deployment is more meaningful and impactful [
1].
A closer examination of existing challenges reveals that the application of computer vision technology in current vertical fields remains insufficiently developed. From an engineering perspective, there is a notable lack of systematic management methodologies and standardized processes for constructing large-scale dynamic datasets tailored to specific domains. Furthermore, there is a significant shortage of efficient automated annotation tools capable of accommodating dynamic changes within datasets.
From a management perspective, the construction of datasets involves a complex project management process comprising numerous interconnected stages. Some teams lack the necessary experience and suitable platforms, which can result in project delays and the inability to meet established quality standards. Personnel management presents additional difficulties. Annotators are required to demonstrate advanced professional competencies, and the expectations placed upon them are stringent.
Regarding quality, several key issues persist. First, annotation accuracy remains low due to disparities in annotator abilities, with some lacking adequate professional training. When confronted with complex data, such annotators are more susceptible to errors. Additionally, the limitations of annotation tools contribute to reduced accuracy, particularly when handling intricate datasets. Second, annotation consistency is often poor. Variations in annotators’ interpretations and applications of standards can lead to inconsistent results for the same data batch, undermining overall data quality and complicating downstream processing. For instance, object boundary annotations frequently exhibit discrepancies among annotators. Furthermore, the absence of effective collaborative management mechanisms hampers the coordinated use of tools and reduces annotation efficiency.
In terms of efficiency, traditional manual annotation methods [
22,
23,
24,
25,
26,
27,
28,
29] continue to face significant challenges. Classic annotation tools currently in use include the following: (1) LabelImg [
22], an open-source image annotation tool supporting basic shapes such as rectangles, commonly employed for object detection tasks. It allows annotation results to be saved in formats like PASCAL VOC and YOLO. Owing to its user-friendly interface, it is widely adopted among developers. (2) LabelMe [
28,
29], initially developed by the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, offers a broad range of annotation functionalities including instance segmentation, semantic segmentation, bounding box annotation, and image classification. It supports various annotation shapes such as rectangles, circles, line segments, and points, and can also annotate video data. (3) RectLabel [
24], primarily designed for object detection and image instance segmentation annotation, supports output formats including YOLO, KITTI, COCO JSON, and CSV, and can read and write XML files in the PASCAL VOC format. This tool is widely utilized on Mac platforms. (4) Make Sense [
24], recommended by the developers of YOLOv5 and Roboflow, is an open-source tool built using Typescript. However, it has not received code updates over the past two years, and the responsiveness to community inquiries remains relatively low.
These annotation tools demonstrate clear limitations when confronted with dynamically evolving data. Manual annotation not only demands considerable time and financial resources but also fails to accommodate rapidly changing classification requirements. Additionally, manual processes are susceptible to subjective biases, which compromise the consistency and accuracy of annotation outcomes. When dataset tasks or data are updated, substantial repetitive work is often required to maintain dataset integrity.
In response to these challenges, intelligent annotation technologies have been developed. For instance, automatic annotation variants of LabelImg are typically enhancements or extensions of the original LabelImg tool. Some notable examples include the following: (1) AutoLabelImg1, an improved version of LabelImg, which operates by training a model using a set of pre-annotated images and then integrating the trained model into the annotation tool to automatically process the remaining unlabeled data. The model’s detection outputs can be converted into annotation files such as .xml or .voc formats. Although this tool significantly increases annotation efficiency, manual verification and correction of automatically generated annotations remain necessary. (2) LabelGo-YOLOv5AutoLabelImg, a semi-automatic annotation tool built upon LabelImg and YOLOv5, utilizes pre-trained YOLOv5 model running on the PyTorch 1.7.0 framework to semi-automatically annotate datasets. After the user reviews and confirms the relevant information and selects the desired model, automatic annotation is executed, followed by the option to adjust and save the annotations according to specific project requirements.
At present, several modern AI-powered solutions have substantially advanced automation capabilities, leading to significant improvements in work efficiency. Examples include Supervisely, Roboflow, and Scale AI. Among these, Supervisely has demonstrated notable progress in managing dynamic classification changes and implementing automatic data augmentation techniques.
These intelligent annotation tools enable automatic data labeling, markedly increasing both annotation efficiency and accuracy. Despite these advantages, current intelligent tools exhibit certain limitations. First, they struggle to adequately address research tasks requiring dynamic classification adjustments and practical applications that vary with temporal and spatial factors. Second, existing intelligent annotation systems encounter challenges when dealing with evolving classification demands and complex geometric transformations. When confronted with new classification requirements or shifts in data distribution, it is often necessary to retrain the model or manually revise annotation strategies. Moreover, during data augmentation processes, re-annotation is typically required. For example, when objects in an image undergo rotation, the annotation boxes may become misaligned, resulting in inaccurate labels. These challenges collectively reduce annotation efficiency and consistency, significantly limiting the flexibility and practical utility of intelligent annotation tools.
In light of the aforementioned challenges, this paper proposes several innovative solutions. First, it introduces a novel multi-person collaboration model. In addition to the traditional roles of annotators, reviewers, and administrators, the model incorporates personnel dedicated to data augmentation and automatic data annotation. This addition reduces the operational burden on general annotators while enhancing both the quality and efficiency of the annotation process. Second, it optimizes the dataset construction workflow by presenting a comprehensive process encompassing problem identification, target classification, data collection and cleaning, annotation, review, data augmentation, automatic annotation, re-review, data distribution, and final output. Each component of this process is tightly integrated, ensuring effective collaboration and maintaining high data quality, thereby facilitating the seamless progression of annotation work and providing robust datasets for model training. Third, the paper introduces the design of automatic annotation tools: an image data classification adjustment tool and an automatic annotation tool for post-augmentation processing. The former includes automated utilities for modifying or deleting classification labels, which algorithmically adjust label files to accommodate changes in classification requirements. The latter automatically aligns the position and orientation of annotation boxes according to the rotation angle of images and outputs labels in YOLO format, significantly enhancing annotation efficiency and precision.
The integration of a collaborative multi-role framework, a dataset construction process adaptable to dynamic changes, and specialized automatic annotation tools collectively offers both a theoretical foundation and methodological guidance for the development of large-scale, vertical domain model datasets.
2. Method
2.1. Design of the Multi-Role Collaboration Framework
In dataset annotation management, the rational assignment of personnel roles is crucial for improving efficiency, fully utilizing specialized expertise, promoting effective collaboration, and ensuring high-quality outcomes. In traditional manual data annotation workflows, user roles can typically be categorized into three main groups [
30]:
Annotators, who are responsible for performing data annotation tasks and are generally individuals with relevant professional training. In specific contexts or industries with exceptionally stringent quality requirements, such as medical imaging or water conservancy, they may include model training personnel (e.g., programmers) or domain experts directly engaged in annotation activities.
Reviewers, who oversee the verification of annotated data, conduct proofreading and statistical checks, promptly correct errors, and address missing annotations. This role is often assumed by experienced annotators or authoritative experts within the field.
Administrators, who manage annotation personnel, coordinate the assignment and collection of annotation tasks, and ensure smooth workflow execution. These roles are interdependent, complementing one another while fulfilling distinct responsibilities. Each is an essential component of the overall data annotation process. Moreover, since annotated datasets are frequently used to train machine learning and artificial intelligence algorithms, model training personnel are required to build models using manually annotated data, while product evaluation personnel must repeatedly validate model annotation performance to assess whether it meets deployment requirements.
To enable the efficient construction of dynamic datasets for vertical domain models, this study proposes a multi-role collaboration framework encompassing key stages such as data augmentation, automatic annotation, manual review, and dynamic updates. This integrated, closed-loop collaboration process is designed to ensure both data quality and timeliness.
This paper adds data augmentation and automatic data annotation personnel, who are responsible for data augmentation and automatic annotation work. While enhancing data unity, it reduces the computer operation requirements of ordinary annotators, enabling them to focus more on the annotation quality and efficiency of the original data. The detailed responsibilities of each added role are summarized in
Table 1.
2.2. Dataset Construction and Process with Multi-Role Collaboration
The dataset construction process plays a pivotal role in ensuring data quality and integrity, enhancing processing efficiency, and enabling scalable data expansion [
31]. As illustrated in
Figure 1, traditional dataset construction processes are typically designed for static datasets. When datasets undergo operations such as augmentation, addition, or deletion of data, annotation work must often be repeated from the beginning, leading to substantial redundant labor.
A key limitation of the traditional process depicted in
Figure 1 is that any expansion of the dataset or modifications such as the addition, removal, or alteration of data categories render the original annotation files invalid. This necessitates re-annotation using manual annotation tools. When a considerable volume of annotated data has already been accumulated, this repetitive work becomes particularly onerous and inefficient.
This study adopts data annotation within a multi-person collaboration framework as an illustrative case and proposes a comprehensive data annotation process, as depicted in
Figure 2.
The dataset construction and annotation workflow follows these sequential steps: problem orientation, identification target classification (including classification and naming), original image collection and task assignment, image set naming and rule design, image review, data annotation, annotated data review, data augmentation, automatic annotation, augmented dataset naming and label generation, data review, automated data distribution, and final data output, as shown in
Figure 2.
Problem Orientation: The data annotation process begins with problem orientation. Establishing a dataset requires first identifying the approximate application scenario and the specific problem the dataset aims to address. Large-scale datasets can support multiple problem orientations, each corresponding to distinct application scenarios and target identifications.
Identification Target Determination and Classification: Based on the established problem orientation, the next step is to specify the targets to be identified and perform the relevant classification work. This process generates a correspondence table that links problem orientations with the classification of identification targets.
Data Collection: The data collection phase encompasses acquiring diverse types of data, including videos and images. Since collected data may contain issues such as missing values, noise, or duplicate entries, it is essential to perform data cleaning tasks to ensure the dataset comprises high-quality, reliable data.
File Naming by Classification: After filtering, the cleaned data are named according to the class name defined in the target classification correspondence table, followed by an image sequence number.
Key Step—Classification Modification Management: During the project, if deletions or modifications occur within the target classification scheme, the administrator pauses annotation tasks, issues an updated classification correspondence table, and employs a custom-developed tool to revise existing annotations accordingly. Once the updates are complete, annotators receive notifications to resume work following the new annotation guidelines, thereby preventing large-scale annotation errors stemming from data inconsistency.
Data Annotation: In this step, the administrator divides the data requiring annotation into distinct annotation tasks based on specific annotation requirements. Each task may have unique specifications and annotation point criteria. Multiple annotators are assigned to complete each annotation task to ensure thorough coverage and accuracy.
Data Review: Following completion of annotation, annotators submit their work to the administrator, who conducts a comprehensive review of the annotated image data.
Data Augmentation: After the initial review, data augmentation is applied, particularly for datasets with limited sample sizes or, when necessary, across the entire dataset. This process includes an additional review phase to monitor the quality of the augmented data at the source, minimize annotation errors, and avoid the compounding of inaccuracies through data augmentation.
Annotation Based on Augmentation Rules: Data augmentation is executed according to well-defined rules. For instance, when performing geometric transformations, such as rotating an image by 90 degrees or flipping it horizontally, annotations are adjusted based on these operations. An automatic annotation tool is then utilized to generate annotations for the augmented data, with the resulting annotation information comprehensively recorded in the annotation dataset. During this process, clear links between each piece of augmented data and its corresponding original data, along with the applied augmentation operations, are documented to facilitate future data usage and management. Crucially, metadata from before and after augmentation are distinguished through a systematic file-naming convention.
Data Review: Automatically annotated data are opened using annotation software (e.g., LabelImg 1.5.1) for a thorough review to verify annotation accuracy.
Data Automatic Distribution: Upon passing the review, the system automatically partitions the data into training and testing sets, providing notifications if certain classes contain insufficient sample sizes.
Data Output: The finalized dataset is outputted for use by model trainers.
Finally, the annotated data are employed to train the target algorithm model. The quality of annotated data is primarily evaluated by reviewers, who conduct model tests and relay the test results to model training personnel. Based on the feedback, model training personnel iteratively adjust the model parameters to achieve optimal performance. If the desired model performance cannot be attained despite parameter tuning, this indicates deficiencies in the annotated data. In such cases, reviewers communicate the identified data issues to annotators, who then reassign annotation tasks as necessary, thereby maintaining a closed-loop workflow. Ultimately, reviewers submit the final model performance indicators to product evaluation personnel, who conduct the concluding assessment before deployment.
2.3. Design of Automatic Annotation Tools
Building upon the dataset construction process described above, highly automated and adaptable tools are essential to support efficient data annotation. This paper introduces two automatic annotation tools: (1) an image data classification modification tool—Biaogai Jingling—and (2) an automatic annotation tool for post-augmentation processing—Zhizeng Huigai.
Design of the Image Data Classification Tool—Biaogai Jingling
The design of the image data classification modification tool focuses on two main functionalities: (1) classification modification, implemented through an automatic image label adjustment tool, and (2) classification deletion, also handled through an automatic label adjustment mechanism.
In practical annotation workflows, it is inevitable that additions, deletions, or modifications of classifications will be encountered. Such changes necessitate corresponding adjustments to existing annotated data. Traditional annotation tools rely on manual updates to reflect classification modifications, resulting in substantial inefficiencies, repetitive work, and increased labor costs. This study develops a universal algorithm for automatically modifying and deleting image data label classifications, enabling rapid and efficient updates.
The basic principle of the image data classification modification algorithm involves several algorithmic concepts. The alteration of the classification list is a critical aspect. This change is not merely an information update but can restructure the classification order. For example, as shown in
Table 2, image data initially classified in the order of algae, foam pollution, and garbage pollution may, due to new classification requirements, be reordered as floating objects, river color, and riverside buildings.
When the classification order changes, a significant challenge arises: annotations of previously labeled images become misaligned because annotation information is tightly coupled with the original classification order. Once the order is altered, existing annotations can no longer accurately correspond to the revised classification structure. For example, an image initially labeled as “River Color—Red” may be incorrectly reassigned to another category under the new classification scheme, leading to data inconsistency and inaccuracies.
To resolve this issue, this study proposes an effective solution that employs a correspondence table detailing the relationship between classifications before and after modification. Using this table, the algorithm automatically updates the label files, meticulously adjusting each annotation to align with the new classification system. This process ensures that image annotations remain consistent with the revised categories, preserving both data accuracy and usability.
Handling Changes in Classification Order: The correspondence table, provided before and after classification updates, serves as the foundation for the algorithm to automatically modify the label files, synchronizing annotations with the new classification order.
Managing Classification Deletions: The image data classification modification algorithm follows defined principles when addressing deleted classifications. Specifically, if a classification is removed, the associated label tag files of images labeled under that category are deleted, as the corresponding classification information no longer exists. Retaining these labels would cause data redundancy and confusion.
Additionally, labels for categories positioned after the deleted classification are reassigned lower order numbers by decrementing their category indices by one. For instance, if the original category numbers are 2, 3, and 4, and category 2 is deleted, the remaining categories 3 and 4 are renumbered to 2 and 3, respectively. This approach preserves logical consistency in the classification order.
- 3.
Handling Classification Name Modifications: When only the classification name is modified without altering the classification order, no label adjustment is required. In this scenario, existing annotation information remains compatible with the updated classification name, eliminating unnecessary computations and minimizing the risk of introducing errors.
This classification modification algorithm efficiently accommodates various scenarios during classification updates, ensuring both high efficiency and accuracy in image data management.
The basic operations of the algorithm are as follows:
Delete the label tag files associated with any removed classification.
Decrease the order numbers of labels following the deleted classification by one.
Refrain from modifying annotations when only the classification name changes but the order remains constant.
Execute updates systematically according to the established correspondence table when classification orders are changed.
For example, the following sequence illustrates the algorithm’s operations:
Delete images and labels categorized under “Algae Pollution” and then decrease the order numbers of the remaining labels by one, as shown in
Table 3.
Change the label of “Garbage Pollution” from 1 to 5 (a value exceeding the total number of categories).
Reassign the label of “River Color—Red” from 2 to 1.
Update the label of “Garbage Pollution” from 5 to 2.
For newly added images of “Oil Film Pollution”, perform manual annotation, as shown in
Table 4.
2.4. Image Data Augmentation and Automatic Annotation Tool: Zhizeng Huigai
When constructing large-scale datasets, the limited quantity of original image data often necessitates the use of data augmentation techniques to expand the number of dataset images and balance different data categories. This approach improves the model’s robustness and generalization capabilities. Typically, data augmentation can increase the data volume severalfold, by tens or even hundreds of times, for instance, through image rotations at various angles. However, although these techniques resolve the issue of insufficient sample size, they significantly intensify the workload associated with data annotation, which becomes a major bottleneck.
To address this challenge, this section introduces a tool designed to simultaneously produce data augmentation and corresponding data annotation results, thereby overcoming current limitations in automation and low annotation efficiency. This tool is primarily intended for datasets using rectangular bounding box annotations and outputs image labels in YOLO format.
The basic principle of the automatic annotation algorithm is as follows:
For each original image, which corresponds to one or more bounding boxes, when the image is rotated, the associated annotation boxes rotate by the same angle.
The rotation angle of each image is recorded, and the algorithm automatically adjusts the positions and rotation angles of the corresponding annotation boxes to match the image’s rotation.
For a rotated image, every coordinate point of its bounding box is multiplied by the rotation matrix, ensuring the bounding box rotates around the image’s rotation center by the same angle. This guarantees that the annotation boxes remain accurately aligned with the rotated image.
In a two-dimensional plane, the rotation matrix for a point (x, y) rotated by an angle θ around the origin to obtain the coordinates (x′, y′) can be calculated using the following rotation matrix. The rotation matrix R is
Then, the coordinates (x′, y′) of the point (x, y) after rotation can be obtained through the following matrix multiplication:
That is, x′ = xcosθ − ysinθ, y′ = xsinθ + ycosθ. For example, when rotating the point (1,0) counterclockwise by 90° around the origin, at this time, θ = 90°, cosθ = 0, sinθ = 1. According to the above formula, we can calculate x′ = 0, y′ = 1, and the coordinates of the rotated point are (0,1).
This method enables accurate automatic annotation of augmented images. Furthermore, for each image in the augmented dataset, the tool generates file names by appending the rotation angle to the original image name and outputs label files whose names correspond to the augmented images. Each label file contains the results of the automatic annotation algorithm, ensuring a precise and efficient annotation process for the augmented dataset.
3. Experimental Setup
3.1. Experimental Design
To comprehensively assess the performance of the vertical domain model dynamic dataset construction method based on multi-role collaboration and intelligent annotation, this study designed a series of rigorous experiments. Using the water conservancy engineering field as the application scenario, a specialized dataset was constructed, as illustrated in
Figure 3. During the experiments, classification adjustments and data augmentation operations were performed, and comparisons were made with conventional data annotation methods.
For dataset construction, the WATER-DET dataset was meticulously assembled to represent key scenarios in water conservancy engineering. The dataset comprises 1500 images encompassing 12 common water environment issues, including water pollution, abnormal water levels, and dam cracks. By annotating and analyzing these images, the dataset provides a robust data foundation to support vertical domain models in the water conservancy field. The initial classification included fifteen categories; however, as the experiments progressed and the need for model optimization emerged, certain categories were removed or added, and the number of images per category underwent multiple adjustments. The initial and final classifications are presented in
Table 5 and
Table 6, respectively.
The methods developed in this study were applied and validated through experiments. For comparison, three popular annotation tools, LabelImg (Taipei, Taiwan, China), LabelMe (MIT CSAIL, Cambridge, MA, USA), and Make Sense (Melbourne, VIC, Australia), were selected. Evaluation was conducted using three key performance indicators: annotation efficiency, accuracy rate, and consistency. Annotation efficiency was measured by the number of images annotated per hour (images/hour), reflecting differences in annotation speed among the methods.
To ensure experimental reliability and validity, all tests were performed under identical hardware and software environments. The hardware configuration included a high-performance server equipped with a multi-core processor, large-capacity memory, and high-speed storage devices to ensure computational performance and rapid data access. For software, a standardized operating system, a deep learning framework, and relevant dependent libraries were used to eliminate variations due to environmental discrepancies. Each method was tested multiple times, and average values were calculated to minimize experimental errors.
3.2. Collaborative Workflow
The collaborative workflow proposed in this study is illustrated in
Figure 4, specifically under scenarios involving changes in classification order.
This workflow can be activated under several scenarios, including but not limited to the following:
When the problem domain evolves, necessitating the addition or removal of classifications;
When doubts arise regarding the rationality of the current classification system or a reset is required during model training;
When model evaluation indicates that uneven image distribution is adversely affecting performance;
When certain images yield suboptimal training results, requiring targeted data augmentation;
When new images from practical applications need to be integrated into the dataset.
In the role-collaborative process, upon detection of classification changes, annotation tasks are immediately suspended, and a classification change correspondence table is generated and distributed.
Data Augmentation Phase: Data augmentation personnel first perform comprehensive processing of the original data using techniques such as image rotation, scaling, cropping, and noise addition. These operations generate a diverse set of new images with characteristics similar to the original data but containing varied details. For image recognition tasks, rotating and scaling original images at multiple angles improves model exposure to diverse features, enhancing generalization capabilities. Data augmentation personnel also conduct preliminary screening and organization of augmented images to ensure data quality and usability.
Review Phase: Reviewers play a vital role in maintaining annotation quality. They meticulously examine annotations produced by automatic annotation tools, checking for accuracy, consistency, and completeness. By comparing annotations across different annotators, reviewers identify and resolve discrepancies, ensuring data quality aligns with model training requirements. Additionally, reviewers refine and update annotation specifications based on observed challenges, promoting standardization and consistency in annotation practices.
When issues are detected in annotation results, reviewers return the data to automatic annotators for correction or directly coordinate with data augmentation personnel to regenerate affected data. Once the review is successful, the dataset is dynamically updated with the revised data, ensuring the model benefits from the most current and accurate information. Moreover, based on feedback from model training outcomes and evolving project requirements, data augmentation personnel can adjust augmentation strategies, automatic annotators can optimize annotation algorithms, and reviewers can update review protocols, creating a closed-loop process for continuous improvement and dynamic dataset updates.
Task Allocation Strategy: To maximize efficiency and annotation accuracy, an optimized task allocation method is employed, taking into account image complexity and annotator expertise. Image complexity is quantitatively assessed using indicators such as the number of image features, texture complexity, and the quantity of target objects. Images containing multiple targets, complex backgrounds, or blurred details are classified as high-complexity, whereas images with simple backgrounds and singular targets are deemed low-complexity.
Determining annotator expertise is equally critical. By analyzing annotators’ historical performance data, including accuracy rates across various domains, their areas of specialization can be identified. For example, Annotator A may demonstrate superior performance in medical image annotation, while Annotator B excels in traffic scene annotation.
Based on these insights, a task allocation model is constructed to align image complexity with annotator expertise. High-complexity images are assigned to experienced annotators in relevant fields to ensure precise and efficient annotations. Conversely, low-complexity images are allocated to general annotators, optimizing workforce utilization. This targeted assignment enhances both efficiency and annotation quality by achieving an optimal match between tasks and annotator capabilities.
To maintain fairness and rationality in workload distribution, the workflow incorporates a task priority queue and a dynamic adjustment mechanism. Tasks are prioritized based on factors such as project urgency and data requirements, ensuring high-priority tasks are assigned promptly to qualified annotators. Furthermore, by monitoring annotator progress and task completion status, the allocation process is dynamically adjusted to balance workloads and improve overall collaborative efficiency.
3.3. Application of the Classification Tool
The automatic annotation tool developed in this study offers a highly user-friendly interface. Using the classification modification correspondence table, the system can automatically generate updated annotations from the original data. When data classifications are altered, the administrator first suspends the ongoing annotation tasks. Then, the automatic annotation personnel handle the automated updating of annotation files. As illustrated in
Figure 5, the system automatically updates annotated images according to the new classification scheme.
3.4. Application of the Data Augmentation Tool
The data augmentation tool developed in this research is also designed for ease of use and high practicality. It is employed during the data augmentation phase by data augmentation personnel. The original images and their corresponding annotation files are placed in separate folders, after which the system automatically generates augmented images and updated annotation files. Additionally, a prompt box assists reviewers by displaying annotation positions, facilitating the verification of annotation accuracy.
Figure 6 presents examples of augmented data, where the red-framed sections indicate the automatically generated annotation boxes, enabling reviewers to inspect the correctness of annotations.
As demonstrated in
Figure 6, across a range of image types—including sewage outlets, sand yards, buildings, river and lake water pollution, and oil-based pollution—the data augmentation tool delivers favorable outcomes. It effectively enhances annotation efficiency and accuracy, underscoring the tool’s practical value in processing real-world datasets.
4. Result Analysis
4.1. Comparison of Annotation Efficiency
As demonstrated in
Table 7, a detailed analysis of the experimental data confirms the substantial advantages of the vertical domain model dynamic dataset construction method proposed in this study. By employing multi-role collaboration and intelligent annotation, the method achieves superior performance in annotation efficiency, accuracy, and consistency.
Regarding dynamic classification adjustment response time, the proposed method demonstrates exceptional performance, reducing the adjustment process to the minute level. When new classification requirements or changes in data distribution occur, traditional methods necessitate redeveloping annotation strategies and retraining annotators, which is highly time-consuming. In contrast, the method presented in this study rapidly adapts to dynamic classification adjustments through close multi-role collaboration and prompt updates to intelligent annotation algorithms. Data augmentation personnel swiftly modify augmentation strategies to accommodate new classifications, automatic annotation personnel apply the updated model to generate accurate annotations, and reviewers perform timely evaluations to ensure data quality.
In the area of data augmentation, this method also shows outstanding performance. Traditional data augmentation typically requires manual annotation before augmenting data in the program, which increases the annotators’ workload and hampers reviewers’ ability to visually inspect results. In contrast, this method leverages image transformation and data synthesis technologies to quickly generate large volumes of augmented data. Automatic annotation personnel then use the developed tools to annotate the augmented data efficiently. Timely review and feedback from reviewers further guarantee annotation accuracy and consistency, minimizing repetitive labor stemming from mis-annotations.
In terms of consistency, the proposed method significantly enhances annotation consistency through an integrated consistency-checking mechanism. By comparing annotations from different annotators, this mechanism can promptly identify and rectify inconsistencies, ensuring high levels of accuracy and coherence. In comparison, traditional annotation approaches lack effective consistency checks, making it difficult to achieve reliable and uniform results.
Comparative experiments with LabelImg, LabelMe, and Make Sense highlight the clear advantages of this method across annotation efficiency, accuracy, and consistency metrics. Notably, when classification modifications occur, the number of images processed accurately by this method greatly surpasses that of other tools, with the advantage becoming increasingly pronounced for larger-scale datasets. These experimental findings comprehensively validate the effectiveness and superiority of the proposed method in constructing dynamic datasets for vertical domain models.
In comparison with contemporary automated annotation platforms such as Supervisely (Supervisely, Mountain View, CA, USA), Roboflow (Roboflow, San Francisco, CA, USA), and Scale AI (Scale AI, San Francisco, CA, USA), this study presents the analysis summarized in
Table 8. The proposed method demonstrates superior performance over Roboflow and Scale AI, particularly in scenarios involving category deletion and modification. Compared to Supervisely, it offers comparable functionality while providing greater advantages in management workflows and support for localized operations.
4.2. Model Verification
For the identification of river and lake water environment problems, a model named Water-YOLOv11n was trained using transfer learning based on the YOLOv11n architecture. As illustrated in
Figure 7, the precision confidence of the water environment problem identification model reaches 1, while the average recall rate across all categories achieves 95.8% at a 0.5 threshold, indicating a high precision level. The model’s F1-score is 0.9, reflecting a favorable balance between precision and recall and demonstrating excellent overall performance. Through verification, the model shows strong capability in accurately identifying diverse water environment issues.
To further assess classification performance, the normalized confusion matrix is displayed in
Figure 8. The matrix reveals that most categories are correctly identified with high confidence. However, some misclassification occurs between visually similar categories, such as dead animals and sewage, which reflects the inherent complexity of distinguishing closely related environmental features.
Given the inherent complexity of river and lake water environment problems, the experiment included multiple cycles involving data addition, classification modification, and data augmentation. These tasks were efficiently executed through the collaborative workflow and specialized tools developed in this study, allowing the experiments to proceed swiftly and effectively. Looking ahead, leveraging the collaborative process and tools proposed herein, additional real-world application datasets will be incorporated into the river and lake water environment problem dataset to further enhance the model’s performance and adaptability.
4.3. Analysis of Annotation Efficiency for Different Image Complexities
The automatic annotation method developed in this study effectively addresses the challenge of extensive manual labeling typically required after classification adjustments or image modifications during or following model training. This approach enhances efficiency through two primary strategies:
- (1)
Leveraging Existing Annotation Results: By fully utilizing previously verified annotations as a foundation.
- (2)
Intelligent Adaptation: By generating variations of annotated data through data augmentation based on existing labels.
However, it is important to note that the current data augmentation process in this study is limited to rotation and translation of quadrilateral bounding boxes compatible with the YOLO format. This constraint limits the applicability of the method to other annotation types, such as polygon annotations or semantic segmentation masks. In scenarios involving diverse annotation formats or more complex image characteristics, such as varying resolutions, irregular object shapes, dense object distributions, or high-noise backgrounds, the current approach may encounter limitations. Consequently, future work will aim to extend augmentation strategies to support additional annotation formats and enhance robustness when processing more challenging image conditions.
4.4. Analysis of the Applicability of Automatic Annotation Methods in Different Fields
The proposed methodology demonstrated both high efficiency and accuracy when applied to constructing image datasets within the water conservancy engineering domain. Moreover, the workflow maintains low computational costs by combining automatic annotation tools with human-in-the-loop validation, thereby avoiding repeated full retraining cycles and supporting incremental dataset updates. In the presented case study, an initial model was trained on cat-and-dog image classification tasks and subsequently adapted to detect categories related to water pollution. During deployment, new domain-specific images were incorporated into the dataset to further refine model performance. This process illustrates the method’s flexibility in managing category transitions and evolving datasets—situations that frequently arise in real-world applications.
Importantly, similar dynamic requirements are common in diverse fields, including transportation, agriculture, and medical imaging, where category definitions can change over time and newly collected data must be integrated seamlessly for continuous model improvement. Therefore, the proposed approach demonstrates strong cross-domain adaptability and scalability, making it well suited for iterative and dynamic dataset construction in various industries.
4.5. Analysis of Quality Assurance After Automatic Annotation Updates
To ensure data quality following automatic annotation updates, the process in this study is directly built upon thoroughly reviewed manual annotations. All original annotations were initially labeled by trained annotators and subsequently verified by an auditing team to guarantee their accuracy. Automatic annotation procedures were then applied to this high-quality baseline. Additionally, the model utilized for annotation was experimentally validated, achieving an F1-score of 0.9 and exhibiting consistent performance across categories. These results confirm that the updated annotations are both accurate and reliable, while significantly enhancing annotation efficiency and maintaining overall data quality throughout the process.
5. Conclusions
This study focuses on constructing dynamic datasets for vertical domain models, proposing an innovative engineering-oriented approach based on multi-role collaborative workflows and intelligent annotation techniques. The proposed methodology effectively addresses the challenges encountered by traditional dataset construction methods, particularly regarding efficiency, consistency, and dynamic adaptability.
Through a carefully designed multi-role collaboration framework, the responsibilities and collaborative processes of data augmentation personnel, automatic annotation personnel, and reviewers are explicitly defined. This structured workflow enables seamless progression from data processing to annotation review, significantly enhancing annotation efficiency and data quality. Additionally, the developed intelligent annotation algorithms—specifically the annotation box correction algorithm based on rotation matrices—successfully resolve annotation deformation errors arising from rotation scenarios, greatly improving annotation accuracy and the level of automation.
The construction of the WATER-DET dataset in the water conservancy engineering domain, along with comparative experiments involving LabelImg, LabelMe, and Make Sense, demonstrates the effectiveness of the proposed method. Results show that the response time for dynamic classification adjustments is reduced to the minute level. Moreover, the proposed approach achieves a 100% improvement in annotation efficiency, maintains high annotation consistency following data augmentation, and provides a user-friendly environment conducive to efficient review processes. These advantages become even more pronounced with larger-scale datasets, highlighting the method’s superiority and practical utility in dynamic dataset construction for vertical domain models.
While this research has made substantial progress, integration with traditional annotation tools remains necessary for foundational tasks. Future work will focus on embedding the proposed methodology into manual annotation tools such as LabelImg and LabelMe, thereby enhancing annotation intelligence levels and enabling distributed data storage and collaborative annotation workflows. Furthermore, to address current limitations in processing diverse annotation formats and more complex image characteristics—such as varying resolutions, irregular object shapes, dense object distributions, or high-noise environments—future enhancements will expand the augmentation framework to support broader annotation types and improve adaptability under challenging conditions.
The findings of this study, which integrate multi-role collaboration, process reengineering, and advanced automatic annotation tools, provide an effective solution for improving dataset construction efficiency. Moreover, the proposed approach shows promising potential for application in other fields. For instance, future exploration could investigate constructing dynamic datasets for agricultural pest monitoring or geological disaster early warning, further advancing the innovative application of artificial intelligence technologies across diverse industries.