Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment

Zhu, Song; Hua, Yuansheng; Zhu, Jiasong; Meng, Fanyi

doi:10.3390/rs17162819

Open AccessArticle

Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment

¹

College of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518000, China

²

National Key Laboratory of Green and Long-Life Road Engineering in Extreme Environment, Shenzhen University, Shenzhen 518000, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(16), 2819; https://doi.org/10.3390/rs17162819

Submission received: 19 June 2025 / Revised: 28 July 2025 / Accepted: 12 August 2025 / Published: 14 August 2025

(This article belongs to the Special Issue Monitoring and Modelling of Geological Disasters Based on InSAR Observations: 3rd Edition)

Download

Browse Figures

Versions Notes

Abstract

Interferometric Synthetic Aperture Radar (InSAR)-derived Digital Elevation Models (DEMs) provide critical landform data for monitoring the stability of urban infrastructure, especially for linear infrastructure such as roads and transportation corridors. Traditional landform classification methods are often hindered by incomplete results and require significant manual intervention. To address these challenges, we propose an automated landform classification method based on an enhanced Random Forest (RF) model that integrates Optimization of Decreasing Reduction (ODR) for majority class undersampling and Support Vector Machine Synthetic Minority Oversampling Technique (SVM-SMOTE) for minority class oversampling, specifically to address class imbalance. The method was validated using a dataset of 82,450 expert-labeled samples from approximately 100 km of highway corridors, with independent test sets and ten-fold cross-validation. The enhanced RF model achieved a classification completeness rate of 100% and a macro F-score of 97.0%, significantly outperforming traditional rule-based and standard RF methods. This approach provides robust post-processing support for InSAR-based urban infrastructure monitoring and environmental modeling.

Keywords:

InSAR-derived DEMs; landform classification; random forest; class imbalance; urban infrastructure monitoring

1. Introduction

Urban infrastructure—including road networks and transportation corridors—plays a crucial role in supporting the daily operations of cities. However, as these infrastructures age, they become increasingly prone to ground deformation and structural instabilities [1,2]. This issue is especially pronounced in rapidly urbanizing areas, where continual settlement, subsidence, or slope movement can undermine roads and related facilities [3,4]. Detecting and monitoring such deformation is critical for sustainable urban development and public safety. In this context, fine-scale geomorphological (landform) classification has emerged as an important factor in urban transportation design and stability monitoring. Roads constructed across certain geomorphic features often face elevated hazard risks [5,6]; for example, features such as debris slide scars, flow channels, or the toe zones of deep-seated failures can negatively affect road stability [7,8]. Accurately identifying these subtle landforms is therefore essential for engineers to design safer transportation corridors and to proactively address potential failure zones before problems arise [9,10]. Furthermore, for existing highway infrastructure ongoing landform classification and monitoring are critical. Such classifications are necessary for assessing evolving geohazard risks due to environmental changes or aging infrastructure, planning targeted maintenance interventions, and ensuring the long-term stability and safety of these vital transportation links [11]. Understanding the geomorphological context helps in identifying areas susceptible to future instability even after construction is complete.

The initial construction activities, such as excavation and embankment placement, fundamentally alter the local landform stability. Both artificial and adjacent natural slopes are susceptible to continuous geohazards like landslides, rockfalls, and debris flows, which pose persistent threats to traffic safety throughout the highway’s operational life [12]. Therefore, accurately classifying and periodically re-evaluating these landform units is fundamental for identifying high-risk areas, planning cost-effective maintenance, and enabling rapid damage assessment and emergency response after events such as heavy rainfall or earthquakes [13]. In essence, understanding the landform context is indispensable for assessing evolving risks and ensuring the long-term resilience of these vital transportation links.

The rapid development of remote sensing technologies has revolutionized the acquisition of high-resolution topographic data. Among various sources, interferometric synthetic aperture radar (InSAR)-derived DEMs have emerged as a powerful tool for landform analysis, offering several advantages over traditional optical or LiDAR-based DEMs. InSAR can provide consistent, high-resolution elevation data over large areas, regardless of weather conditions or daylight, making it particularly suitable for regions with frequent cloud cover or difficult access [14,15]. The improved vertical accuracy and spatial resolution of modern InSAR DEMs enable the detection of subtle landform features that are critical for infrastructure hazard assessment. However, while the availability of high-quality DEMs has greatly enhanced our ability to map and analyze landforms, it does not by itself solve the challenges of automated, accurate, and comprehensive landform classification—especially in the context of complex and heterogeneous landscapes.

Landform classification in highway corridor studies has relied heavily on rule-based approaches, where experts define deterministic thresholds for geomorphometric parameters such as slope, elevation, and relief, often derived from digital elevation models (DEMs) [16,17,18]. These rules are then applied to extract different landform types through a series of mathematical operations and reclassification steps. While effective in some cases, these methods have several inherent limitations. First, they require significant manual intervention and expert knowledge, making them labor-intensive and difficult to scale. Second, the rigid thresholds may not capture the full variability of natural landforms, especially in areas with complex or transitional features, leading to incomplete or inaccurate classification results. Third, as highway networks extend into more diverse and challenging terrains, the limitations of rule-based methods become increasingly pronounced, often resulting in large unclassified areas or misclassification of critical landform types. This lack of automation and adaptability restricts their practical application in large-scale or data-rich environments.

To overcome the limitations of rule-based methods, researchers have increasingly turned to machine learning algorithms for landform classification. Random Forest (RF), in particular, has gained popularity due to its high accuracy, robustness to noise, and ability to handle complex, nonlinear relationships between features [19,20,21,22,23,24,25,26]. By training on expert-labeled samples, RF models can automatically learn classification rules and generalize to new areas, significantly improving automation and reducing reliance on manual thresholding. However, a persistent challenge in applying machine learning to landform classification is the issue of class imbalance. In real-world highway corridor datasets, certain landform types—such as valleys or plains—may be underrepresented compared to more common types like rolling or undulating areas. Standard RF models, when trained on imbalanced data, tend to be biased toward the majority classes, resulting in poor performance for minority landform types. This is particularly problematic because these minority classes often correspond to the most geohazard-prone or engineering-critical areas. Existing solutions, such as cost-sensitive learning or simple resampling, have limitations as they may require subjective parameter tuning, risk introducing noise, or fail to adequately address the underlying data distribution.

To address these challenges, this study proposes an enhanced Random Forest-based landform classification framework that integrates two advanced sampling strategies: optimization of decreasing reduction (ODR) for majority class undersampling and support vector machine synthetic minority oversampling technique (SVM-SMOTE) for minority class oversampling [27,28,29]. By embedding this hybrid sampling approach into the RF training process, the method generates balanced and diverse training subsets for each decision tree, thereby improving the model’s ability to accurately classify all landform types—including those that are underrepresented. The proposed method is validated using a large, expert-labeled dataset derived from InSAR DEMs covering approximately 100 km of highway corridors in a mountainous region. Its performance is systematically compared with both traditional rule-based and standard RF methods in terms of classification completeness, accuracy, and spatial coherence.

The main contributions of this work are as follows:

(1): We develop a fully automated landform classification framework tailored for highway corridor hazard assessment, leveraging high-resolution InSAR-derived DEMs.
(2): We introduce a novel hybrid sampling strategy that effectively addresses class imbalance in landform datasets, significantly improving the recognition of minority landform types.
(3): We provide a comprehensive evaluation of the proposed method against existing approaches, demonstrating its superiority in terms of classification completeness and accuracy.

This research aims to advance the state of the art in landform classification for transportation infrastructure, providing a robust and scalable solution to support safer and more resilient urban development.

2. Study Area

This study encompasses several highway corridors with a cumulative length of approximately 100 km. Due to the sensitive nature of the infrastructure data the specific geographical locations are anonymized, but they are situated in a mountainous region known for its complex geological and topographical characteristics. This diverse setting presents a variety of potential geohazard risks relevant to highway engineering. The study area was selected based on criteria including topographic diversity (e.g., presence of valleys, plains, mountains, undulating landforms), proneness to geohazards like landslides and subsidence (as per references [3,4,5,6]), and representation of typical challenges in Chinese highway networks extending into remote areas. This ensures it is not a “specific easy case” but a challenging, generalizable testbed with both simple and complex landforms, validated across multiple corridors to justify the robustness of results. To define the study domain, a 1.5 km buffer zone was established on both sides of each highway centerline, creating an extensive “highway domain” for analysis. This large and varied area was intentionally selected to provide a challenging and representative dataset, featuring a wide array of landform types. This makes it an ideal testbed for rigorously evaluating the robustness and generalizability of the proposed classification method across real-world conditions.

3. DEM Acquisition and Preprocessing

3.1. DEM Acquisition

The primary data for this research are high-resolution digital elevation models (DEMs) derived from interferometric synthetic aperture radar (InSAR) data. The fundamental principle of InSAR involves using the phase difference information from two SAR images of the same area, acquired from slightly different orbital positions, to create an interferogram. This interferogram, after being corrected for various errors and unwrapped, can be converted into a precise model of surface elevation through geocoding. The detailed processing chain, including co-registration, interferogram generation, phase filtering, and atmospheric correction, was conducted following established methodologies [15].

The resulting DEMs feature a spatial resolution of 5 m. The vertical accuracy of the DEMs was assessed to be approximately ±2.3 m (root mean square error) when compared against independent ground control points. The combined study areas consist of a total of 12,510,081 grid cells, providing a dense and detailed topographic foundation for the landform classification.

The accuracy of InSAR-derived DEMs can be affected by dense vegetation, especially in areas with high slopes and thick forest cover, due to signal decorrelation and penetration limitations. To minimize these effects, we selected InSAR data acquired during periods of lower vegetation cover where possible. Additionally, we performed a quality assessment of the DEMs using ground control points, and regions with significant DEM errors—often corresponding to dense vegetation—were excluded from the training and validation sample sets. This ensures that the labeled samples used for model training and evaluation are representative and reliable. Furthermore, the hybrid sampling and machine learning approach employed in this study is robust to a certain degree of input data noise as it relies on expert-validated samples.

3.2. DEM Sample Preprocessing

A high-quality ground truth dataset is the cornerstone of a reliable machine learning model. To this end, we generated our labeled samples through a rigorous, two-stage hybrid process that leverages both objective classification rules and nuanced expert knowledge. This approach ensures the creation of a high-confidence ground truth dataset.

Stage 1: Automated Candidate Screening. First, we applied an established highway landform rule-based method, with classification criteria detailed in Table 1, as an automated screening tool across the entire study area. This method, based on geomorphometric parameters, efficiently generated a large, preliminary pool of candidate samples with strong geomorphological tendencies.

Stage 2: Expert-driven Validation and Refinement. In the second stage, this candidate pool was subjected to a meticulous manual review by geomorphological experts. This critical step goes beyond simple rule application; the experts validated correctly classified, representative samples while discarding ambiguous or erroneous ones, particularly those located at the transitional boundaries between different landform types. This use of expert-validated “semantic tie points” to guide classification is a proven strategy for improving model accuracy. This expert-driven validation process resulted in a final ground truth dataset of 82,450 high-confidence samples. Each sample corresponds to a single DEM pixel, and its features are the landform attributes derived for that cell. The dataset was then partitioned into a training set (80%) and a test set (20%) for independent validation, with the detailed distribution shown in Table 2. The detailed distribution of these samples across the different landform types, which highlights the significant class imbalance that this study aims to address, is presented in Table 1. To visualize the data and the selection methodology, Figure 1 serves as an illustrative example, showing the DEM, hillshade representation, and resulting spatial distribution of labeled samples for a representative sub-region of the study areas.

4. Methodology

4.1. Overview

This study introduces an advanced, automated framework for classifying highway corridor landforms using interferometric synthetic aperture radar (InSAR)-derived digital elevation models (DEMs). The core of our methodology is an improved Random Forest (RF) algorithm designed to overcome the limitations of traditional rule-based methods and standard machine learning models, particularly when dealing with imbalanced datasets. The pipeline begins with the acquisition of high-resolution DEMs for the highway strip area and the selection of typical landform samples based on established classification rules to create a training dataset. To address the inherent imbalance in landform sample distribution, we integrate a hybrid sampling strategy directly into the RF structure. For each decision tree within the forest, a unique and balanced training subset is generated by combining the support vector machine synthetic minority oversampling technique (SVM-SMOTE) for intelligent oversampling of minority classes and the optimization of decreasing reduction (ODR) algorithm for targeted undersampling of majority classes. This ensures that each tree in the ensemble is trained on a balanced, yet diverse, dataset, thereby enhancing the model’s overall predictive power and generalization capability. The final classification is achieved through a majority vote of all decision trees, resulting in a complete and highly accurate landform map that effectively mitigates the issues of classification incompleteness and bias towards majority classes found in conventional approaches. The framework process is shown in Figure 2.

4.2. RF Classification

RF is an ensemble learning theory proposed by Breiman [30]; the idea is to construct an ensemble classifier based on multiple decision trees. It increases the dissimilarity between classification models by taking several different subsets of training samples, which can improve the generalization ability and the prediction ability of the model. The RF classification model has the characteristics of high accuracy, fewer parameters, and stable performance compared with many other machine learning algorithms. The RF classification method mainly consists of two processes: training and classification. The model training process draws K sample subsets from the training samples by bootstrapping. The unselected training samples are used as out-of-bag (OOB) data (for error estimation). Then, corresponding decision trees are constructed for each of the K sample subsets using CART (Classification and Regression Trees). At all nodes of a single decision tree, a subset of p features (where p is a parameter, typically total number of features) is randomly selected from the available features, and the optimal split for the growth of the tree is determined by calculating the amount of information contained in each of these p features. The final classification result of the Random Forest (RF) ensemble is obtained by majority voting among all decision trees, as shown in Equation (1):

C (x) = a r g \max_{Y} \sum_{i = 1}^{K} I (h_{i} (x) = Y)

(1)

where C(x) is the RF ensemble classifier,

h_{i} (x)

is the i-th decision tree’s classification result for input x, Y represents a class label, K is the total number of trees, and I( ) is the indicator function. RF is an ensemble classifier with excellent performance, but it is not perfect. When the sample set is imbalanced the RF algorithm tends to ignore the characteristics of the minority class, which leads to a degradation of the classifier’s performance and prevents it from taking full advantage of its original strengths.

4.3. Hybrid Sampling Algorithm

4.3.1. ODR Algorithm

The ODR algorithm [31] was developed on the basis of random undersampling. It makes the sample dataset balanced by removing redundant information in the majority class according to the influence degree of the surrounding majority class samples, thereby avoiding the elimination of valid information to a certain extent. The steps of undersampling based on the ODR algorithm are as follows:

An associated set defining a sample q is the set of samples containing q among the k nearest neighbors of other samples in the set M and the associated set

A_{q} = \{n_{q_{1}} n_{q_{2}} n_{q_{3}}\}

, where sample q is the nearest neighbor of

n_{q_{i}} (n_{q_{i}} \in A_{q})

. Opposite samples are samplings whose types are different from q.

Based on the k-item nearest neighbors of all samples in the sample set T, mine and combine the minimum set of domain chains of this item and then form a chain table of associated sets about the sample set M according to the minimum set of domain chains of each sample in T.
For all samples q in the majority class sample set M, the KNN algorithm is used to discriminate the samples in its associated set, and the number of correct judgments is set to A. Subsequently, delete q from the nearest neighbors of the sample set until the k + 1 nearest neighbors are added, then use the KNN algorithm to discriminate the results, and the number of correct judgments is set to B.
Compare the values of A and B. If A > B, remove sample q as it is considered to have little effect on the classification of the training sample set T. Conversely, sample q is considered to have a large effect on the classifier.
Compute the nearest opposing sample of all samples in M in the training sample set T and find the Euclidean distance Z_P between them.
According to the value of A − B from large to small (only in the case of A − B > 0), if the value of A − B is the same, the order of Z_P from small to large is optimized. Then the majority class samples are deleted successively until the number of majority class samples decreases to the specified number, and the algorithm ends.

4.3.2. SVM-SMOTE Algorithm

The sample distribution cannot be taken into account in the classical SMOTE algorithm, which can lead to an overall change in the distribution of the dataset, and the boundaries of the different types are ambiguous [31,32,33,34,35,36,37]. The core idea of SVM-SMOTE [33], a modification of the SMOTE algorithm, is to use an SVM classifier to identify the minority class samples that are most critical for defining the class boundary (i.e., support vectors). New synthetic samples are then generated strategically along the lines connecting these support vectors, focusing on ‘safe’ regions where the density of majority class samples is low. The step of oversampling based on the SVM-SMOTE algorithm is as follows:

1. The SVM algorithm is used to find the optimal Lagrange multiplier

a_{i}

for each xi in the original set of samples

X = \{x_{i}, y_{i} |i = 1,2, \cdot \cdot \cdot, n\}

according to the following optimization problem (Equations (2) and (3)):

L (a) = \sum_{i = 1}^{n} a_{i} - \frac{1}{2} \sum_{i, j} α_{i} α_{j} y_{i} y_{j} T (x_{i}, x_{j})

(2)

s . t . 0 ≤ α_{i} \leq C, \sum_{i} α_{i} y_{i} = 0, \forall i

(3)

where

x_{i}

is the sample feature vector;

y_{j}

is the sample label corresponding to xi; and

T (x_{i}, x_{j}) =

is the kernel function; and in each set, the samples for which

α_{i}

> 0 will be the support vector

s_{i}^{+}

.

2. For

s_{i}^{+} \in S^{+}

in the sample set (

S^{+}

is the set of support vectors for the minority class samples), calculate its n adjacent points. The number of sample points belonging to the minority class is n′. If n′ ≥ n/2, the point is considered a valid sample; if n′ < n/2, it is a dangerous sample point.

3. For each valid sample, the SVM-SMOTE algorithm generates new minority class samples by interpolation, as shown in Equation (4):

X_{in}^{+} = s_{i}^{+} + ε (d_{n}^{i, j} - s_{i}^{+})

(4)

where

X_{in}^{+}

is the minority class sample generated by interpolation and

d_{n}^{i, j}

is the j-th adjacent point of

s_{i}^{+}

, a random number in the interval [0, 1]. For each hazardous sample point, a minority class sample is generated by interpolation as follows (Equation (5)):

X_{ex}^{+} = s_{i}^{+} + ε (s_{i}^{+} - d_{n}^{i, j})

(5)

where

X_{ex}^{+}

represents the set of the minority class samples generated by interpolation, and the final result is obtained by combining the original set of samples and all samples generated by interpolation.

4.4. Proposed Classifier for Highway Landform

Based on the hybrid sampling strategy, this study proposes an improved Random Forest (RF) classification method. The core idea is to replace the traditional bootstrapping process in the RF algorithm with a dynamic, per-tree hybrid sampling mechanism. As illustrated in Figure 3, this approach is designed to tackle the class imbalance problem at its root—within the training process of each individual classifier in the ensemble—thereby enhancing the model’s performance on imbalanced highway landform datasets. The traditional RF algorithm’s bootstrapping creates training subsets that often mirror the original dataset’s imbalance, leading to models biased towards the majority class. Our improved method fundamentally alters this process. Instead of simply resampling from the original dataset, we generate a unique, balanced, and diverse training subset for each decision tree before it is trained. The workflow, as depicted in Figure 3, proceeds as follows:

Dynamic Subset Generation: The process begins with the full “Highway Landform Training Set”. For each of the n decision trees to be built in the forest, a random sampling parameter α_i (where i ranges from 1 to n) is generated within a predefined range. This parameter dictates the extent of undersampling for the majority class, ensuring that each tree’s subsequent training data will differ in size and composition. This step is critical for maintaining the diversity of the ensemble, a key factor for the model’s generalization ability.
Hybrid Balancing: A hybrid sampling strategy, combining the ODR algorithm and the SVM-SMOTE algorithm, is then applied to the original training data based on the parameter α_i. The ODR algorithm first removes a number of redundant majority class samples, followed by the SVM-SMOTE algorithm, which synthesizes new, high-quality minority class samples in data-sparse regions. This dual approach creates a “Training Subset i” that is not only numerically balanced but also features well-defined class boundaries.
Ensemble Training and Voting: Each of the n “Training Subsets” is then used to train its corresponding “Decision Tree i”, yielding an individual “Result i”. Because each tree is trained on a unique and balanced dataset, it can adequately learn the characteristics of the minority classes without being overwhelmed by the majority classes. Finally, the individual results from all n decision trees are aggregated, and the final “Highway Landform Classification Result” is determined through a majority vote.

This per-tree sampling strategy ensures that the RF model overcomes the performance degradation typically caused by imbalanced datasets, leading to a more robust and accurate classification for all landform types, especially those in the minority.

4.5. Validation

To rigorously validate the performance of our enhanced RF method with hybrid sampling for highway corridor terrain classification, we conducted a comprehensive evaluation using both an independent test set and ten-fold cross-validation. The independent test set comprised 16,490 expert-validated samples, geographically separated from the training data to ensure unbiased assessment. For comparison with the rule-based method, only the overlapping classified areas were considered, thereby eliminating the incomplete parts of the rule-based results to ensure a fair and statistically meaningful comparison. In addition to overall accuracy, we report per-class precision, recall, F-score, and G-mean, which are particularly important for evaluating performance on minority terrain types in imbalanced datasets.

5. Results

5.1. Parameter Optimization

To optimize the hybrid sampling strategy, the effect of the sampling parameter α was investigated. We used ten-fold cross-validation on the training set to calculate the change in F-score and G-mean as α was uniformly increased in steps of 0.01 between 0 and 1. The performance of the classifier at different α values is shown in Figure 4.

When α was 0, the model’s G-mean was at its lowest point as the sampling process only involved SVM-SMOTE without the ODR algorithm to remove potentially noisy information from the majority class. As α increased, the ODR algorithm’s influence grew and the classification performance of the improved RF algorithm gradually improved. The F-score and G-mean reached their maximum values when α was between 0.4 and 0.6, indicating the best classifier performance. In the interval from 0.6 to 1, performance, particularly the F-score, began to decrease as the ODR algorithm removed too many majority class samples, leading to some loss of effective information. These results verify the effectiveness of the hybrid approach and show that the model achieves optimal classification when α is set within the 0.4–0.6 range.

The performance of the Random Forest model is sensitive to several hyperparameters. We adopted a two-step approach to configure our model. First, following the foundational study by Breiman [30] and common practice, we set key parameters to established values: the number of features to consider at each split (nFeatures) was set to the square root of the total number of features (3, so nFeatures = 2 rounded), max_depth was left unlimited, and nSamples (minimum samples per leaf) was set to 2. The nearest neighbor parameter k for the ODR algorithm was set to 5 [36].

Second, with these parameters fixed, we focused on tuning the number of decision trees (nTrees). An experiment varying nTrees from 100 to 1000 showed that performance stabilized around 500 trees. Therefore, we selected 500 as the optimal number to balance classification performance and efficiency. The final model parameters are summarized in Table 3.

5.2. Accuracy Assessment

Based on the optimized parameters determined in Section 4.1, the final classification model was trained. Subsequently, its performance was rigorously evaluated using the independent test set, which comprised 16,490 expert-validated samples. The detailed results are presented in the confusion matrix in Table 4.

The model achieved an excellent overall accuracy of 98.4% and a G-mean of 0.971, empirically proving the high applicability and robustness of the improved RF algorithm on the new, extensive dataset. The confusion matrix shows high values along the diagonal for all five landform types, indicating a very high true positive rate and minimal confusion between classes. This confirms the model’s effectiveness in accurately classifying both majority and minority classes.

5.3. Comparative Classification Mapping

To visually evaluate and compare the final outputs, all three methods were applied to generate classification maps for the four distinct highway corridors that constitute the entire study area. Figure 5 presents the resulting maps, with the classification results displayed as a semi-transparent layer over a hillshade map of the landform. This visualization allows for a direct comparison of the methods’ performance across these diverse settings.

As is evident across all four study sites, the maps produced by the rule-based method (top image in each subplot a–d) suffer from significant incompleteness, with numerous unclassified patches scattered throughout where landform attributes did not meet the strict rule thresholds. In contrast, both the traditional RF (middle image in each subplot) and our improved RF method (bottom image in each subplot) achieve 100% classification completeness, successfully assigning a landform type to every grid cell in all four corridors.

However, a closer inspection reveals critical differences in the quality of the machine learning outputs. The maps from the traditional RF method exhibit a degree of fragmentation and produce discontinuous patches, particularly for minority classes like ‘Valley’. This creates artificial ‘faults’ or breaks in the landscape that are geomorphologically unrealistic. Conversely, the classification maps from our improved RF method display significantly better spatial coherence. The boundaries between landforms are smoother and more continuous, and the shapes of the classified areas more closely align with the natural, flowing patterns of real-world topography. This visual evidence highlights that our improved method not only resolves the issue of incompleteness but also produces a more realistic and spatially consistent landform map.

6. Discussion

6.1. Analysis of Classification Completeness

The analysis of classification completeness was performed by comparing the results from the three methods across the four distinct study areas, with proportional coverage detailed in Table 4. The most significant finding is the complete (100%) classification achieved by both the traditional and improved RF methods. This stands in stark contrast to the rule-based method, which consistently failed to classify a significant portion of the landscape. As the data in Table 5 shows, the incompleteness rate of the rule-based method ranged from 11.9% in Study Area A to a high of 17.6% in Study Area D.

A deeper analysis reveals how the machine learning models handled these unclassified areas. The grids left unclassified by the rule-based method were redistributed among the five landform types by the RF models, often significantly altering the final landform proportions, particularly for minority classes. For instance, in Study Area A, the improved RF method assigned many of the previously unclassified grids to the ‘Plain’ class, increasing its proportional coverage from a mere 1.0% to 5.2%. Similarly, in Study Area B, the proportion of ‘Valley’ landforms increased from 20.2% to 34.5%.

This data-driven observation indicates that the rigid thresholds of the rule-based method are particularly ineffective at identifying minority or transitional landforms. The machine learning models, by generalizing from expert-labeled samples, overcome this critical limitation. Thus, they not only resolve the issue of incompleteness but also provide a more nuanced and complete representation of the landscape, especially for the less common but often critical landform types.

6.2. Comparison Between RF and Improved RF Method

To provide a comprehensive performance evaluation, a detailed comparison was made between our improved RF method and the traditional RF algorithm using the independent test set. The overall macro-average metrics are summarized in Table 6, while the detailed per-class metrics for precision, recall, and F-score are presented in Table 7. In this study, the “improved RF” refers specifically to the Random Forest algorithm in which a hybrid sampling strategy (combining SVM-SMOTE oversampling and ODR undersampling) is applied to generate balanced and diverse training subsets for each decision tree.

The data in Table 7 reveals a significant leap in overall performance. The macro F-score for the improved RF model reached 97.0%, a substantial increase of 10.5 percentage points over the traditional RF’s 86.5%. The primary advantage of our method is most evident in its handling of minority classes, as detailed in Table 7. For the traditional RF, these classes (‘Valley’, ‘Plain’, and ‘Mountain’) suffered from lower precision and recall. In contrast, our improved method elevated all metrics for these classes to above 95%, with F-scores for ‘Valley’ and ‘Plain’ jumping by 14.4 and 13.3 points, respectively. This demonstrates that the hybrid sampling strategy effectively mitigates the inherent bias of standard RF.

The superior discriminative power of our method is also visually confirmed by the Receiver Operating Characteristic (ROC) curve analysis shown in Figure 6. The ROC curves for the improved RF method (Figure 6b) are consistently positioned closer to the ideal top-left corner for all landform types compared to those of the traditional RF (Figure 6a). This visual superiority is quantified by the Area Under the Curve (AUC) values. Our improved model achieved a macro-average AUC of 0.96, which is 8% higher than the standard RF model. Critically, the AUC values for all minority class samples (‘Valley’, ‘Plain’, and ‘Mountain’) saw an increase of 10% or more, confirming the model’s enhanced sensitivity to these challenging landforms. This comprehensive evidence proves our method’s ability to produce a high-fidelity classification for the entire landscape. It is also noteworthy that the ROC curves of the improved RF method exhibit a distinct change in slope (or “knee”) around a false-positive rate (FPR) of 0.1. This phenomenon is primarily due to the effect of the hybrid sampling strategy, which enables the classifier to correctly identify most minority class samples at a relatively low FPR. Beyond this point, further increases in the true-positive rate require accepting more false positives, resulting in a slower increase in the ROC curve. This behavior is typical in imbalanced classification scenarios and highlights the improved sensitivity of our method to minority classes.

6.3. Critical Evaluation and Practical Implications

While the improved RF model demonstrates clear advantages in classification completeness and accuracy, several critical issues remain. The model’s performance is highly dependent on the quality and representativeness of labeled samples, and the hybrid sampling strategy increases computational complexity and requires careful parameter tuning. Moreover, the exclusive reliance on topographic features may limit the model’s applicability in areas where other environmental factors play a significant role. Misclassification of certain terrain types, such as valleys, could lead to underestimation of geohazard risk in highway planning. Therefore, further validation in diverse regions and integration of additional data sources are necessary. Compared to previous studies, our method shows improved performance for minority classes, but its scalability and robustness in larger, more heterogeneous areas require further investigation.

7. Conclusions

In order to resolve the defects of the traditional rule-based method used in the classification of highway strip landforms the RF algorithm was used to take advantage of the automatic mining of implicit geological knowledge in sample sets, and the improved RF classification method was established by incorporating a hybrid sampling strategy. The problem of skewing to the majority class sample set was effectively addressed by the hybrid sampling strategy, which balanced the training subsets for each decision tree. Compared with the rule-based method and the traditional RF method, the improved RF classification method ensured the completeness of the classification results and achieved higher accuracy for all common highway strip landform types within the study area. This combination of digital landform analysis and machine learning could be an application to improve the completeness and to increase the accuracy of highway strip landform classification.

However, the method is strongly dependent on the quality and representativeness of the knowledge of landform types embedded in the sample sets. When the knowledge of landform-type features is incomplete or the samples are not fully representative it will directly affect the effect of the algorithm in mining the implied knowledge, which will lead to discrepancy in classification results: the underlying mechanism of this difference remains to be further investigated.

It is important to position this study within the broader context of geohazard assessment. The primary contribution of this work is a robust and automated methodology for landform classification based on topographic features. This classification serves as a critical foundational layer for subsequent risk analysis. A comprehensive geohazard assessment, however, is a multi-faceted task that requires the integration of numerous variables beyond topography. Factors such as lithology, soil properties, hydrology, and land cover—including vegetation type and density—are crucial for developing holistic and dynamic risk models. Therefore, the accurate landform map produced by our method is designed to be a key input for these more complex models, and future research should focus on integrating these diverse datasets to advance from landform classification to comprehensive geohazard risk assessment along highway corridors.

Author Contributions

Conceptualization, F.M. and S.Z.; Methodology, S.Z.; Investigation, S.Z. and Y.H.; Validation, J.Z.; Formal Analysis, Y.H.; Visualization, J.Z.; Writing-Original Draft Preparation, S.Z.; Writing—Review and Editing, S.Z., Y.H. and F.M.; Supervision, F.M.; Funding Acquisition, F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was jointly supported by the Young Scientists Fund of the National Natural Science Foundation of China (No. 42401402), the Shenzhen Science and Technology Innovation Commission (grant No. 20231120191328001), and the Young Scientists Fund of the Guangdong Regional Joint Fund (No. 2023A1515110722).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

JTG B01-2014; Technical Standard of Highway Engineering. People’s Communications Press: Beijing, China, 2014.
JTG D20-2017; Design Specification for Highway Alignment. People’s Communications Press: Beijing, China, 2017.
Al-Homoud, A.; Masanat, Y. A classification system for the assessment of slope stability of landforms along highway routes in Jordan. Environ. Geol. 1998, 34, 59–69. [Google Scholar] [CrossRef]
Mihai, B.A.; Robert, D.; Savulescu, I. Geomorphotechnical Map for Railway Mainline Infrastructure Improvement. A case study from Romania. Géomorphologie Relief Process. Environ. 2014, 20, 79–90. [Google Scholar] [CrossRef]
Hearn, G.; Pettifer, G. The role of engineering geology in the route selection, design and construction of a road across the Blue Nile gorge, Ethiopia. Bull. Eng. Geol. Environ. 2015, 75, 163–191. [Google Scholar] [CrossRef]
Kadi, F.; Yildirim, F.; Saralioglu, E. Risk analysis of forest roads using landslide susceptibility maps and generation of the optimum forest road route: A case study in Macka, Turkey. Geocarto Int. 2019, 36, 1612–1629. [Google Scholar] [CrossRef]
Sharma, S.; Bansal, V.K. Location-based planning and scheduling of highway construction projects in hilly landform using GIS. Can. J. Civ. Eng. 2018, 45, 570–582. [Google Scholar] [CrossRef]
Choi, J.; Kim, S.; Heo, T.-Y.; Lee, J. Safety effects of highway landform types in vehicle crash model of major rural roads. KSCE J. Civ. Eng. 2011, 15, 405–412. [Google Scholar] [CrossRef]
Oettl, D.; Sturm, P.; Pretterhofer, G.; Bacher, M.; Rodler, J.; Almbauer, R. Lagrangian Dispersion Modeling of Vehicular Emissions from a Highway in Complex landform. J. Air Waste Manag. Assoc. 2003, 53, 1233–1240. [Google Scholar] [CrossRef] [PubMed]
Jia, X.; Dang, W.; Liu, F.; Qingmiao, D. Evaluation of Highway Construction Impact on Ecological Environment of QingHai-Tibet Plateau. Environ. Eng. Manag. J. 2020, 19, 1157–1166. [Google Scholar] [CrossRef]
Li, C.F.; Li, T.B.; Lan, F.; Wang, J.F.; Ren, Y.; Kou, X.M. Research on the distribution law of geohazards along the highways in the Western sichuan plateau gradient zone. Front. Earth Sci. 2025, 13, 1536412. [Google Scholar] [CrossRef]
Yang, H.Z.; Dong, J.Y.; Guo, X.L. Geohazards and risk assessment along highway in Sichuan Province, China. J. Mt. Sci. 2023, 20, 1695–1711. [Google Scholar] [CrossRef]
Tempa, K.; Chettri, N.; Aryal, K.R.; Gautam, D. Geohazard vulnerability and condition assessment of the Asian highway AH-48 in Bhutan. Geomat. Nat. Hazards Risk 2021, 12, 2904–2930. [Google Scholar] [CrossRef]
Dong, Y.T.; Liu, B.B.; Zhang, L.; Liao, M.S.; Zhao, J. Fusion of Multi-Baseline and Multi-Orbit InSAR DEMs with landform Feature-Guided Filter. Remote Sens. 2018, 10, 1511. [Google Scholar] [CrossRef]
Liu, S.J.; Tang, H.C.; Feng, Y.J.; Chen, Y.L.; Lei, Z.K.; Wang, J.F.; Tong, X.H. A Comparative Study of DEM Reconstruction Using the Single-Baseline and Multibaseline InSAR Techniques. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 8512–8521. [Google Scholar] [CrossRef]
Zhou, F.-B.; Meng, F.-Y.; Zou, L.-H.; Zhang, S.-S. A method for automatic classifying strip landforms in road area. J. Highw. Transp. Res. Dev. 2020, 37, 50–56. [Google Scholar] [CrossRef]
Lei, J.; Li, S.; Fan, D.; Zhou, H.; Gu, F.; Qiu, Y.; Xu, B.; Liu, S.; Du, W.; Yan, Z.; et al. Classification and regionalization of the forming environment of windblown sand disasters along the Tarim Desert Highway. Sci. Bull. 2009, 53, 1–7. [Google Scholar] [CrossRef]
Qi, H.-L.; Tian, W.-P.; Zhang, X.-R. Index system of landform regionalization for highway in China. Chang’an Daxue Xuebao (Ziran Kexue Ban)/J. Chang. Univ. (Nat. Sci. Ed.) 2011, 31, 33–38. (In Chinese) [Google Scholar] [CrossRef]
Wang, J.; Li, K.; Shao, Y.; Zhang, F.; Wang, Z.; Guo, X.; Qin, Y.; Liu, X. Analysis of Combining SAR and Optical Optimal Parameters to Classify Typhoon-Invasion Lodged Rice: A Case Study Using the Random Forest Method. Sensors 2020, 20, 7346. [Google Scholar] [CrossRef]
Singh, J.; Thakur, D.; Ali, F.; Gera, T.; Kwak, K.S. Deep Feature Extraction and Classification of Android Malware Images. Sensors 2020, 20, 7013. [Google Scholar] [CrossRef]
Sayin, M.O.; Lin, C.-W.; Shiraishi, S.; Shen, J.; Basar, T. Information-Driven Autonomous Intersection Control via Incentive Compatible Mechanisms. IEEE Trans. Intell. Transp. Syst. 2019, 20, 912–924. [Google Scholar] [CrossRef]
Shieh, J.L.; Haq, Q.M.U.; Haq, M.A.; Karam, S.; Chondro, P.; Gao, D.Q.; Ruan, S.J. Continual Learning Strategy in One-Stage Object Detection Framework Based on Experience Replay for Autonomous Driving Vehicle. Sensors 2020, 20, 6777. [Google Scholar] [CrossRef]
Veronesi, F.; Hurni, L. Random Forest with semantic tie points for classifying landforms and creating rigorous shaded relief representations. Geomorphology 2014, 224, 152–160. [Google Scholar] [CrossRef]
Qin, C.Z.; Zhu, A.X.; Qiu, W.L.; Lu, Y.J.; Li, B.-L.; Pei, T. Mapping soil organic matter in small low-relief catchments using fuzzy slope position information. Geoderma 2012, 171–172, 64–74. [Google Scholar] [CrossRef]
Zhao, W.F.; Xiong, L.Y.; Ding, H.; Tang, G.-A. Automatic recognition of loess landforms using Random Forest method. J. Mt. Sci. 2017, 14, 885–897. [Google Scholar] [CrossRef]
Csatáriné Szabó, Z.; Mikita, T.; Négyesi, G.; Varga, O.G.; Burai, P.; Takács-Szilágyi, L.; Szabó, S. Uncertainty and Overfitting in Fluvial Landform Classification Using Laser Scanned Data and Machine Learning: A Comparison of Pixel and Object-Based Approaches. Remote Sens. 2020, 12, 3652. [Google Scholar] [CrossRef]
Huda, S.; Yearwood, J.; Jelinek, H.F.; Hassan, M.M.; Fortino, G.; Buckland, M. A Hybrid Feature Selection with Ensemble Classification for Imbalanced Healthcare Data: A Case Study for Brain Tumor Diagnosis. IEEE Access 2016, 4, 9145–9154. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, H.; Zhang, X.; Qi, D. Deep Learning Intrusion Detection Model Based on Optimized Imbalanced Network Data. In Proceedings of the 2018 IEEE 18th International Conference on Communication Technology (ICCT), Chongqing, China, 8–11 October 2018; pp. 1128–1132. [Google Scholar] [CrossRef]
Gao, W.; Huang, L.; Liu, S.; Dai, C. Artificial Bee Colony Algorithm Based on Information Learning. IEEE Trans. Cybern. 2015, 45, 2827–2839. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Batista, G.E.A.P.A.; Prati, R.C.; Monard, M.C. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]
Lin, W.C.; Tsai, C.F.; Hu, Y.H.; Jhang, J.S. Clustering-based undersampling in class-imbalanced data. Inf. Sci. 2017, 409–410, 17–26. [Google Scholar] [CrossRef]
Chawla, N.; Bowyer, K.; Hall, L.; Kegelmeyer, W. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. (JAIR) 2002, 16, 321–357. [Google Scholar] [CrossRef]
Tao, X.M.; Zhang, D.X.; Hao, S.Y.; Xu, P. Fault detection based on spectral clustering combined with under-sampling SVM under unbalanced datasets. Zhendong Yu Chongji/J. Vib. Shock 2013, 32, 30–36. [Google Scholar] [CrossRef]
Wang, S.; Minku, L.L.; Yao, X. Resampling-Based Ensemble Methods for Online Class Imbalance Learning. IEEE Trans. Knowl. Data Eng. 2015, 27, 1356–1368. [Google Scholar] [CrossRef]
Han, H.; Wang, W.-Y.; Mao, B.-H. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. In Proceedings of the International Conference on Intelligent Computing, Hefei, China, 23–26 August 2005; Volume 3644, pp. 878–887. [Google Scholar]
Tao, X.; Li, Q.; Ren, C.; Guo, W.; He, Q.; Liu, R.; Zou, J. Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw. 2020, 122, 289–307. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Illustration of the data and sample preprocessing for a representative sub-region of the study areas. (A) The 5 m resolution digital elevation model (DEM) for a segment of a highway corridor. (B) A detailed hillshade rendering of the sub-region highlighted in (A), which accentuates topographic relief. (C) The resulting spatial distribution of labeled landform samples for the same area, which serve as the ground truth for model training and testing. The legend indicates the five classified landform types.

Figure 2. Methodological framework of the proposed improved Random Forest (RF) classification method. The workflow consists of three main stages: (1) data preprocessing, where the initial imbalanced sample dataset is prepared from the DEM; (2) training the classification model, where a hybrid sampling strategy, combining the ODR algorithm for undersampling and the SVM-SMOTE algorithm for oversampling, is applied to generate balanced training subsets for each decision tree; (3) final classification, where the ensemble of trained trees votes to produce the final landform map.

Figure 3. Improved RF algorithm.

Figure 4. Sampling parameter α and classification performance.

Figure 5. Visual comparison of the landform classification maps generated by the three methods, overlaid on a DEM-derived hillshade to provide topographic context. The figure shows the results for the four distinct highway corridor study areas: (a) Study Area A, (b) Study Area B, (c) Study Area C, and (d) Study Area D. Note the significant incompleteness (unclassified areas) in the rule-based results and the enhanced spatial continuity and coherence of the improved RF method compared to the traditional RF.

Figure 6. The ROC curves of the different landform types. The landform types are valley (A), plain (B), mountain (C), lightly undulating (D), and rolling area (E), and F denotes the macro-average of all landform types. The improved RF method shows a sharp increase in TPR at low FPR, resulting in a visible knee in the ROC curve, which reflects the effect of the hybrid sampling strategy on minority class detection.

Table 1. Rules for highway landform classification (according to Zhou et al. [16]).

Landform Types	Factors
Landform Types	Slope/(°)	Relative Elevation/(m)	Relief/(m)
valley	3~7	/	<50
plain	<3	/	<30
mountain	>20	>200	>120
lightly undulating	3~20	<100	50–120
rolling area	>20	100~200	50–120

Table 2. Sample size and distribution for each landform type in the combined dataset.

Landform Types	Training Dataset	Test Dataset	Total	Percentage
Valley	1880	470	2350	2.85%
Plain	1808	452	2260	2.74%
Mountain	1984	496	2480	3.01%
Lightly undulating	29,480	7370	36,850	44.70%
Rolling area	30,808	7702	38,510	46.70%
Total	65,960	16,490	82,450	100%

Table 3. Optimized parameters for the classification model.

Model	Parameter	Optimal Threshold
RF algorithm	nFeatures nSamples nTrees	nFeatures = 2 nSamples = 2 nTrees = 500
Hybrid sampling algorithm	k α	K = 5 α ∈ [0.4–0.6]

Table 4. Confusion matrix of test set classification results.

	Valley	Plain	Mountain	Lightly Undulating	Rolling Area
Valley	58	1	0	0	0
Plain	0	56	0	0	0
Mountain	0	0	61	1	0
Lightly undulating	1	2	1	915	32
Rolling area	2	0	2	28	954
Precision (%)	96.5	95.9	96.1	98.0	99.1
Recall (%)	95.3	96.1	95.9	98.3	98.2
F-score (%)	95.9	96.5	96.0	98.1	98.6
Over accuracy = 0.97; G-mean = 0.95

Table 5. Proportional landform coverage and incompleteness rate for each method across the four study areas.

Study Area	Types	Valley (%)	Plain (%)	Mountain (%)	Lightly UnDulate (%)	Rolling Area (%)	Total (%)	Incomplete (%)
Study Area A	Rule-based method	23.6	1.0	5.6	13.8	44.1	88.1	11.9
	Traditional RF	35.0	1.0	5.6	14.3	44.1	100%	0.0
	Improved RF	33.5	5.2	5.8	13.1	42.3	100.0	0.0
Study Area B	Rule-based method	20.2	11.6	1.3	14.2	39.2	86.5	13.5
	Traditional RF	33.1	11.6	1.3	14.9	39.2	100%	0.0
	Improved RF	34.5	13.8	1.5	12.9	37.3	100.0	0.0
Study Area C	Rule-based method	15.2	8.1	3.1	14.4	43.5	84.3	15.7
	Traditional RF	30.7	8.1	3.1	14.6	43.5	100%	0.0
	Improved RF	29.6	9.7	4.6	15.5	40.6	100.0	0.0
Study Area D	Rule-based method	13.8	6.8	2.3	10.9	48.7	82.4	17.6
	Traditional RF	31.3	6.8	2.3	11.0	48.7	100%	0.0
	Improved RF	31.1	6.7	2.2	9.8	50.1	100.0	0.0

Table 6. Overall accuracy comparison.

	Precision	Recall	F-Score
Traditional RF	83.8%	87.1%	86.5%
Improved RF	97.1%	96.8%	97.0%

Table 7. Per-class accuracy comparison.

Type	Method	Precision (%)	Recall (%)	F-Score (%)
Valley	Traditional RF	78.0%	85.0%	81.5%
Valley	Improved RF	96.5%	95.3%	95.9%
Plain	Traditional RF	80.0%	86.7%	83.2%
Plain	Improved RF	95.9%	96.1%	96.5%
Mountain	Traditional RF	82.0%	87.0%	84.5%
Mountain	Improved RF	96.1%	95.9%	96.0%
Lightly undulating	Traditional RF	86.0%	91.3%	88.6%
Lightly undulating	Improved RF	98.0%	98.3%	98.1%
Rolling area	Traditional RF	87.0%	89.5%	88.2%
Rolling area	Improved RF	99.1%	98.2%	98.6%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, S.; Hua, Y.; Zhu, J.; Meng, F. Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment. Remote Sens. 2025, 17, 2819. https://doi.org/10.3390/rs17162819

AMA Style

Zhu S, Hua Y, Zhu J, Meng F. Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment. Remote Sensing. 2025; 17(16):2819. https://doi.org/10.3390/rs17162819

Chicago/Turabian Style

Zhu, Song, Yuansheng Hua, Jiasong Zhu, and Fanyi Meng. 2025. "Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment" Remote Sensing 17, no. 16: 2819. https://doi.org/10.3390/rs17162819

APA Style

Zhu, S., Hua, Y., Zhu, J., & Meng, F. (2025). Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment. Remote Sensing, 17(16), 2819. https://doi.org/10.3390/rs17162819

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automated Landform Classification from InSAR-Derived DEMs Using an Enhanced Random Forest Model for Urban Transportation Corridor Hazard Assessment

Abstract

1. Introduction

2. Study Area

3. DEM Acquisition and Preprocessing

3.1. DEM Acquisition

3.2. DEM Sample Preprocessing

4. Methodology

4.1. Overview

4.2. RF Classification

4.3. Hybrid Sampling Algorithm

4.3.1. ODR Algorithm

4.3.2. SVM-SMOTE Algorithm

4.4. Proposed Classifier for Highway Landform

4.5. Validation

5. Results

5.1. Parameter Optimization

5.2. Accuracy Assessment

5.3. Comparative Classification Mapping

6. Discussion

6.1. Analysis of Classification Completeness

6.2. Comparison Between RF and Improved RF Method

6.3. Critical Evaluation and Practical Implications

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI