A Short-Term Forecasting Method for High-Frequency Broadcast MUF Based on LSTM

: This paper proposes a short-term forecasting method for high-frequency broadcast Maximum Usable Frequency (MUF) based on Long Short-Term Memory (LSTM) to meet the demand for refined and precise high-frequency broadcast coverage. Based on the existing infrastructure of broadcast and television stations, we established an experimental verification system to collect data for approximately three years. Two links were selected based on data quality: Urumqi to Lhasa and Lanzhou to Lhasa. A short-term forecast of MUF was conducted using the data from these two links. Comparison and analysis were conducted between the forecasting results of our model and those of the REC533 model. Our proposed method outperforms the REC533 forecasting results overall, with a reduction in root mean square error (RMSE) of 0.66 MHz and an improvement in forecast accuracy of 14.77%. The comparison result demonstrates the feasibility and accuracy of our model.


Introduction
High-Frequency (HF) radio can achieve long-distance communication without the need for relays, which is widely utilized in amateur radio, military and government communications, global maritime distress and safety system communications, as well as radar detection [1].HF skywave communication, as an ancient and traditional communication method, utilizes one or more reflections in the ionosphere to achieve ultra-long-distance or even global communication [2], which has the advantages of flexible networking, high mobility, and strong invulnerability [3].These factors have made HF communication a research hotspot in the past, present, and future [4].
As communication services demand greater immediacy and reliability, identifying an available propagation frequency efficiently has emerged as a significant challenge [5].Particularly, the selection of an HF working frequency hinges on the Maximum Usable Frequency (MUF), which is identified as the uppermost frequency allowing for satisfactory radio circuit performance through ionospheric signal propagation between specified terminals at a certain time and under certain conditions, in line with the recommendations from the International Telecommunication Union (ITU) [6][7][8].
In recent years, coordination efforts in high-frequency broadcasting have placed higher demands on the refinement and quality enhancement of HF broadcasting development [9].HF broadcast coverage analysis often relies on tools such as REC533 [10] and VOACAP [11], based on the International Reference Ionospheric (IRI) model.Due to the lack of ionospheric environmental data and real-time detection information for China and its surrounding regions in the IRI model, the analysis accuracy of HF broadcast coverage in China and its neighboring areas is limited.Therefore, there is an urgent need to integrate multi-modal HF broadcast data and the channel characteristics of HF broadcast concern areas to enhance the refinement of HF broadcast quality assessment and improve the precision of frequency scheduling.
ITU-R P.533 [10], as an internationally recommended prediction method for HF circuit performance, can be used for short-term prediction of MUF.This method determines the link reflection point's location based on the shortwave communication link's transmitting and receiving locations.Then, a short-term prediction of the ionospheric parameters foF2 and M3000F2 above the location of the link reflection point is performed.Finally, the short-term prediction results of these ionospheric parameters are used to predict the MUF, thereby achieving the short-term prediction of the MUF.That is to say, the prediction of MUF is indirectly achieved through the prediction of ionospheric parameters such as foF2 and M3000F2.Methods for short-term prediction of ionospheric parameters include neural network methods [12,13], machine learning methods [14,15], empirical orthogonal function analysis [16], etc. Wang et al. [17] used the Volterra filter method to forecast M3000F2 and performed regional reconstruction to achieve accurate MUF forecasting due to the ITU model.
The methods for MUF forecast currently adopted domestically and internationally include: the autocorrelation function method [18], based on 20-30 days of data, can achieve high forecasting accuracy for periods within 3 h.The artificial neural network method [19,20] requires 5-7 days of ionospheric observation data, as well as geomagnetic K-index and Ap-index information, to conduct forecasts of usable frequencies.Yu et al. [21] proposed a prediction model of MUF based on federated learning, which has great advantages in protecting user privacy, reducing communication loss, and shortening training time.Wang et al. [3] achieved long-term accurate forecasting of MUF by reconstructing the MUF propagation factor using statistical machine learning methods.The Kalman filtering method [22,23], based on background field data combined with actual detection data, is utilized for forecasting usable frequencies.Its advantage lies in requiring fewer historical data.However, a drawback is that the prediction accuracy and stability decrease when new detection data are not introduced.The ionospheric storm prediction method [24] establishes a forecasting model primarily targeting disturbances in the ionosphere.It necessitates an initial assessment of ionospheric disturbances before proceeding with storm-time predictions, thus lacking universality.
This paper constructs a short-term prediction model for the MUF using the LSTM method based on actual detected ionospheric data to meet the operational demands for finescale forecasting of HF broadcasts.The proposed model directly predicts MUF based on the measured data of the link, omitting the short-term prediction of ionospheric parameters, and then calculates the intermediate process of MUF by parameters such as foF2, which reduces the error caused by the intermediate calculation process.The accuracy of the forecast results of this model surpasses that of the ITU model, providing strong support for precise coverage of HF broadcasts.The structure of this paper is as follows: Section 1 analyzes the necessity and current status of MUF forecasting.Section 2 outlines the implementation steps of MUF forecasting for operational systems and the methods for collecting ionospheric parameters.Section 3 describes the modeling process of MUF forecasting based on LSTM.Section 4 analyzes the forecast accuracy of the two methods.Section 5 summarizes the research findings of this paper.

Overall Approach
We set up an experimental verification system using the existing broadcast and television station infrastructure to validate the accuracy of the short-term MUF forecasting method proposed in this paper based on LSTM for HF broadcasting.An experimental verification system was built based on the existing basic conditions of radio and television stations; the specific process is as follows: (1) Set up antennas, calibrate antennas, test connectivity, and carry out preparations for the test.verification system was built based on the existing basic conditions of radio and television stations; the specific process is as follows: (1) Set up antennas, calibrate antennas, test connectivity, and carry out preparations for the test.

Testing Time and Testing Location
Leveraging receiver points established at broadcast stations (Nanchang, Shijiazhuang, Golmud, Lhasa), receiving ionospheric oblique sounding signals [25] transmitted by key concern area stations in western China (Urumqi, Kashgar, Yili, Lanzhou), and integrating accumulated ionospheric data, the short-term forecasting method for HF broadcast MUF based on LSTM is validated.Figure 2

Testing Time and Testing Location
Leveraging receiver points established at broadcast stations (Nanchang, Shijiazhuang, Golmud, Lhasa), receiving ionospheric oblique sounding signals [25] transmitted by key concern area stations in western China (Urumqi, Kashgar, Yili, Lanzhou), and integrating accumulated ionospheric data, the short-term forecasting method for HF broadcast MUF based on LSTM is validated.Figure 2 illustrates the schematic diagram of the constructed test network's locations for model input.
The experiment lasted for a total of 3 years from 2017 to 2019.The testing is continuous except for particular circumstances, such as power outages.The locations of the transmitting and receiving stations for the test are shown in the Table 1:  The experiment lasted for a total of 3 years from 2017 to 2019.The testing is continuous except for particular circumstances, such as power outages.The locations of the transmitting and receiving stations for the test are shown in the Table 1: Table 1.The locations of the transmitting and receiving stations.

Serial Number Receiving Station Locations Transmitting Station Locations
Lhasa Lanzhou

Data Collection
The ionospheric oblique probing data mainly include: (1) fminE: the minimum frequency of the E layer; PminE: the group distance corresponding to the minimum frequency of the E layer; (2) mufE: the maximum usable frequency of the E layer; PmufE: the group distance corresponding to the maximum usable frequency of the E layer; (3) fminEs: the minimum frequency of the Es layer; PminEs: the group distance corresponding to the minimum frequency of the Es layer; (4) mufEs: the maximum usable frequency of the Es layer; PmufEs: the group distance corresponding to the maximum usable frequency of the Es layer; (5) fminF1: the minimum frequency of the ordinary wave in the F1 layer; PminF1: the group distance corresponding to the minimum frequency of the ordinary wave in the F1 layer;

Data Collection
The ionospheric oblique probing data mainly include: (1) fminE: the minimum frequency of the E layer; PminE: the group distance corresponding to the minimum frequency of the E layer; (2) mufE: the maximum usable frequency of the E layer; PmufE: the group distance corresponding to the maximum usable frequency of the E layer; (3) fminEs: the minimum frequency of the Es layer; PminEs: the group distance corresponding to the minimum frequency of the Es layer; (4) mufEs: the maximum usable frequency of the Es layer; PmufEs: the group distance corresponding to the maximum usable frequency of the Es layer; (5) fminF1: the minimum frequency of the ordinary wave in the F1 layer; PminF1: the group distance corresponding to the minimum frequency of the ordinary wave in the F1 layer; (6) mufF1: the maximum usable frequency of the ordinary wave in the F1 layer; PmufF1: the group distance corresponding to the maximum usable frequency of the ordinary wave in the F1 layer; (7) fminF2: the minimum frequency of the ordinary wave in the F2 layer; PminF2: the group distance corresponding to the minimum frequency of the ordinary wave in the F2 layer; (8) mufF2: the maximum usable frequency of the ordinary wave in the F2 layer; PmufF2: the group distance corresponding to the maximum usable frequency of the ordinary wave in the F2 layer; (9) fhminF2: the minimum frequency of the high-angle mode of the ordinary wave in the F2 layer; PhminF2: the group distance corresponding to the minimum frequency of the high-angle mode of the ordinary wave in the F2 layer.
Note: the unit of the above frequency parameters is MHz, with a measurement accuracy of 0.1 MHz.The data collection interface of a typical ionospheric tilt detection is shown in Figure 3.
F2 layer; (8) mufF2: the maximum usable frequency of the ordinary wave in the F2 layer; PmufF2: the group distance corresponding to the maximum usable frequency of the ordinary wave in the F2 layer; (9) fhminF2: the minimum frequency of the high-angle mode of the ordinary wave in the F2 layer; PhminF2: the group distance corresponding to the minimum frequency of the high-angle mode of the ordinary wave in the F2 layer.
Note: the unit of the above frequency parameters is MHz, with a measurement accuracy of 0.1 MHz.The data collection interface of a typical ionospheric tilt detection is shown in Figure 3.

Testing Data Information
Considering the completeness of data collected from the test links, links suitable for short-term prediction analysis are selected.These links are Lanzhou to Lhasa and Urumqi to Lhasa.These links have considerable data and can be used for short-term prediction analysis.In this study, the dataset collected for 3 years is divided, with the data from January 2017 to November 2018 as the training set, the data in December 2018 used as the initial input for model prediction, and the data in 2019 as the test set.Table 2 shows the test set data information of the two links.

Testing Data Information
Considering the completeness of data collected from the test links, links suitable for short-term prediction analysis are selected.These links are Lanzhou to Lhasa and Urumqi to Lhasa.These links have considerable data and can be used for short-term prediction analysis.In this study, the dataset collected for 3 years is divided, with the data from January 2017 to November 2018 as the training set, the data in December 2018 used as the initial input for model prediction, and the data in 2019 as the test set.Table 2 shows the test set data information of the two links.

MUF Forecasting Method of REC533
The basic MUF of the E layer can be represented as where x = min d − 1150 1150 , 0.74 , where d represents the great-circle distance of the path (km), and foE is the foE at the midpoint of the path (MHz).When the propagation path distance d ≤ d max , radio wave propagation occurs in a single-hop mode, with the midpoint of the path as the control point.At this time, the basic MUF of the F2 layer can be expressed as: where where n 0 represents the minimum initial hop count; C 3000 represents the C d value when d is 3000 km; foF2 denotes the foF2 at the midpoint of the path (MHz); f H represents the gyrofrequency at the midpoint of the path (MHz).
When the propagation path distance d > d max , radio wave propagation occurs in a multi-hop mode, with the control point located at a distance of d 0 /2 from the transmitter/receiver.At this point, the basic MUF of the F2 layer is given by: In the equation, F2(d max )MUF 1 and F2(d max )MUF 2 represent the minimum-hop-mode F2(d max )MUF at the two control points, respectively.
The calculation process is clearly outlined in ITU-R P.533 recommendations [10].

Modeling Process
The LSTM algorithm [26] was originally proposed by Sepp Hochreiter and Jürgen Schmidhuber in 1997 and is a specific form of Recurrent Neural Network (RNN).RNN is a general term for a series of neural networks capable of processing sequential data.Compared to regular neural networks, it comprises input, hidden, and output layers, with nodes connected between input layers [27].The LSTM algorithm can learn longterm dependencies, primarily to address the issues of gradient vanishing and exploding gradients while training long sequences.The LSTM deep neural network consists of three types of gates [28]: the forget gate, the input gate, and the output gate.These three gates control the memory state of previous, input, and output information, respectively, ensuring that the network can better learn long-distance dependencies.As a result, LSTM outperforms traditional RNNs in more prolonged sequence problems.
Step one: The forget gate determines what information to discard from the cell state.The output h t−1 from the previous time step and the current input x t data are read, then mapped through a sigmoid function into the [0, 1] range, and, finally, whether to retain the information left from the previous time step is decided.
where f t is the forget gate function, W f is the weight, and b f is the bias.
Step two: The input gate adds new information to the cell state.The input gate determines the updated value i t by passing the input and the previous hidden state through a sigmoid function, then generates a new candidate value C t using the tanh function.
where W i and W C are weights, and b i and b c are biases.The third step combines the results of the previous two steps.It multiplies the output of the forget gate and the candidate value of the input gate and adds it to the old cell state to generate the updated value C t .
The fourth step involves the output gate, which determines how much of the cell state to output.Firstly, the output gate determines the output value of the cell state by using a sigmoid function with the input and the previous hidden state, resulting in the output value o t .Then, after normalization by the tanh function, the update value C t is multiplied by the output value o t to obtain the result value (h t ) at time t.
where W o and b o respectively represent the weight and bias.
The flowchart is shown in Figure 4.
sigmoid function, then generates a new candidate value t using the tanh function.
[ ] ( )  [ ] ( ) where Wi and WC are weights, and bi and bc are biases.The third step combines the results of the previous two steps.It multiplies the output of the forget gate and the candidate value of the input gate and adds it to the old cell state to generate the updated value Ct.
The fourth step involves the output gate, which determines how much of the cell state to output.Firstly, the output gate determines the output value of the cell state by using a sigmoid function with the input and the previous hidden state, resulting in the output value ot.Then, after normalization by the tanh function, the update value Ct is multiplied by the output value ot to obtain the result value (ht) at time t.
[ ] ( ) where Wo and bo respectively represent the weight and bias.
The flowchart is shown in Figure 4.

Validate Analysis Method
Forecasting for the links under the same conditions, combined with the measured results, the root mean square error (RMSE) (MHz) between the forecast values and the test values for each hour across the network is as follows: where N represents the number of measurement points, Y o represents the actual measured data, and Y p represents the forecasted data.Meanwhile, based on REC533, the improvement enhancement percentage of the improved core algorithm model is compared, defined as: where σ REC533 and σ P correspond to the RMSE of REC533 and the improved core algorithm model, respectively.

Analysis of Forecast Results Comparison
The short-term forecast data comparison is mainly conducted by collecting measured MUF data, using measured data from the previous day to forecast the MUF of adjacent days.The input data of the LSTM forecast model include sunspot number, longitude and latitude of the transmitting and receiving location points, time (year, month, day, and hour), and MUF, and the output is the monthly median value of MUF at the current moment.Finally, the prediction results of the proposed model in 2019 are compared with the forecast MUF values from REC533 as well as the measured data, analyzing the accuracy of the data.
(1) Urumqi to Lhasa Test data from the Urumqi to Lhasa link were collected, and the monthly median MUF calculated based on the improved core algorithm model under this test condition, the MUF values calculated by REC533, and the statistical monthly median of measured data were compared and analyzed.Missing test data during the forecasting process were supplemented with long-term forecast data.The comparison curve is shown in Figure 5, and the statistical analysis results are presented in the Table 3.
( ) where σREC533 and σP correspond to the RMSE of REC533 and the improved core algorithm model, respectively.

Analysis of Forecast Results Comparison
The short-term forecast data comparison is mainly conducted by collecting measured MUF data, using measured data from the previous day to forecast the MUF of adjacent days.The input data of the LSTM forecast model include sunspot number, longitude and latitude of the transmitting and receiving location points, time (year, month, day, and hour), and MUF, and the output is the monthly median value of MUF at the current moment.Finally, the prediction results of the proposed model in 2019 are compared with the forecast MUF values from REC533 as well as the measured data, analyzing the accuracy of the data.
(1) Urumqi to Lhasa Test data from the Urumqi to Lhasa link were collected, and the monthly median MUF calculated based on the improved core algorithm model under this test condition, the MUF values calculated by REC533, and the statistical monthly median of measured data were compared and analyzed.Missing test data during the forecasting process were supplemented with long-term forecast data.The comparison curve is shown in Figure 5, and the statistical analysis results are presented in the Table 3.Based on the improved core algorithm model calculations of the Urumqi to Lhasa link, REC533-calculated MUF values, and actual measurement data, the comparative results reveal: 1.
The testing process adhered to the relevant standards and methods, such as GJB, demonstrating overall smooth test curves except for June, indicating the validity of the measurement results.

2.
The REC533-calculated results generally align with the overall trend of the actual measurement data.The prediction effect is better in the morning and evening, and the forecast error at noon is larger than that of the measured data, with the maximum root mean square error (RMSE) of 3.29 MHz, except for June.

3.
The improved core algorithm results closely match the overall curve of the actual measurement data, with a maximum monthly RMSE of 3.01 MHz.The overall calculated results show a reduction of 0.39 MHz in RMSE compared to the REC533 forecast results, with an accuracy improvement of 8.06%.
(2) Lanzhou to Lhasa Test data from the Lanzhou to Lhasa link were collected, and the monthly median MUF calculated based on the improved core algorithm model under this test condition, the MUF values calculated by REC533, and the statistical monthly median of measured data were compared and analyzed.Missing test data during the forecasting process were supplemented with long-term forecast data.The comparison curve is shown in Figure 6, and the statistical analysis results are presented in the Table 4.  Based on the improved core algorithm model calculations of the Lanzhou to Lhasa link, REC533-calculated MUF values, and actual measurement data, the comparative results reveal: 1.The testing process adhered to the relevant standards and methods, such as GJB, demonstrating overall smooth test curves, indicating the validity of the measurement results.2. The REC533 calculation results generally align with the trends observed in the actual measurement data.The prediction effect is better in the morning and evening hours, and the statistical pattern at noon is larger than that of the measured data.Especially from May to August, the measured data fluctuates greatly at noon due to solar activ-  Based on the improved core algorithm model calculations of the Lanzhou to Lhasa link, REC533-calculated MUF values, and actual measurement data, the comparative results reveal: 1.
The testing process adhered to the relevant standards and methods, such as GJB, demonstrating overall smooth test curves, indicating the validity of the measurement results.

2.
The REC533 calculation results generally align with the trends observed in the actual measurement data.The prediction effect is better in the morning and evening hours, and the statistical pattern at noon is larger than that of the measured data.Especially from May to August, the measured data fluctuates greatly at noon due to solar activity, and the REC533 method shows a large deviation, with a maximum monthly root mean square error of 6.34 MHz.

3.
The results obtained from the improved core algorithm model closely match the curves of the actual measurement data.The maximum root mean square error is 5.26 MHz.Overall, the calculated results exhibit a decrease of 0.93 MHz in root mean square error compared to the REC533 forecast results, with an increase in accuracy of 21.48%.
Compared with the Urumqi-Lhasa link, the short-term forecast method for the Lanzhou-Lhasa link has more significant advantages.The REC533 method uses statistical long-term data, and the overall prediction curve is relatively smooth over time, which reflects the concept of statistical data and local correction in this method.The short-term forecast method is derived from the training datasets, so the LSTM model depicts the forecast curve in more detail than the REC533 method, which also reflects the advantages of the method constructed in this article.The average RMSE of the LSTM model's prediction results on the two links is 0.66 MHz, and RRMSE is 14.77%.

Conclusions
The proposed short-term MUF forecasting method for HF broadcasting based on LSTM establishes a model suitable for the Chinese region.By setting up actual testing links and collecting data for about three years, two high-quality data links, namely Urumqi to Lhasa and Lanzhou to Lhasa, were selected for MUF short-term forecasting.The forecast results were compared with those of the REC533 model.The results show that the proposed method outperforms the REC533 forecast results overall, with a reduction in root mean square error of 0.66 MHz and an improvement in forecast accuracy of 14.77%.The results verify the feasibility and high accuracy of the LSTM model, especially for short-term predictions during periods of intense solar activity.The result validates the feasibility and high accuracy of the method.With the integration of more data resources and data from other monitoring stations, there is great potential to improve the accuracy of regional MUF forecasting further using this method.The following research direction will involve introducing multi-source data fusion methods into regional MUF forecasting, expanding this algorithm, and comparing it with other machine learning and deep learning algorithms.

( 2 )
Collect shortwave channel parameter data based on shortwave channel detection equipment, mainly including MUF, time, transmit link position, receive link position.(3) Preprocess the channel parameter data to ensure the validity of the data, divide the training set and the test set, use the LSTM method to model based on the training set, and then make predictions based on the test set.(4) The forecast results adopt the cross-validation method and are compared with the forecast results of REC533 to verify the accuracy of this method.The process is shown in Figure 1 below: Atmosphere 2024, 15, x FOR PEER REVIEW 3 of 12

( 2 )
Collect shortwave channel parameter data based on shortwave channel detection equipment, mainly including MUF, time, transmit link position, receive link position.(3) Preprocess the channel parameter data to ensure the validity of the data, divide the training set and the test set, use the LSTM method to model based on the training set, and then make predictions based on the test set.(4) The forecast results adopt the cross-validation method and are compared with the forecast results of REC533 to verify the accuracy of this method.The process is shown in Figure 1 below:

Figure 2 .
Figure 2. Diagram illustrating the relationship between transmitting and receiving stations' locations and model input data.

Figure 2 .
Figure 2. Diagram illustrating the relationship between transmitting and receiving stations' locations and model input data.

Figure 3 .
Figure 3.A figure of typical ionospheric oblique sounding.

Figure 3 .
Figure 3.A figure of typical ionospheric oblique sounding.

Figure 5 .
Figure 5.Comparison chart of the Urumqi to Lhasa oblique sounding link data.Figure 5. Comparison chart of the Urumqi to Lhasa oblique sounding link data.

Figure 5 .
Figure 5.Comparison chart of the Urumqi to Lhasa oblique sounding link data.Figure 5. Comparison chart of the Urumqi to Lhasa oblique sounding link data.

Atmosphere 2024 , 12 Figure 6 .
Figure 6.Comparison chart of the Lanzhou to Lhasa oblique sounding link data.

Figure 6 .
Figure 6.Comparison chart of the Lanzhou to Lhasa oblique sounding link data.

Table 1 .
The locations of the transmitting and receiving stations.

Table 2 .
Test data information.January 2019-12 December 2019 16,560 test data entries totally, with 12,132 entries being valid

Table 2 .
Test data information.

Table 3 .
Error statistics of the Urumqi to Lhasa oblique sounding link.

Table 4 .
Error statistics of the Lanzhou to Lhasa oblique sounding link.

Table 4 .
Error statistics of the Lanzhou to Lhasa oblique sounding link.