Modeling of Lake Malombe Annual Fish Landings and Catch per Unit Effort (CPUE)

: Forecasting, using time series data, has become the most relevant and effective tool for ﬁsheries stock assessment. Autoregressive integrated moving average (ARIMA) modeling has been commonly used to predict the general trend for ﬁsh landings with increased reliability and precision. In this paper, ARIMA models were applied to predict Lake Malombe annual ﬁsh landings and catch per unit effort (CPUE). The annual ﬁsh landings and CPUE trends were ﬁrst observed and both were non-stationary. The ﬁrst-order differencing was applied to transform the non-stationary data into stationary. Autocorrelation functions (AC), partial autocorrelation function (PAC), Akaike information criterion (AIC), Bayesian information criterion (BIC), square root of the mean square error (RMSE), the mean absolute error (MAE), percentage standard error of prediction (SEP), average relative variance (ARV), Gaussian maximum likelihood estimation (GMLE) algorithm, efﬁciency coefﬁcient (E 2 ), coefﬁcient of determination (R 2 ), and persistent index (PI) were estimated, which led to the identiﬁcation and construction of ARIMA models, suitable in explaining the time series and forecasting. According to the measures of forecasting accuracy, the best forecasting models for ﬁsh landings and CPUE were ARIMA (0,1,1) and ARIMA (0,1,0). These models had the lowest values AIC, BIC, RMSE, MAE, SEP, ARV. The models further displayed the highest values of GMLE, PI, R 2 , and E 2 . The “auto. arima ()” command in R version 3.6.3 further displayed ARIMA (0,1,1) and ARIMA (0,1,0) as the best. The selected models satisfactorily forecasted the ﬁsh landings of 2725.243 metric tons and CPUE of 0.097 kg/h by 2024.


Introduction
Lake Malombe is located within the Upper and Middle Shire River basin in the Southern part of Malawi. The lake is one of the most important freshwater ecosystems in terms of providing fishing resources in Malawi [1]. It is mostly dominated by Cichlidae, Claridae, and Cyprinidae families though other genera such as Bathyclarias, Fossorochromis, Pseudotropheus, Caprichromis, and Brycinus are also equally important. Historically, Lake Malombe is well known for massive fisheries resource depletion in Africa since the 1960s [2]. The relative biomass index of the lake fishery decreased from 1.01 units in the 1970s to 0.3 units in 2016. Fish biomass landings also decreased from 44,000 metric tons in 1981 to 6000 in 2018. The above evidence impelled the Malawi government to take a command control approach (stipulated in the guideline of Fisheries Act [3] formulated based on the Laws of Malawi, Chapter 66: 05 1974 and amended in 1976, 1977, 1979, 1984, 1996, and 1997, which include gear licensing, gear and mesh size regulations, implementation of the closed season, banning of fine-meshed beach seines and regulating fishing effort) to reverse the situation. However, this approach faced strong resistance in its implementation and hence failed to achieve its objective. In the 1980s, the fisheries co-management approach was introduced [4]. This approach was considered to be more holistic and assumed that Lake Malombe fishing communities affected by the collapse of the fishery had the potential to find a sustainable solution to address the impact through collaborative planning. The approach was further introduced in response to the perceived failure of the command control management approach in preventing the decline of fish stocks in the lake and the lack of proper government resources to effectively manage the fisheries. However, several researchers, fisheries experts, and experienced fishers argued that the co-management approach in Lake Malombe was also a failure. Hara [5] in his study noted that lack of information on social, cultural, and institutional factors underlying the exploitation patterns of lake fishery was the main reason for the failure of the co-management approach in Lake Malombe. It was further argued that the implementation arrangements of this approach also followed the top-down method, was donor-driven, not formed within a new institutional vacuum, and did not take into account the institutional landscape and diversity that impacted the functioning and performance of the structures [6].
The failure of fisheries co-management prompted the Malawi government to develop an ecosystem-based fisheries management approach. The forecasting of total annual biomass landings and catch per unit effort (CPUE) is a basic component embedded in this approach to effectively manage the stocks. For example, one of the main objectives of this new approach is to set a practical fishing effort in a concrete area during the known breeding period for stock replacement. To achieve this objective, it is necessary to predict uncontrollable biomass trends and possible abundance of the stock biomass in the lake-an approach that requires the development and the application of stock production conceptual models. These models are very relevant since they are directly linked to the concept of the precautionary approach-a basic concept in fish stock management [7]. Schaeffer and Fox production models, also known as surplus production models (a model associated with maximum sustainable yield (MSY)), have already been commonly applied to depict the status of fishery in Lake Malawi and other inland lakes. Although these models are acceptable in fishery management, their applications are still questionable. For example, the models only depict the current picture of the fishery using time-series trends and lack the power of prediction. Other models such as state-space [8] and Bayesian [8] have also been applied with little success. The time-series approach, however, has been indispensable in understanding natural resources systems and the development of better management policies. It is described as the best approach to fisheries modeling and stock assessment. It provides a feasible way to examine the time series data and provide a prediction of the catches. The approach demands fewer biological assumptions than other traditional fisheries models and with simple mathematical techniques and few assumptions, it can significantly reduce the modeling costs including research costs. Several researchers have recommended the time series approach in fisheries stock assessment. For example, Stergiou [9] highly recommended a time series approach and efficiently predicted the Mullidae fishery in the Eastern Mediterranean Sea. Selvaraj et al. [10] successfully developed autoregressive integrated moving average (ARIMA) models to predict fishery landings in the Colombian Pacific Ocean. Koutroumanidis et al. [11], however, combined ARIMA models and fuzzy expected intervals software to forecast fishery landings in Thessaloniki, Greece. Researchers such as Park [12], Georgakarakos et al. [13], Lioret et al. [14], Prista et al. [15], Raman et al. [16], and Stergiou [17] also preferred time series modeling in their predictions of the marine fishery, understanding the dynamics of the fishery, monitoring and forecasting fish landings in data-scarce regions. This study used ARIMA (p,d,q) models to forecast Lake Malombe fish landings and CPUE.

The Study Area
Lake Malombe (Figure 1) is located in the Shire River Basin (the largest and longest river basin in Malawi and fourth-largest in Africa). The Shire River basin lies in the southern part of the East Africa Great Rift valley system. It originates from Lake Malawi at Samama and flows 400 km south and southeast to its confluence with the Zambezi River at Ziu Ziu in Mozambique with an estimated catchment area of 18,945 km 2 divided into the upper, middle, and lower sections. The upper section of the Shire River Basin (with a total channel bed drop of about 15 m over a distance of 130 km) lies between Mangochi and Machinga districts. From Mangochi, the Upper Shire River Basin drains into Lake Malombe (coordinates 14 • 40 0" S 35 • 15 0" E) 8 km south of Mangochi and continues to flow through swampy banks flanked by hills and escarpment. Lake Malombe is documented as the third-largest in Malawi with an estimated total area of 162 square miles (420 km 2 ), length of 30 km, a width of 17 km, and water depth not exceeding 6 m [18]. The communities around the lake catchment are predominately fishers [19]. The lake has approximately 65 fishing beaches scattered over the three major administrative strata known as Lake Malombe East coded as 1.1, Lake Malombe West coded 1.2, and Upper Shire coded 1.3 [19]. The surrounding area of Lake Malombe is densely populated by the Yao ethnic tribe consisting of over 85% [19] of the fishing population. Few tribes such as Chewa, Lhomwe, and Nyanja are also found around the Lake.

Fish Biomass
Department of Fisheries through its statistical office under the Malawi Fisheries Management Act (1997) is mandated to collect fish landing data on a weekly basis from artisanal fishermen. In Lake Malombe, the fish landing data is collected using Malawi Traditional Fishery (MTF)-a computerized gear-based sampling technique. The mathematical model below is used to determine the fish landings: where c t is the mean catch per unit effort obtained per administrative stratum during the survey, A is the total area of the stratum and a is the area swept by the net during the unit effort and x i being a proportion of the fish in the path of the gear that is returned by a net. The data is entered into the catch and effort statistics database hosted by Monkey Bay Fisheries Research Division of the Department of Fisheries. Therefore, the 43 years' annual fish landings data used in this study was the summation of annual fish landings from different species in all Lake Malombe strata expressed in metric tons.

The Estimation of Catch per Unit Effort (CPUE)
The catch per unit effort (CPUE) data is collected alongside fish biomass landings. It is defined as an estimate of stock size [20]. It shows if the relationship between catch and effort is linear through the origin. Traditionally, there are two ways to calculate CPUE: where C i is defined as the ith catch (usually expressed in weight) and f i is its respective fishing effort [21]. Griffiths [22] named CPUE 1 as a weighted index of density and CPUE 2 as the unweighted index of density reasoning that the ratio CPUE 2 CPUE 1 , which he called the index of concentration, would exceed unity in areas of higher-than-average density. If fishing is at random, the index of concentration is 1 and if fishing effort is applied in those areas with less than average density, the ratio is less than 1. Another ratio estimator, that does not have much intuition, is presented by Snedecor and Cochran [23] (Equation (4)): where C is proportional to f, the regression line between them statistically goes through the origin and can be fitted by a simple model C i = β f i + ε i . The above three equations are unbiased estimates of the population ratio β in normally distributed populations. The choice among the three is a matter of precision.

Conceptual Framework of the ARIMA Model
This paper presents Box and Jerkins approaches that were followed to model and forecast time series data using autoregressive integrated moving average (ARIMA) models. To model the fish landings and CPUE data, the traditional statistical models such as autoregressive (AR), smoothing, moving average (MA) and autoregressive integrated moving average (ARIMA) were applied. The autoregressive (AR (p)) model is expressed as where Y t is the dependent variable γ at time t, γ t−1 + γ t−2 + . . . + γ t−p are the lagged dependent variables while α 1 + α 2 + . . . + α p are the unknown parameters of the model α 1 = 0 and ε t is the value of the disorders term at time t, i, i.d. ε t ∼ 0, α 2 : p-the number of lagged values of y and represent the order of the process. The moving average (MA (q)) is defined by a function of its present and q-past disorders (lagged error) and is expressed as where Y t is the dependent variable y at time t, ε t−1 , ε t−2 . . . ε t−q means lagged disturbances, and β 1 , β 2 . . . .β q means unknown parameters of the model β q = 0, ε t~N 0, α 2 q the number of lagged values of y and represents the order of the process. The two-process (AR) and (MA) are combined to form the autoregressive moving average model (ARMA (p,q)) expressed as: However, the ARMA model works with stationary data which is not the case with fish landings and CPUE data [24]. Therefore, the application of differencing to remove a mean trend from non-stationary series is the common procedure for transforming non-stationary data to stationary [25]. The procedure has been advocated in the Box-Jenkins approach [26]. In this paper, the autoregressive integrated moving average model (ARIMA (p,d,q)) is introduced to deal with non-stationary data. The general form of the ARIMA model is where a new time series is obtained by differencing the initial series (γ t )d. In the fisheries context, p describes the movements of the data series, d represents the degree of difference used to achieve stationarity and q represents the smoothing technique used to estimate secular trends. The autocorrelation (AC) and partial autocorrelation (PAC) functions are used to extract the periodic component of the time series and construct ARIMA models suitable for explaining time series data. The ACF is defined as the sequence of values: where is the empirical autocovariance at lag τ and C o is the sample variance. The ACF reveals regular moving average spikes. For example, if the model has an MA (1) component, then there will only be one regular significant spike. If the model has an MA (2) component, then the model will have two regular spikes. The sample ACF ρ τ at lag, τ is simply the correlation between two sets of residuals obtained from regressing the elements γ t and γ t τ+1 on the set of intervening values. The PAC reveals a spike at the lag of the interaction term. Different combinations of multiplicative parameters were estimated to determine whether the identified parameters were statistically significant.
To evaluate the adequacy and performance of the models during the validation process, a series of measures of accuracy were applied. The correlation coefficient (R) was used to depict the relationship between observed and predicted fish landings and CPUE. On the other hand, the coefficient of determination (R 2 ) described the proportion of the total variance applied and was explained by the model. Other significant measures of variance such as standard error of prediction (SEP), the coefficient of efficiency (E 2 ), and the average relative variance (ARV) were also applied to see how far the model was able to explain the total variance of the data. It was further noted that there was a need to quantify the errors in the same series of variables and these errors include the square root of the mean square error (RMSE) and the mean absolute error (MAE) expressed as: where Q t is the observed total annual fish landings or CPUE at the time t,Q t is the estimated total annual fish landings or CPUE at time t, and N is the total number of observations of the validation set. The percent standard error of prediction (SEP) was expressed as where Q is the mean observed total annual biomass or CPUE of the validation set. The coefficient of efficiency E 2 and the average relative variance (ARV) were expressed as: The sensitivity to outliers due to the squaring of the different terms is linked to E 2 or equivalent to ARV. A value of zero for E 2 suggests that the observed averageQ is as good as a predictor of the model, while negative values indicate that the observed average is a better predictor than the model. For the perfect match, the values of R 2 and E 2 should be close to one and those of SEP and ARV close to zero. The persistence index (PI) was also used for the model performance evaluation.
where Q t−L is the observed fishery landings or CPUE at the time step t − L and L is the leadtime.
The PI values of one reflect a perfect adjustment between predicted and the observed values and a value of zero indicates that the model is not suitable. Negative PI means that the model is degrading the original trend meaning that it has the worst performance. The t ratios of the parameters to their standard error were also estimated. The Ljung-Box Chi-squared test was used to estimate whether the overall correlogram of the residuals displayed any methodical error. The Ljung-Box Chi-squared test is expressed as where rk (k = 1 . . . . . . . . . m) are residual AC and n is the number of observations used to fit the model. Akaike information criterion (AIC) and Bayesian information criterion (BIC) was calculated to give the possibility to compare with values obtained from other models. The AIC is expressed as where −2 ln f y ∅ means the goodness of fit while +2k means model complexity and k is the number of model parameters. Note, the model with the smallest AIC is chosen as the finest. The BIC was calculated as: The Gaussian maximum likelihood estimation (GMLE) algorithm was further used to check the model adequacy. The GMLE was expressed by firstly denoting the elements of τ 1 and τ 2 in ascending order as j 1<j 2 <···<j p and i 1 < i 2 < · · · < i q (20) where N = N 1 N 2 , X is an N.1 vector consisting of N observation in ascending order and is independent of σ 2 . The GMLE estimators were expressed aŝ Once the suitable time series model estimated its unknown parameters and established that the model fits well, the next step was to forecast fishery landings and CPUE values. In this context, the autoregressive model is represented as follows: The next observation beyond y 1 . . . . . . . . . . . . y T is predicted using the model expressed belowŶ where theπ µ are obtained by substituting the estimated parameters in the theoretical ones. The forecast γ t+1 is obtained and used to forecast γ t+2 which later is used to generate γ t+3 . The process is used to obtain a forecast out of any point in the future.

Results
Modeling and forecasting of fish landings and CPUE were conducted based on the raw data. The precision and characteristics of ARIMA models were studied in detail following the Box-Jenkins approach. The model parameters were identified, estimated, and verified. The model shown in Figure 2 is based on the 43 years' time-series data from 1976 to 2019. The first approach to ARIMA modeling was data inspection. Figure 2 shows that the fishery landing trend was very unstable with the highest observed from 1980 to 1990. The same observation is made in the catch per unit effort (CPUE) data plot which shows the highest within the period from 1976 to 1980. Figure 3 on the other hand shows the autocorrelation (AC) and partial autocorrelation (PAC) functions of the original fish landings time series data. The AC is significantly higher than zero and gradually decreases to a zero function of the number of lags demonstrating that the time series data were not stationary. Similar results were observed in CPUE data series.

Model Identification
From the time series plot (above), it was very apparent that the time series data for fishery landings and CPUE had a random walk behavior and therefore had to be differenced. Differencing is an explicit option in ARIMA modeling [27]. It is implicitly a part of random walk and exponential smoothing models. In this paper, the first-order differencing was applied and the results were inspected in the form of the data plot. Figure 4 shows that time-series data of the first-order difference was stationary in mean and variance and therefore, it was suggested that an ARIMA (p,1,q) model was probably the best. The selected model was able to remove the trend component of the time series to remain with irregular components. A correlogram was also applied to detect whether particular time series data was stationary or non-stationary. The stationary time series data give an AC and PAC functions that decay rapidly from their initial value of unity zero lag while in non-stationary time series, the ACF dies out gradually over time. The stationary testing was done using a plot analysis of the autocorrelation (ACF) and partial autocorrelation function (PACF). The ACF depicted the correlation measurement of the observations at different times. PACF coefficient also called reflection coefficients in signal processing was used as the basis of autoregressive estimation. The ACF and PACF of the transformed fish landings time series (original and smoothed series) are presented in Figure 5. Both fishery landings and CPUE data series showed that a first-order differencing (d = 1) was adequate to remove the trend. The ACF and PACF in Figure 5 showed a significant spike only at lag 1, meaning that higher-order autocorrelation was explained by lag 1 AR [26].
In verifying the data stationarity, however, it was noted that AR or MA models were not pure as seen from ACF and PACF correlograms in Figure 5. Therefore, several models had to be tested to identify the most suitable one for fishery landings and CPUE forecasting.

Model Selection
The model selection was based on the minimization of Akaike information criterion (AIC), Bayesian information criterion (BIC), square root of the mean square error (RMSE), the mean absolute error (MAE), standard error of prediction (SEP), average relative variance (ARV) and increasing Gaussian maximum likelihood estimation (GMLE) algorithm, efficiency coefficient (E 2 ) and coefficient of determination (R 2 ). Table 1 summarizes the values of different competing models and proves the choice of the ARIMA (p,d,q) model on which fishery landings' prediction was based on. Table 1 shows that the ARIMA model The coefficient of determination also known as R squared was used as an indicator for the goodness of model fit in the linear regression. The ARIMA (0,1,1) had the highest R 2 coefficient (R 2 = 0.73) suggesting that about 73% of the variance was explained by the model. Also, the persistent index (PI) was closer to 1 and higher (PI = 0.59) than the rest of the competing models suggesting that the ARIMA (0,1,1) model was not naïve. The p-value for the Ljung-Box test was greater than 0.05 suggesting that there was very little evidence for non-zero autocorrelations. Both AR and MA models were greater than 0.05, suggesting that the residues of the model were independent at a 95% level of confidence and the ARMA model proved to be the best model fit. The level of explained variance displayed a percentage standard error prediction (SEP) of 22%, root mean standard error (RMSE) of 2001.29 tons, and a mean absolute error of 1044.04 tons. A detailed analysis of the results showed that the model displayed the best performance as indicated by the persistence index which was closer to 1 (PI = 0.56). R statistical package version 3.6.3 has an installation for finding the appropriate forecasting model "auto. arima ()". As seen from Figure 6, the ARIMA (0,1,1) was displayed as the most appropriate forecasting model for fish landings.    Table 2 on the other hand summarizes the measures of forecasting accuracy for CPUE among the competing ARIMA (p, d, q) models. According to the measures of forecasting used, the best forecasting performance was displayed by the ARIMA model (0,1,0). The model had lowest AIC (−44.24), BIC (−55.43), RMSE (0.08 kg/h), MAE (0.05 kg/h), SEP (26%), ARV (0.25), highest GMLE (39.6), E 2 (0.6465), and PI (0.66). The R squared was used as an indicator for the goodness of model fit in the linear regression model. The ARIMA (0,1,0) had the highest R 2 coefficient (R 2 = 0.78) suggesting that about 78% of the variance was explained by the model. Also, the persistent index (PI) was closer to 1 and higher (PI = 0.66) than the rest of the competing models suggesting that the ARIMA (0,1,0) model was not naïve. The p-value for the Ljung-Box test was also greater than 0.05 suggesting that there was very little evidence for non-zero autocorrelations. Both AR and MA models were greater than 0.05 suggesting that the residues of the model were independent at a 95% level of confidence and the ARMA model proved to be the best model fit. Using the "auto. arima ()" command in R statistical package version 3.6.3, the software also automatically estimated ARIMA (0,1,0) as the best ARIMA model ( Figure 6). Based on the selected ARIMA (0,1,1) and ARIMA (0,1,0) models presented in Figure 6, the following model was developed.
with: y t , y t−1 : fish landings or CPUE period t and t − 1, respectively, ε t , ε t−1 : residuals of period t and t − 1 constitute a white noise, and δ 1 and θ 1 : coefficients of AR and MA processes, respectively.
3.3. Accuracy of ARIMA (0,1,1) and ARIMA (0,1,0) Models Before forecasting, the residuals were checked through ACF and PACF to see if there was any systematic patterns that need to be eliminated to improve the accuracy and performance of the selected model. The ACF and PACF residuals plot (Figure 7) showed that none of the ACF were significantly different from zero at 95% confidence intervals. Therefore, ARIMA (0,1,1) and ARIMA (0,1,0) models provided an adequate predictive model which probably could not be improved.

Forecast
The modeled and predicted fish landings and CPUE are displayed in Figure 8. The summary of predicted and observed fish landings and CPUE are presented in Tables 3 and 4.  Tables 3 and 4 show that the noise residuals were a combination of positive and negative errors and falling within 95% confidence intervals indicating that the model had a good performance of forecasting.  Figure 8 presents the results of fish landings and CPUE forecasts obtained after applying ARIMA (0,1,1) and ARIMA (0,1,0) models for the period of 5 years (2019-2024). As seen from Figure 8, the model satisfactorily predicted that by 2024, both fish landings and CPUE will be estimated at 2725.243 metric tons and 0.097 kg/h respectively.

Discussion
The abundance of fish species in Lake Malombe is directly linked to CPUE; though common criticism of CPUE is that the relationship between abundance and CPUE is more complex [28,29]. Figure 9 depicts the variation of the annual fish landings during each defined fish landing period. The data plot shows that Lake Malombe fish landings experience strong periodical fluctuations, with higher catches registered in 1976-1996 and maximum fish landings in 1984, and limited fish landings during the rest of the period. A significant decline in fishery landings is noted, during 2000-2005, mostly attributed to stock depletion as the result of an increase in fishing effort. The fishery recovery appeared from 2006 to 2016, with the maximum registered in 2015. The CPUE in Figure 9 was at the peak in 1976 and then began fluctuating negatively and thereafter remained constant with a slight fluctuation. The decrease in CPUE trends from the 1980s to 2019 was an indicator of increasing fishing effort and a decrease in fish landings. Maynou et al. [30] explained that CPUE series reflect the general abundance of species and catch fluctuation. According to Weyl et al. [31], CPUE variation is strongly linked to the difference in the number of fishers, man-hours, and the categories of fishing gears. Low CPUE indicates a relatively low abundance of fish which results in prolonged man-hours and an increase in the number of fishers and low catches [32,33]. Alexander et al. [34] had similar observations and reported a strong relationship between whole lake CPUE to relative fish biomass and abundance. Weyl et al. [31] evidenced an increase in biomass and CPUE in the southern part of Lake Malawi after the subsequent closure of the fishery within 1992/1993. In Lake Nasser, Egypt, the time series of size and CPUE also showed a negative trend indicating the high exploitation rate of the most important commercial fish species in the lake by the fishing gears [32]. A similar situation is reported in Lake Chilwa, Lake Malawi, and other inland lakes [35][36][37].
The option to model both fishery landings and CPUE was based on the fact that these two indicators can be used by regulators to monitor for a potential change in fish population related to the effects of human exploitation and other anthropogenic factors [38]. It was based on the assumption that Lake Malombe fishery can provide a greater benefit to the riparian communities if proper management conditions and plans are sustainably exercised. Stergiou [38] claimed that fish landing fluctuation is linked to ecosystem type and season while Pinnegar et al. [39] believed that fish landings can reflect changes in gear technologies, management measures, and the species abundance in the lake ecosystem. The two arguments above indicate that the inaccurate model can give inaccurate management information and appropriate selection of the time series model is the basic requirement in the prediction of fish landings and CPUE. In this case, the ARIMA (0,1,1) and ARIMA (0,1,0) models were selected as the most accurate and the best in predicting Lake Malombe fish landings and CPUE for five years. It should be noted that the p = 0 in ARIMA (0,1,1) for fish landings suggests that there were no autoregressive terms but rather the model was exponential smoothing. On the other hand, the p = 0 and q = 0 in ARIMA (0,1,0) suggests that the selected model was a random walk with a constant trend and had no autoregressive and moving average terms. The five-year duration was chosen to increase the accuracy of the model. It was also assumed that forecasting five years in advance could be of great importance for planning and decision-making among fisheries managers, fishermen, and the fishing industry in general. The ARIMA (0,1,1) and ARIMA (0,1,0) models accurately predicted the lowest lake fish landings (2725.243 tons) and CPUE (0.097 kg/h), respectively. The prediction obtained by the models provides a special interpretation to fisheries managers, scientists, ecologists, and experienced fishers. It implies that the optimum fishing effort should be reduced below the current fishing effort for the fishery to recover a certain level of CPUE. In the absence of limitations, the open-access approach to the fishery in Lake Malombe will eventually lead to a further cycle of depression with declining economic returns for fishers and negative social and economic impacts [40].

Conclusions
In Malawi, fishing is described as the most productive and economic activity that sustains millions in the riparian communities. This study was developed to predict uncontrollable biomass trends and the possible abundance of the stock. The main objective was to provide baseline data to fisheries managers to set a practical fishing effort in the lake. The annual fish landings and CPUE trends were first transformed from non-stationary to stationary using the first-order differencing. The ACF, PACF, AIC, BIC, RMSE, MAE, SEP, ARV, GMLE, E 2 , R 2 , and PI were estimated, which led to the identification and construction of ARIMA models, suitable for explaining the time series and forecasting. According to the measures of forecasting accuracy, the best forecasting models for fish landings and CPUE were ARIMA (0,1,1) and ARIMA (0,1,0). These models had the lowest AIC, BIC, RMSE, MAE, SEP, and ARV values. The models further displayed the highest values of GMLE, PI, R 2 , and E 2 . The "auto. arima ()" command in R version 3.6.3 further displayed ARIMA (0,1,1) and ARIMA (0,1,0) as the best. The ARIMA (0,1,1) and ARIMA (0,1,0) models accurately predicted the lowest lake fish landings (2725.243 tons) and CPUE (0.097 kg/h), respectively, with a good level of confidence. The fish landings and CPUE trends showed downward trends suggesting that the current stock exploitation in Lake Malombe is unstainable. Therefore, the study recommends that the current fishing effort should be reduced to enable the fishery to recover a certain optimum level of CPUE.

Limitation
This study did not consider the use of hybrid models to explore the potential effects of covariates (such as fishing mortalities, environmental and social-economic drivers) due to limited data availability. Therefore, it is suggested that future studies should consider that.
Author Contributions: R.M. conceptualized the study, developed the methodology, sourced the data, analyzed the data, and developed the original manuscript. Authors S.M., E.K., T.A., T.B.P., I.B.M.K., C.C.K. supervised the study, reviewed and edited the manuscript, visualized, and validated the study. All authors have read and agreed to the published version of the manuscript.