3.2. Procedure
Herfindahl–Hirschman Index (IHH)
The Herfindahl–Hirschman Index (IHH) is a common tool in economics for measuring how diverse exports are or how concentrated the market is in a certain area or industry. Orris C. Herfindahl (1950) [
41] and Albert O. Hirschman (1945) [
42] first suggested this indicator. Later, it was recognised as a very important tool for studying the competitive structure of markets [
43]. To find the IHH, you add the squares of each agent’s or product’s relative share of the total. This makes it possible to find out if exports are more evenly spread out among a number of players or if they are mostly going to a small number of players [
44].
A high Herfindahl–Hirschman Index (IHH) score in international trade means that exports are very concentrated, which means that the country depends on a small number of goods or trading locations. Conversely, a low index score signifies greater diversity, which correlates with an export structure that is more competitive, resilient, and sustainable amid external volatility. The IHH is a good tool for looking at the sustainability and composition of export performance in regions or countries because it gives a single number for the balance and spread of trade flows [
45].
In this study, the Herfindahl–Hirschman Index (IHH) was applied to assess the level of concentration of agricultural exports from the Lambayeque region during the period 2015–2024. For the first calculation, data from the AZATRADE platform was used, specifically information on exporting companies and the total value exported (FOB) recorded in each year of the period analysed.
The formula used was the same as that traditionally used in concentration analyses:
where
: represents the Herfindahl–Hirschman Index, which measures the degree of concentration of exports.
: indicates the relative share of each exporting company in the total agricultural exports of Lambayeque in a given year. This share is obtained as:
is the value of exports (FOB) of the company .
: corresponds to the total number of exporting companies considered in the analysis.
In operational terms, for each year of the period 2015–2024, the total set of exporting companies in the agricultural sector was identified and the individual share of each firm in the total annual export value was calculated. These shares were then squared and added together to obtain the Herfindahl–Hirschman Index (IHH) for each year. This procedure made it possible to assess the degree of business concentration within the sector, where higher index values reflect greater market concentration and, consequently, a lower level of competition or export diversification [
46].
Similar to the calculation of the IHH applied to destination markets, a second analysis was carried out to assess the level of concentration of the destination markets for agricultural exports from the Lambayeque region during the period 2015–2024. In this case, the same data from the AZATRADE platform was used, specifically the variables corresponding to the destination countries and the FOB value exported to each of them in each year of the period analysed. The same Herfindahl–Hirschman Index (IHH) formula was used for this calculation.
This procedure made it possible to estimate the degree of geographical concentration of agricultural exports from the Lambayeque region, enabling us to identify whether export activity is diversified across multiple destinations or, on the contrary, concentrated in a limited number of markets.
Gravity model procedure
The gravity model has become established in economic literature as a fundamental analytical tool for examining and quantifying international trade flows between two countries. Its name derives from the law of universal gravitation formulated by Isaac Newton in 1687, which states that the force of attraction between two bodies is directly proportional to the product of their masses and inversely proportional to the distance between them. Similarly, the gravity model in economics posits that the volume of trade between two nations increases with the economic size of both and decreases as the costs associated with geographical distance increase [
47].
The gravity model was first applied by Tinbergen (1962) to analyse bilateral trade between countries, demonstrating that trade flows are determined mainly by the economic size of the nations involved and their geographical distance [
48]. However, it was Anderson’s work (1979) [
49] that gave this model a more robust theoretical foundation by developing a formulation based on a system of structural equations. This advance gave it greater analytical consistency and predictive power, consolidating it as an essential tool for studying international trade patterns.
Tinbergen’s Gravity Model Formula (1962)
The basic model proposed by Tinbergen (1962) [
50] to explain bilateral trade between two countries is expressed as:
where:
: represents the value of exports or trade flow from country to country ;
and : are the income levels or Gross Domestic Product (GDP) of countries and , respectively;
: corresponds to the geographical distance between the two countries, which acts as a barrier to trade;
: is a constant that groups together factors common to all trade relations (e.g., technology or global trade conditions);
: these are parameters that are estimated empirically and reflect the sensitivity of trade to changes in the variables of economic size and distance.
To facilitate econometric estimation, Tinbergen transformed the model into its linearised logarithmic form:
where
represents the random error term.
In its practical application, the gravity model is formulated based on the most important explanatory variables of international trade. It includes indicators linked to national production, population, and various political, cultural, and geographical elements that affect the strength of trade between nations. These factors make it possible to capture the structural features of the economies involved, as well as the barriers or facilities arising from their institutional and geographical context, which gives the model a greater capacity to explain and predict in the study of the determinants of foreign trade [
51].
When employing the gravity model to examine global trade flows, it is crucial to take into account several methodological considerations to ensure the accuracy and empirical validity of the results. A crucial aspect of this process is the econometric specification of the model, which determines the optimal functional approach for depicting the relationships among variables and evaluates the accuracy and consistency of the resulting estimates from a statistical standpoint, despite possible challenges related to endogeneity, heteroscedasticity, or autocorrelation [
52].
Although the gravity model estimated in this study follows a static specification, the use of panel data over the 2015–2024 period allows for capturing the temporal evolution of export flows and market concentration patterns. The longitudinal structure of the dataset reflects gradual structural adjustments in trade relationships, including export expansion, diversification processes, and changes in market dependence over time. Therefore, while explicit dynamic terms are not incorporated, the model provides meaningful insights into the underlying driving mechanisms of export growth [
53].
The data used to estimate the gravity model come from official sources and databases widely recognised for their statistical reliability. The information collected is summarised in
Table 1, which systematically presents the variables used and their respective sources, ensuring the transparency and reproducibility of the analysis.
In this study, which focuses on analysing the diversification of agricultural exports from the Lambayeque region using the gravity model, a data set corresponding to the period 2015–2024 was used. The information collected covers exports to 100 destination countries during that time interval, considering the value exported under the FOB modality as the main variable. Macroeconomic and structural variables relevant to the model estimation were also incorporated, including Peru’s gross domestic product (GDP), the GDP of the destination countries, the Peruvian population, the population of the importing countries, and the geographical distance between Peru and each destination country. In addition, a dichotomous variable was included to represent the existence of Free Trade Agreements (FTAs) between Peru and its trading partners. Together, these metrics constitute the set of variables used in the estimation of the gravity model applied to the analysis of agricultural export flows.
where:
ln (EXPijt): natural logarithm of exports from Lambayeque to country j in year t.
ln (GDPit): logarithm of the GDP of the exporting country (Peru/Lambayeque).
ln (GDPjt): logarithm of the GDP of the importing or destination country.
ln (POPjt): logarithm of the population of the destination country.
ln (DISTij): logarithm of the geographical distance (in kilometres) between Lima and the capital of the destination country.
FTAij: dichotomous variable that takes the value of 1 if there is a free trade agreement and 0 if there is not.
uij: specific unobserved effect of each country (fixed or random effect).
εijt: idiosyncratic error term.
EViews 14 (Econometric Views) software is a statistical and econometric analysis tool widely used in applied research in the fields of economics, finance and social sciences. Its versatility and ability to handle time series, cross-sectional and panel data make it an ideal tool for estimating and validating complex econometric models, such as the gravity model used in this study [
54].
Annual data for the period 2015–2024 were used, which were imported in Excel format and organised into a balanced panel workfile in EViews software. This panel consisted of 100 cross-sections (destination countries) and 10 periods (years), which allowed for the analysis of temporal dynamics and structural differences between trading partners. Subsequently, the variables were transformed using natural logarithms in order to linearise the functional relationships, reduce heteroscedasticity problems and facilitate the interpretation of the estimated coefficients in terms of elasticities. The transformed variables included the value of exports (export_fob), Peru’s gross domestic product (gdp_peru), the gross domestic product of the destination country (gdp_destination), the population of the destination country (population_destination), and the geographical distance between Peru and the partner country (distance_km).
Within the framework of the gravity model, econometric estimates were made using EViews software, through three panel data specifications: Pooled Ordinary Least Squares (Pooled OLS), Fixed Effects and Random Effects. Firstly, the Pooled Ordinary Least Squares (Pooled OLS) estimation is the simplest method, combining all observations without distinguishing between units or periods, under the assumption of parameter homogeneity.
Sensitivity analysis and robustness checks
Before making the gravity model, the dataset was carefully checked to make sure it was consistent and reliable for analysis. The data description correctly pointed out that the sample size went down because observations that did not have information about the main explanatory factors were left out. We replaced these observations with a very small positive constant (X = ε) because the gravity model is written in linear logarithmic form and the dataset has export flows with zero values. This method keeps track of information about trade relations that are either nonexistent or just starting to develop, and it also allows for the use of the logarithmic transformation. Even though choosing this constant is inherently subjective, ε was set low enough to reduce any possible effect on the predicted coefficients.
The data was not winsorised or trimmed in any way. Instead, logarithmic transformations of continuous variables were used to make the distribution less uneven, limit the effect of extreme values, and make it easier to understand the predicted coefficients in terms of elasticities. After cleaning, organising, and standardising the data in Microsoft Excel to make sure that the units of measurement and time were all the same, we used EViews 14 and IBM SPSS Statistics 27 to run gravity model estimates.
We did a number of robustness checks to see how strong the gravity model estimates were and how much they changed based on sample selection and model specification. To accommodate potential structural changes at the onset of the observation period, the model was initially re-estimated utilising a different sample period that excludes the initial years (2015–2016). The main coefficients, especially those related to geographic distance and free trade agreements, kept their predicted signals and statistical significance.
Second, due to its limited statistical significance in the reference specification, an alternative gravity model specification was developed that excluded the population variable of the destination country. The results remained qualitatively consistent, indicating that the principal conclusions are unaffected by the inclusion or exclusion of specific explanatory variables.
Finally, the sensitivity of the results was further tested by comparing several panel data estimators that are often used in trade gravity models. The pooled ordinary least squares (Pooled OLS) method was the benchmark estimator. It included all observations and assumed that the parameters were the same for all of them. This approach is simple, but it does not take into account country-specific variability that is not already known [
55]. To address this limitation, a fixed effects (FE) specification was employed, allowing for country-specific intercepts that account for structural, institutional, and cultural factors that remain constant over time and influence trade flows [
56]. Also, a random effects (RE) model was estimated using the EGLS method with Swamy-Arora variance components. This was done under the assumption that differences between countries are random and not related to the explanatory variables. When this assumption is true, the RE estimator works better and lets the results be applied to a bigger group of people [
57].
Hausman test
The Hausman test allowed us to determine the most appropriate specification between the fixed and random effects models. Given that the
p-value (Prob. Chi-Sq) was greater than 0.05, the null hypothesis was not rejected, concluding that the random effects model is the most appropriate for explaining agricultural export flows from the Lambayeque region in the period 2015–2024 [
58] (See
Table 2).
Consequently, the gravity model was estimated using the EGLS (Estimated Generalised Least Squares) method under a random effects scheme, which allowed us to capture the unobserved heterogeneity between countries and obtain more efficient estimates of the coefficients.
However, diagnostic tests showed contemporary correlation and heteroscedasticity between cross-sectional units, which could have a negative impact on the validity and efficiency of standard errors. The model was re-estimated to address these difficulties, using the Panel-Corrected Standard Errors (PCSE) method according to the Period SUR (Seemingly Unrelated Regressions) structure, which was suggested by Beck and Katz in 1995 and revisited [
59]. This method allows standard errors to be corrected without altering the estimated coefficients, resulting in more robust and reliable conclusions.
In summary, the final model, EGLS with random effects and PCSE correction, provides consistent and statistically valid estimates, strengthening the reliability of econometric inferences about the determinants of international agricultural trade in the Lambayeque region during the period analysed.
Calculation of market potential
Post-estimation stage of the gravity model: calculation of market potential
Once the gravity model had been estimated in its log-linear form using the random effects approach, this specification was used to estimate the potential exports from the Lambayeque region to each destination market. In the first stage, logarithmic predictions were generated from the coefficients obtained in the model estimation. The equation estimated in EViews was stored under the identifier re, from which the variable ln_pred was constructed, representing the expected value of exports (in logarithms) for each country and year, based on their economic characteristics, gross domestic product, population, geographical distance, and the existence of a free trade agreement (FTA) with Peru.
Subsequently, the model residuals (u_re) were calculated in order to correct the retransformation bias generated when converting estimates expressed in logarithms to level values. To this end, the correction factor proposed by Duan (1983), known as the smearing estimator, was applied, which is obtained by calculating the average of the exponential of the estimated residuals. This procedure allows the predictions to be adjusted and unbiased values to be recovered at the original level of exports, ensuring a more accurate estimate of expected trade flows [
60].
With this adjustment, potential exports were obtained as:
where
is the logarithmic prediction and
is the correction factor (
smearing).
Finally, the market potential metric was constructed by comparing potential exports with actual observed exports:
Thus, positive values indicate that, according to the predictions of the gravity model, the Lambayeque region has untapped export potential to the corresponding market. Values close to zero reflect a level of trade consistent with the expected behaviour according to the economic fundamentals of the model. Negative values suggest the existence of an overexported or saturated market, in which the current volume of exports exceeds the estimated potential level (See
Table 3).
Based on this indicator, a market classification was developed:
For the analysis of commercial opportunities, the last year of the sample (2024) was taken as a reference and the destination countries were ranked from highest to lowest market potential. This analysis made it possible to identify the destinations with the best economic and commercial conditions according to the model’s variables. Finally, based on these results, a representative table was constructed that visually shows the 10 countries with the greatest export potential, facilitating the interpretation and comparison of the findings, and a table with the countries that are already saturated.
Although the gravity model and concentration indices are robust tools that provide consistent, replicable and comparable results over time, they have limitations inherent to their quantitative nature. These approaches do not adequately capture relevant qualitative factors, such as product reputation and differentiation, the phytosanitary policies of destination countries or business strategies for positioning in international markets. Furthermore, the information sources used, such as the AZATRADE database, depend on official customs records, which may be delayed in updating or contain minor omissions in statistical reporting. Nevertheless, the integration of statistical and econometric techniques strengthens methodological rigour, guarantees the internal validity and analytical consistency of the results, in accordance with the methodological standards of high-impact scientific literature, contributing to reducing possible estimation biases [
61].
In accordance with the Research Ethics Code of the Universidad Cesar Vallejo (2024) [
62], the principles of honesty, scientific integrity, and good practices in R&D&I were incorporated into the conduct of this study. Emphasis was placed on methodological rigour, objectivity, compliance with the authors’ intellectual property standards (APA 7th edition standards) and adherence to the principles of integrity and originality.