Supply and Demand Model for a Chili Enterprise System Using a Simultaneous Equations System

: The supply and demand of various fresh chili types correspond to one another simultaneously because they can be complementary or substitute products. This study aimed to simultaneously examine the supply and demand model of red chili and cayenne peppers to support innovative chili enterprise system development. The methodology is the simultaneous equation system panel data using the years 2010–2020 time-series and provincial cross-sectional data combined with data mining techniques. The two-stage least square method was used for the estimation and simulation of the models. The results of the models comprise three identities and ten structural equations. The resulting models are all statistically ﬁt with some drawbacks owing to the micro-level data availabilities. The simulations on the labor wages, prices, demand, and production for red chili and cayenne peppers affected the other variables within the model. In conclusion, the chili supply and demand models could support the development of the innovative chili enterprise system; however, this requires proactive collaboration and participation from all the relevant institutions and actors. Moreover, this research model is potentially applicable to other horticultural commodities.


Introduction
Chili supply and demand are monitored monthly due to its seasonal production, perishable feature, and impact on inflation.In Indonesia, the actors involved in chili supply and demand are the chili producers, collectors, wholesalers, agents, distributors, and suppliers of the chili processing industry, and also the retailers for both the traditional and modern markets (i.e., supermarkets and hypermarkets) (Badan Pusat Statistik 2018).The chili producers (farmers) sell their products directly to collectors, retailers, wholesalers, agents, or distributors.These actors are essential in providing the time series data for a supply and demand model.Therefore, a data chain from the data collection to the decisionmaking and marketing (Miller and Mork 2013;Chen et al. 2014) becomes very important.To maximize the operational process of the chili supply chain, chili farmers must work closely with farmer groups and farmer associations (Perdana et al. 2018).The price of chili at the consumer level is often higher due to the length of the chili supply chain, and most farmers need to understand the role of Village-Owned Enterprises (Bumdes) and Indonesian Farmers' Shops in shortening their marketing chain (Barusman et al. 2019).Hence, we need an innovative enterprise system to package the supply and demand monitoring of chili data, whose prices are highly volatile.The enterprise system has user representatives that play an essential role in dealing with the physical, cognitive, and social boundaries (Pan and Mao 2016), involving multi-dimensional relationships in a dynamic environment (Weichhart et al. 2016).The enterprise system acts as a technology platform in the contemporary technology portfolio to facilitate organizational process innovation (Lokuge and Sedera 2018).
Most previous research on chili has focused separately on the supply, demand, or price and has mainly applied to one type of chili.For example, the supply chain design of chili (Ridwan et al. 2017) and lean production in the cayenne pepper supply chain (Perdana et al. 2018) apply to one type of chili.The following price studies also apply to only one type of chili.Budiastuti et al. (2017) employed the consumer price index (CPI) for prediction models using the support vector regression (SVR) method based on cloud computing.They concluded that chili price volatility occurs due to variations in economic variables from time to time and is influenced by price variations in the market.Anggraeni et al. (2018) employed an artificial neural network (ANN) with price input variables at the consumer level, chili production, and consumption.Their best forecasting model was at four input nodes, four hidden nodes (on one layer), and one output node.
In this study, we first investigated three types of chilies for supply and demand models using simultaneous equation systems.To support this, we examined previous chili types of research on panel data, simultaneous equation systems, and demand and supply models.
Here, we determined state-of-the-art research from 66 relevant papers obtained using the Google Scholar search engine from 2010 to September 2021.Further, we queried the 66 relevant papers using the keyword combination of "Panel Data" AND "Simultaneous Equation System" AND "Red Chili" AND "Curly Chili" AND "Cayenne Pepper," and it "returned no results."Further queries that discarded curly chili also returned no results.This indicates that the topic of panel data with a simultaneous equation system for red chili, curly red chili, cayenne pepper or red chili and cayenne pepper is widely open; thus, the latter is covered in this study.A study by (Sativa et al. 2017) on a red chili simultaneous equation system produced one identity equation and four structural equations; however, this study only employed one type of chili.Our study considered red chili, curly chili, and cayenne peppers.Here, we argue that each type of chili influences the other; hence, panel data modeling with a simultaneous equation system is considered the most suitable for the supply and demand model.The simultaneous equation model is a regression model with more than one equation and a reciprocal relationship between the equations (Gujarati and Porter 2009).
This study applied business intelligence and analytical approaches for descriptive, predictive, and prescriptive analyses.A descriptive approach will be carried out by summarizing the data through data summarization measures, their distribution, and then presenting the data in a plot series.Based on the trend pattern, we can observe the dynamics of each study variable and the alignment of the trend patterns between the variables.The predictive analysis examines the factors influencing the production, consumption, and price of fresh chili and how these three variables are related.Moreover, the prescriptive analysis provides suggestions and recommendations as a follow-up to the impact of the results of the predictive analysis study.Thus, our research question is: How do the chili supply and demand model outcomes endorse the development requirement for an innovative chili enterprise system?Hence, the following are our research objectives (RO): RO1. Implementing descriptive analysis to obtain an overview of the trend patterns of the production, consumption, and prices of three types of fresh chili in the panel data.RO2.Applying simultaneous equation model panel data accompanied by predictive and prescriptive analysis to observe how changes in a factor affect the production, consumption, and price of three types of fresh chili simultaneously and endorse the innovative enterprise system.
Our study addresses the problems associated with the efficiency of the fresh chili agro-system by using a simultaneous equation system model.The novelties contributed by this study are as follows: (1) More than one fresh chili type was studied using a simultaneous equation system model.
(2) The application of panel data for a simultaneous equation system is a time series (the years 2010-2020) and a cross-section according to the chili production centers.
(3) The resulting model will be the foundation for developing a fresh innovative chili enterprise system.

Literature Review
In large cities with more than one million people, chili consumption is approximately 66,600 tons/month or 800,000 tons/year.This need will increase by approximately 10-20% more than usual on religious holidays or national celebration days (Kementerian Perdagangan 2016).Chili production and land have spread to almost 34 Indonesian provinces.According to the Ministry of Agriculture (2021), the largest red chili production centers in 2020 were West Java (266,067 tons), North Sumatra (193,862 tons), and Central Java (166,269 tons), and for cayenne peppers they were the East Java (684,943 tons), Central Java (159,099 tons), and West Java (130,838 tons) provinces.With Indonesia's population of 271,066,366 pupils in 2020, the general consumption of red chili and cayenne pepper was somewhat elevated.According to the Ministry of Agriculture (2021), in 2020 and 2021, the consumption of red chili was 2.020 kg/capita and 2.201 kg/capita, respectively, while that of cayenne pepper was 1.769 kg/capita and 1.854 kg/capita, respectively.This means the total consumption in 2020 and 2021 was 577,554 tons and 596,617 tons for red chili and 479,516 tons and 502,557 tons for cayenne pepper, respectively.
Many chili-related studies from January 2021 to September 2021 used panel data for the impact of food inflation on the consumer price index (Caroline and Nairobi 2021), price determination (Karabiyik et al. 2021), and pricing with a fixed-effect model (He et al. 2021).Other related chili studies used a system of simultaneous non-linear equations for agribusiness companies in Japan (Chung et al. 2021) and red chili supply models employing the Bayesian method (Fajar and Winarti 2021).
The simultaneous equation system uses a two-stage least squares (2SLS) and threestage least squares (3SLS) estimation.The benefits of the 2SLS method compared to the 3SLS are that it is more straightforward and effortless to produce asymptotic and efficient, consistent assumption parameters (Sativa et al. 2017).The 3SLS is more efficient and has advantages over 2SLS because the 3SLS captures all cross-correlation covariates (Neog and Gaur 2020).In addition, the 3SLS produces more efficient parameters but is very sensitive to changes in the model specifications and requires more samples than the 2SLS method (Neog and Gaur 2020).
Supply and demand are excellent foundations for assessing the development of enterprise systems.The first step toward enterprise system development is to determine a worthy chili enterprise architecture and the available enterprise architecture frameworks that characterize the chili market conditions (Rachmaniah et al. 2022).Various strategies for enterprise development may consider a modeling framework consisting of three modeling views: business, analytical design, and data preparation (Nalchigar and Yu 2018).The three modeling views link an enterprise strategy with analytical algorithms and data preparation activities.Many researchers have worked on various studies on enterprise systems design and critical success factors, for instance, regarding top management support (Shao et al. 2016), a steering committee (Murphy et al. 2016), dynamic capability (Niemi and Laine 2016), small and medium enterprises (Huang et al. 2018;Thiak 2018), and their analysis (Kurnia et al. 2019).
Data mining mixes various domains, such as statistics, exploratory data analysis techniques, and artificial intelligence systems and databases.Data mining develops the 'intelligence' of an intelligent data support system (IDSS) and utilizes different techniques, such as classification, pattern recognition, clustering analysis, association rule mining, and data visualization (Belciug and Gorunescu 2020).The IDSS uses data mining for two main tasks: description and prediction.The IDSS can also oversee anomaly detection (i.e., missing data or outlier detection) and association rules for commerce.Decision support systems (DSS) have been extensively researched in diverse facets of agriculture and farming (Rupnik et al. 2019), showing an increasing direction for (i) predictive modeling, (ii) data interpretation, and (iii) statistical facts study.Another agricultural DSS application is for an agricultural appraisal in dryland areas (Suroso et al. 2014) or agribusiness investment (Suroso and Ramadhan 2012).
Data mining is closely related to business intelligence and analytics (BIA) frameworks.The most significant aspects that determine the triumph of business intelligence initiatives refer not to technology but to the firm confidence of all users in business intelligence and the soft competencies and talents needed for business intelligence (Olszak 2016).The study findings show that most of the organizations surveyed must increase their "analytical knowledge" and think more creatively about probable data sources.The modeling framework comprises three modeling views: business, analytics design, and a data preparation view (Nalchigar and Yu 2018).All the modeling views are linked to the enterprise strategy with analytic algorithms and data preparation activities.

Data Collection Method
Chili data collection on the production, harvested area, and price is primarily conducted monthly (Direktorat Jenderal Hortikultura Kementerian Pertanian and Badan Pusat Statistik 2020).The data source comes from sub-district production centers as most farmers are in sub-district areas (i.e., the supply side).Excluding the non-production center (DKI Jakarta province), approximately 6521 sub-districts of 416 agricultural district offices (Table 1) manually collected the data and input the data collected into the online horticulture application (available in sipedas.pertanian.go.id).Remember that Indonesia divides the government structure from sub-districts, districts, and provinces up to the center; therefore, the data source is the sub-district aggregated into the upper level.The BPS survey showed eight main patterns of the chili distribution chain; the combination of actors involved in the chain formed the main pattern of the chili agro-system (Figure 1).Consequently, the more actors involved, the higher are the end-consumer prices due to transportation and trade margins.In 2021, red chili production was dominated by the patterns 3, 2, 1, and 4, whereas the cayenne production was dominated by the patterns 2, 1, 3, and 4 (Table 1).On the demand side, Indonesia's 2020 total population was 271 million, that served as the chili consumers.Over time, the data collected by the BPS and the Ministry of Agriculture's DJH have escalated; consequently, the 'V' characteristic of big data has become an issue.These are the volume, validity, value, variability, velocity, venue, veracity, visibility, volatility, and visualization.Big data analytics might be beneficial to make determinations for farmers to schedule optimal chili production, including their logistics and distribution chains (Siregar and Suroso 2021).Following the state-of-the-art results described in the Introduction, the intended types of fresh chili covered in this study were red chili (CMB), curly red chili (CMK), and cayenne peppers (CRM).These three types of chilies were analyzed to determine the demand and supply relationship of the three types of fresh chili simultaneously.Monthly data were collected from structured data sources from various districts in Indonesia for the period 2010-2020.In this study, we assumed that the chili farmers were the primary producers of fresh chilies participating in the farming and selling of the CMB, CMK, and CRM.Every month, the Ministry of Agriculture's DJH and BPS manually collected data using a printed three-fold form of the seasonal fruit vegetable horticulture survey (SPH-SBS).The frequency of the SPH-SBS data collection was for monthly (time-series) data, and the data source was the production center locations at the sub-district level in Indonesia (cross-section).Table 2 lists the data requirements.The various granularities of the available data affected the model, including its evaluation and validation.Granularity refers to the discrepancy of the detailed data available from the national data down to the province and district data and vice versa, by the type of chili or by the frequency of the data (i.e., monthly, or yearly).Our identification in Table Following the state-of-the-art results described in the Introduction, the intended types of fresh chili covered in this study were red chili (CMB), curly red chili (CMK), and cayenne peppers (CRM).These three types of chilies were analyzed to determine the demand and supply relationship of the three types of fresh chili simultaneously.Monthly data were collected from structured data sources from various districts in Indonesia for the period 2010-2020.In this study, we assumed that the chili farmers were the primary producers of fresh chilies participating in the farming and selling of the CMB, CMK, and CRM.Every month, the Ministry of Agriculture's DJH and BPS manually collected data using a printed three-fold form of the seasonal fruit vegetable horticulture survey (SPH-SBS).The frequency of the SPH-SBS data collection was for monthly (time-series) data, and the data source was the production center locations at the sub-district level in Indonesia (cross-section).Table 2 lists the data requirements.The various granularities of the available data affected the model, including its evaluation and validation.Granularity refers to the discrepancy of the detailed data available from the national data down to the province and district data and vice versa, by the type of chili or by the frequency of the data (i.e., monthly, or yearly).Our identification in Table 3 shows the findings on the data availability collected from various sources.Further identification of the data availability depicted in Table 3 shows that data for the actual retail price and domestic chili prices were available for the CMB, CMK, and CRM; however, our data did not distinguish between the CMB and CMK types for the remaining variables, and instead referred to them as red chili (CM/CB).Hence, we hypothesized that red chili and cayenne pepper could simultaneously form the chili supply and demand model.

Analysis Methods and Techniques
The data collected in this study can be in the form of cross-sectional data, univariate/multivariate time series data, or panel data (Figure 2).Cross-sectional data have the same time but are at different locations, and the time-series data show measurements at a similar location but at different times.Panel data combines the cross-sectional and time series data, and the data panel comprises a single or simultaneous equation model.This study used panel data with a simultaneous equation system.Further, our analytical methods and techniques combined econometric and data mining techniques comprising four stages of the research flow, as shown in Figure 3.We applied a data mining approach during the first stage.Here, we advance on the summarization, data visualization, and data preprocessing.A data mining approach was applied to assess the quality of the collected data.The specification model can be adjusted based on the application of this data mining technique.Figure 3 depicts the research steps.Note that the resulting estimation results serve various simulation scenarios, such as for labor wage, price, demand, and production simulations.Due to granularity constraints, as depicted in Table 3, the time-series units used yearly data (2010-2020), and the cross-sectional units used provincial data (33 provinces) excluding non-production centers (DKI Jakarta province).The collected data were packaged into one dataset consisting of the provinces with annual time units, and we obtained 363 observations.We used the two-stage least squares (2SLS) method to estimate the six constructed models.Here, we used four criteria for the model fit: an analysis of variance model Pr > F-stat (1%, 5%, and 10% confidence intervals), an adjusted R-squared, the Dur    Due to granularity constraints, as depicted in Table 3, the time-series units used yearly data (2010-2020), and the cross-sectional units used provincial data (33 provinces) excluding non-production centers (DKI Jakarta province).The collected data were packaged into one dataset consisting of the provinces with annual time units, and we obtained 363 observations.We used the two-stage least squares (2SLS) method to estimate the six constructed models.Here, we used four criteria for the model fit: an analysis of variance model Pr > F-stat (1%, 5%, and 10% confidence intervals), an adjusted R-squared, the Durbin-Watson (DW) test, and parameter estimate results.Satisfactory prediction results have high adjusted R-squared values, and a flawless model has an R-Square = 1, but this rarely occurs (Forsyth 2018).The Durbin-Watson test for autocorrelation determines the appropriate functional form of a cross-sectional model.The limits of d are 0 and 4, and the d value of 2-4 stands for no autocorrelation, positive or negative (Gujarati and Porter 2009).The model specifications are as follows: Due to granularity constraints, as depicted in Table 3, the time-series units used yearly data (2010-2020), and the cross-sectional units used provincial data (33 provinces) excluding non-production centers (DKI Jakarta province).The collected data were packaged into one dataset consisting of the provinces with annual time units, and we obtained 363 observations.We used the two-stage least squares (2SLS) method to estimate the six constructed models.Here, we used four criteria for the model fit: an analysis of variance model Pr > F-stat (1%, 5%, and 10% confidence intervals), an adjusted R-squared, the Durbin-Watson (DW) test, and parameter estimate results.Satisfactory prediction results have high adjusted R-squared values, and a flawless model has an R-Square = 1, but this rarely occurs (Forsyth 2018) Error terms: The chili supply (QSC), which comprises a supply of red chili (QSCM) and cayenne pepper (QSCR) in location k and time period t, is an identity equation.QSCM consists of the quantity of red chili production (QCM) plus the volume of red chili exports (XCM) minus the red chili imports (MCM).At the same time, the QSCR consists of the amount of cayenne pepper production (QCR) and the number of cayenne pepper exports (XCR) minus the cayenne pepper imports (MCR).The researchers assume that imported and exported chilies are homogeneous products calculated from the exports and imports of fresh chilies.The formulated identity equations for the national chili supply are as follows: where QSC is the chili supply model, QSCM is the supply of red chili, QSCR is the supply of cayenne peppers, QCM is the production of red chili, MCM is the import volume of red chili, XCM is the export volume of red chili, QCR is the production of cayenne peppers, MCR is the import volume of red chili, and XCR is the export volume of red chili.

Chili Production Model
The production of red chili (QCM) and cayenne pepper (QCR) at time t at location k is a function of the chili prices, alternative input prices, and the alternative values of the remaining variables that affect the supply.NPK fertilizer was used as the basic fertilizer price (PNR) (Farid and Subekti 2012); (Sukmawati et al. 2014).The structural equations are as follows: where LQCM is the lag of red chili production, LPCM is the red chili harvested area, UTK is the actual labor wage, HRCM is the actual red chili producer price, PNR is the actual basic fertilizer price, CH is the dummy season, LQCR is the lag of cayenne pepper production, LPCR is the cayenne pepper harvested area, and HRCR is the actual cayenne pepper producer price.

Chili Import Models
Red chili imports (MCM) and cayenne pepper imports (MCR) at time t at location k aim to meet the consumption needs and stabilize prices, which often increase sharply.The structural equations for imports are as follows: where LMCM is the lag of red chili imports, RPIC is the ratio of international and domestic chili prices, QCM is red chili production, QDCM is the demand for domestic red chili, DR is a dummy of reference price, LMCR is the lag of cayenne pepper imports, QCR is cayenne pepper production, and QDCR is the demand for domestic cayenne pepper.

Chili Export Models
Exports of red chili (XCM) and cayenne pepper (XCR) occupy small volumes owing to a high domestic demand.The structural equation for chili exports is as follows: XCM k,t = e 0 + e 1 LXCM k,t−1 + e 2 RPIC k,t + e 3 QCM k,t + e 4 QDCM k,t + e 5 NTX k,t + u5 k,t (8) where LXCM is the lag of red chili exports, NTX is the real IDR exchange rate, and LXCR is the lag of cayenne pepper exports.

Chili Demand Models
Fresh chili is the primary ingredient in various cooking spices: therefore, the demand is relatively stable.Fresh chili in Indonesia is generally not only consumed fresh by households but is also used for seeds, in chili processing industries such as for sauces and chili powders, the instant noodle industry, and other processed industries, both food and non-food.The structural equations of the demand for domestic red chili (QDCM) and domestic cayenne pepper (QDCR) are as follows: QDCM k,t = g 0 + g 1 LQDCM k,t−1 + g 2 HCM k,t + g 3 PDD k,t + u7 k,t (10) where LQDCM is the lag of the red chili domestic demand, HCM is the actual retail price of red chili, PDD is the population at location k in period t, LQDCR is the lag of the domestic demand for cayenne pepper, and HCR is the actual retail price of cayenne pepper.

Chili Price Models
Chili availability and demand influence the price model based on the volume of chili exports and imports.The structural equations for the red chili price (HCM) and the cayenne pepper price (HCR) are as follows: HCR k,t = j 0 + j 1 QDCR k,t + j 2 QSCR k,t + j 3 LXCR k,t−1 + j 4 LMCR k,t−1 + u10 k,t where QDCM is the domestic demand for red chili, QSCM is the supply of red chili, LXCM is the lag of red chili exports, LMCM is the lag of red chili imports, QDCR is the cayenne pepper domestic demand, QSCR is the cayenne pepper supply, LXCR is the lag of cayenne pepper exports, and LMCR is the lag of cayenne pepper imports.

Simulation of the Model
The simulation utilized two coding types: coding for validation and coding for simulation.The validation code was before the simulation code, and the results of the validation code execution were then used as a basis to assess the impact of the simulation execution results.The coding for the six simulation variables used the same coding segment label, SIMNLIN procedure, and parameter description.The difference between the scenarios was in the structural equation coding segment and the identity equation, which depended on the percentage size according to the scenario: (i) UTK increases by 10%, (ii) HCM increases by 50%, (iii) HCR increases by 50%, (iv) QDCM increases by 15%, (v) QDCR increases by 15%, and (vi) QCM and QCR both decrease by 25%.The percentage of increase or decrease was determined based on the phenomena that often occur in the fresh chili agro-system, especially during national/religious holidays or the dry/rainy seasons.

Data Summarization and Visualization
The data mining technique includes the summarization, visualization, and data preprocessing for all the variables listed in Table 2. Our investigation revealed that all the data variables we collected showed different granularities by the type of chili, the level of data availability (i.e., the district, province, and national), and the frequency of data (i.e., monthly, or yearly).The following paragraphs describe the sample discrepancy in the data granularities for the production data, harvest area data, actual labor wages, and producer/farmer price.
The production data we collected had different granularities on the type of chili (missing red chili/CMB and curly chili/CMK) and the level of data availability (annual data at the district/province level for 2010-2020 and monthly data at the province level for the year 2016-2021).Most production data were unavailable as district-level data or had zero values (missing data).Data for the CMK type were not available at all.The same was true for the harvested area data-CMK harvested area data were unavailable.In the 2018-2020 period, the national production of cayenne pepper (CRM) was higher than that of large chili/red chili (CB/CM) even though the harvested area for the CRM in the same period was higher than that of the CB/CM area.In 2017, the CRM harvested area was higher, but its production was lower than that of the CB/CM.The available harvested area data were in province and district data units; there was a large amount of missing data at the district level.
Another example of granularity is the actual labor wage.Different granularities on the actual labor wages occurred at the data availability level, which was available nationally rather than monthly.The BPS uses two wages terms for farm workers: nominal and real wages.The data on the real wages and nominal wages for the farm workers that we collected were monthly data at the national level for 2010-2021.This study used data on the real wage of labor because it describes the purchasing capacity of farm workers' income/wages.According to the BPS, the real wage of farm laborers is the ratio between the farm workers' nominal wages and the rural households' consumption index.The nominal wage of farm workers is the average daily wage received by workers as compensation for their labor.Our data show that although the nominal wages tended to rise, the real wages tended to remain flat and sometimes even decline.We observed that the real wage increased in value every five years close to the nominal wage value (stepwise); however, the value was flat when the nominal wage increased.
The producer-level actual price data also showed different granularities.The data were available on two types of chilies-red chili (CMB) and cayenne pepper (CRM).The available data were monthly data broken down into provinces from 2010 to 2020, while the detailed monthly or daily data were unavailable by district/city in that period.In addition, the DKI Jakarta province data were unavailable because it is not a chili production center.In addition, the data for the North Kalimantan province for 2010-2019 were unavailable, but data for 2020 for this province were available.

Data Preprocessing
Our cross-section and time-series data had various granularities and contained many missing values for data preprocessing to provide a better model.There are many approaches for handling missing values.For example, when the values of '0' are the missing values (Boardman et al. 2019); multiple imputations can fill in the missing values with multiple informed guesses (Grant 2019); manually filtering out data with a poor quality (such as incorrect, inconsistent, or irrelevant data) (Xu et al. 2019); and the use of profiling algorithms (Ramirez Ramirez et al. 2019).Other missing approaches are attributed to random values sampled from non-missing values, computing their mean, and replacing the missing values (Feng et al. 2019).
Other studies have automatically permitted the imputation of missing values using the k-nearest neighbor algorithm (with k = 10) and have utilized the most similar instances to extrapolate and fill in the missing values (Rupnik et al. 2019).For example, Sharda et al. (2020) imputed the most probable value or ignored or used domain knowledge and expert opinions.Falah and Rachmaniah (2022) used a min-max normalization using a 0-1 range of values to avoid data with large values dominating data with small data ranges and replacing the missing values with the average value calculation; however, the mean value can be affected by outlier or extreme values.One way to handle extreme values is to use a median (Forsyth 2018).Outliers can occur because the extreme values are very small or so large that they interfere with the model.Outliers may also occur due to incorrect recording, incorrect summarization (sub-district to national), or having too varied data, making it challenging to interpret the data.Therefore, our study confronted the missing data using the median value.

Determination of the Model Specifications of the Simultaneous Equation System
Our study ultimately examined two chili types: red chili and cayenne peppers, and the specification model comprised a supply model with three identity equations and two structural equations for the production, import, export, demand, and price models.Further, we used the CM suffix to denote red chili and CR for cayenne peppers.

Chili Supply Model
The supply of chili (red chili and cayenne pepper) at location k in the t time period is an identity equation consisting of the amount of production and volume of imports minus the exports.We assume that imported and exported chilies are homogeneous products that count from the exports and imports of fresh chilies.The chili supply model is an identity equation consisting of Equations ( 1)-( 3) to describe the structural equations constructed:

Chili Production Model
The structural equations for the chili production models are Equation ( 4) for red chili production (QCM) and Equation ( 5) for cayenne pepper production (QCR).Table 4b shows the models' estimation results.

Chili Import Models
The structural equations for the chili import models are Equation ( 6) for red chili imports (MCM) and Equation ( 7) for cayenne pepper imports (MCR).Table 5a,b shows the estimation results of these models.

Chili Export Models
The structural equations for the chili export models are Equation (8) for red chili exports (XCM) and Equation ( 9) for cayenne pepper exports (XCR).Table 6a,b shows the models' estimation of results.

Chili Demand Models
The structural equations for the chili demand models are Equation (10) for the domestic red chili demand (QDCM) and Equation ( 11) for the domestic cayenne pepper demand (QDCR).Table 7a,b shows the models' estimation of results.

Chili Price Models
The structural equations for the chili price models are Equation ( 12) for the red chili price (HCM) and Equation ( 13) for the cayenne pepper price (HCR).Table 8a,b shows the estimation results of the models.

Simulation Result
The chili agro-system involves numerous settlements, such as collecting heterogeneous production centers and many vibrant supply chain practices.It includes considerable trade and freight margins owing to transportation costs, and segregated chili datasets among government institutions (Rachmaniah et al. 2022).Consequently, we used the simulation for chili production (QCM and QCR) and its prices (HCM and HCR) because price instability increases the risk of reducing the encouragement of plants (Rachmaniah et al. 2021).The simulation was also applied to the labor real wages (UTK) and demand for domestic chili (QDCM and QDCR).Table 9 lists the simulation results.The chili supply model comprises three identity equations: Equation ( 1) for the total chili supply model, Equation (2) for the red chili supply model, and Equation (3) for the cayenne pepper supply model, which are elaborated into ten equations (Table 10)-Equations (4) to Equation (13).The identity equation cannot show the behavior of endogenous variables formed by the multiplication, division, addition, or subtraction of several variables (Gujarati and Porter 2009).Table 10 summarizes the supply and demand estimation results.Although some adjusted R-square and DW showed reduced values, all the models fit.Table 10 also shows the two-way relationship between the endogenous and exogenous variables.For instance, the chili price specifies the quantity of its demand and supply; simultaneously, the demand and supply quantities also specify its price.Price and quantity are endogenous variables as they are computed simultaneously from the system of equations.Our verdict aligns with the theories of Gujarati and Porter (2009) and Sativa et al. (2017).The analysis of variance in Table 10 shows that the global F test for all the models is significant (p-value < 0.1), meaning that the models fit as exhibited by the p-value < 0.0001 and p-value < 0.0003.In the red chili production model (QCM), the value of adjusted (Adj.)R-square is 0.95806, which means the model can explain 95.81% of the diversity of the red chili production.The value of Adj.R-square = 0.95806 is close to 1, indicating that the model is excellent (Forsyth 2018).The remaining values of Adj.R-square in the remaining models were self-explanatory.Our model fit declaration using the Adj.R-square aligned with another study stating the coefficient of determination (R-square) of the average production function was 0.715 for the dry season and 0.652 for the rainy season, which explained, consecutively, 71.5% and 65.2% of the variation of the chili production (Saidah et al. 2020).This study used an Adj.R-square instead of the R-square.The value of the R-square is the proportion of the variability of the dependent variable described in the model; it is (the ability to predict the model)-the value of the Adj.R-square is a correction to R 2 that considers the number of variables used in the model (Ohtani and Tanizaki 2004).Meanwhile, a DW value less than 1.33585 indicates an autocorrelation occurrence, meaning there is a correlation between the i-th observation and the i − 1 observation for i = 1 . . .363.Alternatively, it is stated that there is a relationship between e k,t and e k,t−1 , namely, the error that occurs in an observation k,t influences errors in an observation k,t−1.A DW value of 1.5-2.5 indicates no autocorrelation (Jeong and Jung 2016).
The chili agro-system in Indonesia involves various factors such as the chili producers (farmers), production center locations, government intervention, supply and demand, price disparities, actors in the trading system, and its environment.With these various constituent factors, the chili agro-system essentially forms an enterprise system, as portrayed in the two-way relationship between the endogenous and exogenous variables shown in Table 10.These variables interact with one another when discussing the supply and demand models described in the following simulation results.Structural equations in the simultaneous equation system for the supply and demand models describe the structure of the economic model of an economy or the behavior of economic agents (Gujarati and Porter 2009).An enterprise is a sociotechnical system consisting of interdependent human resources, information, and complex technology interacting in an environment that supports a shared mission (Giachetti 2016; Bernard 2012).A system is an interaction that is an essential behavior in an enterprise.An enterprise is a sociotechnical system involving humans and technology, i.e., it is an open system that interacts with its environment to achieve specified goals (Giachetti 2016).
Worthy sustainable economic development demands monthly inflation stability.For this reason, price stability is a determining factor in achieving inflation under a specified inflation target policy.However, in Indonesia, there are eight main patterns of chili distribution with a varying number of supply chain actors so that, in the end, this will affect the price received by the final consumers (Rachmaniah et al. 2022).Econometrically, prices can be stable if the supply and demand are balanced.The implication is that a price control mechanism is needed, which means that it is necessary to monitor the production, consumption, and price of fresh chili packaged in a fresh chili enterprise system that considers the factors that make up the Indonesian fresh chili agro-system.Furthermore, to develop an enterprise, it is necessary to select an enterprise architecture framework that is relevant to the conditions of the existing agro-system.
The architectural enterprise framework for chili agro-systems has never been studied, meaning that no previous studies have considered the application of the architectural enterprise framework for chili agro-systems (Rachmaniah et al. 2022).Meanwhile, the results of research by Sessions and DeVadoss (2014), Dorohyl et al. (2017), Bondar et al. (2017), andMokone et al. (2019) state that the Zachman Framework has various advantages.Therefore, based on previous descriptions, the AE framework for the chili supply and demand model is the Zachman Framework.

Simulation Results of the Constructed Simultaneous Equation System
The validation results are presented in the column 'Actual Mean Validation' in Table 9, followed by the simulation results in the subsequent columns.Each scenario required specific coding.The simulation coding results were then compared with the validation results to determine the impact of the simulation changes.The formula used to determine the impact of the simulation (% shift) is: Simulation Impact = (Mean Prediction − Actual Mean)/Actual Mean When the real wage of labor (UTK) increases by 10%, it causes an increase of 1.66% and 3.68%, respectively, in the production of red chili (QCM) and the production of cayenne pepper (QCR) models.UTK is an exogenous variable in the QCM and QCR models.The results of the UTK simulation aligned with Saidah et al. (2020) in that the increased production of red chili farming was due to an increasing labor use.The increase in UTK also caused an increase in other variables, namely, an increase of: • 2.57% in the total supply of fresh chili (QSC), • 1.66% in the supply of red chili (QSCM), • 3.58% in the supply of cayenne pepper (QSCR), • 2.17% in the imports of red chili (MCM), • 2.15% in the imports of cayenne pepper (MCR).
As an implication, the data collection on production, harvested area, and price of fresh chili must occur in each instance or event of production (supply) and procurement (demand/purchase of supply chain actors from farmers).Data collection that is conducted once a month per sub-district will face accuracy problems because farmers will need help to remember their monthly volume, since they do not have the proper tools.The study results regarding the red chili and curly red chili prices imply that the prediction model will be more accurate when using daily time-series data, rather than weekly or monthly time-series data (Falah and Rachmaniah 2022).
The data were collected directly from each chili farmer in a sub-district per incident and accumulated per district.In this case, it would be necessary to automate data collection forms at the farmer level and to further aggregate them at the sub-district level.The number of rural residents who owned a cellular phone was 54.31% of the rural population (Badan Pusat Statistik 2021).This means that about 54.31% of rural farmers have the potential to automate their data collection via cell phones; the remainder still require intervention from the District Data Collection Officer.
The theoretical implication affects the size of the estimated parameters generated from the panel data model of the simultaneous equation system.This supports the previous literature on the chili supply and demand model (Asidawati et al. 2022).The involvement of big data could assist in identifying the chili enterprise system through decision-making in production scheduling, logistics planning, and the supply chains across regions (Siregar and Suroso 2021).
Generalization of the time-series data units on an annual basis and cross-section at the provincial level eliminates detailed data and produces residuals that are less normally distributed, because the model only examined 363 observations compared to 951,456 observations generated from monthly data sourced from the 7208 sub-districts for the 2010-2020 period.Various data variables that are often needed in modeling should be implemented in a one-gate policy.The authors hope that every government, education, research, and development institution would be able to access these data variables and since collecting data from the root source is not easy, it is necessary to develop an innovative enterprise system for fresh chili.With this enterprise system, the authors hope that the data collection mechanism could be carried out systemically as a part of government institutions' main tasks and be automated at the micro level.

Conclusions
The granularity and completeness of the panel data affected the results of the panel data of the simultaneous equation system.The authors found obstacles in collecting the monthly data for almost every variable required for the constructed model specification.The absence of monthly time-series data for the 2010-2020 period caused the model to switch to annual data.Sub-district or district data were also unavailable for most variables; therefore, the cross-sectional data were changed from district data to provincial data.In addition, the data collected did not distinguish between red chili and curly red chili, but the data collection for the CMB, CMK, and CRM would have been functional by 2021.Consequently, researchers must change the model specifications from three types of chilies to two, while it is important to note that Indonesia's Big Data for horticultural commodities is still in its early stages.
Indonesia's aspiration for One-Data Indonesia is still in progress.Implementing horticultural Big Data requires the proactive collaboration and participation from each relevant institution and all actors in the horticultural agro-system.The various data sources are spread across institutions from the smallest unit in the village/sub-district to the center level.Moreover, collecting data from the data owners, such as farmers, is also complex because farmers are in rural locations which are sometimes difficult to reach, and an ICT literacy inadequacy must be considered.The completeness of data is essential for creating an accurate model and an enterprise system model with cross-institutional and cross-actors who hold and manage that data is needed.A simultaneous equation system in panel data was ensued into six models, which were elaborated into three identity equations and ten structural equations.All the models obtained were statistically fit, although there were still some models that had autocorrelation problems (DW < 2.00) and some less suitable models (Adj.R-square < 0.55).The constraints on the granularity and completeness of the data are the root causes of this problem.Micro-level data can produce a more precise model, emphasizing the need for a chili or horticultural innovative enterprise system.This constraint leads to the necessity of developing a chili enterprise system using the Zachman Framework for the architectural enterprise of chili agro-systems.

Figure 1 .
Figure 1.Number of provinces and actors according to chili's main pattern of distribution.

Figure 2 .
Figure 2. Econometrics methodology based on data type (adopted from various sources).
bin-Watson (DW) test, and parameter estimate results.Satisfactory prediction results have high adjusted R-squared values, and a flawless model has an R-Square = 1, but this rarely occurs (Forsyth 2018).The Durbin-Watson test for autocorrelation determines the appropriate functional form of a cross-sectional model.The limits of d are 0 and 4, and the d value of 2-4 stands for no autocorrelation, positive or negative (Gujarati and Porter 2009).The model specifications are as follows:

Figure 2 .
Figure 2. Econometrics methodology based on data type (adopted from various sources).

Figure 2 .
Figure 2. Econometrics methodology based on data type (adopted from various sources).
. The Durbin-Watson test for autocorrelation determines the appropriate functional form of a cross-sectional model.The limits of d are 0 and 4, and the d value of 2-4 stands for no autocorrelation, positive or negative (Gujarati and Porter 2009).The model specifications are as follows: Set and index intercept: a 0 , b 0 , c 0 = Intercept of chili production model d 0 , e 0 , f 0 = Intercept of chili import model g 0 , h 0 , i 0 = Intercept of chili export model j 0 , k 0 , l 0 = Intercept of chili demand model m 0 , n 0 , o 0 = Intercept of chili price model Parameter estimate: a i , b i = Estimated chili production model parameters (i = 1, 2, 3, 4, 5, 6) c i , d i = Estimated chili import model parameters (i = 1, 2, 3, 4, 5, 6)   e i , f i = Estimated chili export model parameters (i = 1, 2, 3, 4, 5) g i , h i = Estimated chili demand model parameters (i = 1, 2, 3) i i , j i = Estimated chili price model parameters (i = 1, 2, 3, 4)

Table 1 . Chili distribution main patterns and its related provinces (adopted from various sources). Pattern No. Province ID No. of Province Production Center Population 2020 (1000) Total Production Year 2021 (Ton) District Sub-District Red Chili Cayenne Pepper
Number of provinces and actors according to chili's main pattern of distribution.

Table 3 .
Identification of Data Availability and Granularity from Various Data Sources.
u2 k,t = Error of the chili production model at location k, period t u3 k,t , u4 k,t = Error of chili import model at location k, period t u5 k,t , u6 k,t = Error of chili export model at location k, period t u7 k,t , u8 k,t = Error of chili demand model at location k, period t u9 k,t , u10 k,t = Error of chili price model at location k, period t Parameter: QSC k,t = Total supply of chili at location k, period t (tons) QSCM k,t = Supply of red chili at location k, period t (tons) QSCR k,t = Supply of cayenne pepper at location k, period t (tons) QDCM k,t = Demand of domestic red chili at location k, period t (tons) QDCR k,t = Demand of domestic cayenne pepper at location k, period t (tons) LQDCM k,t−1 = Lagged demand for red chili at location k, period t − 1 (tons) LQDCR k,t−1 = Lagged demand for cayenne pepper at location k, period t − 1 (tons) QCM k,t = Production of red chili at location k, period t (tons) QCR k,t = Production of cayenne pepper at location k, period t (tons) LQCM k,t−1 = Lagged production for red chili at location k, period t − 1 (tons) LQCR k,t−1 = Lagged production for cayenne pepper in location k, period t − 1 (tons) HCM k,t = Real retail price of red chili at location k, period t (IDR/kg) HCR k,t = Real retail price of cayenne pepper in location k, period t (IDR/kg) HRCM k,t = Real producer-level price of red chili at location k, period t (IDR/kg) HRCR k,t = Real producer-level price of cayenne pepper at location k, period t (IDR/kg) MCM k,t = Import volume of red chili at location k, period t (tons) MCR k,t = Import volume of cayenne pepper at location k, period t (tons) LMCM k,t−1 = Lagged import volume for red chili at location k, period t − 1 (tons) LMCR k,t−1 = Lagged import volume for cayenne pepper in location k ,period t − 1(tons) XCM k,t = Export volume of red chili at location k, period t (tons) XCR k,t = Export volume of cayenne pepper at location k, period t (tons) LXCM k,t−1 = Lagged export volume for red chili at location k, period t − 1 (tons) LXCR k,t−1 = Lagged export volume for cayenne pepper at location k, period t − 1 (

Table 4 .
Estimation results for (a) red chili production and (b) cayenne pepper production models. (a)

Table 5 .
Estimation results for (a) red chili import and (b) cayenne pepper import models. (a

Table 6 .
Estimation results for (a) red chili export and (b) cayenne pepper export models. (a)

Table 7 .
Estimation results for (a) red chili demand model and (b) cayenne pepper demand model. (a

Table 8 .
Estimation results for (a) red chili price model and (b) cayenne pepper price model. (a

Table 9 .
Simulation results of scenarios of increasing/decreasing the estimated parameters of the UTK, HCM and HCR, QDCM and QDCR, as well as the QCM and QCR.

Table 10 .
Summary of the Supply and Demand Model Estimation Results.