Incremental Design of Perishable Goods Markets through Multi-Agent Simulations

In current markets of perishable goods such as fish and vegetables, sellers are typically in a weak bargaining position, since perishable products cannot be stored for long without losing their value. To avoid the risk of spoiling products, sellers have few alternatives other than selling their goods at the prices offered by buyers in the markets. The market mechanism needs to be reformed in order to resolve unfairness between sellers and buyers. Double auction markets, which collect bids from both sides of the trades and match them, allow sellers to participate proactively in the price-making process. However, in perishable goods markets, sellers have an incentive to discount their bid gradually for fear of spoiling unsold goods. Buyers can take advantage of sellers’ discounted bids to increase their profit by strategic bidding. To solve the problem, we incrementally improve an online double auction mechanism for perishable goods markets, which promotes buyers’ truthful bidding by penalizing their failed bids without harming their individual rationality. We evaluate traders’ behavior under several market conditions using multi-agent simulations and show that the developed mechanism achieves fair resource allocation among traders.


Introduction
In the research of multi-agent simulations, several types of auction mechanisms have been investigated extensively to solve large-scale distributed resource allocation problems [1] and several applications have been proposed in different kinds of markets [2][3][4].
In the auctions, resources are generally supposed to have clear capacity limitations, and in some cases they also have explicit temporal limitations on their value [5].In other words, the resources are modeled to be perishable in some problem settings.This study discusses the problem of trading perishable goods, such as fish and vegetables, in a market.
Several enterprises produce perishable goods or services and make profits by allocating them to dynamic demand before they deteriorate.In the services industries, revenue management techniques [6] have been investigated.Their objective is to maximize the revenues of a single seller because revenue management is typically practiced by the seller for its own profit.Therefore, it is difficult to apply those techniques to the markets, in which perishable products of multiple sellers must be traded in a coordinated manner to maximize social utility.
Agricultural and marine products are usually traded at spot markets (The opposite of spot markets are forward markets, which trade goods before production.)that deal in already-produced goods because their quantity and quality are highly uncertain in advance of production.Hence, their production costs are fixed in the markets.Their salvage value is zero because they perish when they remain unsold in the markets.Therefore, their production costs are sunk and sellers' marginal costs are zero in the traditional one-shot markets for perishable goods.This justifies extensive use of one-sided auctions, in which only buyers submit bids, for deciding allocations and prices of perishable goods in fresh markets [7].However, in such markets, sellers cannot straightforwardly influence price making to obtain fair profits [8].
In double action (DA) markets [9], both buyers and sellers submit their bids and an auctioneer determines resource allocation and prices on the basis of their bids.Recent studies [10][11][12] have applied multi-attribute double auction mechanisms to perishable goods supply chain problems using mixed integer linear programming.Although their approach is powerful in dealing with idiosyncratic properties of perishable goods, such as lead time and a shelf life, they have not considered the fluctuating nature of perishable goods markets, where supply and demand change dynamically and unpredictably.
To solve the problem, we develop an online DA market, in which multiple buyers and sellers dynamically tender their bids for trading commodities before their due dates.Our online DA market is developed as an instance of a call market, which collects bids continuously and clears the market periodically using predetermined rules.Since bids in the call market have multiple matching chances, the perishable goods in the call market can hold certain salvage values until their time limits.Therefore, sellers can participate in the price-making process of the online DA market, in which their reservation prices are equal to remaining values of the goods.However, since values of the perishable goods decrease progressively, sellers have an incentive to discount their price for fear of spoiling unsold goods.Taking advantage of such an incentive of the sellers, buyers can increase their surplus by bidding a lower price.Nevertheless, buyers need to bid a moderate price for securing the goods before time limits for procurement.

Related Research
Until recently there has been little research undertaken on the online DA.As a preliminary study for an online DA, an efficient and truthful mechanism for a static DA with temporally constrained bids was developed using weighted bipartite matching in graph theory [13].In addition, for online DAs, some studies have addressed several important aspects of the mechanism, such as design of matching algorithms with good worst-case performance within the competitive analysis framework [14], construction of a general framework that facilitates a truthful dynamic DA by extending static DA rules [15], and an application to electric vehicle charging problems [16].Although these research results are innovative and significant, we cannot directly apply their mechanisms to our online DA problem because their models incorporate the assumption that trade failures never cause a loss to traders, which is not true in perishable goods markets.
It is demonstrated that no efficient and Bayesian-Nash incentive compatible exchange mechanism can be simultaneously budget balanced and individually rational [17].For a static DA mechanism with temporal constraints, it is shown that the mechanism is dominant-strategy incentive compatible (or strategy-proof) if and only if the following three conditions are met: (1) its allocation policy is monotonic, where a buyer (seller) agent that loses with a bid (ask) v, arrival a and departure d also loses all bids (asks) with v ≤ v (v ≥ v), a ≥ a and d ≤ d (When we must distinguish between claims made by buyers and sellers, we refer to the bid from a buyer and the ask from a seller.);(2) every winning trader pays (or is paid) her critical value at which the trader first wins in some period; and (3) the payment is zero for losing traders [13].In the perishable goods market, the third condition cannot be satisfied because losing sellers have to give up the value of their unsold goods.Even when buyers bid truthfully, the sellers have a strong incentive to discount their valuation when their departure time is approaching and their goods remain unsold.With the knowledge of the sellers' incentive to discount their goods, buyers have an incentive to take advantage of sellers' discounts by underbidding.A mechanism that fails to induce truthful behavior in its participants cannot be efficient, because it does not have the information necessary to make welfare-maximizing decisions.In addition, for online DA markets, it is impossible to achieve efficiency because of the imperfect foresight about upcoming bids [18].Therefore, neither perfect efficiency nor incentive compatibility can be achieved in the online DA market for perishable goods.In order to develop stable and socially profitable markets for perishable goods, we need to investigate a mechanism that imposes (weak) budget balance and individual rationality while promoting reasonable efficiency and moderate incentive compatibility.
In perishable goods markets, sellers have an incentive to discount their price for fear of spoiling unsold goods.Taking advantage of such an incentive of the sellers, buyers can increase their surplus by bidding a lower price.In our previous research [19], we designed a new online DA mechanism called the criticality-based allocation policy.In order to reduce trade failures, the mechanism prioritizes bids with closer time limits over bids with farther time limits that might produce more social surplus.The proposed mechanism achieves fair resource allocation in the computer simulations, assuming that agents report their temporal constraints truthfully.However, in the proposed mechanism, traders have an incentive to bid late in order to improve allocation opportunities by increasing the criticality of their bids.This causes unpredictable behavior of traders and leads to deterioration of market efficiency.
In this study, extending our previous research, we incrementally develop a heuristic online DA mechanism for perishable goods with a standard greedy price-based allocation policy and achieve fair resource allocation by penalizing buyers' untruthful bids while maintaining their individual rationality.In the penalized online DA mechanism, buyers have an incentive to report a truthful value because they must pay penalties for their failed bids.At the same time, buyers' individual rationality is not harmed because buyers are asked to pay their penalty only on the occasion of a successful matching, in which the sum of a market-clearing price and the penalty does not exceed the valuation of the buyers' bids.
The rest of this paper is organized as follows: Section 2 introduces our market model.Section 3 explains the settings of multi-agent simulations to be used in the following sections.Section 4 presents a naive mechanism design in our online DA market and experimentally analyzes buyers' behavior trading perishable goods.Section 5 improves the market mechanisms and evaluates their performance based on market equilibria in an incremental fashion.Finally, Section 6 concludes.

Market Model
In this section, we build a model of online DA markets for trading perishable goods, discuss strategic bidding by traders, and define their utility in the markets.

Notations
In our online DA market, we consider discrete time rounds, T = {1, 2, . . .}, indexed by t.For simplicity, we assume the market is for a single commodity.The market has two types of agents, either sellers (S) or buyers (B), who arrive and depart dynamically over time.In each round, the agents trade multiple units of the commodity.The market is cleared at the end of every round to find new allocations.
Each agent i has private information, called type, , where v i is agent i's valuation of a single unit of the good, q i is the quantity of the goods that agent i wants to trade, a i is the arrival time, and d i denotes the departure time.The gap between the arrival time and departure time defines the agent's trading period [a i , d i ], indexed by p, during which the agent can modify the valuation of its unmatched bid.Moreover, agents can repeatedly participate in the market over several trading periods.
We model our market as a wholesale spot market.In the market, seller i submits a bid of her goods at arrival time a i .At departure time d i , the salvage value of the goods evaporates because of its perishability unless it is traded successfully.Seller i must bring her goods to the market before her arrival.Therefore, seller i incurs a production cost in her trading period and considers it as valuation v i of the goods at arrival time a i .Because of advance production and perishability, sellers face the distinct risk of failing to recoup the production cost in the trade.Buyers procure the goods to resell them in their retail markets.Arrival time a j is the first time when buyer j values the item.For buyer j, valuation v j represents her assumed budget to procure the goods.In addition to trade surplus, buyer j gains profits by retailing the goods if she succeeds in procuring them before her departure time d j , which is the due time for a retail opportunity.
Let θt denote the set of all the agents' types reported in round t; a complete reported type profile is denoted as θ = ( θ1 , θ2 , . . ., θt , . . .); and θ≤t denotes the reported type profile restricted to the agents with a reported arrival of no later than round t.In each trading period p, agent i deals with a new trade and has a specific type θ ) is a bid made by agent i in round t within trading period p (i.e., t ∈ [a p i , d p i ]).To be noted is that successful trades in previous rounds of period p reduce the current quantity of goods to q t i ≤ q p i .In the market, a seller's ask and a buyer's bid can be matched when they satisfy the following condition.
The third term in the matching condition (1) (i.e., dp i > dp j ) requires that seller's goods must not perish before the buyer's due date.
Among matchable bids and asks, the market decides the final allocations based on its mechanism M, which is composed of an allocation policy π and a pricing policy x.The allocation policy π is defined as {π t } t∈T , where π t i,j ( θ≤t ) ∈ I ≥0 represents the quantity traded by agents i and j in round t, given reports θ≤t .The pricing policy x is defined as {x t } t∈T , x t = (s t , b t ), where s t i,j ( θ≤t ) ∈ R ≥0 represents a price seller i receives as a result of the trade in round t, given reports θ≤t with buyer j who pays a price b t i,j ( θ≤t ) ∈ R >0 .

Agent's Utility
Standard DA markets assume that agents have simple quasi-linear utility: ∑ j∈B (s i,j − v i )π i,j for seller i and ∑ i∈S (v j − b i,j )π i,j for buyer j.However, to represent the characteristics of a wholesale market for perishable goods, we define the idiosyncratic utility for sellers and buyers.
For seller i, when π t i,j units of goods are sold to buyer j at price s t i,j in round t within period p, seller i obtains income s t i,j .Because the production cost of the seller is v p i , the seller's surplus is (s t i,j − v p i )π t i,j .If a unit of goods perishes in round t without being traded, seller i loses valuation v p i .Therefore, seller i's utility is defined as follows.
Definition 2 (Seller i's utility at time round t).
The second term in Equation ( 2) represents that sellers' production cost is sunk.The sunk cost term can be removed from the utility function without making any influence on seller's behaviors, but we intentionally put this term in th utility function to explicitly represent sellers' difficulty of making profits in the market since the value of her goods is completely lost because of perishability whether they are sold or not.
When buyer j has a budget v p j for purchasing one unit of goods and succeeds at procuring π t i,j units of the goods at price b t i,j , she obtains surplus (v p j represents a retail price of the goods, buyer j earns additional profit γ p j v p j π t i,j by retailing the procured goods.We set 1.0 as the value of γ p j for the empirical evaluation in Section 3. Thus, buyer j's utility is defined as follows. Definition 3 (Buyer j's utility at time round t).
Agents are modeled as risk neutral and utility maximizing.Equation (2) shows that the seller's bidding strategy is intricate because she can enhance her utility by either raising the market price with higher valuation bidding or preventing the goods from perishing with lower valuation bidding.Equation (3) reveals that the buyer also has difficulty finding an optimal bidding strategy because she can improve her utility by either bringing down the market price with lower valuation bidding or increasing successful trades and retail opportunities with higher valuation bidding.

Strategic Bidding by Agents
Agents are self-interested and their types are private information.At the beginning of a trading period, agent i submits a bid by making a claim about its type θi = ( vi , qi , âi , di ) = θ i to the auctioneer.An agent's self-interest is exhibited in its willingness to misrepresent its type when this will improve the outcome of the auction in its favor.However, misrepresenting its type is not always beneficial or feasible for the agent.
Reporting an earlier arrival time is infeasible for both sellers and buyers because the arrival time is the earliest timing at which they decide to sell or buy the goods in the market.Reporting a later arrival time or an earlier departure time can only reduce the chance of successful trades for the agents.For example, even though buyers like to take advantage of sellers' time discounting behavior in perishable goods markets, buyers' belated arrival is not beneficial for them because they might possibly miss seller's low-priced bids that could be matched if buyers bid earlier.Buyers can get better quotes by submitting a low-valued bid as early as possible.In other words, the buyers do not lose their chance of buying at a discounted price by biding low prices without misreporting their arrival time.
For a seller, it is impossible to report a later departure time di > d i since the goods to be sold in the market perish by time d i .For a buyer, misreporting a later departure time dj > d j may delay retailing the procured goods.
As for quantity, it is impossible for a seller to report a larger quantity qi > q i because the sold goods must be delivered immediately after trade in a spot market.Moreover, it is unreasonable for a buyer to report a larger quantity qj > q j because excess orders may produce dead stocks for her.
In addition, although a seller can misreport a smaller quantity qi < q i with the intention of raising the market price, in that case, she needs to throw out some of her goods already produced for sale.If a buyer misrepresents a smaller quantity qj < q j to lower the market price, she loses a chance of retailing more goods.Although these misreports might create larger profits for sellers and buyers, finding the optimal quantity for increasing their profits is not a straightforward task.Therefore, in this study, we assume that the agents do not misrepresent a quantity value in their type.
On the other hand, we assume that an agent has incentives to misreport its valuation for increasing its profit because this is the most instinctive way for the agent to influence trades in a market.In a spot market for perishable goods, a seller may report a lower valuation vi < v i when she desperately wants to sell the goods before they perish.In addition, a buyer may report a lower valuation vj < v j to increase her profit when she can take advantage of the sellers' discounted bids.
Consequently, in this study, we assume that agent i may report its limit price strategically (i.e., vt i may not be equal to v i ) but reports truthful values of quantity q i , arrival time a i , and departure time d i in any round t.

Multi-agent Simulation
Our online DA market is considerably more complex than traditional DA markets, for which theoretical analysis is intractable.Therefore, we need to evaluate the market empirically using multi-agent simulations.

Simulation Settings
In the simulation, 15 sellers and 15 buyers, each of which has one unit of demand or supply, dynamically participate in the market every day, which makes total quantities of demand and supply balanced in the market.We use seven types of markets with distinctive demand-supply curves.In the n-th market, buyer j has a unit of demand, whose value is v j = 60 − 2(j − 2n), and seller i has a unit of supply, whose value is v i = 60 + 2(i − 2n). Figure 1a-c represent markets with a different risk of trade failures.It is noteworthy that all the markets have an equal competitive equilibrium price as 60 despite their diversified risk of trade failures.Each simulation runs for 28 days, in which the market is cleared every hour (i.e., duration of a time round is 1 h).Agents submit their bid to the market at the random timing every day.The submitted bid expires 48 h after their submission (i.e., duration of a trading period is 48 time rounds for every bid).Therefore, every agent has two valid bids in the market unless successfully matched bids are removed from the market.
In the time interval of a trading period, agents can freely modify the reported valuation in their unmatched bid.Fifty randomized trials are executed to simulate the diversified temporal patterns of agents' biddings in seven types of markets.

Agent's Bidding Strategies
As discussed in Section 4, it is impossible to achieve incentive compatibility in the online DA market for perishable goods, in which sellers suffer a loss when their goods remain unsold.In such markets, sellers adopt markdown pricing strategies to discount their asks progressively.Responding to the sellers' discounting behavior, buyers try to find the lowest matchable valuation to bid, considering market conditions and their true valuation.However, in order to achieve stable and fair trades, markets should give buyers a plausible incentive to bid truthfully.Therefore, we plan to design a market, in which "for buyers, always bidding untruthfully makes less profits than always bidding truthfully," even when sellers bid time discounting values.We call this property quasi incentive compatibility on the buyer's side.To behave optimally in such a market, buyers cannot stick to untruthful bidding but should switch to truthful bidding based on a market situation.However, we assume that in a complex and unpredictable online market, dynamically adapting the bidding strategy correctly is beyond average buyers' capability.Consequently, if a market satisfies quasi incentive compatibility for buyers, the buyers are supposed to maintain truthful bidding in the hope of being better off.In addition, we expect that sellers can make proper profits thanks to buyers' truthful bidding.
Experimental analyses of complex markets generally require many cycles of simulations using a large population of heterogeneous agents with sophisticated strategies [20].However, in this study, we do not consider any fancy learning-based strategy, in which agent's bidding behavior is dynamically determined based on a statistically updated market model, because of the following reasons:

•
Since agents' population fluctuates randomly in online markets, accurate status of a market is statistically unpredictable in the simulation.

•
Since a market condition and agent's true valuation are fixed in each simulation run, agents can presume that a market is an approximately static environment in the simulation.
As for buyers, we assume that buyer j constantly bids with a certain deviation from her true valuation v p j as follows: 1.

Constant difference strategy (COND α):
Buyer j always reports her valuation as The value of α is constant for buyer j and 0 Constant rate strategy (CONR β): Buyer j always reports her valuation as The value of β is constant for buyer j and 0 ≤ β ≤ 1.0.
As for sellers, we restrict our attention only to strategies that do not bid higher than their true valuation because sellers have a considerable risk of trade failures under our market conditions in which total quantities of demand and supply are balanced.Hence, in the experiments, we use the following strategies for sellers: 1.

Modest strategy (MODE):
Seller i always reports her valuation as vt i = 0.

Truth-telling strategy (TRUE):
Seller i always reports her valuation truthfully as

3.
Step discount strategy (STEP): Seller i bids as

Monotonous discount strategy (MONO):
Seller i bids as MODE strategy is developed to simulate one-sided auction markets in which only buyers submit their bids.STEP strategy is a stepwise discount rule, in which a seller reports zero valuation after the midpoint of trading periods.MONO strategy is a typical instance of markdown pricing that simulates sellers' progressive time-discounting behavior in perishable goods markets.

Primary Market Design
In the perishable goods markets described in Section 2, agents have to manipulate their valuation carefully in order to increase their utility.Our goal is to design a market mechanism that secures desirable outcomes for both individual agents and the entire market without strategic bidding by the agents.
The market mechanism has traditionally been developed by designers who formulate a problem mathematically and characterize desirable mechanisms analytically in that framework.However, the mechanism design of complex markets, such as our perishable goods markets, is analytically intractable.Recently, computational methodologies for mechanism design, such as automated mechanism design [21], have been advocated.They solve a constrained optimization problem to ensure desirable properties of the mechanism, such as strategy-proofness, for every possible input of agents.We cannot apply these methods to solve our problem because we need to find a satisficing but not optimal mechanism for our market, in which neither perfect incentive compatibility nor complete efficiency can be achieved.Instead, we take a heuristic approach known as an incremental mechanism design [22], which starts with a naive mechanism and incrementally improves it over a sequence of iterative evaluations.

Naive DA Mechanism
As the naive DA mechanism, we adopt standard allocation and pricing policies, which are commonly used in many studies on DA, in order to delineate idiosyncratic problems in traders' behavior in perishable goods markets.
A common goal of the allocation policy in DA markets is to compute trades that maximize social surplus, which is the sum of the difference between bid and ask prices for all matched bids, with the assumption that traders bid truthfully.Thus, the standard allocation policy arranges the asks in ascending order of the seller's price and the bids in descending order of the buyer's price, and matches the asks and bids in sequence.We refer to this allocation rule as a greedy allocation policy.
The greedy allocation policy is monotonic and is efficient in durable goods markets, in which agents do not lose utility when they fail to trade.However, sellers of perishable goods lose the value of perished goods when they fail to sell them during the trading period.Consequently, in addition to increasing social surpluses from trades, increasing the number of successful trades is important in the perishable goods markets for maximizing social utility.In our previous research [19], we advocated a criticality-based allocation policy that gives higher priorities to the expiring bids.Since the criticality-based allocation policy is not monotonic, traders have an incentive to misreport their valuation and temporal information.
The pricing policy is important to secure truthfulness and increase market efficiency.Nevertheless, since obtaining truthfulness in online DA markets for perishable goods is impossible, we impose both budget balance and individual rationality, and we promote reasonable efficiency and adequate truthfulness.We adopt the k-DA pricing policy [23], in which a clearing price for both sellers and buyers is determined as (1 − k) vt j + k vt i , thereby making the naive DA market budget balanced (i.e., s t i,j = b t i,j ).We set the value of k as 0.5 for experiments in this paper.

Analyzing Buyers' Behaviors
Our goal in the research is to empirically design a perishable goods market that satisfies quasi incentive compatibility for buyers.As an initial step toward the goal, we evaluate the frailty of the naive DA mechanism against buyer's strategic behavior by analyzing buyers' utility in the seven types of markets when they have asymmetric strategy profiles.In the experiments, sellers adopt MONO strategy and buyers adopt COND α strategy.In the n-th market of the experiments, seller i's valuation is v i = 60 + 2(i − 2n) and all the buyers have a common valuation of v j = 60 − 2(1 − 2n).It should be noted that, in the following experiments in which buyers have asymmetric strategy profiles, buyer j does not have an individual true valuation v j = 60 − 2(j − 2n) as explained in Section 3.1.With an incentive to increase the trade surplus, buyer j is assumed to misreport her valuation as smaller than the true valuation by an idiosyncratic value of α = 2(j − 1).

Simulating Static Markets with Naive DA Mechanism
Before investigating online markets, we analyze buyers' utility in static markets using the naive mechanism composed of a greedy allocation policy and a k-DA pricing policy.In the static markets, every agent k enters and leaves the market simultaneously as a p k = 24(p − 1), d p k = 24(p + 1). Figure 2a-c show buyers' surplus, resale profit and total utility, respectively.The x-axes of the graphs represent 15 buyers in the markets.A buyer with a larger number bids a lower value than its truthful valuation (i.e., vt j = v j − 2(j − 1)).The graphs in each figure show the seven results obtained in distinctive market conditions.In the simulations, all the bids can be matched within the trading periods, because (1) sellers discount their asks monotonously to zero; and (2) quantities of supply and demand are always balanced in the markets.Thus, buyers' resale profits are constant, as shown in Figure 2b.Furthermore, buyers' surpluses are increased by buyers' strategic bidding, as shown in Figure 2a, because low-valued bids match discounted asks and yield large surpluses from their true valuation.Therefore, as shown in Figure 2c, we find that in static markets with a naive DA mechanism, buyers can obtain larger utility by misreporting their value than by bidding truthfully, especially in markets with high risk of trade failures (i.e., markets with a small n value).

Simulating Online Markets with Naive DA Mechanism
As a next step, we investigate buyers' utility in seven types of online markets with the naive DA mechanism.
Figure 3a-c show buyers' surplus, resale profit, and total utility, respectively.Vertical lines in the graphs show standard deviations of the obtained results.In online markets, where demand and supply are not always balanced due to dynamically changing agents' population, low-valued buyers' bids have a high possibility of not being able to find sellers' asks that satisfy the matching condition (1), which is not the case with static markets when sellers adopt a MONO bidding strategy.Therefore, increases of buyers' surplus by misreporting their value saturate at certain points, as shown in Figure 3a, and buyers' misreporting deteriorates resale profits, as shown in Figure 3b.Adding the results of buyers' surplus and resale profit, Figure 3c shows that modestly untruthful buyers still have higher utility than the truth-telling buyers in any type of online markets.
From the experimental results in this section, we find that the naive DA mechanism cannot satisfy quasi incentive compatibility for buyers in perishable goods markets.

Incremental Improvement of Market Mechanism
We incrementally improve the naive DA mechanism and investigate the agents' equilibrium behavior in the market using multi-agent simulations.

Imposing a Penalty on Buyers' Trade Failures
In perishable goods markets, sellers have an incentive to discount their bids to reduce unsold goods.Buyers tend to underbid in the markets and wait for sellers' prices to fall, as shown in the simulation results in Section 4. To encourage buyers' high valuation, we impose a penalty on buyers for their matching failures.We advocate a penalized payment policy {x t } t∈T , x t = (s t , b t , p t ), where p t j ( θ≤t ) ∈ R >0 represents a penalty imposed on buyer j as a result of her trade failures until round t, given reports θ≤t .Although imposing the penalty on buyers seems unrealistic in practical market situations, it can be considered as a market entry fee for the buyers, which is refunded when they successfully trade in the market.
The penalty of buyer j is an average price gap between a buyer's bid and sellers' asks in the past matching failures, which is updated every time the market is cleared, as shown in Algorithm 1.

2:
Buyer j's penalty information is stored in ave and cyc.The penalized pricing policy requires a buyer to pay the sum of a clearing price and the penalty when the buyer's bid is executed.If the required amount exceeds the buyer's bid price, the buyer refuses the payment to cancel the execution and proceed to another bidding.Therefore, the penalized pricing policy does not cause buyers' individual rationality to deteriorate and the matching condition of the market changes as follows.

Definition 4 (Matching condition with buyer's penalty). Seller i's ask θt
) and buyer j's bid θt j = ( vt j , q t j , a p j , d p j ) are matchable when The last term in the matching condition (4) guarantees the buyer's individual rationality.When the penalized pricing policy is adopted, the buyer's utility is reduced by the penalty for trade failures as follows.
Definition 5 (Buyer j's penalized utility in time round t).
Hence, with the penalized pricing policy, buyers have an incentive to report a high valuation for avoiding matching failures.
We evaluate a penalized DA mechanism that replaces the simple k-DA pricing policy in the naive DA mechanism with the penalized pricing policy.We execute the same simulations described in Section 4.2.2 to investigate the effects of imposing a penalty on buyers when their bids fail to match sellers' asks.
Figure 4a,b,d represent buyers' surplus, resale profit and total utility, respectively, in the same way as the previous experiments.Figure 4c shows the penalty imposed on each buyer by the mechanism.
Comparing Figures 3b and 4b, we find that, after imposing penalties on buyers, misreporting decreases buyers' resale profits drastically, because the matching condition (4) increases matching failures and reduces resale opportunities for penalized buyers.In addition, Figure 4a,c show that the increase of buyers' surplus by misreporting is canceled out by the penalty imposed on the buyers.Therefore, by comparing Figures 3c and 4d, it is shown that the proposed penalized DA mechanism largely succeeds in eliminating buyers' utility gain by misreporting their values in the markets.

Analyzing Agents' Equilibrium Behavior
The experimental results of the penalized DA mechanism demonstrate that truth-telling buyers are better off than untruthful buyers in the heterogeneous population of buyers.For the next step, we need to understand the interactions between buyers and sellers with various bidding strategies to ensure quasi incentive compatibility of buyers along with other favorable properties, such as efficiency and stability, in our online markets of perishable goods.
The Nash equilibrium is an appropriate solution concept for understanding and characterizing the strategic behavior of self-interested agents.However, computing the exact Nash equilibria is intractable for a dynamic market with non-deterministic aspects [24].We analyze the market through simulations across the restricted strategy space to evaluate its quasi incentive compatibility for buyers.
In the experiments, buyers adopt CONR β strategies, in which buyer j bids vt j = βv p j , with five different values of β (i.e., 0.2, 0.4, 0.6, 0.8, and 1.0), and sellers use four types of bidding strategies (i.e., MODE, TRUE, MONO, and STEP) explained in Section 3.2.In the n-th market of the experiments, seller i's valuation is v i = 60 + 2(i − 2n) and buyer j's valuation is v j = 60 − 2(j − 2n) as explained in Section 3.1.Because agents on each side are faced with the same utility maximization problem, they follow the same strategy in the equilibrium analysis.Therefore, we consider symmetric strategy profiles for each side of agents.
Table 1 shows the payoff matrix between sellers and buyers in the online market with low risk of trade failures shown in Figure 1c.Buyers have more chances of increasing their surpluses by underbidding in the lower-risk markets.
Each cell in the table represents the result of interactions between the sellers and buyers with the corresponding strategy.The cell is separated into the following two parts:

•
The upper part of the cell shows the following three types of information: (1) the average clearing price; (2) the average matching rate; and (3) the total utility that is the sum of traders' utility and penalties paid by buyers to the auctioneer.

•
The lower part of the cell represents utility of trader agents.The bottom left corner shows the averaged utility of seller agents, and the top right corner reveals that of buyer agents.Standard deviation of the average utility is shown inside parentheses.
In the table, numbers in boldface represent utilities of the agent's best response to the other agent's bidding strategy, and cells in gray show Nash equilibria.
It is noteworthy that the results in the first row of the table, where sellers adopt MODE strategy, are the same as those by the naive DA mechanism because there are no matching failures and thus, no penalty for buyers when sellers bid zero valuation.1 shows that there are two Nash equilibria strategy profiles: (MODE, CONR 0.2), and (MONO, CONR 0.8).Among them, adopting the CONR 0.2 strategy is risky for buyers because it produces zero utility for them if sellers avoid using the MODE strategy, which produces a large loss for sellers even when buyers bid high values.Therefore, (MONO, CONR 0.8) is a more feasible strategy profile to be executed in the low-risk market with the penalized DA mechanism.From the above equilibrium analysis, it is found that buyers obtain more profits by strategically misreporting their value 20% less than the true valuation and the penalized DA mechanism fails to achieve quasi incentive compatibility for buyers.

Adjusting Buyers' Penalty Based on Market Condition
Since buyers are more likely to underbid for increasing their surpluses in the lower-risk markets, as shown in Table 1, larger penalties must be imposed on buyers' trade failures in the lower-risk markets in order to prevent buyers' strategic bidding.For this purpose, we modify the buyer's penalty on matching failures as follows: In Equation ( 6), MatchingRate() is a function that calculates a current successful matching rate of bids and asks in the market.If the matching rate is above 50%, the market is considered to have a low risk of trade failures and the adjusted buyer's penalty pt j has a larger value than the original penalty p t j .We call the DA mechanism with the modified penalty an adaptively penalized DA mechanism.
Table 2 shows the payoff matrix between sellers and buyers in the low-risk market shown in Figure 1c with the adaptively penalized DA mechanism.The table shows that there are three Nash equilibria strategy profiles: (MODE, CONR 0.2), (STEP, CONR 1.0), and (MONO, CONR 1.0).To be noted is that we consider (MONO, CONR 1.0) is a Nash equilibrium because its seller's utility (i.e., 1051) is not significantly smaller than the seller's utility in (STEP, CONR 1.0) (i.e., 1058) based on its standard deviation (i.e., 147).Among them, adopting CONR 0.2 strategy is risky for buyers for the same reason explained in Section 5.2.Therefore, (STEP, CONR 1.0) and (MONO, CONR 1.0) are more feasible strategy profiles to be executed in the low-risk market with the adaptively penalized DA mechanism.From the above equilibrium analysis, it is found out that buyers maximize their profits by truthfully reporting their true valuation and the adaptively penalized DA mechanism achieves quasi incentive compatibility for buyers.
The adaptively penalized DA mechanism also succeeds in achieving quasi incentive compatibility for buyers in riskier market conditions, such as Figure 1a,b, as shown in Tables 3 and 4.
As explained, the results in the first row of the tables are the same as those of the naive DA mechanism.(MODE, CONR 0.2) is a dominant strategy equilibrium when sellers are not allowed to bid in the naive DA market, which is the case with the conventional perishable goods markets.Therefore, by comparing the results of (MONO, CONR 1.0) strategy profile with those of (MODE, CONR 0.2) strategy profile, we can clarify the effects of applying the adaptively penalized DA mechanism to perishable goods markets.From the abovementioned results, we find that the adaptively penalized DA mechanism increases sellers' utility while it depresses buyers' utility and decreases total utility especially in higher-risk markets.This means the adaptively penalized DA mechanism makes markets fairer, but it also makes markets less efficient in high risk conditions due to low matching rates.However, since the inefficiency of markets largely comes from sellers' loss of perished goods (i.e., 7984 in the high-risk market and 3797 in the medium-risk market), sellers can improve their profits and the market's efficiency by controlling their supplies based on the estimated market condition.In the adaptively penalized DA market, sellers can make a reasonable estimation of the market conditions from their matching rates, which is not possible with the conventional markets in which almost all of the supplies are sold at equally low prices regardless of the market conditions (i.e., 4 in the high-risk market, 5 in the medium-risk market, and 7 in the low-risk market).In perishable goods markets, it is intuitively reasonable for sellers to offer discounted asks when there is little time left for matching with buyers' bids.The above-described equilibrium analysis results show that only the MONO strategy is a best response for sellers to buyers' truthful bidding in all the three types of markets.We investigate the buyers' best response to sellers' MONO strategy in seven types of markets.
The y-axes in Figure 5a-c show buyers' utility, sellers' utility, and total utility, respectively.The x-axis represents seven market conditions with a larger number corresponding to a smaller risk of trade failures as explained in 3.1.Graphs in each figure show the results obtained in the trades with buyers adopting the CONR strategy with five different β values (i.e., 0.2, 0.4, 0.6, 0.8 and 1.0).Figure 5a shows that buyers can obtain the largest utility by bidding truthfully (i.e., β = 1.0) in all market conditions.Figure 5b reveals that sellers can also achieve the largest utility when buyers bid truthfully in every market condition.As a result, as Figure 5c shows, trades produce the largest social profit when buyers report their truthful valuation in all conditions of the market.Therefore, the experiments demonstrate that the adaptively penalized DA mechanism holds quasi incentive compatibility for buyers and succeeds in improving the profits for sellers in perishable goods markets.

Conclusions
We incrementally developed an online DA mechanism for the market of perishable goods to realize fair resource allocation among traders by considering the losses due to trade failures.We explained that sellers have a high risk of losing money by trade failures because their goods are perishable.To reduce trade failures in the perishable goods market, our DA mechanism encourages buyers' truthful bidding by penalizing buyers' failed bids without spoiling their individual rationality.The experimental results using multi-agent simulation showed that our DA mechanism was effective in promoting truthful behavior of traders for realizing the fair distribution of large utilities between sellers and buyers.
The experimental results in this study are very limited for any comprehensive conclusion on the design of online DA for perishable goods.For example, we assumed that buyers accept the penalty imposed by the mechanism as long as buyers' payment does not exceed their bid (thus, not violating their individual rationality).However, their tendency to accept the penalty might depend on possibility of their successful trades and potential profits from their resales.The influence of traders' sophisticated behavior in the market needs to be investigated further in future research.We need to investigate other types of bidding practices and test them under various experimental settings.In addition, human behavior in the market must be examined carefully to evaluate the designed mechanism [25].
In developing countries, there is strong demand for improving efficiency in agricultural markets [26] and this is also true for rural societies in developed countries, including Japan.
As a real-life application of our online DA mechanism, we have been operating a substantiative e-marketplace for trading oysters cultivated in the north-eastern coastal area of Japan (Miyagi prefecture) for five years.Only local traders used to participate in the markets run by the fisheries cooperatives because the highly perishable nature of fishery products prevents their trade from being open to wider participants.This has led to low incomes and the collapse of local fishery industries, which also suffered severe damage from a massive earthquake and tsunami in 2011.Although we do not have sufficient data to show the effectiveness of our online DA mechanism, we hope to contribute to promoting successful deployment of electronic markets for fisheries and improve the welfare of people in the area by attracting more traders online.

Table 1 .
Payoff matrix in the low-risk market.MODE: modest strategy; TRUE: truth-telling strategy; STEP: step discount strategy; MONO: monotonous discount strategy.

Table 2 .
Payoff matrix in the low-risk market.

Table 3 .
Payoff matrix in the high-risk market.

Table 4 .
Payoff matrix in the mid-risk market.