Multiobjective Games for Detecting Abnormally Expressed Genes †

: A class of multiobjective games with applications to a medicine setting is studied. We consider the vector Shapley value and the vector Banzhaf value for a multicriteria game and we apply them to a microarray game. We give an axiomatic characterization too.


Introduction
In this paper, a class of multiobjective games with applications to a medicine setting is studied. We consider, via mathematical game theory, the genic expression to investigate serious diseases such as cancer. Our goal is to propose a method for evaluating the relevance of the genes as disease markers. The common application in Medicine is "to teach" a classificator to distinguish between healthy and sick subjects on the basis of samples given by doctors. For technical details we refer to Moretti et al., 2007. A method to make a feature selection is to use Cooperative Game Theory with transferable utility (TU games in literature, see Peters [1], Gonzales Diaz et al. [2]).
Intuitively, each gene is considered in a coalition of genes and to each coalition a value is assigned which shows how much these genes' expressions suggest that we distinguish between healthy and sick subjects. Some research [3,4] applied Mathematical Game Theory to analyze the results obtained with microarray techniques which allow to make a photo of 1000 gene expressions through a unique experiment. The starting point is studying the genetic expression in a cell sample and verifying some particular biological conditions (for example, the cells of a subject affected by a tumoral disease).
Mathematical Game Theory has a fundamental role to define the "microarray games" and to evaluate the relevance of genes to regulate or to provoke the onset of a pathology, taking into account the interactions with other genes. It is well known that many diseases have a genetic origin. In the mathematical literature we find the study of some power indices: Shapley value [5,6], Banzhaf value [7] to evaluate the relevance of genes. In this context the Shapley and the Banzhaf values are studied as a measure about the "importance" of a gene ("relevance index") in the diagnosis. We study the vector Shapley value for microarray multiobjective games basing our study on the idea of "partnership of genes" [8]. Intuitively, this is a genes' group with correlated characterizations that are very important to study if the disease is developing.
The experimental results have shown that the Shapley value is a valid tool to evaluate the expressions of genes and to predict a tumor disease.
The advantage of considering a coalitional game is the possibility to compute a numerical index, the so called relevance index, which intuitively represents the relevance of each gene taking into account the relevance of the others when, for example, a tumor disease is developing. We study the microarray games and we generalize this problem to a vector one. We consider multicriteria or multiobjectives games because we think that by taking into consideration more objectives, the solution is more precise and allows to better understand the presence of a disease.
The importance of microarray games for medical problems is emphasized in the papers [3,4,9] For more information about multiobjective games and their solutions you can see: regarding Vector Optimization [10], and regarding many interesting results for multicriteria games [11], the first step about multicriteria exact potential games and approximate solutions [12], the study of multicriteria fuzzy games [13], multicriteria partial cooperative games and applications to environmental models [14], about a new concept of approximate solutions with improvement sets [15], for multicriteria ordinal potential games and application to peering games and telecommunication models [16], and some multicriteria games with potential function [17].
In this paper, we consider the vector Shapley value and we extend the Banzhaf one as indices of relevance of the genes as disease markers but research is in progress about other solutions and a comparison among the results. We follow an axiomatic approach to the vector solutions and we prove that both the vector Shapley value and the vector Banzhaf value are characterized by a suitable sets of axioms.
These microarray games can be applied to neurological disease and allergies too. As suggested by an anonymous referee, a similar application of Game Theory appears in the reliability theory. We refer to the measure introduced by Barlow and Proshan [18] and Birnbaum [19]. The first is about Shapley value and the second is about Banzhaff (see [20]). A similar situation appears also in Szajowski [21], referring to voting systems.
The paper is organized as follows: in Section 2 there are some results of background, in Section 3 the multicriteria microarray games are considered. In Section 4 we study an axiomatic approach to the Shapley value and in Section 5 we study an axiomatic approach to the Banzhaf value. Finally, Section 6 is devoted to conclusions and open problems.

Background
Given x, y ∈ R n we consider the following inequalities on R n : x y ⇔ x i ≥ y i ∀i = 1, . . . , n; x ≥ y ⇔ x y and x = y; x > y ⇔ x i > y i ∀i = 1, . . . , n. Analogously we define , ≤, <. We write R n ++ = {x ∈ R n : x i > 0 ∀i = 1, . . . , n} and R n + = {x ∈ R n : x i ≥ 0 ∀i = 1, . . . , n} Let us consider an m-multiobjective (or m-multicriteria) TU-game N, v (see [22]) where N = {1, 2, . . . , n} is the set of players and v : 2 N → R m is the characteristic function of the game, with v(∅) = (0, . . . , 0). It assigns to each coalition S ∈ 2 N a m-vector, m being the number of objectives, equal for If all players cooperate, the grand coalition forms. Let us write

Definition 1. A multicriteria game N, v is convex if
for each S ⊂ T and for each i ∈ N.
We say that a cooperative game has the property of weak-superadditivity if there is no Let us define the imputation set for a multicriteria game.

Definition 4. An imputation of the game is a matrix
The set of all imputations is denoted by I(N, v). We write X S = ∑ i∈S X i , X N = v(N).

Definition 5.
Let us recall two cores for the game N, v : From the scalar inequality "≥" , two vectorial inequalities follow: " " and "≥". Consequently, it is possible to define two cores. In this paper we use only the first.
Theorem 1. If a multicriteria cooperative game is convex then the core C(N, v, ) = ∅.
Proof. see [22]. Let G m n be the space of the multicriteria games with m objectives and n players, it is a vector space with dimension (2 n − 1) × m which can be defined by the unanimity games u S and by the identity games i S : Each game can be written as a linear combination of these games which define the basis.

Multicriteria Microarray Games
We give some definitions about the microarray games with m objectives and N the player-genes set.
Let us consider n genes and k samples, starting from these we build a n × k matrix A = (a ih ) if the gene i is over or under expressed in the sample h according to the criterium j = 1, · · · , m 0 otherwise Fixed the sample h, let us define the support of h w.r.t. the objective j as the set of players i s.t.
Intuitively it identifies the set of abnormally expressed genes. So we can define the unanimity game following the Definition 6 where S = spta j .h .
Then the microarray game associated to A = ((a ih ) ) i,h is defined as v = 1 |S D | ∑ S⊂S D u S where = 1, · · · , m, where |S D | is the cardinality of the set S D .
S D and S R are two sets where the first contains the samples from individuals which we consider without the disease and the second contains samples from individuals which we want to investigate.
Let us denote by M m n the set of microarray games. In the following example we consider a Microarray Experimental Situation (MES) with the tuple The discriminant method is the same for the two objectives.
We introduce the matrix M = a b where a, b ∈ {0, 1}, the value 0 means that the gene is normally expressed (intuitively the disease is not present), the value 1 means that the gene is abnormally expressed, (intuitively the disease is present). The values a b = 1 1 mean a high degree of dangerousness, 0 0 mean no degree of dangerousness, instead 0 1 or 1 0 keep attentive because this is a warning situation and it can become dangerous.
, otherwise it is 0. Referring to Example 1, the matrix M will be: So the microarray game will be: Intuitively in a partnership the proper subsets of genes of S are not important (for the disease).

Some Considerations about Multicriteria Microarray Games
When microarray games are used to evaluate which genes are more relevant as a marker of a disease, an important role is played by the binarization of the data on the level of expression of the genes of RiboNucleic Acid (RNA) in the samples under investigation. Moretti et al. [9] defined an upper and a lower threshold of the normal expression for each gene as the maximum and the minimum value of the values in a significative sample of persons whose condition may be considered normal. Of course, these thresholds strongly depend on the sample used; for instance, a person could suffer from the disease but it is not known, so the expression of some genes could be altered and this matter could significantly increase the upper threshold or decrease the lower threshold, or some genes may simply be overexpressed or underexpressed without the presence of the disease. This remark suggests that we should define different pairs of upper and lower thresholds and for each of them define a different binarization of the expression of the genes; different support matrices may be built leading to different games, each of which may be viewed as a different criterion. At a first glance, it may seem that the different support matrices are strongly related to the thresholds, so that if a gene is abnormally expressed w.r.t. a given pair of thresholds, then it is abnormally expressed also with a tighter pair of thresholds. We may notice that this is not completely true for two reasons, which we present in the following using suitable toy examples.

Example 2. Consider two samples and three genes with the following expressions:
Sample 1 Sample 2 Gene 1 3.2 6.5 Gene 2 9.8 4.9 Gene 3 6.4 5.3 Using the upper thresholds 7, 8, 9 and the lower thresholds 4, 5, 6 for the three genes, respectively, we obtain the following support matrix: The previous example shows that by decreasing the upper and lower thresholds for all the genes, the support matrices are completely uncorrelated; it is the same for the Shapley  Using the upper thresholds 6, 5, 7, 8, 7 and the lower thresholds 4, 3, 5, 6, 5 for the five genes, respectively, we obtain the following support matrix: In view of the previous examples, we can consider different thresholds for binarizing the matrix of the expressions of the genes in the various samples as different criteria for building the multicriteria microarray game. Another possibility is to consider the data related to the DeoxyriboNucleic Acid (DNA) of the samples, instead of the RNA; finally, the different criteria may be obtained from other information.

Axiomatization of the Vector Values
In this section we recall axioms from the literature by adapting them to the genes' situations. The columns Sh i (v), for i = 1, . . . , n, are the vectors where s and n denote the cardinality of the coalitions S and N respectively.
We recall that an allocation rule (or solution) ψ is a map which assigns to each N, v an element of R n×m .
The Shapley value in the classical definitiom [5] is characterized by the following three axioms: Intuitively if the dummies of a game abandon it, the others do not dislike that and the allocation does not change.
Let us introduce the vector Banzhaf value which is another point solution for cooperative games; it was introduced to measure the power of the members in a voting situation.

Definition 9.
Given v ∈ G m n , the Banzhaf value is the function β : G m n → R m×n which associates to the vector game v a m × n matrix The columns β i (v), for i = 1, . . . , n, are the vectors For the game in Example 1 the Shapley and the Banzhaf values are: In a real application we have to consider a high number of genes so software such as MATLAB or R can help us.

An Axiomatic Approach for the Shapley Value
Let v ∈ G n m . Let F be a generic solution with m components. Let us consider some axioms that are desirable for a good solution. Intuitively, for each criterium, the solution F gives to elements in the partnership S not less then the value of the partnership S. Axiom 5. F has the Partnership Feasibility (PF) if there is no partnership of genes S ⊂ 2 N \ ∅ for the game v such that F S > v (S), for each ∈ 1, . . . , m.
Intuitively, for each criterium, the solution F gives to elements in the partnership S no more than the grand coalition value.

Axiom 6. F has the Partnership Monotonicity (PM) if
Intuitively, considering two different and disjoint partnerships of genes generating the same number of tumors in a sample and if the set of genes outside the union of those partnerships is irrelevant for the illness, then the players-genes in the smaller partnership are more relevant than those in the bigger one.
Intuitively, each sample must have the same degree of reliability, for example, the power of a gene on p samples must be equal to the sum of the powers on each sample.

Axiom 8.
The solution F has the Null Gene Property (NG) if for all null gene i ∈ N it turns out F i (v) = 0.
Intuitively, if a player-gene contributes nothing to each coalition then the solution gives to it a null relevance.
Axiom 9. F has the Equal Treatment Property (ETP ) if for each game v ∈ M m n and for all partnership of genes S and for each i, j ∈ S it turns out F i (v) = F j (v).
Intuitively, the allocation rule gives the same relevance to each element in a partnership.

Axiom 10. F has the Anonymity property
Here v σ is the game with v σ (σ(U)) = v(U) for all U ∈ 2 N , or v σ (S) = v(σ −1 (S)) S ∈ 2 N . and σ * : R n → R n is defined by (σ * (x)) σ(k) = x k for all x ∈ R n and k ∈ N. All must be read componentwise. (c) By the fact that the Sh(v) ∈ C(v) then ∑ i∈N\S Sh i (v ) ≥ v (N \ S) and ∑ i∈N\S Sh i (v ) = v (N \ S), ∑ i∈S Sh i (v ) ≤ v (S) = 1, . . . , m and from this it follows the (PF) property.

Theorem 2.
There is one and only one solution for the microarray multicriteria game verifying the properties EFF, NG, AN, and ADD for the partnership. It is the Shapley value.
Proof. The proof is similar to the scalar case (see [5]).  Proof. The proof is similar to the scalar case, see [4].

Conclusions and Open Problems
In the present paper we have considered an approach to multiobjective microarray games. The idea of many objectives comes from considering that if there are many parameters to study then the expression analysis is more precise.
We have investigated the results via two solutions of the cooperative games: the Shapley value and the Banzhaf one.
Moretti et al. [9] considered them as relevance indices for genes and many experiments in mathematical literature prove that they are a good choice. There are a lot of problems to investigate, among them: (1) consider other solutions as nucleolus [23], tau-value [24], Alexia value [25], E-equilibrium [15] and compare the obtained results, (2) consider the problem via network games, study the problem via machine learning, (4) another interesting application could be to use this strategic method of multicriteria microarray games to evaluate the dangerous behaviors in a town or in a military zone and in this way give unusual support to the strategic engineering.
Some of these issues are work in progress.
Author Contributions: Conceptualization, V.F. and L.P.; methodology, V.F. and L.P.; formal analysis, V.F. and L.P.; investigation, L.P.; data curation, V.F.; writing-original draft preparation, V.F. and L.P.; writing-review and editing, L.P. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.