Quantum computing approach to realistic ESG-friendly stock portfolios

Finding an optimal balance between risk and returns in investment portfolios is a central challenge in quantitative finance, often addressed through Markowitz portfolio theory (MPT). While traditional portfolio optimization is carried out in a continuous fashion, as if stocks could be bought in fractional increments, practical implementations often resort to approximations, as fractional stocks are typically not tradeable. While these approximations are effective for large investment budgets, they deteriorate as budgets decrease. To alleviate this issue, a discrete Markowitz portfolio theory (DMPT) with finite budgets and integer stock weights can be formulated, but results in a non-polynomial (NP)-hard problem. Recent progress in quantum processing units (QPUs), including quantum annealers, makes solving DMPT problems feasible. Our study explores portfolio optimization on quantum annealers, establishing a mapping between continuous and discrete Markowitz portfolio theories. We find that correctly normalized discrete portfolios converge to continuous solutions as budgets increase. Our DMPT implementation provides efficient frontier solutions, outperforming traditional rounding methods, even for moderate budgets. Responding to the demand for environmentally and socially responsible investments, we enhance our discrete portfolio optimization with ESG (environmental, social, governance) ratings for EURO STOXX 50 index stocks. We introduce a utility function incorporating ESG ratings to balance risk, return, and ESG-friendliness, and discuss implications for ESG-aware investors.


I. INTRODUCTION
Finding an optimal balance between risk and return of an investment is the primary goal for every investor.For investments in securities markets, this problem has been formalized by Markowitz [1] in the sense that one needs to find optimal weights for each security, so that the portfolio maximizes the return and minimizes the risk within a given universe of considered securities.Mathematically, this amounts to minimizing the utility function Q c : R k → R by finding the appropriate weights ⃗ x ∈ R k : Here ⃗ r denotes the expected returns of each portfolio component, ϕ controls the level of risk-aversion and Σ ∈ R k×k is the asset price correlation matrix.The minimization is subject to the following constraints, which ensure that the entries of ⃗ x can be interpreted as nonnegative weights in a long-only portfolio: Since the correlation matrix is positive-definite and symmetric, the utility function Q c is convex, so that a solution to this optimization problem can be found in polynomial time with linear and quadratic algorithms [2].The vector ⃗ x contains non-negative real numbers, which represent the relative weights of capital allocation * daniel.guterding@th-brandenburg.de to the considered assets.These weights can be multiplied by the amount of available capital to obtain the capital allocation to the respective assets.This approach faces problems when implementing such portfolios in a realistic environment, where traded contracts are discrete and securities prices are finite.Hence, the theoretical capital allocation is in general not commensurate with the discrete securities prices.In practice, this challenge is easily overcome by rounding to the nearest multiple of the securities price.The consequences for large portfolios are mild, since the relative weights of asset allocation are hardly changed by the rounding.For small and intermediate portfolios, the rounding may affect the relative weighting significantly and create sub-optimal implementations of originally optimal portfolios.Discrete extensions of the Markowitz portfolio optimization, where the discreteness of securities contracts is considered from the start, have been studied for a long time, because such discrete portfolios also facilitate the inclusion of further realistic features such as transaction costs or Boolean constraints on stock selection.[3][4][5] Intensive studies have been conducted on the problem of transaction costs and the optimal investment trajectory in a multi-period setting.These studies revealed that the discrete Markowitz portfolio theory (DMPT) is a nonpolynomial hard problem [6][7][8][9][10], even if the trajectory problem is only formulated for a single period [11].
The main problem is that the number of possible portfolio compositions grows factorially with the number of assets in the investment universe and the allowed number of assets in the portfolio.If our portfolio may contain n not necessarily different lots out of an investment universe of k different assets, where each asset can be bought multiple times, the total number of possible port-folio combinations is given by a binomial coefficient: For a moderate portfolio size of n = 1000 and a small investment universe of k = 4 stocks, the number of possible portfolios is already M > 10 10 .If we extend the number of considered stocks to k = 50 and keep n = 1000, the number of possible portfolios grows to M > 10 86 , which is larger than current estimates for the number of atoms in the entire universe.At the same time, there is no efficient algorithm for finding the optimal combination out of these M combinations on a classical computer, so that only a brute force approach guarantees success.However, most realistic problems are too large to be solved by testing each of the M combinations, so that finding the exact solution of large problems is not feasible on classical machines.Therefore, these problems have been approached using heuristic and approximate methods on classical computers, which do not guarantee an optimal solution [6,[12][13][14][15].In recent years, the rapid progress in manufacturing of quantum processing units (QPUs) and the development of hybrid quantum-classical workflows, not only for universal quantum computers [16][17][18][19][20], but also for quantum annealers [21][22][23][24][25][26][27][28][29], has re-ignited interest in this type of problems.Meanwhile, quantum annealers have been shown to provide quantum advantage for certain classically intractable problems [30] and seem to provide a promising platform for solving quadratic binary optimization and integer quadratic optimization, even in the presence of hard and weak constraints.Based on these prospects, portfolio optimization is a natural application for quantum computing in finance, and in particular quantum annealers.For a broader review of quantum computing applications in finance, see refs.22, 29, and 31.Recently, awareness of environmental, social and governance (ESG) aspects of investing has grown among private and institutional investors alike.A growing number of financial products caters to the growing demand and incorporates ESG aspects into the product design.The trend toward more ESG awareness is likely to get further amplified by regulatory updates on internationaland national level.See for example ref. 32 for an overview of ESG regulation in the banking sector across Europe.In January 2023, European authorities agreed on a European implementation of the internationally developed Basel III update that will result in updated capital requirements regulation (CRR) and capital requirements directive (CRD), including requirements on ESG awareness and inclusion into risk management.Where up to now integrating ESG constraints in investment decisions has been up to individual preferences, it can be expected to become a required standard in the near future in the EU.The inclusion of ESG risk as an additional risk factor besides historical covariance into the Markowitz framework (see eq. 1) is actively being investigated [33][34][35][36].
Meanwhile, ESG data in different formats are available from a number of data providers such as MSCI, ISS ESG, Refinitiv, Sustainalytics, and others.The approach for establishing ESG ratings varies.Some providers offer ESG ratings or scores that aim to capture investment risks by assessing how effectively a company manages ESG risks to its business ("financial").Other providers aim to characterize the impact of a corporation on the environmental, social and governance dimensions, with the goal of facilitating informed decisions for investors ("impact").The ISS ESG data used in this analysis captures both the financial and impact aspects of ESG ratings.The scores can be given on the level of environmental, social and governance dimensions, with focus on smaller sub-areas, or as a single aggregate score on company level, which seeks to represent the average of all relevant aspects.For a critical review of available data sets and methodologies see refs.37 and 38.How ESG ratings should be best included into the Markowitz framework is an open question.Both inclusion of the expected ESG score into the vector of expected returns [39][40][41][42] and optimizing the ESG score in the form of a multi-objective optimization [34,[43][44][45] have been investigated in the literature.Including the ESG score into the returns vector is intrinsically ambiguous, since it compares ESG score and monetary returns as if these quantities had the same units.This introduces a conversion law between returns and ESG scores, which depends on the exact form of the ESG score data, which may differ between various providers.However, it would be preferable to have a unique framework for incorporating ESG scores into the Markowitz utility function (see eq. 1).The multi-objective optimization approach, on the other hand, may be unable to control the interplay between returns, variance and ESG performance, depending on the exact implementation.
In this work, we extend the Markowitz portfolio theory to include the ESG scores directly in the utility function in a way that avoids the ambiguity in relation to the returns.Furthermore, we can investigate and control the interplay between returns, variance and ESG performance.Our formulation is applicable to standard (continuous) and also discrete mean-variance (Markowitz) portfolio optimization, to allow for application to realistic scenarios.We demonstrate the feasibility of our method on classical computers for the continuous case and quantum annealers for the discrete portfolio optimization case.Results are based on real market data of selected stocks from EURO STOXX 50 index as well as actual respective ESG scores from ISS ESG.
The paper is divided as follows: Section II contains the main results of our study.In subsection II A we establish the correct normalization approach for the discrete Markowitz problem, so that solutions for the continuous and the discrete formulation may be compared.We provide a relationship between the total number of stocks in the portfolio and the risk-aversion parameter, which needs to be considered.In subsection II B we introduce a budget constraint into the discrete portfolio problem, so that realistic scenarios with limited budget may be investigated.We compare the usual rounding approach to direct search of discrete optimal portfolios and find that rounding produces sub-optimal portfolios for small to medium investment budgets.In subsection II C we introduce a novel framework for including ESG scores into both continuous and discrete Markowitz portfolio optimization, which is applicable even to ESG data with heterogeneous scales.In section III we discuss our results and potential implications for ESG-aware investors.Finally, in section IV we summarize our results and provide an outlook on future research topics.

A. Discrete Markowitz portfolio theory and role of the risk-aversion parameter
Here we investigate the connection between the continuous and the discrete Markowitz portfolio theory.Naively, one could expect that the discrete approach should yield the same results as the continuous version for large portfolios, where the discreteness becomes less relevant.As mentioned in the introduction, already the single period DMPT is an NP-hard problem [11,15].
To explore this connection in detail, we formalize the DMPT problem with a fixed number of stocks in the portfolio in the following way: The crucial difference to the continuous case (see eq. 1) lies in the discrete nature of the constraints: Note that the return vector ⃗ r and the covariance matrix Σ do not change their meaning, since these quantities are dimensionless.Therefore, no special care has to be taken when interpreting return vector ⃗ r or covariance matrix Σ in the continuous vs. the discrete case.
If the raw solution of this naive approach (eqs.4 and 5) is denoted as ⃗ X naive , we can calculate the relative portfolio weights ⃗ x d,naive , which may be compared to the solution ⃗ x c from the continuous case, by dividing through the portfolio size N tot : If we calculate the Euclidean distance between the continuous solution ⃗ x c and these naive weights from eq. 6, we would expect the difference from ⃗ x c to vanish with increasing N tot : lim We calculated this difference with the formalism described so far.The continuous solution was extracted using the CVXPY [46,47] software package for the Python programming language.For solving the discrete portfolio optimization problem, one could use a heuristic classical algorithm [48], an algorithm for gate-based quantum computers [5,49], a quantum-inspired approximate algorithm for classical computers [5] or a quantum annealer [29,50].For an overview of the use of various computing approaches in portfolio optimization, see ref. 51.D-Wave quantum annealers implement the Ising model, known from theoretical physics, in a specialized quantum processing unit.These annealers are not universal quantum computers.Therefore, their applicability is limited to problems, which can be represented in terms of the Ising model [52].The solution of the optimization problem is extracted via a physical annealing process, which gradually cools the quantum processing unit down to temperatures close to absolute zero.Subsequently, the quantum state of the system is measured and translated back to the original problem space.
We have decided to use a quantum annealer, because the discrete portfolio optimization problem can be rewritten as an Ising model.[29,52] Therefore, the quantum annealer is a natural choice when solving discrete portfolio problems.Other previously mentioned approaches are also viable [51], but either cannot reach the problem sizes considered here or have no guarantee of providing an optimal solution.Nevertheless, heuristic approaches may yield very good results as demonstrated in ref. 48.
The Ising model is formulated in terms of discrete variables, which represent magnetic moments.These can be in one of two quantum states s i ∈ {+1, −1}.A simple transformation allows us to convert these magnetic moments into zeros and ones a i ∈ {0, 1}, which can be used to represent integer numbers in binary encoding: This transformation between integer optimization problems and the Ising model is well-known [52] and automatically carried out in various software packages like D-Wave Ocean.
[53] The optimization problem can be entered into Ocean in a declarative way using a domainspecific language.In particular, this means that no imperatively formulated solution algorithm is required.This software package also handles the transformation of constraints into penalty terms in a proprietary way.For details on the technical implementation of D-Wave solvers, see ref. 54.Note, however, that going beyond long-only portfolios would require a type of optimization constraint, which is currently not supported by D-Wave software packages.Now we estimate a theoretical upper bound to the number of qubits required by our approach.Since D-Wave Ocean uses binary encoding for integer variables, the upper bound for the required number of qubits N qubit scales with the logarithm of the portfolio size N tot and linearly in the number of assets k within our investment universe: Of course, the proprietary algorithm of D-Wave may require an overhead of additional qubits to encode the problem on real-world hardware.Unfortunately, these details are not public and cannot be investigated here further.
The results of our calculations using CVXPY for the continuous problem and D-Wave Ocean for the discrete problem are shown in Fig. 1a).We realized that the difference in eq.7 does not converge to zero with growing portfolio size.That is the case, because risk-aversion parameters for the continuous and discrete portfolio cases are not directly comparable.This phenomenon does not depend on the exact value of ϕ > 0.
If we view the continuous problem of eq. 1 as a particular discrete problem, in which the solution vector ⃗ x is rescaled by 1/N tot and the limit N tot → ∞ is applied, we can write it in the following way: (10) The constraints are the same as in eq. 5.It is clear that the discrete problem of eq. 10 can only converge to the continuous problem of eq. 1 if the additional factor of 1/N tot in front of the covariance term is absorbed into the risk-aversion parameter.Hence, the risk-aversion parameter of the discrete case ϕ d is connected to the riskaversion parameter of the continuous case ϕ c in the following way: Therefore, we need to respect the mapping for the normalized risk-aversion parameter ϕ in eq.11 if we want to compare portfolios from continuous and discrete optimization.Doing this correctly and re-calculating the difference in eq.7 with the normalized risk-aversion parameter, one obtains the second curve in Fig. 1a), which clearly converges to zero for a large number of stocks N tot .It is also instructive to examine the solutions of the naive approach for different portfolio sizes (without renormalizing ϕ) and their position in volatility-return space.This is shown in Fig. 2. All solutions lie on the  DE0006231004).Data is taken from the period between 1 January 2010 and 1 January 2021.Lines are guides to the eye.a) Difference between continuous solution and naive discrete approach (circles) as well as the difference between continuous and normalized discrete solution (squares).Obviously, the naive approach does not converge to the continuous solution, even for very large portfolios.The normalized discrete approach converges to the well-known continuous solution for large portfolios.The remaining differences in portfolio composition are purely due to the discreteness.b) Difference between continuous and naive discrete solutions for the modified utility function Q mod = ⃗ x T Σ⃗ x, which only includes the covariance term.It is clearly visible that both the continuous and discrete approaches converge to the same minimum variance portfolio for this modified utility function Q mod .Also here, the remaining differences in portfolio composition are purely due to the discreteness.
'efficient frontier', as the surface of maximum return as a function of volatility is commonly called.This efficient frontier in the background was generated by sampling random portfolio compositions.
As we can see, the solutions from the naive discrete approach trend towards the minimum-variance solution if we naively fix ϕ = 1.The reason is clear from eq. 11: with N tot ≫ 1, we should have adapted the risk-aversion parameter to the portfolio size.For example we should have used ϕ = 1/1000 for N tot = 1000 in order to obtain comparable solutions.Fixing ϕ = 1 irrespective of the portfolio size leads to portfolios, for which risk-aversion becomes increasingly important as the size of the portfolio grows.Therefore, the naive approach always converges to the minimum-variance portfolio for N tot ≫ 1.Also note how the scale of variations in volatility in Fig. 2 is small, already for N tot = 10, due to the over-emphasis on risk-aversion.The convergence to the minimum variance portfolio happens rapidly as a function of the number of stocks N tot .Already at N tot = 1000 the composition is practically indistinguishable from the minimum variance portfolio.
As a final test, we have carried out another calculation, in which we have neglected the term related to maximizing the return in the utility function (see eq. 4).If only the covariance term is considered, the optimization should always yield the minimum-variance portfolio irrespective of the portfolio size.This is clearly the case, as shown in Fig. 1b).The remaining difference in portfolio compositions between continuous and discrete solutions is purely due to the discrete stock allocation in the latter case.Thus, we have shown that renormalizing the risk-aversion parameter ϕ according to eq. 11 is crucial.Now that we have established a discrete portfolio optimization approach, which is comparable to the wellknown continuous approach, we introduce budget constraints in the following subsection to mimic realistic portfolio selection problems.

B. Discrete Markowitz portfolio theory with limited investment budget
So far, we have only solved the portfolio problem with a limited number of stocks.In practice, the number of stocks that can be purchased is usually not limited directly, but indirectly via the total available investment budget.To make our study more realistic, we now fix the total available investment budget.This means that the algorithm will not optimize different stocks like for like, but rather optimize portfolios with many low-price stocks versus portfolios with few high-price stocks.
As explained in section II A, the total number of stocks in discrete Markowitz portfolio theory plays a crucial role in the risk-aversion parameter, which determines the compromise between risk and return of the portfolio.With the risk-aversion parameter for the continuous portfolio ϕ c , we write the utility function for the discrete portfolio theory in the following way: The minimization of this utility function is subject to the following constraints: Here, ⃗ p is the vector, which contains the price per stock for each stock.Therefore, ⃗ p T ⃗ x is the initial value of the portfolio.B is the initially available investment budget.In this sense, we constrain the optimization to the space of those portfolios, which can be purchased with the initially available budget.Since we also maximize return via the utility function (eq.12), the algorithm will yield portfolios, which use the available budget to the maximum extent.
In practice, we will study the problem defined by eqs. 12 and 13 at a fixed number of stocks N tot .If the number of stocks N tot is chosen too small, the initial portfolio value will be far below the initial budget B. If we choose a too large number N tot , the number of possible portfolio combinations will exceed the capabilities of contemporary quantum hardware.Therefore, we start with low N tot and gradually increase this number until the difference between portfolio value and available budget ⃗ p T ⃗ x − B becomes sufficiently small.As we will see, this approach yields good results even in realistic settings.
Of course, we would like to compare these discrete solutions to portfolios that are based on the usual continuous Markowitz theory.In the continuous case, the solution ⃗ x c provides a relative allocation of the available investment budget to the respective stocks.The actual portfolio is then usually constructed by multiplying the relative weights ⃗ x c with the available budget B. This gives the budget, which is allocated to each stock.To obtain the integer number of stocks that has to be bought for each sort, one divides by the price of the respective stock and rounds to the next integer, which is denoted as ⌊⌉.Therefore, we can write the integer portfolio composition based on the rounding approach as: Here, (⃗ x c ) i is the i-th component of the vector of relative allocation from the continuous Markowitz theory and p i is the price of the i-th stock.Interestingly, the rounding approach according to eq. 14 yields portfolio compositions, which are substantially different from the discrete approach using eqs.12 and 13, even if consistent values for the risk-aversion parameter ϕ c are used.Remember that these two approaches only coincide in the limit of infinite available budget, as explained in subsection II A.
We have carried out continuous and discrete portfolio optimization with a risk-aversion parameter of ϕ c = 8 and a total investment budget of B = 100000 €.For simplicity, the investment universe is again limited to BMW (ISIN DE0005190003), Deutsche Post (ISIN DE0005552004), Deutsche Telekom (ISIN DE0005557508) and Infineon (ISIN DE0006231004).In the discrete case we have used N tot = 3401, which produces an initial portfolio value of ⃗ p T ⃗ x d = 99999.87€ for the optimal solution.The rounding approach (see eq. 14) may of course slightly violate the budget constraint.Thus, the rounded solution yields an initial portfolio value of ⃗ p T ⃗ x c,r = 100006.32€ and a total number of stocks N tot = 4026.The larger number of stocks for the rounded continuous case appears, because the standard Markowitz approach puts a large relative weight on Deutsche Telekom, which has the lowest Euro value per stock within the considered investment universe.This means that a larger number of these stocks will be bought with the available budget.
The resulting portfolio compositions for the discrete and rounded continuous cases are shown in Fig. 3, both FIG. 4. Position of the best portfolio compositions in volatility-return space for a budget of B = 100000 € and risk-aversion parameter of ϕ = 8.The discrete solution is obtained by minimizing the utility function in eq. 12 under the constraints of eq. 13.We use Ntot = 3401.The continuous results were obtained by multiplying the relative allocation by the available budget and rounded to integer stocks via eq.14, which results in Ntot = 4026.The investment universe comprises BMW (ISIN DE0005190003), Deutsche Post (ISIN DE0005552004), Deutsche Telekom (ISIN DE0005557508) and Infineon (ISIN DE0006231004).Data is taken from the period between 1 January 2010 and 1 January 2021.The discrete solution is clearly at the efficient frontier, while the rounded continuous solution is visibly sub-optimal.Arrows are guides to the eye.
in terms of the number of stocks bought per ISIN and the invested budget per ISIN.We observe that the respective portfolio compositions are strikingly different.The rounded continuous approach yields a solution, which is well diversified in terms of allocated budget.The discrete approach on the other hand yields a portfolio, which is slightly more concentrated in terms of budget allocation.This effect is likely due to the strict budget constraint in the discrete case, which forces the optimization to pick allocations that fit the specific budget constraints.
We also investigated the position of the obtained portfolios in volatility-return space (see Fig. 4).The discrete solution is right at the efficient frontier, i.e. it yields an optimal return for the given volatility.The rounded continuous portfolio has a lower volatility, but also yields a significantly lower than optimal return.The deviation of nearly 2 percentage points in return is larger than one may expect from the seemingly harmless rounding approach.The effects of the rounding observed here are likely relevant in practical applications.In fact, one can expect even larger deviations for portfolios with a larger number of components.
Our results clearly show that the continuous and discrete approaches only converge to identical results in the limit of infinite portfolio size.For a limited investment budget, the approach of minimizing the utility function for the discrete case directly on the quantum computer yields results, which are far superior to the widely used rounding method based on the standard Markowitz ap-proach, even for moderately sized portfolios and limited investment universe.We expect that the quantum computing approach will have even stronger appeal for large investment universes, since the discreteness of individual components will play an even more important role there.Now that we have established the superiority of the quantum computing approach in the case of limited budget, we come to the main idea of our study: the inclusion of ESG data into the discrete portfolio optimization problem.

C. Incorporation of ESG data into Markowitz portfolio theory
We have to address two questions in order to include ESG data into Markowitz portfolio optimization: i) how to classify portfolios in terms of ESG scores and ii) how to incorporate such information into the optimization scheme.The current literature on this topic can be divided into two main approaches: the most commonly found way of including ESG data is to constrain the Markowitz utility function, so that it yields a portfolio with the weighted average of the expected ESG scores [34-36, 39-41, 43-45, 55, 56], which actually constrains the possible portfolio compositions.The second approach [42] employs an affine transformation between returns and ESG scores, which is controlled by an additional parameter.
Obviously, the composition-weighted average of expected ESG scores is not the only property that can be used to classify portfolios.In this subsection we introduce a novel scheme for classifying portfolios in terms of ESG score, which we incorporate into the discrete portfolio optimization scheme explained in subsection II B.
We assume that the value of the ESG score S in every scoring system is bounded by the best and worst possible scores S ∈ [S − , S + ].Let us consider a relative portfolio composition ⃗ π with respect to the ESG scores of a given scoring system.Since the entries of ⃗ π are non-negative and their sum is one, the entries of ⃗ π can be interpreted as a probability distribution.As a reference point we take a portfolio, which only contains stocks with the best possible ESG score S + .With the result of Wasserstein [57], distances of other relative portfolio compositions with respect to this best possible portfolio can be calculated.Note that there may be multiple portfolio compositions in terms of stock allocation, which possess the best possible score S + , e.g. if more than one stock in the investment universe has the best possible score.However, this possible degeneracy is irrelevant in our approach, as we will see.
If we use the Wasserstein metric in the case of two onedimensional sets of measurements and take the limit of infinite number of observations [58,59], we can write the Wasserstein p-distance between a given relative portfolio composition ⃗ π and the best possible portfolio in the following way: If all constituents of a portfolio ⃗ π have the worst possible score S − , the distance measure is For p = 1 our result in eq.15 becomes the weighted average (up to a constant factor).Therefore, we may view eq. 15 as a generalized framework for classifying portfolios in terms of ESG scores.This framework does not depend on whether the best score S + has the lowest or the highest value in the respective scoring system.Also note that our distance measure may be generalized to work with heterogeneous data from different ESG data providers by using the relative distance of the single portfolio component within its pertinent ESG score range, by extending eq. 15 with an additional normalization factor: Here S + (π i ), S − (π), Si (π i ) indicate respectively the best, the lowest and the spot score in the ESG system pertinent to the component π i .Although our methodology would enable us to mix multiple ESG scoring systems, we do not expand upon this topic in the present manuscript and leave it for future research instead.In the present manuscript we only use ESG data provided by ISS ESG.
So far we have written the distance measure in terms of the relative composition with respect to the ESG score.In order to include the ESG data into the discrete opti- mization framework, we need to establish the ESG distance measure in terms of the composition with respect to the allocation of individual stocks.It is easy to show that eq. 15 can equivalently be written as: Here, x i is the i-th component of the discrete portfolio allocation vector ⃗ x and S i is the ESG score of the i-th component stock in the portfolio.All other quantities are defined as before.
The Wasserstein p-distance is defined for p ∈ [1, +∞).Since the function f (x) = x p is strictly increasing for p ≥ 1 and x > 0, we know that D ESG (p, ⃗ x) from eq. 17 has a global maximum equal to D max = |S + − S − |.Therefore, we can include a linear constraint on D ESG (p, ⃗ x) into the discrete optimization problem from subsection II B for every p ≥ 1.The respective problem then reads: The minimization of this utility function is subject to the following constraints: Note how the utility function in eq.18 is unchanged compared to eq. 4 and eq.12.The difference lies only in the additional constraint in eq.19.Here, D is a nonnegative constant.For D ≥ D max this constraint has no effect on the optimal portfolio composition ⃗ x.For D = 0 only stocks with the best possible score are allowed.In between these two extremes, the constraint restricts possible solutions to the given maximum distance in ESG rating space.In practice, we use p = 1 and the latest ESG date in the period under investigation to calculate D ESG (p = 1, ⃗ x).Exploring the effect of other choices for p is left for future studies.
We now perform calculations with the following stock universe reported in order from high- est to lowest ESG score: Deutsche Telekom (ISIN DE0005557508), SAP (ISIN DE0007164600), Intesa Sanpaolo (ISIN IT0000072618) as well as EssilorLuxottica (ISIN FR0000121667).The portfolio optimization problem from eqs. 18 and 19 was again solved on a D-Wave quantum annealer for different values of the ESG constraint D. The ESG data were provided by ISS ESG.The grading system is on a scale from 4 to 1, where a higher number indicates better ESG performance.We use a budget of B = 100000 € and a risk-aversion parameter of ϕ = 8.The result in volatility-return space is shown in Fig. 5.We first set D = 5 and obtained a solution with D ESG = 1.6.Hence we that the actual maximum reachable ESG distance within the given investment universe is D ESG = 1.6.We gradually decreased D from there until the solution visibly departed from the efficient frontier.The latter was again calculated by sampling random portfolio compositions.At a certain point, stronger constraints on D ESG (p, ⃗ x) produce portfolios, which move farther away from the efficient frontier.This finding is consistent with the study by Cesarone et al. [44] We also analyzed the portfolio composition for different values of D. The results are shown in Fig. 6.We found that decreasing the distance D from the best possible portfolio in ESG terms gradually increases the weight of stocks with better ESG score compared to stocks with worse ESG score, both in terms of relative composition and budget allocation.Stocks of Intesa Sanpaolo are not part of the optimal portfolios due to their relatively unfavorable returns (not driven by ESG scores).We had to vary N tot somewhat as a function of D, so that the full budget can be allocated.As can be seen from Fig. 6, allocation to Deutsche Telekom increases with decreasing D. Since stocks of Deutsche Telekom have a much lower price per stock than SAP and EssilorLuxottica, a higher number of stocks has to be allocated, which requires larger N tot .The resulting budget allocations are summarized in Table I.
In this subsection, we have introduced a novel distance measure for portfolios within the space of ESG scores based on the Wasserstein metric.We use this distance measure to constrain the search for optimal portfolios in volatility-return space to a certain vicinity of the best possible portfolio in ESG space.We have demonstrated that our approach yields sensible and interesting results in combination with discrete portfolio optimization on a quantum annealer.

III. DISCUSSION
The approach we have presented here is based on historical data for covariance and returns.A further constraint such as the one on the ESG classification may not improve the performance of any portfolio within this framework.However, there is an ongoing discussion in the literature on whether ESG-aware investors generate higher returns than comparable non-ESG benchmarks in the long-term and can realize a better performance during a global crisis.Results of investigations into the historically measured performance of stocks with strong and weak ESG ratings vary depending on markets, ESG data and time periods considered for analysis [44,48,[60][61][62][63][64][65][66][67].
Cesarone et al. [44] investigate mean-variance-ESG optimal portfolios and show how portfolio mean returns systematically move away from the efficient frontier the more weight is put optimizing the ESG scores of the respective portfolio (compare Fig. 5).These authors use a continuous Markowitz framework and obtain results consistent with ours.In addition, we show that optimal discrete portfolios can be obtained from modern quantum annealers under realistic circumstances.
Auer and Schuhmacher [60] study the impact of socially responsible investing on the performance of investment funds.They compare the returns of investment funds with different ESG ratings to the return of their respective benchmark index.They find that portfolios of European stocks with high ESG ratings often underperform with respect to their benchmark, while no consistent over-or under-performance was observed in the Asia-Pacific region and the United States.This approach differs from ours in that Auer and Schuhmacher use benchmark indices as their reference point, while we compare to portfolios on the mean-variance efficient frontier.In this sense, an over-performance of ESG-aware portfolios is possible in Auer and Schuhmacher's approach, since they compare to benchmark indices, which may have sub-optimal returns in the first place.Due to the different methodology, these authors' results are not directly comparable to ours.Nevertheless, our results and those of ref. 44 can help to rationalize these findings.Both studies find that the deviation of ESG-aware from mean-variance optimal portfolios depends on the emphasis, which is put on the ESG optimization goal.In particular, the novel ESG distance measure we introduced could help to clarify the results of Auer and Schuhmacher in future studies.
Amon et al. [64] find that portfolios with good ESG ratings can be constructed at a small cost in terms of returns.This is consistent with our findings and those of ref. 44, which both show that many portfolios with close to optimal returns can at the same time have good ESG ratings.In future studies, our ESG distance measure could be used to quantify the deviation of these ESGaware portfolios from the best possible portfolio in the respective rating system.
García et al. [48] investigate ESG ratings within a multi-objective optimization framework, focusing on portfolios comprised of component stocks from the Dow Jones Industrial Average (DJIA) index.These authors solve an NP-hard realistic portfolio problem similar to ours, but use a heuristic evolutionary algorithm where we employ a quantum annealer.They find that better ESG ratings generally imply lower returns.Nevertheless, many portfolios with good ESG ratings possess favorable risk-return profiles and may even outperform benchmark indices like the DJIA.These results are consistent with our present study.Breedt et al. [61] perform a factor analysis based on a proprietary mean-variance optimization method.They find that ESG is not an independent factor, i.e.ESG information is already captured by other investment factors.They conclude that including ESG information into the investment process does neither lower nor improve the investment returns.We found that it is possible to construct ESG-aware portfolios, which are very close to the efficient frontier.Hence, our results can be considered consistent with those of Breedt and coauthors.
Nofsinger and Varma [62] find that socially responsible investment portfolios over-perform in times of market crisis and under-perform in other periods.They performed regression using several factor models.This methodology is very different from ours and other mean-variance approaches.Furthermore, the ESG selection is based on a screening approach, not on optimization.Again, overand under-performance was measured with respect to regional benchmarks.Therefore, these results are not directly comparable to ours.
Demers et al. [66] performed a similar study and concluded that ESG-aware investment does not protect against market crises.Their argument is similar to that of ref. 61, since they also conclude that ESG is not an independent investment factor.
Bae et al. [65] perform a regression analysis and conclude that corporate social responsibility does not affect the returns of US stocks during times of the COVID-19 market crisis.These authors also point out the possibility of firms having positive ESG values in certain rating systems, while actually acting against these goals in practice.Like other factor regression studies, these results are not directly comparable to ours.
La Torre et al. [63] find that ESG ratings weakly affect the returns of EURO STOXX 50 component stocks.Their analysis is based on regression of a factor model, which is only loosely related to our study.Again, our quantitative distance measure in ESG space could help to clarify these results in future studies.
Atz et al. [67] perform a meta-study on the impact of sustainability on investment returns.They argue that most studies find no discernible impact, while about onethird of all investigated studies find a positive impact.The positive impact is attributed to the possibility of capturing climate risk premium and higher robustness during times of crisis.
As explained before, the question of whether ESG-aware investing produces measurable effects on investment performance is beyond the scope of this work.Such effects would result from investment decisions guided by beliefs and values, which are not captured by the Markowitz framework used in our study.
In our opinion, the question is ultimately to which degree investor expectations about future developments are reflected in historical prices.As we explained in the introduction, the importance of informed investment decisions based on ESG data can be expected to grow.The degree to which non-ESG-aware investors are following these developments is, however, unclear.Therefore, stocks with good ESG scores may outperform stocks with worse ESG scores, as the public increasingly demands the publication of ESG data and enforces the adoption of ESG-aware investment strategies.This effect is not captured by Markowitz portfolio theory and would require a radically different approach.
We expect that interest in ESG topics will grow rapidly among investors, particularly regarding portfolio classification in terms of ESG scores.Our method of using ESG data enables ESG-aware investors to construct ESG-friendly portfolios without the need for further assumptions or additional parameters.In particular, we avoid assuming additivity of ESG data with other terms in the Markowitz utility function.In fact, we do not modify the utility function at all, so that ESG data only appears in the linear constraint we introduced.Thus, in our approach ESG preference, returns and volatility can be interpreted independently, as one would expect [33,34,41].
Our study also shows that portfolio optimization is an attractive case for combining classical and quantum workflows.While the discrete portfolio optimization problem can only be solved efficiently on a quantum computer, all data processing is still done efficiently on a classical computer.We believe that many quantum applications will be part of such hybrid quantum-classical workflows in the future.See refs.23,24,50,[68][69][70] for more examples for hybrid approaches to portfolio optimization.

IV. CONCLUSIONS
We have presented a study of Markowitz portfolio optimization in the presence of discrete stock allocations, limited budget and constraints on portfolio ESG scores.We have studied both the usual continuous formulation of the portfolio problem as well as a more realistic discrete version.The discrete version can not be solved efficiently on classical computers, at least not by enumerating all possible portfolio combinations, although some progress has been achieved using simulated annealing on classical hardware [4].Therefore, we have employed a D-Wave quantum annealer for solving the discrete portfolio problem.
We have established a mapping between continuous and discrete Markowitz portfolio theory, which allows us to compare results in a meaningful way.This mapping involves a re-scaling of the risk-aversion parameter ϕ.Importantly, we have also shown that failing to apply this re-scaling in the discrete case, the relative composition of discrete solutions will not converge to the continuous solution, even in the limit of infinite portfolio size, but rather converge to the minimum variance portfolio.Subsequently, we extended Markowitz portfolio theory to include a budget constraint.We showed that rounding of continuous portfolio compositions to the nearest integer number of stocks yields sub-optimal portfolios for small and medium investment budgets.Solutions from our discrete approach on the contrary lie on the efficient frontier in volatility-return space.
Furthermore, we introduced a novel way to classify portfolios in terms of their ESG score via the Wasserstein p-distance by viewing relative portfolio compositions as discrete probability distributions.Using the Wasserstein metric we measure a portfolio's distance with respect to the best possible portfolio by ESG score in the respective scoring system.Our method is a generalization of the weighted average classification scheme reported in the literature and applicable to any ESG scoring system without further modification.Our framework can even be modified to accommodate ESG data from heterogeneous scoring systems.We incorporated the ESG data into the optimization process by constraining the portfolio search to a certain maximum distance from the portfolio with the best possible ESG score via a linear constraint, which is independent of the chosen metric.
We also reported case studies for portfolios using components of the well-known EURO STOXX 50 index.By decreasing the maximum distance from the best ESG portfolio we found that portfolio compositions were gradually putting more weight on stocks with better ESG scores and less weight on stocks with worse ESG scores, both in terms of number of stocks and in terms of allocated budget.
Our results can help ESG-aware investors to include their preferences in an effective way, building on the widely used Markowitz portfolio theory.How these preferences are derived is a research field in itself and goes beyond our work.
So far, we have only studied the Wasserstein p-distance for p = 1.Future studies should clarify the role of p for the ESG portfolio problem.Furthermore, our method could be applied to larger portfolios and heterogeneous ESG data from different providers.We believe that the formalism we presented can be applied to many practical problems, such as finding tradeable ESG-optimized portfolios or constructing discrete ESG-aware portfolios as a basis for exactly hedgeable indices.These topics are left for future research.

FIG. 1 .
FIG.1.Euclidean norm of the difference vector between optimal relative portfolio weights ⃗ x for the continuous (⃗ xc) and discrete optimization case (⃗ x d ).The risk-aversion parameter is set to ϕ = 8, but different choices of ϕ > 0 give similar results.The investment universe comprises BMW (ISIN DE0005190003), Deutsche Post (ISIN DE0005552004), Deutsche Telekom (ISIN DE0005557508) and Infineon (ISIN DE0006231004).Data is taken from the period between 1 January 2010 and 1 January 2021.Lines are guides to the eye.a) Difference between continuous solution and naive discrete approach (circles) as well as the difference between continuous and normalized discrete solution (squares).Obviously, the naive approach does not converge to the continuous solution, even for very large portfolios.The normalized discrete approach converges to the well-known continuous solution for large portfolios.The remaining differences in portfolio composition are purely due to the discreteness.b) Difference between continuous and naive discrete solutions for the modified utility function Q mod = ⃗ x T Σ⃗ x, which only includes the covariance term.It is clearly visible that both the continuous and discrete approaches converge to the same minimum variance portfolio for this modified utility function Q mod .Also here, the remaining differences in portfolio composition are purely due to the discreteness.

FIG. 2 .
FIG. 2. Portfolio positions in volatility-return space for the naive discrete optimization approach as a function of the total number of stocks Ntot in the portfolio.The risk-aversion parameter is set to ϕ = 1.The investment universe comprises BMW (ISIN DE0005190003), Deutsche Post (ISIN DE0005552004), Deutsche Telekom (ISIN DE0005557508) and Infineon (ISIN DE0006231004).The light blue background is generated by randomly sampling the space of possible portfolios.The upper boundary of the light blue area is commonly called 'efficient frontier'.Data is taken from the period between 1 January 2010 and 1 January 2021.Lines and arrows are guides to the eye.

FIG. 3 .
FIG. 3. Best portfolio compositions for a budget of B = 100000 € and risk-aversion parameter of ϕ = 8.The discrete solution is obtained by minimizing the utility function in eq. 12 under the constraints of eq. 13.We use Ntot = 3401.The continuous results were obtained by multiplying the relative allocation by the available budget and rounded to integer stocks via eq.14, which results in Ntot = 4026.The investment universe comprises BMW (ISIN DE0005190003), Deutsche Post (ISIN DE0005552004), Deutsche Telekom (ISIN DE0005557508) and Infineon (ISIN DE0006231004).Data is taken from the period between 1 January 2010 and 1 January 2021.a) Portfolio composition in terms of number of stocks.b) Portfolio composition in terms of Euro value.

FIG. 5 .
FIG. 5. Position of the best discrete portfolio compositions in volatility-return space for a budget of B = 100000 €, a risk-aversion parameter of ϕ = 8 and different values of maximum allowed ESG distance D. The ESG data were provided by ISS ESG.The grading system is on a scale from 4 to 1, where a higher number indicates better ESG performance.The investment universe comprises Deutsche Telekom (ISIN DE0005557508), SAP (ISIN DE0007164600), Intesa Sanpaolo (ISIN IT0000072618) and EssilorLuxottica (ISIN FR0000121667).Those are given in order from highest to lowest ESG score.Data is taken from the period between 1 January 2010 and 1 January 2021.The solution portfolios move away from the efficient frontier as we restrict them into a space that becomes gradually tighter around the best possible ESG score.Arrows are guides to the eye.

FIG. 6 .
FIG. 6. Best discrete portfolio compositions for a budget of B = 100000 €, a risk-aversion parameter of ϕ = 8 and different values of maximum allowed ESG distance D. The ESG data were provided by ISS ESG.The grading system is on a scale from 4 to 1, where a higher number indicates better ESG performance.The investment universe comprises Deutsche Telekom (ISIN DE0005557508), SAP (ISIN DE0007164600), Intesa Sanpaolo (ISIN IT0000072618) and EssilorLuxottica (ISIN FR0000121667).Those are given in order from highest to lowest ESG score.Data is taken from the period between 1 January 2010 and 1 January 2021.Decreasing the maximum distance D to the portfolio with best possible ESG score results in compositions which gradually contain higher amounts of stocks with better ESG scores such as Deutsche Telekom and SAP.a) Portfolio composition in terms of number of stocks.b) Portfolio composition in terms of Euro value.

)
Here, i enumerates the possible values Si of the ESG score within the given portfolio ⃗ π.This vector contains the relative number of stock allocations to the respective ESG score Si .πi is the i-th component of the vector ⃗ π. p ∈ [1, +∞) is the parameter of the Wasserstein pdistance.Note how the exact composition in terms of stocks is irrelevant in this approach.The distance measure D ESG is only sensitive to the ESG scores of the respective constituents.Therefore, different allocations of stocks with the same ESG score do not affect D ESG .Also note that comparison to a best possible individual allocation would have required us to know this specific portfolio.This target portfolio is, however, in general unknown.The point of our method is to find it.Hence, we have chosen an approach in which knowledge of this hard to find solution is not required.If all constituents of a portfolio ⃗ π have the best possible score S + , the distance measure is D ESG

TABLE I .
Number of stocks and allocated budget ⃗ p T ⃗ x as a function of the maximum ESG distance D.