Diversifying Investments and Maximizing Sharpe Ratio: A Novel Quadratic Unconstrained Binary Optimization Formulation

: The optimization of investment portfolios represents a pivotal task within the ﬁeld of ﬁnan-cial economics. Its objective is to identify asset combinations that meet speciﬁed criteria for return and risk. Traditionally, the maximization of the Sharpe Ratio, often achieved through quadratic programming, has constituted a popular approach for this purpose. However, real-world scenarios frequently necessitate more complex considerations, particularly in relation to portfolio diversiﬁcation with a view to mitigating sector-speciﬁc risks and enhancing stability. The incorporation of diversiﬁcation alongside the Sharpe Ratio into the optimization model creates a joint optimization task, which can be formulated as Quadratic Unconstrained Binary Optimization (QUBO) and addressed using quantum annealing or hybrid computing techniques. These techniques offer promising solutions. We present a novel QUBO formulation for this optimization, detailing its mathematical formulation and demonstrating its advantages over classical methods, particularly in handling diversiﬁcation objectives. By leveraging available QUBO solvers and hybrid approaches, we explore the feasibility of handling large-scale problems while highlighting the importance of diversiﬁcation in achieving robust portfolio performance. We ﬁnally elaborate on the results showing the trade-off between the observed values of the portfolio’s Sharpe Ratio and diversiﬁcation, as a natural consequence of solving a multi-objective optimization problem.


Introduction
Portfolio optimization plays a pivotal role in the financial industry.Banks, insurance companies, and hedge funds exploit the theory of modern portfolio formulated by Harry Markowitz, which earned him a Nobel prize [1].In portfolio optimization, we are given a set of assets to choose from.Although some portfolios are handcrafted uniquely according to the experience of financial advisors, a software market has emerged and continues to grow in tandem with technological advancements.By considering the evolving behavior of these assets' values, we can estimate the expected return (financial gain) and its volatility (risk).We then utilize this information to construct an optimization problem that is built on the maximization of returns and minimization of the risk.Numerous objective functions have been engineered, all aiming to construct the best portfolio.Among these, the Sharpe Ratio has emerged as one of the main indicators of portfolio quality [2].The Sharpe Ratio is a measure of an investment's performance, calculated as the ratio between the portfolio's expected return and its risk, computed as standard deviation.In its simplest form, the resulting optimization problem is convex, thus efficient to solve on a classical device.
In business scenarios, the Sharpe Ratio is often considered a fundamental, yet not fully exhaustive, measure to evaluate financial portfolios.Other characteristics are taken into account, based on the needs that the portfolio is required to meet.One such example is the degree of diversification: investments spread across multiple sectors and industries are preferred for risk-averse profiles, which can often be the case for large financial institutions.Modeling additional business needs into an optimization problem may lead to nontrivial tasks, raising the demand for efficient strategies able to find high-quality solutions.These needs lead to different objective functions, which are based on the Sharpe Ratio optimization but make the resulting problem not necessarily convex anymore.A common business directive is the diversification designed as the preference towards solutions that allocate budget on as weakly correlated (or anticorrelated) families of assets as possible.Stocks of different companies from the same industry, for example, real estate, may be somewhat linked, and the resulting portfolio may not appeal to an investor.
Recently, the improvement in computational capabilities promised by quantum computing has generated large interest in the financial sector [9].One of the first applications of quantum annealers has been proposed by authors in [10], who built a Quadratic Unconstrained Binary Optimization (QUBO) formulation for portfolio optimization.
QUBO models represent a powerful tool for solving complex optimization problems across various domains [11,12].In physics, they offer valuable insights into phenomena like spin glass theory, quantum magnetism, and lattice gauge theory [13][14][15].In the fields of biology and chemistry, QUBO models are employed for tasks such as measuring similarity among molecular structures [16] and solving lattice protein folding problems [17].In industrial contexts, QUBO models are indispensable for addressing optimization challenges across different sectors.They are applied, for instance, in logistics for optimizing vehicle routing problems and traffic flow [18,19], as well as in resource allocation and scheduling tasks [20].
Formulation [10] is tailored to run with the limited capabilities of the quantum hardware available at the time of writing.This has resulted in modeling Sharpe maximization as a discrete problem, where each asset is either selected or not, and so, breaking the convexity of continuous model formulation, it makes the problem nonconvex.However, with the hardware constantly improving, we are interested in developing methods for a wider scope [21].
By combining multiple notions and techniques widely discussed in the literature, such as QUBO approach and mathematical modeling applied to financial allocation problems, we develop our work with the aim of obtaining a new formulation able to handle complex portfolio optimization tasks using quantum computing techniques.Specifically, we introduce a novel QUBO formulation that preserves the Sharpe Ratio and handles a diversification term to the optimization problem, which entails a complexity that, to the best of our knowledge, has not been formulated as QUBO and solved through a quantum computing approach.
Firstly, we modify the approach in [10] to go beyond the all-or-nothing selection approach of each asset, allowing the selection of a linear combination of investments on all assets with arbitrarily large precision: the more precise the linear combination becomes, the more binary variables are needed, and the formulation itself becomes more convex, up to the limit case of infinite precision which is convex.The possible convexity of such a formulation, which at this point can be solved efficiently by classical means, does not undermine the importance of the work because the formulation can serve as a baseline to add personalized, business-dependent constraints.Secondly, we propose diversification as one of the possible terms that might be taken into consideration.The presence of a diversification objective function item has the aim of penalizing, but potentially not completely ruling out, the investments on assets belonging to the same market sector, guaranteeing a portfolio diversified over multiple sectors up to an arbitrary degree.This solution maximizes the expected return and, at the same time, minimizes the risk of loss given by potential market crashes in individual sectors.We show how to integrate such a term on the baseline QUBO formulation, proving that the diversification maximization leads to a nonconvex problem.
As a general remark, we focus on the problem of identifying the optimal amount of capital to allocate on long positions over the stocks considered.Throughout this study, we adopt the common assumption that trading or selling activities, such as borrowing or opening short positions on assets, are not allowed.
We test our approaches on the classical qbsolv algorithm and on the D-Wave Leap hybrid classical-quantum solver.In particular, the choice of such solvers allows us to abstract from all the hardware details that we need to consider when executing our instances on quantum annealers.The implication of porting our approach on noisy, small-scale quantum annealers is discussed.Alternative approaches in solving the portfolio optimization on quantum computers, although not exhaustive for the task that we face, are shown in [22][23][24][25].
The paper is structured as follows.Section 2 is devoted to reviewing the background on the topic.Section 3 describes the fine-grained QUBO formulations for Sharpe maximization.Section 4 shows the chosen modeling strategy for the diversification term.In Section 5, we report our experiments with various solvers, and we discuss the results.Finally, Section 6 concludes our work and outlines future directions.

Preliminaries
In this section, we briefly introduce the necessary background to render our work self-contained.For further details, one can refer to [26] for the definition of effective QUBO formulations and to [27] for quantum annealing.

Notation
The sets of real, complex, and Boolean values are denoted with R, C, B = {0, 1}, respectively.Scalar variables and constants are denoted with lowercase alphabetic characters and lowercase greek characters, respectively, e.g., x and λ.Vectorial values are denoted with lowercase bold characters, e.g., x, and are intended to be column vectors.Matrix and operators are denoted with uppercase alphabetic characters, e.g., Q.The notation σ x , σ y , σ z denotes the Pauli matrices, with the optional apex indicating the qubits on which the operator acts.

Quadratic Unconstrained Binary Optimization
The Quadratic Unconstrained Binary Optimization (QUBO) is an NP-hard combinatorial optimization problem defined as follows: given a set of n Boolean variables x ∈ B n and a n × n matrix Q, typically in upper-triangular form, of real values, the problem consists of finding the value x = arg min or equivalently x = arg min Despite the simplicity of this approach, its NP-hardness guarantees the possibility of expressing a vast class of problems, such as those introduced in Section 1.It is notable that we can express problems over integers and rational variables using different encodings.For example, a positive integer y can be defined over an m-bit variable b through the binary encoding y ≡ ∑ m−1 i=0 2 i b i or through the unary encoding y ≡ ∑ m−1 i=0 b i .The use of different encodings will lead to diverse QUBO formulations, and their efficacy has to be assessed on a per-case basis.The same reasoning applies to differently structured data, e.g., fixed point rational variables.
Constraints can be expressed implicitly by representing the QUBO in (2) as a linear combination of the main objective function and other terms, each corresponding to one constraint.In [28], it is shown how to design different kinds of constraint terms.

Quantum Annealing
Quantum annealing is a heuristic optimization procedure proposed in [29].It is carried on by initializing a quantum mechanical system in a superposition of all candidate solutions, which then evolves according to the Schrödinger equation under a time-varying Hamiltonian: Here, t ∈ R ≥0 represents the time, x is the Hamiltonian of the system at initialization, and H f corresponds to the problem formulation to be minimized.According to the adiabatic theorem [30], if the system evolves slowly enough, it remains in its ground state throughout the entire evolution.The evolution is controlled by the given schedule, specified by Γ and J .The Hamiltonian H f takes the form: where h i is called bias and represents the strength that leads the corresponding S i variable to take either value −1 or 1, J ij is called coupler and encodes the relationship between pairs of variables (S i , S j ), and both are real-valued parameters.The measurement results in a vector S of spin variables S i ∈ {±1}.It is worth noting that the QUBO formulation can be naturally expressed as an Ising Hamiltonian, where a change of variables S i ↔ 2x i − 1 establishes the equivalence.
Recently, several companies have developed large-scale quantum annealers, which are specialized quantum computers designed for solving Ising Hamiltonians.Examples of such systems include D-Wave [21] and Qilimangiaro [31].The availability of quantum annealers is crucial to address the class of problems that can be solved through a QUBO model, aiming to achieve an advantage over other techniques in terms of both solution quality and computational speed.

Combinatorial Optimization Techniques on Quantum Computers
The development of combinatorial optimization algorithms for quantum computers is of paramount interest.Most combinatorial optimization problems are NP-hard, which informally means they are at least as hard to solve as the most challenging NP problem.It is widely believed, based on reasonable computational complexity assumptions [32], that neither classical nor quantum computers can efficiently solve NP-hard optimization problems.However, significant theoretical speedup has been proven, and various techniques, including quantum annealing, have shown promise in providing improved solutions for certain classes of problems.
Regarding quantum annealing, the adiabatic theorem does not guarantee efficient convergence to the global optimum, as the time required to evolve the system may be exponential in the size of the problem instances.In comparison to the simulated annealing algorithm, authors in [33] have proven that quantum annealing can leverage quantum tunneling effects to escape local minima, where simulated annealing would require exponential time to escape.On the other hand, [34] presents arguments in favor of simulated annealing.More recently, [35] has identified specific characteristics that can give quantum annealing an advantage over classical techniques, such as landscapes with many local minima separated by high but thin barriers.
Different possibilities are suggested by quantum algorithms for semidefinite programming [36], which can offer significant speedup over the current classical solution but require fault-tolerant quantum hardware.Authors in [37] have shown how to formulate semidefinite relaxations of QUBO problems.

Sharpe Ratio Maximization in Portfolio Optimization
Portfolio optimization involves selecting the optimal portfolio based on specific criteria, such as maximizing the expected return or profit from the portfolio while minimizing the associated risk.A potential approach for identifying a suitable portfolio is the maximization of the Sharpe Ratio.For an in-depth treatment of portfolio optimization techniques, one can refer to [38].
Formally, given a set of n assets, let µ = (µ 1 , ..., µ n ) be the vector of expected returns of such assets and w = (w 1 , ..., w n ) be a weight vector such that ∑ n i=1 w i = 1.The total expected return of the portfolio is calculated as the weighted sum of the expected returns of each asset, i.e., w T µ.
The risk or volatility of the portfolio is quantified by the standard deviation, denoted as σ, which is the square root of the portfolio variance.The portfolio variance is computed as the quadratic form of the weight vector and the covariance matrix Σ, i.e., w Σw.
It is worth noting that in high-dimensional scenarios, the covariance matrix may become singular, posing a challenge for classical solvers in convex optimization.Some solutions to this issue could involve replacing the inverse with the Moore-Penrose inverse or utilizing Stein-type compression techniques, as reported in [39].Although the choice around the correlation matrix may involve deeper reasoning in some applications, we handle the problem as QUBO in a complex scenario, thus naturally dealing with a nonconvex optimization problem, meaning that the singularity of the matrix can be circumvented as a potential problem for this class of models and relatively to the scope of our work.
The Sharpe Ratio, denoted by S, which does not consider a risk-free rate, is defined as the ratio between the expected return of a portfolio and the square root of its variance.Then, the problem that we solve, which we will denote as Max-Sharpe problem from now on, is formulated as The presence of additional constraints, such as regularization or sparsity conditions, may result in a problem more challenging to solve.

Diversification of Investments over Multiple Sectors
While we model the Sharpe Ratio maximization as the first building block of our task, in some business scenarios multiple additional needs arise, which can be encoded as constraints in the optimization problem or as additional terms in the objective function.The presence of these terms could make the problem no longer convex and therefore not efficiently solvable via classical techniques.
The need for diversification over multiple sectors, when modeled as an objective function item, leads to a nonlinear and, in general, nonconvex optimization problem.Let us assume that we have a matrix A with 5 assets (x 1 , ..., x 5 ) that belong to 3 sectors (s 1 , s 2 , s 3 ) that are defined in the following way: As can be seen from A, x 1 and x 2 belong to s 1 , whereas x 3 and x 4 belong to s 2 , and x 5 is the only asset that belongs to s 3 .We formulate the term as a penalization of investments on assets belonging to the same sector, while incentivizing the allocation of capital on the individual assets where f i < 0 ∀i = 1, • • • , n is the vector of components that drive the solution to invest a positive quantity, D ∈ B n×n is the matrix incorporating the information on the assets sectors and defined as follows 1 asset i and j belong to the same sector 0 otherwise , (7) and The rationale behind this formulation lies in the preference of having a term that penalizes investments on assets belonging to the same sector while simultaneously encouraging nonzero allocations across all assets.This approach drives the optimization process towards a solution that distributes investments as evenly as possible across all sectors.
Investors can benefit from spreading their assets across different sectors to reduce exposure to specific risks associated with a single sector.This diversification strategy helps mitigate the impact of adverse events or downturns affecting any particular sector since investors can potentially offset losses in one area with gains in another, leading to a more stable portfolio overall.By introducing diversification at the sector level, it may be possible to encode market dynamics, which could be difficult to capture from the correlation matrix alone.

QUBO Formulation for Sharpe Ratio Optimization
In [10], the authors proposed the standard de facto QUBO formulation for Sharpe Ratio maximization.Given a portfolio of n assets, the QUBO is defined as follows: where q i is a binary variable representing whether the i-th asset is selected (q i = 1) or not (q i = 0), a i represents the expected risk-adjusted return (a i = µ i /σ i ), and b ij represents the diversification penalties and rewards, corresponding to the correlation between assets i and j.
Optionally, for λ ∈ R ≥0 , the formulation with an additional term formulated as would reward solutions having M selected assets.Furthermore, the authors denote the need to group the values of a i into buckets of 11 evenly spread ranges, and b ij into buckets of non-evenly spread ranges.Such a step likely leads to a nonpositive definite correlation matrix, making the problem more complex for classical solvers.
Building on these considerations, we develop 2 different QUBO formulations for the Sharpe Ratio maximization.The first one, detailed in Section 3.1, consists of a natural expansion of the problem (8) to suit our goal of finding the optimal amount of investments; the second one, detailed in Section 3.2, considers a different modeling approach and encodes more faithfully the definition of Sharpe Ratio into a QUBO formulation.We will then use the first formulation as an auxiliary problem for benchmarking purposes and build on the second one to expand the model in order to handle the diversification maximization problem as well.This choice is motivated in Section 3.1.Finally, in Section 5.5, we draw considerations on the 2 formulations.

Sharpe Ratio Proxy Formulation
One important aspect to note in this approach is the coarse-grained selection, where each asset is either selected or not.
In order to solve a case with a wider scope where we are interested in finding the optimal quantity of investments over the assets, we propose the introduction of fractional weights.To do so, we substitute each 1-bit weight q i with a p-bit vector x i , encoded according to the formula: This encoding, with a chosen precision of p = 9, allows for discretizing the range [0, 1] with a granularity of 0.002.This means that the minimum investment in a single asset is 0.2%.We will denote the discretization constant as d , which, in this example, has been set to 2 /500.This choice for the encoding is general in terms of range of applicability of the QUBO formulation; the only potential reduction in applicability is given by fixing a minimum percentage that can be invested in a given asset.However, we believe that such an approximation is reasonable in financial contexts applied to portfolio optimization, where the amounts invested can be much larger, and that it does not leave out portions of assets that can contribute significantly to the performance of a portfolio.
The overall formulation is defined on the binary matrix of variables x ∈ B n×p , with a suggested value of p = 9, in the form where is the main objective function to minimize, and is the reward term for solutions that satisfy the constraint ∑ i w i = 1.Note that our formulation only has a linear overhead in the number of variables required.
The constants λ 0 and λ 1 are hyperparameters that can be chosen through educated guesses, grid-search, or black-box optimization methods such as Bayesian optimization [40].To ensure the satisfaction of the constraint, we should impose λ 1 λ 0 , which is convenient for classical devices with little numerical error on the encoding, but less convenient on a quantum annealer.
However, it is important to note that optimizing these formulations does not directly equate to maximizing the Sharpe Ratio as it is mathematically defined in (5).It is desirable to construct an objective function that accurately reflects the definition of the Sharpe Ratio.
Namely, defining for simplicity the variables z m = x ik ∀m = 1, • • • , p * n and knowing that p = 9: This is due to the lack of information related to the Sharpe Ratio of the overall portfolio.Instead, only the individual contributions of each asset's Sharpe Ratio and the correlation factor between assets are taken into account.Therefore, we propose a novel formulation that aims to maintain faithfulness to the original definition of the Sharpe Ratio.

Proposed Sharpe Ratio Formulation
Maximizing the Sharpe Ratio as defined in ( 5) is a nonlinear optimization problem.Authors in [38] provide a quadratic reformulation of such problem by introducing a change of variables, namely the problem ( 5) is now substituted by the following equivalent model: where r f is the risk-free rate, e is the vector of ones, and W is the set of feasible portfolios such that e T w = 1 ∀w ∈ W and under the assumption that ∃ ŵ ∈ W | µ T ŵ > r f .This way, given the optimal solution (y, k), it is possible to retrieve the optimal portfolio allocation as w = y k .In order to build our QUBO formulation for the problem above, we consider r f = 0, we identify the set of feasible portfolios W = {w ∈ R n |e T w = 1, w i ≥ 0 ∀i = 1, • • • , n}, meaning that we do not introduce additional linear constraints, and finally we assume We rely on the latter in order to define the discretization of the new variables.This assumption, along with the constraint µ T y = 1, allows finding an upper bound for the y variables equal to 1 µ min , where µ min is the smallest (positive) expected return of our assets.This consideration is backed up by the following, which holds under our assumption: The quadratic formulation of our optimization problem is as follows: where k = e T y and the optimal portfolio allocation in terms of assets weights w is found as w = y k .In this study, we focused on the daily adjusted close prices of the S&P500 index over the period from 2013 to 2020, sourced from Yahoo! Finance [41] (for further details, see Section 5.2).Using these data, we calculated the minimum daily yield, finding it to be µ min = 0.00245, which implies 1 µ min = 408.10190.Consequently, we discretized quantities in our QUBO model within the range [0, 408.10190], and defined the following coefficients: where p = 12 allows representing the range [0, 408.10190] with a discretization step equal to 0.1.Finally, our proposal for a novel QUBO formulation that maximizes the Sharpe Ratio under the assumptions previously declared writes as follows: where λ 0 and λ 1 are hyperparameters that must be tuned in order to find solutions that are both feasible and yield the highest value for the Sharpe Ratio, and and x ∈ B p•n .

QUBO Formulation For Diversified Portfolio Optimization
In this section, we propose the QUBO formulation for (6) according to the variables defined in (17), (18), and (19), we present the complete QUBO formulation for our task, comprehensive of both the Sharpe Ratio as defined by the terms ( 18) and ( 19) and diversification maximization and finally discuss the potential strategies required by classical techniques to tackle our problem, in particular the diversification term.
Building on the variable definition as in the model from ( 17), (18), and ( 19), we model the QUBO term in charge of maximizing the diversification as follows: where f i and D ij are defined as in Section 2.6.Finally, the complete QUBO model is formulated as: where λ 2 is an additional hyperparameter similar to λ 0 and λ 1 such that, for higher values, leads to more diversified solutions and H 0 and H 1 are as defined in Section 3.2.
As can be seen from ( 21), the possibility of having a QUBO model able to maximize both the Sharpe Ratio and the sectoral diversification allows potential investors to have a formulation that takes into account returns and volatility due to (17) and that diversifies the portfolio not only from the information of the covariance matrix, which does not take into account the concentration risk, but also from additional information such as the reference industries of the assets.
In order to quantify the degree of diversification of different portfolios, we employ the following indicator, which we call Diversification Entropy: where ∑ i w i,s is the quantity of investment allocated on sector s, s = 1, . . ., S, with S being the total number of sectors.This measure takes values in [0, 1], 0 corresponding to portfolios not diversified at all, and 1 indicating an equal investment over all sectors.
Classical techniques for solving a portfolio optimization task with a term as in (20), or even more nonconvex items modeling additional needs, may become inefficient as the problem size scales.Also, with a suitable change of variables, one could reformulate the objective function in order to linearize the problem and solve it through linear programming strategies.However, this method leads to a considerable increase in the number of variables, making it difficult for classical solvers to efficiently find satisfactory solutions for large-scale problems.As an example, considering N = 500 linear variables-w i as per our notation, one would need to build an additional variable for each pair of w i , w j in the model in order to associate a penalization coefficient to assets belonging to the same sector: this leads to a total number of 124, 750 additional variables that a quadratic programming model such as QUBO would not need.Moreover, additional constraints would be added in order to efficiently integrate the linearization variables within the problem.As the number of assets, and hence linear variables, scales, the effort to model a diversification or equivalently nonlinear term grows considerably.With the ongoing research and development on quantum computing, which is expected to find high-quality solutions in complex scenarios, it is of paramount importance to define combinatorial optimization tasks in such a way that quantum computers may solve the problem.

Experimental Assessment
In this section, we show the results of our experiments regarding two main aspects of our proposed QUBO formulation: on one hand, following the core scope of our work, we report the behavior of the complete model as we vary the parameters influencing the Sharpe Ratio and the diversification terms, discussing on the trade-off of these measures for the optimized portfolios; on the other hand, as an additional investigation, we evaluate the performance of our formulation for the sole Sharpe Ratio maximization in comparison to other techniques, where the goal is to discuss the appropriateness of our model with respect to the so-called Sharpe Ratio proxy.For the second part of our study, we consider the closeness of the Sharpe Ratio values between the portfolios derived from QUBO formulations and the one yielded by a classical optimization, which is expected to efficiently solve the problem of maximizing the sole Sharpe Ratio.The benchmarking analysis is conducted on a real-world dataset of assets obtained from Yahoo! Finance.
To solve the QUBO formulations, we employ the classical solver qbsolv and the hybrid classical-quantum solver D-Wave Leap [42].In order to obtain the solution from a classical strategy, we use PyPortfolioOpt [43], a state-of-the-art Python3 library for portfolio optimization.

Choice of the Solver
The choice of solver plays a crucial role in optimization and can significantly impact the quality of the obtained solutions.To ensure a fair comparison, we utilize two different solvers: D-Wave's qbsolv and the D-Wave Leap model hybrid_binary_quadratic_model_version2.The former is a classical technique that decomposes large QUBO matrices into smaller sub-instances, which are then solved using Tabu Search [44].The latter is a hybrid classicalquantum solver that combines classical and quantum optimization methods.
The use of a hybrid solver is of great importance in overcoming the current limitations of quantum hardware.Notably, the D-Wave Quantum Processing Unit (QPU) imposes restrictions on the precision at which the QUBO formulation can be encoded, requiring various adjustments and approximations at the mathematical level to compensate for these limitations.Additionally, configuring the numerous settings of a quantum annealer, such as the schedule and annealing time, is nontrivial.The hybrid solver automatically configures the optimization problem to run on classical resources and on the QPU (in our case, the Advantage QPU based on the Pegasus topology with more than 5000 qubits), without the need for manual configuration.It is important to note that the specific details of how the computing resources are utilized within the solver are not publicly disclosed.

Dataset Specifications and Preprocessing
As outlined in Section 3, the input data refers to adjusted close prices with daily granularity for 505 assets in the S&P500 indicator from 2013 to 2020.The dataset consists of 505 columns and 3135 observations (bank holidays and weekends are already removed from the provider).To prepare the data, we follow the procedure outlined below: Remove null values and exclude assets whose time series contain consecutive null values, meaning null values for consecutive days.

2.
Calculate the simple returns R t , defined as R t = P t −P t−1 , and the log-returns logR t , defined as logR t = log(1 + R t ).

3.
Compute the sample expected returns and sample covariance matrix, annualizing the results using a frequency factor of 252, which represents the assumed number of trading days in a year.4.
Remove assets with negative expected returns.
Assets were grouped according to the information reported by GICS, an industry analysis framework established in 1999 by Standard & Poor's and Morgan Stanley Capital International (MSCI) [45], into the following 11 sectors: Basic Materials, Communication Services, Consumer Cyclical, Consumer Defensive, Energy, Financial Services, Healthcare, Industrials, Real Estate, Technologies and Utilities.Table 1 shows summary statistics on average expected returns, volatility, and number of assets aggregated for each sector for the 432 assets considered after preprocessing the input data.The last step is not strictly necessary for a generic portfolio optimization strategy.However, it is an assumption for the QUBO formulation proposed in this work in order to provide the upper bound on the y variable as shown in the previous section.
After applying these steps, we are left with 460 assets in the case of simple returns and 432 assets in the case of log-returns.The discrepancy derives from the fact that 28 assets yield less than 1, which turns negative after logarithmic transformation.We make the assumption that for any arbitrary asset i, the series {R i,t } t≥1 , or equivalently the series of log-returns, follows a sequence of independent and identically distributed Gaussian random variables.This assumption allows us to use the sample mean and sample covariance matrix as estimators for the expected return and the covariance matrix of the assets, respectively.While there may be other financial time series models that could potentially be more suitable for the dataset, the exploration of such models is beyond the scope of this work.In order to have an acceptable precision we need to discretize the formulation in ( 16) using p = 12 bits of precision for the variables.
Figure 1 shows the comparison between the distributions of simple and log-returns through a Quantile-Quantile plot.Given our assumptions on the Gaussian distribution of the assets, we also perform a Shapiro test on the two distributions of returns and base our optimization on the data that shows the highest probability of following a Gaussian distribution, which leads to suggesting the use of log-returns.

Evaluation of the Solutions
It is important to note that our main focus in this work is to provide a novel QUBO formulation to tackle a complex instance of the portfolio optimization problem incorporating multiple business needs.We also stress the ease of modeling quadratic and nonconvex objective function items in the QUBO model, in contrast to the effort required by classical techniques, as discussed in Section 4. Our interest is then to investigate the behavior of our formulation as the impact of a diversification term on the optimized portfolio intensifies, using the Sharpe Ratio and Diversification Entropy as indicators of the solutions' quality.Furthermore, we study the appropriateness of our QUBO formulation for the Sharpe Ratio maximization as a building block for the complete optimization model.Therefore, we do not emphasize the computational time required to obtain the solutions as it is not the primary focus of our study.Instead, we draw attention to the quality of the results in terms of objective function value, which constitutes a larger interest from a business perspective with respect to an investigation of the time-wise scaling properties of the approach.

λ Hyperparameter Optimization
To ensure the generation of high-quality solutions, we conduct a calibration procedure to determine the optimal values for λ 0 , λ 1 , and λ 2 in our formulations, separately for each of the goals of our investigation.The objective of this calibration process is to study how the Sharpe Ratio and diversification measures differ for varying values of the hyperparameters, as well as to create a significant energy gap between feasible and infeasible solutions within the QUBO matrix.This gap facilitates the identification of correct and incorrect results in terms of constraint satisfaction.Additionally, the calibration aims to establish a correlation where higher-quality portfolios are associated with lower energy values, enabling the optimization process to produce improved solutions.
Regarding the analysis on the sole Sharpe Ratio maximization, we conducted multiple QUBO instances with varying parameters and repeated the procedure for each combination of QUBO formulation and solver.Through this process, we obtained the following optimal values.A detailed explanation of our analysis can be found in Appendix A. The optimal values we found are as follows: • For the Sharpe Ratio proxy formulation, solving with qbsolv: λ 0 = 1.2631, λ 1 = 300.• For the Sharpe Ratio proxy formulation, solving with D-Wave Leap: λ 0 = 1.2631, λ 1 = 300.

Results
Figure 2 shows the Sharpe Ratio and diversification measures of the optimized portfolio as the hyperparameters vary.For higher values of λ 2 , whose scope is to induce solutions with a higher degree of diversification, we report results in accordance with the expectations.As the diversification term impacts more and more significantly the optimization, the Sharpe Ratio value tends to decrease: this is due to the optimization being decreasingly incentivized to maximize the Sharpe Ratio, favoring solutions that allocate investments over different sectors but having less impact on the expected return or covariance of the assets.The runs have been performed having fixed λ 0 = 0.44 and λ 1 = 10, 000.A value high enough for λ 1 ensures that the constraint expressed in ( 19) is met throughout all the runs.Then, for our scope, the exact value of λ 0 is not as relevant as the actual ratio between λ 0 and λ 2 , which determine the magnitude of the impact of one measure over the other (respectively, Sharpe Ratio and diversification).For the problem at hand, the behavior shown in Figure 2 is expected.Although there is no mathematical guarantee that an increase in diversification corresponds to a decrease in the Sharpe Ratio, it is reasonable to expect such dynamics for a general dataset such as the one used.Having experimentally confirmed the expected behavior, and based on all considerations drawn throughout the work on the building of the model, we conclude that our proposed formulation that encodes the Sharpe Ratio and diversification maximization is appropriate for the scope while laying the ground for future investigations on benchmarking different techniques and providing a model which can be extended to handle multiple complex optimization terms and solved via quantum computing.
In our investigation on the appropriateness of our Sharpe Ratio maximization formulation, where hence the diversification term is discarded, once the optimal values for λ 0 and λ 1 are fixed, we retrieve 10 additional feasible solutions for each combination of QUBO formulation and solver.We gather statistics on the results in terms of Sharpe Ratio values.In Figure 3  for the Sharpe Ratio proxy formulation, feasibility is given by the sum of asset weights equal to 1, while for the Proposed Sharpe Ratio formulation, we consider the constraint satisfied if µ T y is in a neighborhood of 1, up to a factor equal to 2.5 × 10 −4 .The colour of each violin plot refers to the average Sharpe Ratio value of the runs of each configuration.The darker the colour, the higher the average Sharpe Ratio value.
The Sharpe Ratio proxy and the Proposed Sharpe Ratio formulation result in 3888 and 5184 binary variables, respectively.Among the QUBO solutions, the best results are achieved by solving the Proposed Sharpe Ratio formulation with qbsolv.It is worth noting that, for both formulations, the best performances are obtained by different solvers: D-Wave Hybrid in one case and qbsolv in the other.This discrepancy can be attributed not only to the different number of variables but also to the specific patterns of the QUBO matrices.With 432 assets, the block size is 9 variables for the Sharpe Ratio proxy formulation and 12 variables for the Proposed Sharpe Ratio formulation.
This difference in block size may influence the behavior of the solvers and lead to varying results.The gap between the classical and the best QUBO solutions can be attributed to the capabilities of the solvers.A finer discretization would allow for a closer representation of continuous values but may result in an increase in the number of variables, potentially affecting the solver's performance and leading to suboptimal results.
Figure 4 presents statistics regarding the number of assets selected using the two formulations and different solvers.With the optimization strategy fixed, the solutions exhibit a consistent pattern: the Proposed Sharpe Ratio formulation results in a decrease in both the mean number of assets selected and the variability.This behavior is observed across all solvers and can be attributed to the differences in the number of variables and block sizes between the two QUBO formulations, as discussed previously.

Conclusions
The portfolio optimization problem is a well-known task in the financial economy and has recently drawn attention within the quantum computing literature thanks to the applicability of quantum annealers to solve the problem.
In this work, we tackle a specific strategy to find the optimal allocation of investments over a set of assets, namely the Sharpe Ratio maximization, while optimizing a diversification measure that allows spreading investments across multiple sectors, leading to a nontrivial optimization task.When modeling the Sharpe Ratio, the first building block of our work, we extend a QUBO formulation proposed by [10], highlighting its benefits and potential drawbacks, and propose a novel QUBO formulation to address the drawbacks.Then, we formulate the complete QUBO by taking into account both measures of portfolio quality.We run our experiments on classical and quantum computing hardware and elaborate on the results both in terms of the QUBO formulation and in terms of the optimization solver.Future works might include a deeper investigation of different solvers, comparing also the presented quantum method with alternative resolution strategies, the corresponding capabilities of the Proposed Sharpe Ratio formulation, and the formulation of additional common needs from the financial economy industry.Other important aspects that could be investigated in future work could involve the comparison of solutions with alternative classic methods.

1 .
Simple returns (b) Log-returns Figure Quantile−Quantile plots for simple returns and log-returns.

Figure 2 .
Figure 2. Optimization runs on D-Wave's QBSolv (CPU) and D-Wave's Hybrid (QPU) of multiple QUBO instances as λ 2 varies.The plots report the behavior of Sharpe Ratio (blue line) and Diversification Entropy (red line), respectively, as λ 2 increases.All runs are feasible, considering the constraint µ T y = 1 satisfied if µ T y is in a neighborhood of 1 (ref.to Section 3.2), up to a factor equal to 2.5 × 10 −4 , which is given by multiplying the minimum discretization coefficient by the minimum expected return.
, we compare these results with the solution obtained by solving the Max-Sharpe problem implemented in the PyPortfolioOpt library [43].

Figure 3 .
Figure 3. Results provided by each combination of QUBO formulation and solver.The statistics are drawn from 10 feasible solutions with fixed values for the λ coefficients.All solutions are feasible:for the Sharpe Ratio proxy formulation, feasibility is given by the sum of asset weights equal to 1, while for the Proposed Sharpe Ratio formulation, we consider the constraint satisfied if µ T y is in a neighborhood of 1, up to a factor equal to 2.5 × 10 −4 .The colour of each violin plot refers to the average Sharpe Ratio value of the runs of each configuration.The darker the colour, the higher the average Sharpe Ratio value.

Figure 4 .
Figure 4. Minimum, maximum, and mean (identified by the dot) in the number of assets selected over the 10 feasible solutions found.Left plot reports the statistics solving the two QUBO formulations using D-Wave Hybrid.Right plot, likewise, shows these results derived from the QBSolv solver.

( a )Figure A1 .
Figure A1.Percentage of feasible solutions obtained from 20 runs for each lambda configuration, for each combination of formulation and solver.Bar colour refers to the percentage value of run feasible for each configuration.The darker the colour, the higher the value.

Table 1 .
Summary statistics of sectors.