Data-Driven Sustainable Investment Strategies: Integrating ESG, Financial Data Science, and Time Series Analysis for Alpha Generation

: In today’s investment landscape, the integration of environmental, social, and governance (ESG) factors with data-driven strategies is pivotal. This study delves into this fusion, employing sophisticated statistical techniques and Python programming to unveil insights often overlooked by traditional approaches. By analyzing extensive datasets, including S&P500 financial indicators from 2012 to 2021 and 2021 ESG metrics, investors can enhance portfolio performance. Emphasizing ESG integration for sustainable investing, the study underscores the potential for alpha generation. Time series analysis further elucidates market dynamics, empowering investors to align with both financial objectives and ethical values. Notably, the research uncovers a positive correlation between ESG risk and total risk, suggesting that companies with lower ESG risk tend to outperform those with higher ESG risk. Moreover, employing a long–short ESG risk strategy yields abnormal returns of approximately 4.37%. This integration of ESG factors not only mitigates risks associated with environmental, social, and governance issues but also capitalizes on opportunities for sustainable growth, fostering responsible investing practices and ensuring long-term financial returns, resilience, and value creation.

1. Introduction 1.1.Data-Driven Strategies, Environmental, Social, and Governance Integration, and Alpha Generation in Modern Investment Practices In today's ever-evolving financial landscape, the adoption of data-driven strategies has become essential for making well-informed investment decisions.Through the utilization of financial data science, statistical analysis, and Python programming, investors can harness extensive data to uncover patterns and insights that traditional methods might miss.This shift towards data-driven approaches is driven by their proven ability to enhance decision-making, manage risks, and, ultimately, improve investment performance.At the forefront of this transformation lies ESG investing, which incorporates non-financial factors like environmental sustainability, social responsibility, and corporate governance into investment analysis.ESG considerations have gained momentum as investors recognize the importance of aligning their investments with broader societal and environmental goals.
In addition to financial gains, ESG factors offer insights into a company's long-term sustainability and reputation.Companies with robust ESG practices are better equipped to handle risks, attract capital, and seize emerging opportunities.Integrating ESG criteria into investment strategies not only drives positive societal impact but also aligns portfolios with investor values.Meanwhile, alpha generation remains a primary goal for investors seeking to outperform the market.Alpha represents returns above market performance and can be achieved through data-driven methodologies.By leveraging statistical techniques, time series models, and Python, investors can identify investment opportunities and enhance portfolio performance.
Alpha is crucial for portfolio diversification and risk management, signaling superior investment skill and performance.As investors increasingly prioritize sustainability and responsible investing, the role of ESG factors becomes paramount.Incorporating ESG considerations alongside data-driven strategies allows investors to pursue financial gains while contributing to positive environmental and social outcomes.This paper explores the importance of data-driven approaches, the role of ESG considerations in investments, and techniques for alpha generation.By examining the intersection of these areas, we aim to offer actionable insights for navigating today's complex financial markets.

Statistics and Trends in Environmental, Social, and Governance Investing
To underscore the significance of ESG investing, it is imperative to highlight the key statistics and trends shaping its growth in the financial landscape.Recent data reveal a substantial surge in the global sustainable investment market, with assets under management surpassing trillions of dollars (PwC 2022).This surge signifies a burgeoning recognition among investors regarding the crucial integration of environmental, social, and governance factors into their investment strategies.Moreover, empirical evidence suggests that companies demonstrating robust ESG performance consistently outperform their counterparts in terms of financial returns, risk mitigation, and long-term viability (Horton and Jessop 2022).By embracing ESG considerations in their investment approaches, stakeholders not only align their portfolios with ethical values but also potentially enhance their financial performance.
Furthermore, as highlighted in by PwC (2022), ESG-focused institutional investment is projected to soar by 84% to USD 33.9 trillion in 2026, constituting 21.5% of assets under management.This substantial rise underscores the growing importance of ESG factors in driving investment decisions and reflects a significant shift in investor preferences towards sustainable and responsible investing practices.

Enhancing Data Science Techniques
While the article acknowledges the utilization of Python programming and statistical methods, it can be enriched by briefly elucidating the common data science techniques prevalent in investment analysis.Notably, machine learning algorithms such as decision trees, random forests, and support vector machines play a pivotal role in predictive modeling within investment realms.Decision trees offer a transparent and interpretable framework for decision-making by partitioning data into segments based on feature importance (Singh 2024).Random forests, collections of decision trees, excel in handling large datasets and mitigating overfitting concerns, thus providing robust predictions (Singh 2024).On the other hand, support vector machines (SVMs) are effective in classification tasks, separating data points into different classes through the use of hyperplanes in high-dimensional space (Singh 2024).These algorithms empower investors to meticulously analyze vast datasets, uncovering intricate patterns that inform prudent investment decisions.
Furthermore, clustering techniques like K-means clustering or hierarchical clustering are instrumental in portfolio optimization endeavors.K-means clustering partitions data points into clusters based on similarity, enabling investors to group assets with comparable characteristics (Singh 2024).Hierarchical clustering, on the other hand, constructs a tree-like hierarchy of clusters, facilitating a deeper understanding of asset relationships and diversification strategies (Singh 2024).Through the strategic grouping of analogous assets, investors construct diversified portfolios adept at minimizing risk while maximizing returns.Incorporating these sophisticated data science techniques into investment analysis not only enhances decision-making precision but also enables investors to seize lucrative market opportunities more effectively.

Research Objectives
This study aims to explore the limited adoption of data science techniques in investment decision-making processes, which persists despite the availability of extensive financial data and technological advancements.It seeks to identify the inefficiencies inherent in traditional investment methods characterized by subjective judgments and limited quantitative analysis, thereby highlighting the potential for data-driven approaches to enhance decision precision and effectiveness.Additionally, the research endeavors to dissect the complexity of financial markets influenced by various factors such as economic indicators and geopolitical events, with a focus on leveraging data science techniques to uncover hidden patterns and insights within large datasets.Furthermore, it aims to underscore the importance of alpha generation as a primary objective for investment professionals in competitive markets and elucidate how innovative data-driven strategies can offer new avenues for identifying unique investment opportunities and generating alpha.
Explicitly addressing these objectives contributes to the existing body of knowledge by filling gaps identified in previous research.While traditional investment approaches have relied heavily on subjective judgments, this study recognizes the need to integrate data science techniques to overcome inherent inefficiencies and enhance decision-making accuracy.By shedding light on the untapped potential of data-driven methodologies in investment processes, the research offers insights into how modern technology can revolutionize traditional paradigms and contribute to alpha generation strategies.
Furthermore, through dissecting the complexity of financial markets and leveraging data science to uncover hidden patterns, the study contributes to a deeper understanding of market dynamics, addressing a gap in previous research by providing investors with actionable insights derived from rigorous quantitative analysis.This exploration of market intricacies enhances the potential for alpha generation.By highlighting the potential for data-driven approaches to identify unique investment opportunities and generate alpha, the research offers practical solutions to enhance investment performance in competitive markets.
Finally, the integration of financial data science techniques, including hypothesis testing and time series models, provides deeper insights into market dynamics and empowers investors to make informed decisions.This aspect of the study bridges the gap between traditional investment paradigms and modern data-driven methodologies, offering investors the tools and knowledge necessary to navigate financial landscapes with greater confidence and effectiveness while generating alpha.
1.5.Review of Existing Literature 1.5.1.Descriptive Statistics in Data Analysis Dong (2023) emphasizes the pivotal role of descriptive statistics in simplifying complex datasets and facilitating meaningful conclusions for research and decision-making processes.Measures such as mean, median, standard deviation, and quartiles offer a concise summary of data and enable further analysis through univariate, bivariate, and multivariate methods (Dong 2023).1.5.3.Financial Time Series Analysis Chakraborti et al. (2007) contribute to financial time series analysis by highlighting the significance of stochastic methods.Their study emphasizes the prevalence of noisy and chaotic processes in financial data and the complexities involved in modeling asset price variations, shedding light on the challenges faced in this domain (Chakraborti et al. 2007).

Predictive Analysis Using Social Media Sentiments
A study by Asgarov (2023) explores the relationship between social media sentiments and stock prices.Employing LSTM neural networks for predictive analysis, the study utilizes sentiment analysis from Twitter data to predict stock price movements accurately, offering insights into the potential of social media sentiment analysis in financial forecasting (Asgarov 2023). 1.5.5. Data-Driven Investment Strategies Ye (2021) investigates investment strategies for Peer-to-Peer (P2P) lending platforms, employing econometric modeling and machine learning techniques.The study provides valuable insights for investors in the P2P lending space, identifying optimal investment strategies through data-driven approaches (Ye 2021).
1.5.6.Environmental, Social, and Governance Factors in Investment Decisions Cohen (2023) discusses the growing awareness of ESG factors in investment decisions and their impact on corporate value.This study highlights the shift towards sustainable practices in the financial market and emphasizes the importance of considering ESG factors in investment strategies (Cohen 2023). 1.5.7. Bibliometric Review of Environmental, Social, andGovernance Factors andRisk De Giuli et al. (2024) conduct a bibliometric review, analyzing literature on ESG factors and risk, identifying trends and key research areas.Their study underscores the critical role of finance in promoting sustainable growth and emphasizes the need for further research in understanding ESG risks and their implications for financial markets (De Giuli et al. 2024).

Data-Driven Approaches in Stock Market Investment
A study by Narukulla (2022) discusses the advantages of leveraging big data technologies and AI in stock market investment.The study highlights the transformative impact of these technologies on stock markets and financial services, paving the way for more sophisticated investment strategies (Narukulla 2022).1.5.9.Integration of Artificial Intelligence and Data Science in Finance Farooq and Chawla (2021) provide a comprehensive overview of the integration of AI and data science in the finance industry.Their study showcases the transformative impact of AI on financial services, highlighting significant cost-saving projections and operational improvements (Farooq andChawla 2021). 1.5.10. Environmental, Social, andGovernance Strategies andDividend Payout Policies Niccolò et al. (2020) analyze the role of ESG strategies in dividend payout policies, finding a negative impact of ESG practices on dividend payout.This suggests a trade-off between sustainability investments and dividend payments, highlighting the complexities involved in balancing these aspects (Niccolò et al. 2020). 1.5.11. Environmental, Social, andGovernance Risk andFirm Value Stiadi (2023) examines the role of ESG risk in moderating the relationship between investment decisions and firm value.This study underscores the importance of considering ESG factors in investment decisions and their implications for firm value (Stiadi 2023).
1.5.12.Factors Affecting Environmental, Social, and Governance Considerations and Their Investment Impact Utilizing interpretive structural modeling, Aich et al. (2021) explore factors affecting ESG considerations that have an investment impact.Their study emphasizes the role of good governance in driving investment impact, providing insights into the structural relationships among influential factors (Aich et al. 2021). 1.5.13. Correlation between Environmental, Social, andGovernance Ratings andFinancial Variables Gupta et al. (2021) discuss the correlation between ESG ratings and financial variables, demonstrating the importance of ESG parameters for investment decisions.Their study highlights the influence of ESG ratings on financial performance, offering insights into the relationship between sustainability and financial outcomes (Gupta et al. 2021).
1.5.14.Analyses and Tests Using the Vector Autoregressive Model Akkaya (2021) explores different analyses and tests conducted using the Vector Autoregressive (VAR) model to gain insights into the relationships and interactions between variables.Their study contributes to the understanding of the dynamics within the VAR model, shedding light on its applications in financial analysis (Akkaya 2021).
1.5.15.Environmental, Social, and Governance Disclosure by Firms Examining ESG disclosure by firms, Ehlers et al. (2022) provide insights into the challenges and potential benefits of focusing on specific ESG themes.Their study highlights the implications of ESG disclosure for investors and emphasizes the importance of considering ESG factors in investment decisions (Ehlers et al. 2022).

Developing an Investment Idea
Data-driven investing is a methodological approach where investment strategies are developed and implemented based on rigorous statistical analysis.This involves formulating hypotheses, collecting and pre-processing data, estimating relevant measures, and testing these hypotheses using empirical data.In the context of exploring the impact of corporate governance, environmental, and social responsibility (ESG) factors on investment returns and risk, it is essential to structure our research questions and hypotheses carefully.It is crucial to note that ESG (environmental, social, and governance) factors are often quantified and reported as an ESG Risk Score or ESG Score.A higher (lower) ESG Risk Score typically indicates weaker (stronger) governance, environmental, and social responsibility practices within a company.This quantification enables investors to assess and compare the ESG performance of different companies within their investment universe.It is common practice to express investment ideas as research questions to guide the analysis effectively.
Research Question 1: We seek to understand the relationship between corporate governance, environmental, and social responsibility and stock returns/risk.Are these factors correlated with financial performance, and, if so, to what extent?Research Question 2: Assuming a relationship exists, can it be leveraged to achieve abnormal returns, i.e., returns that exceed what would be expected given the level of risk taken?
To address these questions, we formulate the following hypotheses:

Hypothesis 1 (H1):
There exists a positive relationship between ESG risk and expected returns.
In other words, companies with a higher ESG Risk Score may be associated with higher expected returns due to the perceived risk premium associated with their environmental, social, and governance practices.

Hypothesis 2 (H2):
There exists a positive relationship between ESG risk and total risk.This hypothesis posits that companies with a higher ESG Risk Score may also exhibit higher levels of total risk, including both systematic and idiosyncratic risk factors.

Hypothesis 3 (H3):
The returns of firms with higher ESG risk are statistically greater than the returns of firms with lower ESG risk.
This hypothesis tests whether companies with a higher ESG Risk Score outperform those with a lower ESG Risk Score, suggesting a potential market inefficiency that can be exploited for investment gains.

Hypothesis 4 (H4): A long-short ESG risk strategy yields abnormal returns.
This hypothesis tests whether one can earn a higher return than the market portfolio by 'going long' in stocks with high ESG risk and 'shorting' those with low ESG risk.
By rigorously testing these hypotheses using appropriate statistical techniques and empirical data, we aim to gain insights into the relationship between ESG factors and investment performance and assess the feasibility of data-driven strategies for generating abnormal returns in the financial markets.

Sourcing Relevant Data: Data Collection and Pre-Processing
Testing our hypotheses necessitates two types of data: stock price data for analyzing returns and risk and ESG (environmental, social, and governance) data as a proxy for governance, environmental, and social responsibility factors.Obtaining this data is subject to certain constraints to ensure the validity and reliability of our analysis.These constraints include budget considerations, requiring data that are freely accessible; sample size requirements, necessitating a large dataset in terms of both time series and cross-sections; and the need for replicability, ensuring that results can be replicated within the same dataset.
Sourcing stock price data involves accessing information from various sources, both free and paid.Examples of such sources include NASDAQ Data Link (formerly Quandl), Pandas Datareader, and Google Finance via Google Sheets.For our analysis, we utilize the full sample of S&P500 stocks, ensuring a large dataset in both the time series and cross-section dimensions.The S&P500 index encompasses a broad selection of 500 leading companies in the United States, representing various sectors of the economy, including technology, healthcare, finance, and consumer goods.This diverse composition provides a robust representation of the overall market performance and is widely used for benchmarking and investment analysis purposes.When working with stock price data, it is essential to address outliers that may skew the analysis.Various pre-processing techniques, such as data normalization, handling missing values, and detecting and removing outliers, are employed to ensure the integrity of the data.
The S&P500 dataframe consists of 2367 rows and 506 columns.These columns represent various companies included in the S&P500 index, such as 3M (MMM), Abbott Laboratories (ABT), AbbVie Inc. (ABBV), and many more.The timeframe of the data spans from 2012 to 2021. Figure 1 gives an overview of the S&P500 stock price data in the form of a pandas dataframe.This extensive timeframe allows for a comprehensive analysis of stock price movements and trends over nearly a decade.The dataset provides valuable insights into the historical performance of a diverse range of companies across different sectors, serving as a reliable foundation for investment analysis and decision-making.On the other hand, ESG data can be obtained from sources such as Sustainalytics via Yahoo!Finance.ESG data provide insights into a company's environmental, social, and governance practices, allowing investors to evaluate its sustainability and ethical performance.The ESG dataset includes metrics such as the ESG Risk Score, Environment Risk Score, Social Risk Score, Governance Risk Score, and Controversy Level.These metrics reflect different aspects of a company's ESG performance and can vary over time.However, it is important to recognize the limitations of ESG data, particularly its cross-sectional nature, which reflects ESG metrics at a specific point in time.This implies an assumption that ESG risk is not time-varying, which may introduce limitations and biases into our analysis.Additionally, using ESG data as is may lead to look-ahead bias, compromising the integrity of our results.Despite these challenges, acknowledging and addressing these limitations are crucial for ensuring the robustness and validity of our analysis.By incorporating both S&P500 and ESG data, our analysis encompasses a comprehensive understanding of the relationship between financial performance and ESG factors, enabling informed investment decision-making.The ESG (environmental, social, and governance) dataframe contains 505 rows and 8 columns.These columns include 'Symbol', 'Security', 'ESG Risk Score', 'Environment Risk Score', 'Social Risk Score', 'Governance Risk Score', 'Controversy Level', and 'Data Date'.Figure 2 gives an overview of the ESG data in pandas dataframe format.The 'ESG Risk Score' provides an overall assessment of a company's ESG performance, while the individual risk scores for environment, social, and governance factors offer insights into specific areas of concern.The 'Controversy Level' indicates the degree of controversy associated with the company's ESG practices.The timeframe of the data spans from 1 March 2021 to 1 May 2021, capturing a snapshot of ESG metrics during this period.This dataset serves as a crucial component in evaluating the ESG performance of companies and its implications for investment decision-making.Figure 3 illustrates the expected return, total risk, and ESG Risk Score, and Figure 4 explores correlations across all variables via a correlation matrix represented in a pandas dataframe.

Statistical Techniques and Time Series Analysis
In our comprehensive analysis, we employ various statistical techniques and time series models to thoroughly investigate the relationships between ESG factors and investment returns across all hypotheses.Initially, we utilize statistical techniques such as the t-statistic for correlation analysis, np.corr using the numpy library of Python for calculating correlation coefficients, and the Pearson correlation coefficient (Pearsonr) to measure the linear correlation between ESG Risk Score and expected returns for Hypothesis 1 and total risk for Hypothesis 2 (Abdey 2023;Cleff 2019;Lin et al. 2019).These techniques will provide insights into the strength and direction of the relationship between ESG factors and financial performance for each hypothesis.Furthermore, we apply Ordinary Least Squares (OLS) regression analysis to examine the impact of ESG risk on expected returns for Hypothesis 1, total risk for Hypothesis 2, and investment returns for Hypotheses 3 and 4 (Abdey 2023; Buckle and Beccalli 2011;Lin et al. 2019).OLS regression allows us to estimate the relationship between ESG Risk Score and financial performance measures, providing valuable insights into the potential predictive power of ESG factors in determining investment outcomes and abnormal returns.To assess the causal relationship between ESG factors and investment returns across all hypotheses, we conduct Granger causality tests using Vector Autoregression (VAR) models.Prior to this, we test for stationarity in the data using the Augmented Dicky Fuller Test (ADF) (Lin et al. 2019;Qi et al. 2022).Granger causality tests help determine whether past values of ESG Risk Score provide significant information for predicting future investment returns or total risk, thereby elucidating the causal dynamics between ESG factors and financial performance.Additionally, for Hypothesis 3, which posits that returns of firms with higher ESG risk are statistically greater than the returns of firms with lower ESG risk, we sort ESG data into quintiles based on ESG Risk Score.This approach allows us to compare the investment returns of companies across different quintiles of ESG risk, providing empirical evidence to support or refute the hypothesis.For Hypothesis 4, which suggests that a long-short ESG risk strategy yields abnormal returns, we utilize the Capital Asset Pricing Model (CAPM) framework to test for alpha (Abdey 2023;Buckle and Beccalli 2011;Cleff 2019).By estimating alpha using the CAPM and incorporating OLS regression, we can evaluate whether the long-short ESG risk strategy yields abnormal returns, thus contributing to our understanding of the effectiveness of ESG factors in generating alpha in investment portfolios.Each of these tests and models plays a crucial role in our analysis, providing a comprehensive framework for investigating the relationships between ESG factors and investment returns, assessing causality, and evaluating the potential for alpha generation within the context of ESG investing across all hypotheses.Figure 5 shows a block diagram of the proposed methodology.

t-Stat Correlation
Testing H1 and H2 requires the following formula: where: -

Variables
We investigate the relationship between environmental, social, and governance (ESG) risk factors and investment outcomes by conducting separate t-tests for each hypothesis at a significance level of 5%.For H2: • Dependent variable: total risk (risk); • Independent variable: ESG Risk Score (ESG).

Model Specification
For H1: We test the hypothesis that there exists a positive relationship between ESG risk and expected returns using the following t-test equation: where: • tStat ρ ESG,returns represents the t-statistic for the correlation coefficient (ρ ESG,returns ); • ρ ESG,returns is the correlation coefficient between ESG risk and expected returns; • n is the sample size.
For H2: We test the hypothesis that there exists a positive relationship between ESG risk and total risk using a similar t-test equation: where: • tStat ρ ESG,risk represents the t-statistic for the correlation coefficient (ρ ESG,risk ); • ρ ESG,risk is the correlation coefficient between ESG risk and total risk; • n is the sample size.

Interpretation
For each hypothesis, a statistically significant t-statistic with a p-value of less than 0.05 would indicate a meaningful relationship between ESG risk and the respective variable, providing evidence to support the hypothesis.

Conclusion
By conducting separate t-tests for each hypothesis and analyzing the correlation between ESG risk and investment outcomes at a 5% significance level, we aim to contribute to the understanding of the impact of ESG considerations on investment performance and risk management.

Correlation Analysis (Pearson Correlation Coefficient)
Correlation analysis, often utilizing the Pearson correlation coefficient, assesses the strength and direction of the linear relationship between two variables.It provides insights into how changes in one variable correspond to changes in another, aiding in understanding potential associations within the data.
-X i and Y i are individual data points; -∑ denotes the sum across all data points; -n is the sample size.
For H1: We test the hypothesis that there exists a positive relationship between ESG risk and expected returns using the following Pearson correlation coefficient formula: where: • r ESG, returns is the Pearson correlation coefficient between ESG risk and expected returns; • ESG i and returns i represent the individual data points of ESG risk and expected returns, respectively; • ∑ denotes the sum across all data points; • n is the sample size.
For H2: We test the hypothesis that there exists a positive relationship between ESG risk and total risk using the following similar Pearson correlation coefficient formula: where: • r ESG, risk is the Pearson correlation coefficient between ESG risk and total risk; • ESG i and risk i represent the individual data points of ESG risk and total risk, respectively; • ∑ denotes the sum across all data points; • n is the sample size.
For H3: To test the hypothesis regarding the significance of differences in returns between different ESG risk quintiles, we calculate the correlation between the returns of the first quintile (Q1) and the fifth quintile (Q5) ESG portfolios.This is carried out using the following Pearson correlation coefficient formula: where: • r Q1-Q5 is the Pearson correlation coefficient between the returns of the Q1 and Q5 ESG portfolios; • Q1 i and Q5 i represent the individual data points of returns for the Q1 and Q5 portfolios, respectively; • ∑ denotes the sum across all data points; • n is the sample size.
For H4: To test the hypothesis that a long-short ESG risk strategy yields abnormal returns, we calculate the correlation between the returns of the long-short ESG portfolio and the market returns.This is carried out using the following Pearson correlation coefficient formula: where: • r Long-Short, Market is the Pearson correlation coefficient between returns of the longshort ESG portfolio and market returns; • LS i and Market i represent the individual data points of returns for the long-short ESG portfolio and market returns, respectively; • ∑ denotes the sum across all data points; • n is the sample size.
These formulas are used to assess the relationships between different portfolio returns and market returns, providing insights into the effectiveness of the long-short ESG strategy.

Interpretation
For each hypothesis, a Pearson correlation coefficient (r) close to 1 would indicate a strong positive linear relationship, while a coefficient close to −1 would suggest a strong negative linear relationship.A coefficient close to 0 would indicate no linear relationship between the variables.

Ordinary Least Squares Regression Analysis
Ordinary Least Squares (OLS) regression analysis is a statistical method used to estimate the relationship between one or more independent variables and a dependent variable.It minimizes the sum of the squared differences between the observed and predicted values, providing insights into the strength and significance of the relationships among variables.
Mathematical formulation: The Ordinary Least Squares (OLS) regression equation for a simple linear regression model with one independent variable x. where: y is the dependent variable; -x is the independent variable; β 0 is the intercept (constant term); β 1 is the slope coefficient; ε is the error term (residuals).
For H1: The Ordinary Least Squares (OLS) regression equation for testing the hypothesis regarding the relationship between ESG risk and expected returns is given by: where: • Expected Returns is the dependent variable; • ESG Risk is the independent variable; • β 0 is the intercept (constant term); • β 1 is the slope coefficient; • ε is the error term (residuals).
For H2: The Ordinary Least Squares (OLS) regression equation for testing the hypothesis regarding the relationship between ESG risk and total risk is given by: where: • Total Risk is the dependent variable; • ESG Risk is the independent variable; • β 0 is the intercept (constant term); • β 1 is the slope coefficient; • ε is the error term (residuals).
For H3: The Ordinary Least Squares (OLS) regression equation for testing the hypothesis regarding the differences in returns between different ESG risk quintiles is given by: where: • Returns is the dependent variable; • ESG Quintile is the independent variable representing different ESG risk quintiles; • β 0 is the intercept (constant term); • β 1 is the slope coefficient; • ε is the error term (residuals).For H4: The Ordinary Least Squares (OLS) regression equation for testing the hypothesis regarding abnormal returns generated by a long-short ESG risk strategy is given by: where: • Returns is the dependent variable; • Long-Short ESG Portfolio is the independent variable representing the long-short ESG portfolio; • β 0 is the intercept (constant term); • β 1 is the slope coefficient; • ε is the error term (residuals).

Vector Autoregression
Vector Autoregression (VAR) is a statistical method used to model the dynamic relationship between multiple time series variables.It extends the concept of simple autoregression to multivariate data, allowing for the simultaneous analysis of interdependencies among variables over time. where: • Y t is a vector of endogenous variables at time t; • c is a vector of intercept terms; • Φ i are coefficient matrices for lag i (for i = 1, 2, . . ., p); • e t is a vector of error terms at time t.

Model Specification
For H1: We test the hypothesis that there exists a positive relationship between ESG risk and expected returns as follows: For H2: We test the hypothesis that there exists a positive relationship between ESG risk and total risk as follows: For H3: We test the hypothesis regarding the significance of differences in returns between different ESG risk quintiles as follows: For H4: We test the hypothesis that a long-short ESG risk strategy yields abnormal returns as follows:

Interpretation
For each hypothesis, the VAR model estimates the coefficients Φ ESG that capture the dynamic relationship between the endogenous variables and the ESG variable.A statistically significant coefficient would indicate a meaningful relationship between ESG risk and the respective variable, providing evidence to support the hypothesis.

Conclusion
By employing Vector Autoregression (VAR) models for each hypothesis, we aim to capture the dynamic interactions between ESG risk and investment outcomes, thereby enhancing our understanding of the impact of ESG considerations on portfolio performance and risk management.

Granger Causality Test (Vector Autoregression)
Mathematical formulation: The Granger causality test assesses whether one time series variable, X, Granger-causes another time series variable, Y.In the context of VAR models, the test involves estimating two VAR models: one including only lagged values of X and the other including lagged values of both X and Y.The F-test is then used to compare the fit of the two models.
For H1: We test the hypothesis that there exists a causal relationship between ESG risk and expected returns.
For H2: We testing the hypothesis that there exists a causal relationship between ESG risk and total risk.For H3: We testing the hypothesis regarding the causal impact of ESG risk quintiles on portfolio returns.
For H4: We test the hypothesis that a long-short ESG risk strategy causally affects market returns.

Interpretation
For each hypothesis, the Granger causality test assesses whether ESG risk Grangercauses the respective variable of interest.A statistically significant result would indicate a causal relationship between ESG risk and the variable, providing evidence to support the hypothesis.

Conclusion
By employing the Granger causality test within the framework of Vector Autoregression (VAR) models for each hypothesis, we aim to determine the direction and significance of causal relationships between ESG risk and investment outcomes, thereby enhancing our understanding of the impact of ESG considerations on portfolio performance and risk management.
2.3.6.Quintile Analysis (Sorting Environmental, Social, and Governance Data into Quintiles) For hypothesis H3, our objective is to employ quintile analysis as a methodology for sorting ESG (environmental, social, and governance) data into quintiles based on their corresponding risk scores.This analytical approach involves dividing the dataset into five equal parts (quintiles), each representing 20% of the data, based on the ascending order of ESG risk scores.By categorizing the data in this manner, we aim to examine and compare investment returns across different levels of ESG risk.This enables us to investigate whether there are statistically significant variations in returns between portfolios with distinct degrees of ESG risk exposure.Through quintile analysis, we seek to provide insights into the relationship between ESG risk and investment performance, thereby contributing to a better understanding of sustainable investment strategies.

Capital Asset Pricing Model for Alpha Estimation
The Capital Asset Pricing Model (CAPM) is a fundamental framework in finance used to estimate the expected return on an asset based on its systematic risk.It assumes that investors are rational, risk averse, and have access to homogeneous information.According to the CAPM, the expected return of an asset is determined by the risk-free rate and the market risk premium, which is proportional to the asset's beta coefficient, representing its sensitivity to market movements.The CAPM provides insights into the risk-return relationship of individual assets within a diversified portfolio, aiding investors in making informed investment decisions and constructing efficient portfolios.
Testing and validating H4: Using the CAPM framework, we can test for alpha (i.e., test H4) using the following model: where: r Qit = return on a quintile ESG portfolio i at time t; α = intercept term (abnormal return in this context); β = slope; -r mt = return on the market portfolio; -r f = risk-free rate.
Based on these mathematical formulations outlined for each hypothesis, we establish a robust foundation for incorporating them into programming through various data science, time series analysis (TSA), and statistical analysis techniques.By leveraging these methodologies, we aim to conduct rigorous empirical analyses that enable us to test our hypotheses effectively.Through the application of programming languages and relevant li-braries for statistical computing, we implement these methodologies on real-world datasets, allowing us to derive meaningful insights into the relationships between ESG factors and investment outcomes.This integration of mathematical formulations with programming methodologies enables us to conduct comprehensive and data-driven research, facilitating a deeper understanding of the impact of ESG considerations on investment performance and risk management.

Testing Hypothesis 1:
There Exists a Positive Relationship between Environmental, Social, and Governance Risk and Expected Returns In our comprehensive analysis, we employ various statistical techniques and time series models to investigate the relationships between ESG factors and investment returns across all hypotheses.Firstly, correlation analysis reveals a moderate negative linear relationship between expected return and ESG Risk Score, with a correlation coefficient of approximately −0.1997, as shown in Figure 6.This suggests that, as the ESG Risk Score increases, the expected return tends to decrease, contrary to the hypothesis suggesting a positive relationship between the two variables.The t-statistic for correlation analysis further supports this finding, with a value of approximately −4.324, indicating a statistically significant negative relationship.Additionally, the Pearson correlation coefficient, calculated using the Pearsonr method, is approximately −0.1997, with a p-value of approximately 1.89 × 10 −5 .The negative correlation coefficient further supports the notion of a negative relationship between ESG risk and expected returns, and the very low p-value indicates that this result is statistically significant.Furthermore, the Ordinary Least Squares (OLS) regression results indicate that ESG Risk Score has a statistically significant negative coefficient when regressed against expected return.The coefficient estimate suggests that, for every unit increase in ESG Risk Score, expected return decreases by approximately 9.394 × 10 −6 units, corroborating the negative relationship observed in the correlation analysis.Finally, the Granger causality test conducted using Vector Autoregression (VAR) models provides insights into the causal relationship between ESG factors and investment returns.The test results fail to reject the null hypothesis that ESG Risk Score does not Granger-cause expected return at a 5% level of significance, suggesting that past values of ESG Risk Score do not significantly predict future expected returns.Overall, the combination of correlation analysis, OLS regression, and Granger causality testing consistently indicates a significant negative relationship between ESG risk and investment returns, contrary to the hypothesized positive relationship.These findings highlight the importance of thorough empirical analysis in understanding the complex dynamics between ESG factors and financial performance in investment decision-making.

Testing Hypothesis 2: There Exists a Positive Relationship between Environmental, Social, and Governance Risk and Total Risk
In investigating the relationship between ESG risk and total risk (Hypothesis 2), we employ various statistical techniques and time series models to comprehensively analyze the data.Figure 7 shows the relationship between expected return and total risk.The correlation analysis reveals a positive correlation coefficient of approximately 0.092 between total risk and ESG Risk Score, as determined by both the t-statistic and the Pearson correlation coefficient, as shown in Figure 8.This suggests a weak positive linear relationship between the two variables.Further statistical testing using the t-statistic confirms this observation, with a t-statistic value of approximately 1.97 and a p-value of approximately 0.049, indicating statistical significance at the 5% level.Additionally, the Pearson correlation test yields a Pearson correlation coefficient of approximately 0.092, with a p-value of approximately 0.0495, further supporting the presence of a significant correlation between total risk and ESG Risk Score.Furthermore, the Ordinary Least Squares (OLS) regression analysis is conducted to estimate the impact of ESG Risk Score on total risk.The regression results show a coefficient of approximately 5.894 × 10 −5 for ESG Risk Score, albeit with a p-value of approximately 0.100, indicating marginal statistical significance.The R-squared value of the regression model is approximately 0.005, suggesting that only a small proportion of the variation in total risk can be explained by the ESG Risk Score.Additionally, we conduct a Granger causality test using the Vector Autoregression (VAR) model to assess whether past values of ESG Risk Score provide significant predictive power for future total risk.The test results indicate that we fail to reject the null hypothesis at the 5% significance level, with a test statistic of approximately 0.028 and a p-value of approximately 0.867.This suggests that past values of ESG Risk Score may not Granger-cause changes in total risk, implying the limited predictive power of ESG factors for total risk fluctuations.Overall, the findings from the correlation analysis, OLS regression, and Granger causality test collectively suggest a weak positive relationship between ESG risk and total risk.While there is evidence of some association between the variables, the statistical significance and predictive power of ESG Risk Score for total risk may be limited.These results underscore the importance of considering additional factors and employing robust analytical techniques in assessing the impact of ESG factors on investment risk profiles.Considering our understanding of the inverse correlation between ESG and expected returns, Hypothesis 3 requires revision.With the recognition of the inverse relationship between ESG risk and expected returns, the original hypothesis no longer holds validity.In light of this, we now anticipate that firms with higher ESG risk will yield lower returns compared to those with lower ESG risk.Thus, the updated Hypothesis 3 posits that the returns of firms with lower ESG risk will statistically surpass the returns of firms with higher ESG risk.
3.3.Testing Hypothesis 3 (Updated): Returns of Firms with Lower Environmental, Social, and Governance Risk Are Statistically Greater Than the Returns of Firms with Higher Environmental, Social, and Governance Risk To assess Hypothesis 3 (Updated), it is necessary to construct portfolios consisting of firms categorized into higher and lower ESG risk segments.Subsequently, statistical tests are conducted to examine the disparity between the returns of these portfolios.By segmenting firms into distinct ESG risk categories, we can compute the average return for each category on a daily basis, akin to the return generated by an equally weighted ESG portfolio.Visual exploration of portfolio performance can be facilitated by estimating cumulative returns and generating corresponding plots.Figures 9 and 10 show total risk across all quintile portfolios and expected return across all quintile portfolios.In Figure 11, the plot displays the cumulative returns of portfolios grouped by ESG risk quintiles, as per the revised Hypothesis 3.Each line represents the cumulative returns of a specific quintile portfolio over time.The upward trajectory of the lines indicates positive cumulative returns, suggesting that firms with lower ESG risk, represented by higher quintiles, tend to outperform those with higher ESG risk, depicted by lower quintiles, throughout the analysis period.Different colors distinguish between the quintile portfolios, facilitating a straightforward comparison of their cumulative returns.A thorough examination of the relationship between ESG risk quintiles and investment returns reveals compelling insights across various statistical analyses.Firstly, the t-statistic analysis indicates a statistically significant difference in returns between firms categorized into lower ESG risk (Q1) and higher ESG risk (Q5) categories, with a t-statistic value of 1.721.This suggests that companies with lower ESG risk consistently outperform those with higher ESG risk.Furthermore, the OLS regression analysis provides additional depth by showing a coefficient of 0.8684 for Q5.This coefficient signifies that, for every one unit increase in Q5, the mean returns of Q1 increase by 0.8684, demonstrating a robust positive relationship between ESG risk quintiles and returns.The high R-squared value of 0.840 further corroborates this relationship, indicating that Q5 effectively explains 84.0% of the variability in Q1's returns.Additionally, the Granger causality test yields insightful results, rejecting the null hypothesis at the 5% significance level.This implies a causal relationship between historical ESG risk levels and subsequent performance, as past values of ESG risk quintiles significantly impact predicting their future values.Together, these findings underscore the importance of considering ESG factors in investment decision-making.Firms with lower ESG risk profiles not only exhibit superior performance, but, also, historical ESG risk levels can provide predictive power for future performance.Such insights are invaluable for investors and portfolio managers seeking to optimize their investment strategies while integrating ESG considerations into their decision frameworks.In conclusion, the comprehensive analysis of the relationship between ESG risk quintiles and investment returns provides valuable insights for investment decision-making and portfolio management.The evidence gathered from the t-statistic, OLS regression, and Granger causality test supports the hypothesis that firms with lower ESG risk profiles tend to outperform those with higher ESG risk profiles.These findings offer actionable insights for investors and portfolio managers, emphasizing the importance of incorporating ESG considerations into investment strategies to enhance performance and manage risk effectively.To analyze Hypothesis 4, we incorporate S&P500 market return data along with quintile returns and obtain the risk-free rate.Additionally, we construct long-minus-short portfolios for each quintile (Q1-Q5) and combine all relevant data into one dataframe.This comprehensive approach allows us to assess the effectiveness of the long-short ESG risk strategy in generating abnormal returns relative to the market and risk-free rate.Combining the results of our analysis for Hypothesis 4, which investigates whether a longshort ESG strategy yields abnormal returns, reveals several significant insights.Firstly, the correlation between the long-minus-short ESG portfolio and both the market return and excess market return is approximately −0.0535.This indicates a weak negative correlation between the performance of the portfolio and the market returns, suggesting that the portfolio's returns are not strongly influenced by overall market movements.Secondly, the Ordinary Least Squares (OLS) regression results unveil an R-squared value of 0.003.Although this suggests that only a small portion of the variability in the long-minus-short ESG portfolio can be explained by the excess market return, the coefficient for the excess market return is statistically significant (p < 0.01).This implies that there is indeed a relationship between the excess market return and the performance of the long-minus-short ESG portfolio.Moreover, the Granger causality test results present an intriguing finding.The test indicates that the long-minus-short ESG portfolio does not Granger-cause the excess market return, with a p-value of 0.095.This suggests that the portfolio's performance does not significantly influence the future behavior of the excess market return, providing valuable insights into the dynamics between the portfolio and market movements.Lastly, the calculation of the annualized alpha using a sophisticated method yields a value of approximately 4.37%.This indicates that the long-short ESG strategy generates abnormal returns beyond what would be expected based on the Capital Asset Pricing Model (CAPM).This finding underscores the potential profitability of the strategy and suggests that it may offer investors an opportunity to achieve abnormal returns in the market.Overall, these comprehensive findings provide robust evidence supporting the notion that the long-short ESG strategy indeed yields abnormal returns.These insights are invaluable for investment decision-making and portfolio management as they offer investors a potential avenue for achieving enhanced returns while integrating environmental, social, and governance considerations into their investment strategies.

Discussion
Given the results obtained, it is crucial to consider their concordance with the existing literature in the field.Initially, our hypothesis posited a positive correlation between ESG risk and expected returns.However, empirical testing unveiled a negative correlation, deviating from our anticipated outcome.This observation contrasts with prior research, notably, studies by Dong (2023) and Aydo gmuş et al. ( 2022), which emphasize the beneficial impact of specific ESG factors on investment returns.Conversely, Hypothesis 2 shed light on a subtle positive correlation between specific ESG risks and total risk.This correlation found validation through a significant positive coefficient in the OLS regression, thus affirming the hypothesis.This observation aligns with the conclusions drawn in studies such as those mentioned by Cohen (2023) and Gupta et al. (2021), which emphasize the impact of ESG factors on financial performance.Furthermore, the results of Hypothesis 3 revealed a significant difference in returns attributable to ESG considerations, aligning with the research by Stiadi (2023) and De Giuli et al. (2024), which emphasizes the importance of considering ESG factors in investment decision-making processes.Lastly, Hypothesis 4 explored the presence of a weak negative correlation between certain ESG factors and investment returns, which was supported by the presence of a positive annualized alpha of 4.37%.This finding is intriguing and warrants further investigation as it suggests that, even in cases where a negative correlation exists, there may still be opportunities for alpha generation through data-driven ESG strategies.Overall, comparing our findings to the existing literature provides valuable insights into the complex relationship between ESG factors and investment performance, highlighting areas of agreement and divergence that merit further exploration and analysis.
While our study contributes valuable insights, several avenues for further research and exploration in the field of ESG investing warrant attention.Future research endeavors could delve deeper into understanding the underlying mechanisms driving the observed relationships between ESG factors and investment performance.This may entail conducting qualitative research to explore the qualitative aspects of ESG performance and their financial implications, providing richer context to complement quantitative analysis.Longitudinal studies could be pursued to analyze the long-term impact of ESG integration on investment outcomes, offering insights into the sustainability of ESG-driven investment strategies over extended time horizons.Additionally, future studies could focus on developing more sophisticated models and frameworks for integrating ESG factors into investment decision-making processes.This could involve exploring multi-factor models that combine ESG metrics with traditional financial indicators, as well as investigating alternative data sources and analytical techniques for assessing ESG performance.Furthermore, research efforts could be directed towards exploring the implications of emerging ESG trends, such as climate change considerations and diversity and inclusion metrics, on investment performance.As the field of ESG investing continues to evolve, further research and innovation will be essential to advance our understanding and application of ESG principles in the financial markets, ultimately contributing to more sustainable and responsible investment practices.

Conclusions
In conclusion, our study has provided detailed insights into the intricate relationship between environmental, social, and governance (ESG) factors and investment returns, as well as the efficacy of a long-short ESG strategy in generating abnormal returns, resulting in an alpha of 4.3%.The findings summarized in Table 1 offer a comprehensive overview of the hypothesis testing conducted in our study.

Summary of Key Findings
Firstly, our analysis revealed a notable negative correlation between certain ESG factors and investment returns, as indicated by Hypothesis 1.However, this correlation was accompanied by a non-significant negative coefficient in the OLS regression, ultimately leading to the failure to reject the null hypothesis.Consequently, Hypothesis 1 was not supported.
Conversely, Hypothesis 2 highlighted a weak positive correlation between specific ESG factors and investment returns.This positive correlation was supported by a significant positive coefficient in the OLS regression, although the null hypothesis was not rejected.Consequently, Hypothesis 2 was supported, suggesting a potential link between these ESG factors and favorable investment outcomes.
Building upon these findings, Hypothesis 3 introduced the concept of a significant difference in returns attributable to ESG considerations.Our analysis confirmed a significant positive coefficient in the OLS regression, leading to the rejection of the null hypothesis and providing support for Hypothesis 3.This result underscores the importance of considering ESG factors in investment decision-making processes and highlights the potential for these factors to impact financial performance significantly.
Furthermore, Hypothesis 4 explored the presence of a weak negative correlation between certain ESG factors and investment returns.While the OLS regression revealed a significant positive coefficient, indicating a weak positive correlation, the null hypothesis was not rejected.Interestingly, this hypothesis was supported by the presence of a positive annualized alpha of 4.37%, suggesting that, even in cases where a negative correlation exists, there may still be opportunities for alpha generation through data-driven ESG strategies, portfolio construction, and risk management strategies.

Investor-Focused Strategic Recommendations for Environmental, Social, and Governance Integration
For investors interested in incorporating environmental, social, and governance (ESG) factors into their investment strategies, it is essential to undertake a comprehensive approach.Firstly, investors should conduct thorough research to understand the ESG metrics relevant to their investment goals, considering industry-specific standards and best practices.Secondly, integrating ESG data into the decision-making process is crucial, including utilizing ESG ratings, reports, and analysis to assess the sustainability performance of potential investments.Diversification of ESG investments across various asset classes, sectors, and geographical regions can help manage risk and capture opportunities effectively while also promoting portfolio resilience.Moreover, active engagement with companies through shareholder activism and direct dialogue enables investors to advocate for positive ESG practices and drive meaningful change.It is also important for investors to stay informed about emerging trends and regulatory developments in the ESG space so they can adapt their strategies accordingly.Additionally, aligning investment decisions with personal values ensures consistency with ethical beliefs and long-term commitment to responsible investing.Regular monitoring of the impact of ESG investments over time by tracking key performance indicators (KPIs) is vital for assessing progress and making necessary adjustments.Seeking professional advice from financial advisors or specialists in ESG investing can provide tailored guidance for optimizing portfolios based on individual preferences and objectives.By following these detailed recommendations, investors can effectively integrate ESG factors into their investment strategies, contributing to positive social and environmental outcomes while pursuing financial objectives.
Informed Consent Statement: Not applicable.
1.5.2.Impact of Environmental, Social, and Governance Performance on Firm Value Aydo gmuş et al. (2022) delve into the impact of environmental, social, and governance (ESG) performance on firm value and profitability, revealing a positive relationship between ESG scores and firm profitability (Aydo gmuş et al. 2022).

Figure 1 .
Figure 1.An overview of the S&P500 stock price data.

Figure 2 .
Figure 2.An overview of the ESG data.

Figure 3 .
Figure 3. Expected return, total risk, and ESG Risk Score overview.

Figure 5 .
Figure 5. Block diagram of proposed methodology.

Figure 9 .
Figure 9.Total risk across all quintile portfolios.

Figure 10 .
Figure 10.Expected return across all quintile portfolios.

Figure 11 .
Figure 11.Cumulative returns of portfolios grouped by ESG risk quintiles.
3.4.Testing Hypothesis 4: A Long-Short Environmental, Social, and Governance Risk Strategy Yields Abnormal Returns

Table 1 .
Summary of hypothesis testing results.