Public Sentiment Toward Solar Energy: Opinion Mining of Twitter Using a Transformer-Based Language Model

Public acceptance and support for renewable energy are important determinants of renewable energy policies and market conditions. This paper examines public sentiment toward solar energy in the United States using data from Twitter, a micro-blogging platform in which people post messages, known as tweets. We filtered tweets specific to solar energy and performed a classification task using Robustly optimized Bidirectional Encoder Representations from Transformers (RoBERTa). Analyzing 71,262 tweets during the period of late January to early July 2020, we find public sentiment varies significantly across states. Within the study period, the Northeastern U.S. region shows more positive sentiment toward solar energy than did the Southern U.S. region. Solar radiation does not correlate to variation in solar sentiment across states. We also find that public sentiment toward solar correlates to renewable energy policy and market conditions, specifically, Renewable Portfolio Standards (RPS) targets, customer-friendly net metering policies, and a mature solar market.


Introduction
ABSTRACT Public acceptance and support for renewable energy are important determinant s of renewable energy policies and market conditions.This paper examine s public sentiment toward solar energy in the United States using data from Twitter , a micro-blogging platform in which people post messages, known as tweets.We filtered tweets specific to solar energy and performed a classification task using Robustly optimi zed Bidire ctional Encoder Representation s from Transformers (RoBERTa).Analyzing 71,262 tweets during the period of late January to ear ly July 2020 , we find public sentiment varies significantly across state s.Within the study period, the Northeastern U.S. region shows more positive sentiment toward solar energy than did the Southern U.S. region.Solar radiation doe s not corre late to variation in solar sentiment acro ss states.We also find that public sentiment toward solar correlates to renewab le energy policy and market conditions , specifically, Renewable Portfolio Standards (RPS) targets, customer-fr iendly net metering policies, and a mature solar market.
For the first time in U.S. history, solar energy recently became more cost efficient than coal energy in the long term [36].Solar energy contributes to greenhouse gas (GHG) emission reduction, offers new business opportunities to landowners and energy providers, and allows utility customers to lower their utility bills.De spite these advantages, only about 2% of U.S. electricity generation currently emanates from solar [64].The low proportion of energy generated from solar may be attributed to multiple factors, including skepticism surrounding financial benefits, system reliability, and conflicts of interest with respec t to utility revenue preservation [73].
Although public sentiment and preferences regarding renewable energy have been studied in the last decade [ 43,59,20], few studies have documented public perception of solar energy specifically.Public opinion on solar energy is worth investigating separately because different renewable sources such as geothermal, hydroel ectricity, wind, and bioenergy have unique advantages and requirements depending on the existing climate conditions and state-speci fic energy policies.
Furthermore, most of the existing literature relies on surveys and interviews to understand public perception of renewable energy.Although surveys and interviews can provide targeted and relevant data , they can be susceptible to selection and respons e biase s [9].For example, people who are more supportive of renewable energy may be more likely to respond to surveys and interviews , which may yield bias. Social media is an increasin gly popular mean s to express opinions and preference s and can thus provide valuable information for understanding public opinion on solar energy.
In this paper, our first aim is to under stand public opinion using data from Twitter, a micro -blogging platform in which people can post and interact with me ssages known as "twee ts" .With over 50 million active users in the United States alone [2], Twitter provides an ideal platform for opinion mining as extensive amounts of data are collected across a wide range of demographics and geograp hical location s.To achieve this aim, we utilized Robustl y optimiz ed Bidirectional Encoder Repre sentation s from Transformer s (RoBERTa) [38] based on recent development in the fields of Natural Language Proces sing (NLP) and Machine Learning (ML).RoBERTa possesses an extensive pre-training phase which can later be fine-tuned for a domain-specific task, yielding highly accurate results.Our second aim in this paper is to examine whether the energy policies and market conditions explain public opinion on solar energy.In particular , we focus on state-level characteristics, including renewable energy generation capacity, renewable energy portfolio standards (RPS), net metering, renewable energy incentive s, and solar market maturity.
The results suggest that public opinion on solar energy varies widely across states and is more likely to be positive in states with aggressive RPS targets , customer-friendly net metering rules , and a more mature solar market.This study has policy implications in addressing spatial disparities in public support for solar energy and future opportunities for solar energy deployment.In particular, states which implemented specific renewable energy incentives are more supportive of solar energy, which may have implication s for solar deployment.We also discuss the benefits of using social media as a source of data on public opinion and applying RoBERTa for sentiment classification in gauging public sentiment toward solar energy.

Background
Public acceptance and awareness of renewable energy are critical to the development of renewable energy industries and technology [52].On one hand, lack of institutional, political, and community support is often identified as a key barrier to renewable energy development [73 ].Landscape modification and visual intrusion of facilities are highlighted as main reasons for residents' opposition to renewable energy [6,71 ].On the other hand , the contribution of renewabl e energy to the local and regional economy positively influences public support for renewable energy [47 , 7].Not surprisingly, individuals ' opinion and preferences regarding renewable energy are highly associated with their belief and perception of global warming , climate change , and environmental risk [16,43].Views on renewable energy are also determined by personal characteristics such as education, party identification , and age [20,56].
Government policies and public opinion on renewable energy may relate to each other in a bidirectional sense.Public acceptance and support for renewable energy may affect renewable energy policy adoption, which in turn encourages renewable energy deployment.Renewable energy programs and financial incentives can also mitigate uncertainties in energy transition , garnering public support and acceptance for renewable energy.For example , RPS policy design and framing strongly influence broad public support for renewable energy technologies [59].Additionally, net metering, or an electricity billing policy allowing renewable energy system owners (e.g., residential, commercial, and indu strial buildings) to use the renewable energy generated, has been shown to facilitate solar energy deployment [12,13].A well-designed , transparent net metering policy can mitigate market and revenue uncertaintie s and help gain support from key stakeholders , including utilities , solar businesses , and customers [5], despite compensation rate limitations negatively impacting solar energy preferences after reaching a certain tipping point [12].
Public beliefs about the effectiveness of the policy-making and planning process is an important predi ctor of public sentiment toward renewable energy.Public perceptions of fairness in decision makin g about the siting of renewable energy facilities can mitigate public resistance to renewable energy deployments [71].Having tru st in the policy makers responsible for renewable energy development has a direct effect on publi c opinion [25].Greater openness to information sharing about alternative energy options can enhance public support for renewable energy [78].However , lack of a common understanding about the planning proce ss contributes to public reluctance of renewable energy adoption [8].Public education and outreach around administrative and technolo gical aspects of solar energy can help reverse negative perceptions of renewable energy development [53,25].Leveraging social networks and facilitating participation in the planning and development proces s can also enhance public trust in solar energy development [44].
Electricity market characteristics and conditions also likely influence public opinion on renewable energy.Public preferences depend on the price of conventional electricity because renewable energy is considered an alternative to conventional electricity generation.While the availability of low cost fossil fuel-generated electricity can make it difficult to justify renewable energy development [10], increased energy prices can enhance public acceptance of renewable energy [61].In addition, high upfront costs and lack of adequate financing options are major barri ers to public support for renewable energy [44,25 ].Solar busine sses contribute to local economic development , by lowerin g solar installation costs and creating new jobs , which in turn leads to publi c support for renewable energy development.
Existing literature also documents temporal and spatial variations in preferences and opinions regarding renewable energy.Public opinion about renewable energy changes over time [66,1].For example, Hamilton et al. [20] finds that public acceptance of renewable energy shows clear upward trends from 2011 to 2018.Historic al events (e.g., energy shortages and electricity price increa ses) can affect public awareness about renewable energy technolog y [l].Public acceptance of renewable energy also varies geographically, by communities [11], states [37,20], and countries [58].Spatial and geographical characteristics, such as region-specific culture [4 ], solar radiation [57], and energy autonomy of the region [58], influence spatial variations in public opinion.
Most empirical studies have used surveys and interviews to measure public opinion , sentiment, awareness, and perception of renewable energy.The literature finds broad public support for renewable energy across the United States [59], Finland [33], Mexico [19], Spain [54], South Korea [29], Portugal [60], Greece [24], and worldwide [52,72].Surveys and interviews have advantages in gauging individual-level demographic information , such as gender, education, income [34 ], distance to renewable energy facilities, and previous experience with renewable energy technologies [58], which is one of the key determinants of individual s' preference s regarding renewable energy.
However, surveys and interviews are limited in gauging temporal dynamics and geographical variations in public opinions.Researchers are beginning to use social media, especially Twitter, to examine public sentiment toward renewable energy [l , 23 , 46, 37].Jain and Jain [23] compares five different machine learning technique s for sentiment analysis and finds that the Support Vector Machine (SVM) achieves higher accuracy than K-Nearest Neighbor, Naive Bayes, AdaBoost, and Bagging algorithms for sentiment classification on renewable energy related tweet s.Using both traditional and social media for opinion mining , Nuortimo and Harkonen [46 ] find that public opinion on solar and wind has been the most positive compared to other energy sources, including coal, nuclear , and bioma ss.Using Twitter data from 2014 to 2016, Abdar et al. [1] finds that Alaskans' energy preferences have become more supportive of renewable energy over time.

Materials and methods
Our data are from two main sources: Twitter [63] and the U.S. federal and state government agencies.We use Twitter data for sentiment analysis, or opinion mining, of solar energy.Twitter has been a valuable source for opinion mining, but the manual classification of a large amount of tweets is difficult and time-consuming.Thus, we use NLP and ML methods to detect public opinion on solar energy automatically.We discuss our data collection and preprocessing processes in the following section.Section 3.2 outlines recent developments and approaches in opinion mining.Section 3.3 explains our sentiment classification model built upon RoBERTa.Finally, section 3.4 summarize s energy policy and market data used in this study.

Twitter data collection and pre-processing
Twitter Application Program Interface (API) was used to collect tweets, which are posts created by individuals on Twitter, specific to solar energy.Tweepy, a python library for accessing the Twitter API is used to stream live tweets in real-time [63].We used ten keywords to stream live tweet s, including 'so lar energy', 'so lar panel ', 'so lar PV ', ' solar photovoltaic', 'solar battery ', ' solar thermal' , 'solar power ', 'solar-powered ', 'solar generation', and ' solar subsidies.'In total, 406,811 tweets specific to solar energy were collected between January and early July of 2020.
We removed URLs, "RT (ReTweets)", and images from original tweets using the library preprocessing [41 ], which aids proces sing text strings.We identified a list of words that make a tweet irrelevant to public sentiment toward solar energy.These words include 'Pokemon ', 'Superman', 'galaxy ', 'e clipse ', 'solar plexu s', 'solar-powered human ', and 'I will become your sun.'We excluded 16,245 tweets that included these words.We also excluded the tweets that included the 10 key words (e.g., 'solar energy ', 'solar panel ', 's olar PV' , ' solar photovoltaic ') only in the user identification (i.e., screen name and description) but not in the text, quoted text, or extended text.We excluded 64,253 tweets through this proce ss .
Not all tweets include geographic information that is essential to this study.Only about 40 % of the tweets have geographic information either where they are based or where users tweeted.Amon g this 40 %, about half of the tweets are from countries other than the United States .For the purpo se of this study, we only need the tweet s tweeted by users with geoloc ation s associated with the United State s and/or the tweets by the user s who identify themselves based in the U.S. Thus we extracted tweets with geographic information based on the self-reported locations in the user profiles as well as the latitude/longitude coordinates using Carmen, a library for geolocating tweets [15].Our final dataset includes 71,262 unlabelled tweets as a result of this extraction process.
We randomly selected 5,122 of the 71,262 tweets and manuaIJy classified them into one of the two groups: "po sitive toward solar" or "negative toward solar."Tweets that include job announcements and product advertisement in the solar industry are classified into the positive class as these tweet s support solar energy industry.Retweet s are included since retweeting is a way to express individuals' support or opposition to solar energy.We did not include a neutral class because there were not enough tweets (8% of collected tweets) which we would consider neutral.In this case, including a neutral class could have resulted in lower performance or prediction accuracy [68,48].The tweets classified as neutral were deleted from the manually annotated dataset(5,122 tweets).Table 1 illustrates examples of tweets belonging to the two classes , positive and negative solar sentiment.

Related opinion mining approaches
Sentiment Positive Negative ML approaches have been widely applied to opinion mining, also known as sentiment analysis.Opinion mining is the computational treatment of opinions, sentiments, and subjectivity of text [28].Sentiment classification techniques based on NLP and ML have been widely applied for opinion mining in di verse areas, including public health [79 ], movie reviews [27], airline service [67], political news [26], and the COVID-19 pandemi c [17,40 ].Sentiment classification is a sub-discipline of text classification , which is concerned with classifying a text to a class for analyzing opinion or sentiment in texts.Although human emotions and intent s are highly complex in real life, the current state-of-the-art sentiment analysis has achieved higher performance with a simpler classification task, such as classifying texts into two (e.g., positive and negative ) or three categories (e.g., positive , negative, and neutral).
Recent developments in NLP have produced effective ways for automatic sentiment analysis using ML algorithms [50].Lexicon-based approaches using supervised and unsupervised ML algorithms, including Nai"ve Bayes and Support Vector Machines (SVM) , have generated high accuracy in sentiment cla ssification.They have gained wide popularity in opinion mining [21].More recently , deep learning approaches including Gated Recurrent Units (GRUs), long short-term memory (LSTM), and Convolutional Neural Networks (CNN) have started using non-static word embedding and have shown even higher performan ce results than the previous ML approaches in multipl e sentiment classification tasks [39,49].
Although neural network models have established higher accuracy with their automated feature learning , the application of such models is often limited by their heavy reliance on manually annotated data for trainin g language models [50].Within neural network models , considerable progress has been achieved by the models using the Transformer architecture, which is based on an self-attention mechanism [65].While such advantages have given rise to the development of new models , BERT (Bidirectional Encoder Representations from Transformers) which uses contextual sentences and word-embeddings to address the limitations of RNNs and LSTMs , has notably improved the performance of sentiment analysis [14 ].BERT is also the first large-scale bidirectional unsupervised way of training the language model by using two main strategies, namely as "masked language model " (MLM) and "next sentence prediction " .A number of BERT-based model s have been developed since 2018 including Disti!BERT [55], AIBERT [35], XLNet [74], and RoBERTa [38].One of the key advantages of the BERT-based model s is that users do not need a large corpus of texts to train their models.Users only need to fine-tune a BERT-based model using area/ta sk-specific supervised training data (e.g., manually annotated tweets) because BERT is already pre-trained on large corpus from Wikipedia and books, while utilizing more rich contextual information.The sentence-level vectors are more convenient for downstream NLP tasks, including sentiment classification, detecting sentence similarity, and learning word vector embedding.

Our approach: RoBERTa-based sentiment classification
Our sentiment classification model is based on RoBERTa , a Robustly Optimized BERT [38].Like any other BERT-based models , RoBERTa is powerful as it pretrains the language model by a bidirectional representation of words , meaning that the model is not restricted to reading texts either right-to-left or left-to-right.This deep bidir ectional pre-training structure provides the model with more information about context.RoBERTa also achieves higher accuracy than all other previous BERT-based models including BERT , BERT-Large, and XLNet.This higher accuracy is achieved by four main modifications, including pre-training the language model with 8-times larger batches over 10-times more data; training on 5-times longer sequences; using Byte-Pair Encoding (BPE) vocabulary instead of the character-level vocabulary; removing the next sentence prediction (NSP); dynamically changing the masking pattern applied to the training data instead of static masking [38].Thu s, we chose RoBERTa as our base model, as it is considered the state-of-the-art sentiment classification approach as of January 2020.
In total , we manually annotated 5,122 tweets without duplicates, and the annotated tweets were used to construct our train (80 %), development (10%) and test (10%) sets.As shown in Figure l , we fine-tuned RoBERTa using 4,097 annotated tweets.During the annotation process, we detected sarcastic expre ssions (e.g., "Solar power ?Yeah, right.That is a top concern for millions of Americans during COVID " .)and classified the tweets including such expres sions into negative class.
We used the HuggingFace transformer library [70], which includes the standard RoBERTa-based architecture with 12 hidden states and 12 attention heads, but with some modifications to the hyper-parameter s, the parameter s in which values are used to control the learning process in machine learning.We also used the AdamW optimi zer [32] to minimize the cross-entropy loss.To optimize the parameters, we conducted a number of experiments testing multiple combinations of the hyper-parameters on the development set (10% of the annotated tweets).In our final model , we fine-tuned RoBERTa with a learning rate of 6e-6, an € of le-8 , and dropout of 0.1.We set the maximum sequence length to 128 tokens, used a batch size of 16 and train for 4 epochs on a Tesla PlO0 GPU.Using this final set of hyperparameters, our model achieves 90.7% accuracy with an Fl score of 0.927 on the test set (another 10% of annotated tweets).

Renewable energy policy and market
The second aim of this study is to examine associations between public sentiment on solar energy and renewable energy policy and market characteristics.The characteristics examined in this study include renewable energy generation, RPS , net metering , the number of existing renewable energy incentives and policies , electricity price , and solar energy market maturity.

Renewable generation
A state's existing renewable energy generation capacity may be positively associated with public acceptance of solar energy.Renewable generation is measured as the percentage of renewable energy (excluding hydroelectric energy) generated in a state in Megawatt-hours as of 2019.Electricity generation data are from the U.S. Energy Information Administration (EIA) [ 64].Distributed energy generation ( e.g., rooftop solar) is not included in the dataset due to a lack of nationwide data on distributed energy generation capacity.

RPS
RPS policy is a state-level mechanism that requires utilities to generate or purchase a certain percentage of energy from renewable sources.Although RPS policies share common characteri stics, the design features vary widely across states.Existing studies have developed and used different RPS measures , including a binary measure of whether the state has RPS policies [75], a RPS percentage target [22], an interaction between the percentage target and the target year [31], marginal RPS targets [30], and a RPS target combined with "free trade " of Renewable Energy Certificates (RECs) [77].In order to take into account such variation s, we construct a mea sure for RPS as follows: -----------' TargetYeari -2019 (1) States that already achieved a RPS target or do not have RPS policies as of 2019 received a score of 0. RPS data as of 2019's Quarter 4 are from state government agencies.Iowa and Texas do not have a target year, but both states already achieved RPS target and received a score of 0.

Net metering
Net metering data are from the North Carolina Clean Energy Technology Center [3].Some states have enacted net metering policies and other distributed generation (DG) compensation rules to allow electricity customers generatin g electricity (e.g., rooftop solar) to use that generated energy at any time.Solar system owner s send the excessive energy back to the electricity grid and are compensated at a specific rate.DG compensation policies provide substantial incentives for solar energy generation as commercial and residential building owners can use solar energy at night, generated during the sunny or cloudy days.Net metering policies can vary significantly in terms of design and features.Thus, we construct an additive measure capturing five key features:

Renewable incentives
States which enacted more renewable energy policie s and incentives may have greater public support for renew able energy.Thus, we included the number of renewable energy policies and incentive s by states as of October 2019.The data were collected from the database of State Incentives for Renewable & Efficiency [42 ].

Solar market maturity
Public sentiment on solar energy may be more positive in states with a more mature solar market as thi s creates jobs and supports local economic development.We measure solar market maturity by the number of solar indu stry jobs per 1 million people in the state population in 2019.The data wer e obtained from the National Solar Job Cen sus [62].

Electricity price
Higher electricity prices may be associated with higher public sentiment toward solar energy because distributed renewable energy generation helps reduce electricity bills in places with high electricity rates [61,10].Thus, we include in our analysis the average price of electricity to ultimate residential customers by the state as of August 2019.The electricity data are from the U.S. Energy Information Administration (EIA) [64].

7. Solar radiation
The level of solar radiation positively predicts solar photovoltaic system installations [57 ] and solar energy generation [22 ].As the solar energy system achieves greater efficiency and performance in regions with high solar radiation, we expect that annual average solar radiation is po sitively associated with se ntiment toward solar energy.Solar radiation is measured in KWh/m 2 /Day and is aggregated at the state level.The solar radiation data are from the National Renewable Energy Laboratory 's National Solar Radiation Database [45 ]. .03.18-.11The number of tweets from each state varies from 64 (North Dakota) to 11,788 (California) , with an overall mean of 1397 .294and standard deviation of 1967.161,resulting in larger confidence intervals for North Dakota , South Dakota, and Wyomi ng.However , the number of tweets per 1 million people in the state population has a lower standard deviation of 307.08 7 for a mean of242.996,indicating that the average number of tweets adjusted for state population is more evenly distributed.

Sentiment toward solar energy and renewable energy policy and market characteristics
Table 2 presents the descriptive statistics for all variables included in the statistical analysis.The average sentiment score of each state is calculated by averaging sentiment scores of all tweets from each state.Thus , the mean of the average sentiment scores (7 .546) in table 2 is slightly different from the national average (7 .835),as the means for states are based on non-weighted state averages.The highest correlation between the policy and market variables is .34,between solar market maturity and electricity price.
Table 3 presents the result s from the regression of sentiment score on the renewable energy policy and market variables.In models 1-7 , we examine bivariate associations between sentiment and each renewable energy policy/market variable.The full model (model 8) includes all predictor variables.Model 1 demonstrates that there is no significant relationship between public sentiment and current renewable energy generation.In model 2, public sentiment on solar energy, however, is more positive in states which enacted higher annual RPS targets (P = .178,p < .01).We also find that states with more customer-friendly net metering policies (e.g., statewide net metering mechani sms, no capacity limitations , year-to-year rollover) have a more positive sentiment toward solar energy (P = .167,p < .01) in model 3.This finding is consistent in the full model (P = .115,p < .01).Model 3 suggests that net metering explains 18.8% of the variation in public sentiment on solar energy.
The number of renewable energy incentives in states is positively correlated with public sentiment in model 4 (P = .005,p < .05),but this relationship is not statistically significant in the full model.In model 5, we find that solar energy is perceived more positively in states with a more mature solar market (P =.50 8, p < .01).This finding also hold s in the full model (P = .373,p < .01).The results from model 6 suggest average sentiment score is higher in state s with higher electricity price (P = .046,p < .05),but this association is not statistically significant in the full model.Lastly, solar radiation is not associated with public sentiment on solar energy.The full model has no multicolline arity issues (mean VIP = 1.15) and explain s 46.9 % of the total varianc e in the sentiment score.

State
No

Discussion
This paper aimed to understand how sentiment toward solar energy varies across states , and how policy and market factors explain suh sentiment.The results indicate overall sentiment toward solar energy is positive but substantial variation exists across states.As of early 2020, the Northea st region is most supportive of solar energy, while the South region is the lowest.Solar radiation, which determines the efficiency and performan ce of the solar energy generation system, does not explain geographical variation in public sentiment toward solar, although this may be a result of such radiation varying more widely in states with the largest land area.But we find that energy policie s and solar market maturity explain the variation in sentiment.This finding suggests that state energy policy programs could be effective measures to build public support for renewable energy.
We also detect temporal variation in sentiment for the 6-month period of Twitter data collection.As shown in Figure 4, events can potentially shift public opinion on solar energy.The sharp decline in sentiment score on March 23-25 may be motivated by the congress' discussion on the inclusion of tax incentives for renewable energy in the Coronavirus Aid, Relief, and Economic Security (CARES) Act of 2020.To ensure the robustness of our results, we employed an alternative model excluding the tweets from March 23-25.As shown in Table Al , the main findings from the alternative model were consistent with the original model , except for the estimated effect of renewabl e incentives.In the alternative model , renewable incentives are statistically significant in explaining solar sentiment in both the bivariate and full models.The alternative model also explains more of the total variation (R 2 = .499).
Our analysis finds the proportion of energy generated from renewable sources doe s not relate to public sentiment toward solar energy, but the RPS targets appe ar to matter.That is, public opinion on solar energy aligns more closely with state renewable energy policy goals but not current renewable energy generation .Although our analysis does not examine causal relationships, these findings suggest public opinion and policy may affect each other: public support for solar energy may impact RPS policy adoption and design, but at the same time, a progre ssive RPS policy may help build public support for solar energy.The design and framing of RPS policie s may shape public support for renewable energy [59).
Our analysis also finds that net metering policy is also a strong predictor of public opinion on solar energy.As net metering offers direct financial benefits to residential and commercial solar system owners, we suspect that state netmetering policie s influence public support for solar energy.Con sidering the impact of net metering rules and retail rate s on consumer-side solar system deployment [12), specific design elements (e.g., compensation rate, capacity limitations , subscriber size limitations) , as opposed to simply having net metering rules, may play an important role in fostering positive public perceptions of solar energy.For example, indefinite rollover of excess renewable generation credits and compensation at retail electricity price s provides more explicit incentive s, which in turn may promote positive   perception of solar energy.Solar market maturity, measured by the number of solar jobs per 1 million residents in the state population, is one of the most important predictors of public sentiment toward solar energy.This finding suggests public support of solar energy could positively influence solar energy market growth, and vice versa.Public acceptance of solar energy may be one of the factors that attracts solar businesses.At the same time, economies of scale in the solar market could enable solar firms to lower system installation costs and grid fees, generating "trickle down" effects that benefit solar energy customers and increasing public support for solar energy.
This study has a few key limitations.First, the Twitter data we collected may not be fully representative of all U.S. residents due to demographic disparities between Twitter users and non-users [2].Twitter users tend to be younger and more politically liberal than the general public [69], and at the same time, younger and more liberal populations tend to favor renewable energy development [20].Therefore, our estimation of the average sentiment score (7.835) is likely to be inflated compared to the true sentiment score.Considering sentiment classification can be highly dependent on the platform from which the training data are extracted, future research could address this limitation by incorporating data from multiple social media platforms.
Second , our analysis captures public sentiment toward solar energy between late January and early July 2020, amid a global pandemic, which may limit the generalizability of the findings.Although the number of tweets used to estimate the average sentiment score is over 70,000, and the findings are robust to inclusion/exclusion of outlier values from March 23-25, the sentiment score could be made less sensitive if the data were from a longer period of time.In future research , we aim to collect data from multiple years and explore the sequential order of public sentiment and renewable energy policies to answer the question : Does public perception of solar energy precede renewable energy policie s, or vice versa?
Finally, although our language model achieves high accuracy (90.7 %) with binary classification (positi ve or negative), it is not sufficient to capture fine-grained human emotions , such as happiness, joy , excitement, anger , sadness , frustration , fear, and sarcasm.As proposed by Abdar et al. [1], publi c preferences and attitudes on renewable energy can be gauged with multiple dimensions, including valence (positive versus negative) and arousal (high versu s low level of activation).Future research should aim to capture multidimensional emotion s, for example, by utilizing Sen-tiBert to capture compositional sentiment semantics [76], modeling irony and sarcasm detection [5 l ], and/or explicitly modeling label semantics in guiding an encoder network [18].

Conclusions
This study aimed to understand public sentiment toward solar energy in the United States and examine the relationship between sentiment and renewable energy policy and market characteristics.We collected 71,262 tweets from late January to early July 2020 and analyzed the data using RoBERTa , a state -of-the -art langu age model.We find that public sentiment on solar energy varies widely across states .RPS and net metering policie s and solar market maturity are important predictors of the public sentiment toward solar energy.
The main contributions of this study are as follows : i.We examine public opinion and sentiment specific to solar energy that has not been separately investigated as extensively by existing literature.ii.This study is one of few empirical studies using social media, ML, and NLP approache s to understand public opinion and sentiment regarding renewable energy.111.Applying RoBERTa, which is a state-of-the-art language model as of early 2020, we propo se a new method to examine public opinion and sentiment toward solar energy.Our sentiment classification model achieves high (90.7%)accuracy by using manually annotated tweets , which are area-specific and of higher quality, to fine-tune RoBERTa.iv.This study provides a comprehensive picture of the geographical variation in public sentiment regardin g solar energy across states.We demonstrate that this variation is explained by state policy and market characteristics while refuting the theory that it can be explained by local solar radiation amount s. v. Our analysis offers important policy implications as we provide empirical evidence of the positive relationship between public sentiment toward solar energy and renewable energy policy and market characteristi cs.States that wish to gain public support for solar energy may need to consider implementing consumer -friendly net metering policies and support state solar market growth.

Figure 1 :
Figure 1: Graphic representation of our model based on RoBERTa.The input embeddings are denoted as E, and the final hidden vector is denoted as T. Every input example starts with CLS, a special token.C is the final hidden vector of the special CLS token.Trm is Transformer.Adapted from Devlin et al. [14).

2 )Figure 2 :
Figure 2: Spatia I patterns in sentiment toward solar energy across the United States

Figure 2
Figure 2 presents the sentiment score findings across the United States.The District of Columbia (9.077), Utah (8.7 71 ), and Connecticut (8 .692)are most supportive of solar energy, while Mi ssissippi ( 4.962 ), Alabama (5 .7 45), and Oklahoma (6.224) are least supportive.There is a statistically significant difference (one-way ANOVA, F(3, 47) = 3.30, p = .028)across four U.S. Census regions (Northeast, Midwest , South , and West), but the Bartlett 's test with Bonferroni adjustment indicates that only the difference between the Northeast and the South is statistically significant (Northeast -South= .975,p = .042).Figure 3 displays average sentiment scores of 50 states and the District of Columbia (n = 51) between January

Figure 3
displays average sentiment scores of 50 states and the District of Columbia (n = 51) between January 12 and July 7, 2020.The average sentiment score can vary from O (all tweet s are negative toward solar energy) to 10 (all tweets are positive toward solar energy).The state average sentiment score range s from 4.962 (Mississippi ) to 9.077 (District of Columbia).The 95% confidence intervals depend on the number of tweets collected from each state.

Figure 4 :
Figure 4: Trend of public sentiment toward solar energy from January 12, 2020 to July 07, 2020

Table 1
Sample Tweets for Two ClassesTweets Solar energy has never been easier and more affordable to install.Oil is the the way of the past.Solar and other forms of clean energy is where you have to invest.Add solar and get paid twice -Equity and Energy!I've seen gopher tortoises and red-shouldered hawks rendered bereft of habitat because of solar panel farms.

Table 2
Descriptive statistics

Table 3
Public sentiment toward solar energy and renewable energy policy and market characteristics < 0.01, **p < 0.05 , *p < 0.1.Robust standard errors are in parentheses.