Electricity plays an increasingly important role in powering the U.S. transportation sector with projections of 147–440 TWh of annual consumption by vehicles by 2050 [1
]. This consumption corresponds to about 4–10% of the current total electricity consumption in the United States [3
]. Based on the dataset used in this study, accessed February 2021, there are more than 90,000 charging connectors available at more than 75,000 public charging locations for electric vehicles (EVs). Fueling infrastructure for EVs is unlikely to resemble conventional vehicle fueling infrastructure for a variety of reasons, including the time duration required for fueling, the physical and regulatory differences between electricity and liquid fuels, and the fact that EVs can be charged at home, at the workplace, or in public. Public EV charging infrastructure installed to date has been constructed and operated by a variety of entities under numerous business models. A comprehensive review of public charging prices and price models has not yet been conducted, although this type of summary might be valuable both to sellers and buyers of electricity, as well as to policymakers and other stakeholders. Consumer-facing articles have been published to explain public charging prices to EV drivers [4
Privately operated EV charging infrastructure has been installed and managed by at least 18 companies at public locations in all 50 states, including at grocery stores, hotels, shopping centers, and gas stations. Within and across companies, states, and locations, charging prices can vary greatly. This suggests that companies are pursuing disparate business models. For example, Tesla has installed a centralized network emphasizing long-distance travel that is compatible only with the vehicles Tesla produces, and has intermittently offered free and/or low-cost charging as an incentive for vehicle purchases. In contrast, other networks, such as ChargePoint, EVgo, and Blink, offer charging at low and high power at a variety of location types, at stations that are operated based on centralized or decentralized models.
Charging prices are assessed at fixed or variable rates during a charging session, as a function of time (seconds, minutes, or hours), energy (kilowatt-hours, kWh), or as a total price per charging session. The majority of charging connectors in the U.S. are Level 2 (L2) chargers, meaning power transfer occurs at an average rate between 6.6 and 19.2 kilowatts (kW) [7
]. The majority of the remaining stations have DC fast chargers (DCFC), which provide rates anywhere from 50 to 350 kW. Recognizing that less than 2% of connectors operate at the much slower Level 1, those charging locations are not included in the present analysis. Whereas L2 connectors are largely standardized under the Society of Automotive Engineers’ (SAE) J1772 standard, there are three major DCFC connnector types that are not mutually compatible: the Tesla Supercharger, the SAE Combined Charging System (CCS), and CHAdeMO (short for “CHArge de MOve”), a standard which is being phased out in favor of CCS for new vehicles.
Although there is no official and comprehensive repository of charging price data for public EV charging stations, PlugShare [8
] has obtained price information and other metadata for a substantial portion of the stations in the U.S. via crowd-sourcing through its app and website, and through partnerships with charging station providers. These data (“the dataset”), which largely exist in textual form, are publicly accessible for individual stations via PlugShare’s app and website interfaces, but are not publicly accessible in the aggregate form necessary for the application of broad analytics. The authors obtained access to the dataset in aggregate form in order to conduct this study. Due to the many ways that a price signal can be written in textual form, we needed to employ ad hoc text mining and processing methods to reformat a majority of the dataset’s price information into quantitative data for analysis.
In Section 2
, we present an overview of text mining from the literature. We then describe the dataset in more detail in Section 3
and discuss the text mining and data processing methods employed in our study in Section 4
. The results of our analysis are reported in Section 5
. The specific contributions of this work are as follows:
Ad hoc text mining techniques enable quantitative analysis of an otherwise opaque source of EV charging price data;
Descriptive analytics provide a high-level image of EV charging price variability in the United States; and
Discussion of trends in observed EV charging prices highlights decision-making implications for EV operators, charging station operators, policymakers, and business innovators.
2. Overview of Text Mining
The concept of text mining (text data mining, text analytics) originated with the ideas of natural language processing in the 1950s. However, it was not until the late 1990s that it began to assume a more prominent role across the analytics landscape. This development occurred in conjunction with a maturing data mining toolkit plus advances in computational power and speed capable of processing large unstructured data sets. More recently, text mining has evolved into a discipline of its own, with numerous applications throughout business, engineering, public health, the physical and social sciences, and other endeavors [9
The literature on text mining is now quite extensive in both the research and applications domains. Analytical advancements have progressed rapidly with the implementation of newer and faster algorithms and processing capabilities. Materials describing foundational ideas (e.g., [10
]), as well as advanced methods (e.g., [13
]) are widely available, and the various tools and techniques have been translated to accommodate a variety of computer languages and platforms (e.g., [14
]). Madigan [18
], Weiss et al. [19
], and Sumathy and Chidambaram [20
] provide excellent overviews of the text mining landscape from statistical and data science perspectives.
With its growing importance in the Big Data era, the definition of text mining has become more fluid, expanding to accommodate numerous analytical contexts, ranging from information and content extraction to lexical and sentiment analysis, pattern recognition/categorization, dimensionality reduction, and beyond. Perhaps the most common understanding of text mining in contemporary data analytics revolves around the extraction of word/phrase frequencies and relationships using various clustering and classification techniques [21
]. While text mining can logically be thought of as a means for parsing written artifacts for knowledge discovery, it also plays a significant role in the preprocessing and wrangling stages of Big Data analysis, such as reducing semantic, syntactic, and contextual ambiguity [22
Text mining is commonly used to extract, reduce, or regularize information contained in parcels of written material, free-form responses to questions or inquiries, or more conversational communications scraped from social media. It may also be used to effectively analyze text-based transactional records for relevant and recurring content, such as electronic medical reports (transcriptions of physicians’ notes pertaining to patient visits, conditions, diagnoses, etc.) [24
], industrial maintenance files pertaining to failure times and modes [26
], building maintenance work orders [28
], court proceedings (including case files and docket entries) [29
], customer service archives [30
], and historical exchanges of real estate and mineral leases. The electric vehicle charging records in the dataset represent a similar type of transactional, textual, and numerical data that is amenable to text mining.
In these and other contexts, the approach is more closely aligned with the various aspects of content mining, such as concept extraction, named entity recognition, key word identification, differentiation of implicit or explicit actions and decisions, definition and capture of interesting phrases, and alignment and standardization of abbreviations [31
]. It is these aspects that are most relevant to our investigation of electric vehicle charging costs. Accomplishing the tasks of text mining, however, often requires a more ad hoc, informal, or even “brute force” approach that involves a combination of human intervention, original scripting, and machine learning [33
], particularly as the volume of data increases and encompasses more diverse entities. Our analysis of the dataset requires this kind of approach because of its compositional nature and the continuing flow of additional information into the database over time.
The data are semi-structured in the sense that they are organized in rows (representing individual charging connectors) and columns (representing variables or attributes pertaining to those charging connectors), although the data entries recorded for several of the attributes exist as words, phrases, or sentences (natural language) that must be refined to extract consistent and usable meanings. Although the database itself is semi-structured, the information associated with some attributes is completely unstructured. The documentation for the application programming interface (API) provides more information about the data organization [8
We received the data in two separate tranches: 74,237 observations in 2019 and an additional 19,312 observations in 2021, for a total of 93,549 observations. Each observation represents one connector, so a charging station with multiple connectors is represented by multiple observations. A typical station hosts approximately 1.2 connectors on average. Records contain location information (city, state, zip code), charger information (connector type, network; whether charging is free), parking information (location type; whether parking is free), and unstructured price description information. Of these records, 30,756 have interpretable price information. A small sample of data with price descriptions is shown in Figure 1
Price descriptions, in the form of unstructured text, vary widely in format and information content. This has several implications. Due to the nature of crowd-sourced data and the potential for user error, some of the price information may not be accurate or up-to-date. There is no standard way to specify whether price information applies to parking, charging, or both. Price descriptions thus may contain descriptions of prices for both parking and charging, for one or the other, for neither, or for one or multiple different charging levels, without means of resolving the ambiguity. Finally, prices, and their descriptions, do not follow a standard model. Manual interpretation is not feasible for a growing database of more than 30,000 stations with cost descriptions, so an ad hoc algorithmic text interpretation approach is used. Still, for some price descriptions which are inherently ambiguous (examples shown in Table 1
), neither algorithmic nor manual text interpretation succeed in extracting meaningful price information.
Whereas textual price data do not follow a standard format, the dataset does include standard specifications of whether fees exist for (a) parking (“Parking Type” in Figure 1
) or (b) charging (“Cost” in Figure 1
), or both. Thus, prices for stations with no textual price description but with both free charging and parking can, in theory, be inferred (i.e., the price is $
0). However, since this inference is only possible for free stations, including these data in the general analysis would disproportionately weight free-charging locations. Instead, we assume that the sample of stations with price descriptions, including free stations, constitutes a representative sample of public EV charging stations, and therefore do not infer the price for stations marked as free. Furthermore, there exist records with detailed descriptions of nonzero prices, but that are marked as having both free charging and parking. In such cases, we assume that the price description is accurate.
An overview of how charging connectors are distributed across categories is shown in Figure 2
. Among states, California hosts the greatest share by a substantial margin. Among network providers, ChargePoint hosts the greatest share of charging connectors.
Descriptive analytics, in the form of graphs of the interpreted data, are presented in this section. These analytics are intended to summarize the quantitative data extracted from the dataset, in part to demonstrate the utility and reliability of processing the data using the presented methods. They also provide a high-level overview of public EV charging prices and how they vary within the diverse U.S. public EV charging network. Price variability is present with respect to geography (Figure 4
, Figure 5
, Figure 6
, Figure 7
and Figure 8
), network (Figure 9
), location type (Figure 11), and power level (Figure 12).
5.1. Spatial Distribution
and Figure 5
show the total number of L2 stations and the median charging price for all associated connectors, by county, throughout the United States. Similarly, Figure 6
and Figure 7
show the number of DCFC stations and the median charging price for all associated connectors, by county, throughout the United States. Counties without any L2 station records (in the case of Figure 4
) or DCFC station records (in the case of Figure 6
) are indicated as blank areas. Median prices encompass only those connectors for which unambiguous price information is available. Both L2 and DCFC stations are more highly concentrated on both coasts and in major metropolitan areas in the country’s interior. Median charging prices for L2 stations exhibit a somewhat different spatial distribution than do median charging prices for DCFC stations. The median charging price for L2 stations is somewhat more levelized across the country except, perhaps, in the northwest and mid-Atlantic areas, while the median charging price for DCFC stations is distinctly higher in the northwest and northeast regions, and in the upper midwest and northern Texas regions. Note that the disparate sizes of counties from east to west can visually bias perceptions about the spatial distributions, and that adopting more or less granular political jurisdictions can change those perceptions.
Among all L2 stations, the mean effective price to charge across the three cases is 0.277 $
/kWh. Among all DCFC stations, the mean effective price to charge across the three cases is 0.318 $
/kWh. (For reference, the mean cost of residential electricity in the U.S. is 0.133 $
/kWh as of March 2021 [39
].) However, effective prices span a wide range. DCFC is consistently more expensive on average than L2, but substantial price variability exists within and between states (Figure 8
In Figure 8
, the states on the horizontal axis are listed in decreasing order of count of records. Although one might expect that states hosting greater numbers of connectors would have lower prices due to increased competition, there is no obvious trend to suggest this is the case. However, it should be reemphasized here that California has many times more records than any other state—more than the total in all 40 states represented by “Other” (see Table A4
)—and therefore, that every state’s data are sparse in comparison to California’s.
Additionally, note that in Figure 8
and subsequent similar representations, data distributions are depicted as traditional box-and-whisker plots showing the minimum, maximum, and median values, plus the first and third quartiles. The median, shown as a bold line, may be equal to one or both quartiles if the mode accounts for a sufficiently large fraction of the data.
Distributions of price by network are shown in Figure 9
. Similar to Figure 8
, the networks on the horizontal axis are listed in decreasing order based on plug count. If price data are sparse for a network, the price distributions shown may be misleading (see next section). Again referencing Figure 2
, connector records are heavily concentrated in the top network, which has even more connector records listed than the state of California.
Still, unlike in the comparison of states in Figure 8
, it is clear that some networks have narrower price ranges than others. These differences in price variability may reflect a combination of networks’ spatial span, where widely distributed networks may be subject to a wide variety of utility rates resulting in high price variability, and the extent to which networks impose centralized, network-set pricing, as opposed to station-host pricing.
Missing DCFC Data
When taking into account all levels of charging, Tesla, via its Supercharger and Tesla Destination networks, hosts the second-most stations of any network. However, if considering only DCFC, they account for the overwhelming majority of networked chargers (Figure 10
). Since the Tesla network is only available to Tesla drivers through a proprietary app and vehicle interface, Tesla has little incentive to provide accurate pricing information on public-facing third-party apps, such as PlugShare. Accordingly, only a small fraction of their charging connectors have price information in the dataset, and even these prices may be out of date. Our lack of access to most of Tesla’s prices, and those of other DCFC networks, is a major limitation to the DCFC portion of this analysis.
5.3. Location Type
In the data, 44 types of charger location, or “places of interest”, are distinguished (Figure 11
). While variability between categories appears to be limited relative to that between states or networks, some categories stand out. For example, whereas median prices at hotels are high, median prices at schools are comparatively modest. This may reflect the role that the necessity of charging plays in setting prices. Visitors to hotels, who are less likely to be near home, presumably have a greater need to charge than do visitors to other location types. Again, sparsity of data should be taken into account (Table A6
in the Appendix B
). There are more than five times as many records for parking garages/lots (the most populous category shown) as for restaurants (the least populous category shown).
5.4. Power Level and Units
Variability exists between power levels (DCFC is generally more expensive per kWh than L2) and as a function of the original unit of assessment. As shown in Figure 12
, session-based prices vary widely when expressed as regularized prices in $
/kWh. This may be an artifact of the method for regularizing price: since the regularized price is the mean over the three scenarios (Table 3
), charging sessions can only range between 1 and 3 h, for L2, and between 15 min and 1 h, for DCFC. It may be rare, for example, that a driver pays an expensive session price to charge for only 15 min, but the price for such a scenario (Scenario 1 for DCFC) is included in the regularized price calculation shown in these results.
Once again, it should be noted that some of the boxes in Figure 12
represent sparse data (see Table A7
in the Appendix B
). For example, only 487 of 6834 DCFC stations use a price in units of $
per hour. The low apparent price for hourly DCFC may thus be an artifact of data sparsity. Alternatively, the sparsity and low apparent prices for hourly DCFC might reflect a psychological aspect of pricing. Relative to L2 prices, DCFC prices expressed as $
/h may appear unusually high to EV operators due to the much higher rate of energy delivery. For example, to deliver energy at an effective price of 0.30 $
/kWh, an L2 station’s hourly price would be 1.98 $
/h, whereas a DCFC station’s hourly price would be 15.00 $
/h. The equivalent price advertised as a price per minute (0.25 $
/min) may be more attractive to EV operators.
5.5. Dwell Incentive
Prices can be used as signals to encourage EV operators to extend or shorten the duration of charging sessions. We refer to this as a positive or negative “dwell incentive”. As previously illustrated in Figure 3
, for some pricing structures, the effective overall price can change as a function of charging session length. For example, when charging costs are applied as a flat per-session fee, the effective price of energy decreases throughout a charging session. This may serve as an incentive for EV operators to extend charging sessions, potentially to the benefit of nearby retailers. Alternatively, some pricing structures deliberately increase the price of charging during a session, providing an incentive for shorter charging sessions, potentially to the benefit of electricity providers. These are examples of strategies, as highlighted in a 2019 study, to leverage EV operators’ flexibility to adjust the duration and energy consumption of charging sessions [40
We use a measure of dwell incentive to demonstrate where and how dynamic price structures are implemented. The dwell incentive is calculated by assessing the change in effective price, in $/kWh delivered, as the session duration increases. If the effective price remains constant irrespective of session duration, the dwell incentive at that station is “neutral”; if the price increases with session duration, the dwell incentive is negative; and if the price decreases with session duration, the dwell incentive is positive.
As shown in Figure 13
, the dwell incentive appears to correlate with effective price. On average, stations with a positive dwell incentive charge high effective prices relative to other stations. This suggests a strategy of maximizing revenue per customer (i.e., the drivers who plug in, despite the high price, are incentivized to stay longer), potentially at the expense of fewer customers (some are turned away by the high prices, or because the plug is in use). In contrast, the low average prices in negative dwell incentive structures suggest a strategy of maximizing revenue by increasing plug utilization: the low price encourages drivers to plug in, but the price increases with time to encourage vacating for the next vehicle.
shows that very few stations employ price structures with non-neutral dwell incentives, and in particular, only a few of those employ a negative dwell incentive. It is plausible that pricing mechanisms for influencing dwell behavior, such as idle fees, are assessed more commonly than they appear in price descriptions in the dataset. Still, the typical configuration of EV charging stations, where payment and energy flow are both managed electronically, provides a unique opportunity to use price signals for load management or utilization improvement purposes.
5.6. Comparison with Levelized Cost of Charging
Levelized cost of charging (LCOC) is a metric representing the average cost paid by a station operator to provide charging energy, including initial installation costs and ongoing, time-varying costs throughout the lifetime of the charging equipment. Calculating the difference between LCOC paid by station operators and the average price paid by EV operators is one method for estimating the profit that a station earns.
Median prices obtained from the dataset are higher in every state than the LCOC estimated for station operators. This is illustrated in Figure 15
, which compares the prices extracted from the dataset to estimated values of LCOC for different varieties of charging.
The LCOC values shown in Figure 15
are taken from a study of 2019 EV charging economics [41
]. In this study, researchers detailed the variability of EV charging economics across different charging sites, regions, power levels, and other variables. They estimated LCOC, for an individual charging site, as a function of (a) retail electricity prices, (b) capital and operating costs for the charging equipment, and (c) energy supplied during the lifetime of the equipment. Two sensitivity scenarios (upper and lower) aimed to capture variability in these parameters, leading to higher and lower costs than the baseline scenario.
The comparison in Figure 15
thus serves to emphasize the substantial difference between the estimated LCOC and the actual prices assessed, throughout the U.S., for both L2 and DCFC. One implication of this difference is that the value of energy from a public charging station is substantially higher to a typical EV driver than the cost paid by station operators to provide it. This calls attention to attributes of public EV charging. First, most EV drivers do not have to rely on public infrastructure for the majority of their driving energy, resulting in a different value proposition for drivers at public charging locations relative to home charging or gasoline/diesel refueling. Secondly, utilization may be limited in an early EV market due to the complex means by which infrastructure availability both spurs and reacts to adoption of EVs, representing a restriction to supply that may exert upward pressure on prices. Third, station operators may pay a higher electricity price than nominal retail electricity prices due to pricing mechanisms, such as peak demand tariffs or time-of-use rate schedules, in which case the LCOC would be higher in reality than the estimated values. Each of these attributes is discussed further in the following paragraphs.
5.6.1. Value Proposition for EV Drivers
EV drivers choose from a broader set of refueling locations than do drivers of conventional vehicles, who are confined to refueling at commercial gasoline/diesel stations. This highlights a fundamental difference between the business cases for public charging stations and petroleum refueling stations. Most EV drivers are able to charge at home, and some can charge at the workplace, both of which are likely to be cheaper and more convenient than stopping at a public charging station for either L2 charging or DCFC. Public stations thus serve (a) to enable trips exceeding the EV battery range and/or (b) to provide faster charging than drivers have available at home or work. From the perspective of EV drivers, the value of charging can therefore be considered to be the sum of the direct value of energy and the indirect value of range extension and faster charging (convenience and/or preference), resulting in drivers willing to pay a higher price than the LCOC. An analogous product for which the willingness to pay can be dramatically influenced by differences in convenience and/or preference is water, which usually comes at a significant premium, in bottled form, relative to the price of tap water at home.
5.6.2. Station Utilization in an Early EV Market
Public charging infrastructure and EVs are complexly interrelated in that each increases the value and viability of the other. This is an example of a commonly remarked “chicken-or-egg” problem. If charging is not sufficiently ubiquitous to enable long-distance travel, most people may be unlikely to adopt EVs, but some stations providing widespread charging in an early market will experience low utilization, while EV populations are low. In [41
], public L2 connectors were assumed to be utilized 4.5 h per day, whereas DCFC connectors were modeled at varying levels of utilization, from 1–2 charges per day to over 20% utilization. At present, however, these utilization assumptions may yet be overestimates for many stations.
5.6.3. Peak Demand and Time-of-Use Electricity Tariffs
Finally, electricity prices are often designed to discourage high local and aggregate power demands via peak demand and time-of-use tariffs, which can result in high prices for EV charging, especially DCFC. The authors of [41
] accounted for the effect of tariff variations on DCFC by testing a total of more than 4000 commercial rates and reporting the overall average price for each state. Still, they report that the effective price of electricity for DCFC can exceed $
2 per kWh [42
]. As utility companies continue to adapt to the emerging demands of EV charging, some charging stations may continue to pay electricity prices according to structures that result in expensive refueling using DCFC infrastructure. Alternative solutions, such as installing means of electricity generation (solar panels) or storage (stationary batteries) to minimize or offset power demands, have been proposed to reduce the cost of electricity and mitigate other challenges with the interactions between the electric grid and EV charging stations [37
6. Discussion and Future Directions
Access to a comprehensive source of EV charging price data can facilitate decision-making for EV operators, charging station operators, policymakers, and business innovators. However, such data do not yet exist in an aggregated and accessible format. PlugShare’s crowdsourced U.S. dataset is an attractive source of nationwide charging price data, but the unstructured textual format of its price data has hindered its usability. By employing ad hoc text mining to convert the data into a format amenable to direct analysis, this work lays the foundation for studies of a previously underutilized source of data. Descriptive analytics of the converted dataset provide a high-level image of the state of public EV charging across the United States, with emphasis on the wide variability of charging prices in terms of geographic location, network operator, and location type.
EV charging stations operate under a variety of business models and pricing structures that are vastly different from those associated with commercial petroleum fueling stations. The flexibility in price design equips operators with tools to provide incentives for desired charging behaviors, such as ramping prices to discourage long charging sessions. Our analysis suggests that these tools are not yet being used by the majority of EV charging station operators. Further research to understand the effects of potential price designs on customer choices may provide valuable direction for station operators, especially as charging demand increases.
Because it is often an alternative to at-home charging, the business case for public EV charging is distinct from that for conventional fueling. Our research suggests that prices at most stations exceed estimates for the LCOC paid by station operators, resulting in prices well above what consumers would pay at home and highlighting the unique value proposition of public EV charging. This premium in price represents value beyond that of energy, such as convenience, speed, or necessity, but it remains to be seen what prices consumers will accept in a mature EV charging market.
From the perspective of station owners, EV charging infrastructure comes at a high capital cost that must be recouped, whether through a revenue margin on electricity above the LCOC or by other methods, such as increased revenues at an associated business. The wide variety in approaches to public charging suggests that the electric transportation system remains in its developing stages.
Data wrangling and preprocessing can be tedious, time-consuming, and sometimes unproductive pursuits; working with large volumes of unstructured textual information further exacerbates these issues [45
]. While text mining provides computational and statistical tools to address the problem, there is still no fully automated way to reduce natural language to numerical data that can be used for quantitative analysis. As illustrated in our work, such circumstances require the use of creative ad hoc approaches to extract useful analytical information. However, we hasten to underscore the imperfections in such approaches and the implications they may ultimately have on modeling results and conclusions. Given the growing interest in EVs and infrastructure to support them, we note the necessity of securing reliable and consistent data on which to construct models for operations and business planning.
In this study, we address one limitation to the usability of the dataset, but it suffers other limitations that we are unable to correct. Because the data are not publicly and freely available, the potential for research using the data is limited to those able to pay the access fees. Furthermore, the restrictions imposed by licensing agreements for non-public datasets inhibit the ability of researchers to provide transparent and reproducible work to the public.
An additional limitation to the usability of the dataset is its method of sourcing. By distributing the labor and costs required to obtain data, “crowd sourcing” can generate large volumes of data that may not be obtainable by other means. However, due to its decentralized sourcing, the value and quality of crowd sourced data can be questioned. Particularly when the data are not made public and open-sourced, the ability of researchers to assess value and quality is limited [46
]. This study provides an assessment of the value and quality of the dataset in the form of descriptive summaries and analytics.
Even with its limitations, the dataset presently represents one of the best and most current sources of information about charging costs that can be used to inform consumers and operators alike. As described here, the challenge is to reduce the dataset (and similar information sources) into a comprehensible and analytical format that can be effectively employed for decision making. To date, our work has primarily focused on describing the present status of public charging prices in the U.S.; however, we believe continued expansion of the dataset and fine-tuning (training) of our information extraction algorithm will support further investigations that are more predictive and prescriptive in nature. Future modeling work will incorporate the regularized and cleaned data with various operating parameters to help guide the establishment of best practices to promote EV adoption and investment in infrastructure build-out relative to the cost of EV charging.