The SERL Observatory Dataset: Longitudinal Smart Meter Electricity and Gas Data, Survey, EPC and Climate Data for over 13,000 Households in Great Britain
2. Research Design
2.2. Participant Recruitment
- Determine the number of households in each region and IMD quintile in the UK Address Base dataset to be as representative as possible of the domestic housing stock, after filtering out those with an organisation name, those listed as not ‘in use’, without an approved delivery point or no geographical (local authority) address. The percentage in each gave us our target percentage to recruit for SERL;
- Query a large random sample of these addresses via the DCC gateway to find those which returned a DCC-accessible smart meter, from which we could (in theory) collect data if we had consumer consent. We achieved a positive match with around 2–3% of addresses queried;
- Estimate what the likely response rate from contacted households would be to provide an upper bound to attempt to contact. For wave 1 we estimated a 5–20% response rate and the actual response rate was 9.5%. Subsequent response rate estimates were based on the response rates from earlier waves and were IMD quintile-specific;
- Query additional random samples of UK Address Base addresses as required in order to achieve a stratified random sample with the number in each region-quintile determined by the expected response rate and number recruited to date.
2.3. Data Governance, Ethics and Consent
- Safe people: all researchers must obtain Office for National Statistics (ONS) Accredited Researcher status;
- Safe projects: all projects must be approved by the independent SERL DGB;
- Safe setting: data available via the UKDS secure lab environment;
- Safe data: data are pseudo-anonymised appropriately for the secure lab environment;
- Safe outputs: all outputs are SDC- (Statistical Disclosure Control) checked prior to release.
3. Data Records
3.1. Smart Meter Data
- serl_smart_meter_documentation_edition03.pdf—describes the smart meter datasets in detail including all variables provided.
- serl_smart_meter_data_quality_report_edition03.pdf—describes the data quality analysis done and data availability.
3.2. Survey Data
- serl_survey_documentation_edition03.pdf—describes the variables in the SERL survey dataset;
- serl_pilot_recruitment_survey_copy.pdf—the postal version of the survey sent to wave 1 participants;
- serl_main_recruitment_survey_copy.pdf—the postal version of the survey sent to wave 2 and 3 participants;
- serl_survey_response_frequencies_edition03.csv—table of frequencies of responses to the SERL survey questions, merged or rounded for SDC checking where values are fewer than ten.
3.3. Energy Performance Certificate (EPC) Data
3.4. Climate Data
3.5. BST Dates
4. Data Quality Analysis
4.1. Smart Meter Data
4.2. Survey Data
- A5—What temperature do you set your controller to in the winter months for the late afternoons or evenings? A box with °C was provided in the wave 1 survey, but it was clear that some respondents were reporting temperatures in °F. Reported temperatures above 35 were assumed to be temperatures in °F and were converted to °C. In subsequent surveys two boxes (°C OR °F) were provided. Some issues remained despite this, including: °F continuing to be reported in the °C box, a range of temperatures being provided, temperatures reported in both °C and °F but not matching within 1 °C, and excessively high or low temperatures (defined as less than 10 °C, or more than 35 °C after assuming answers more than 35 were given in °F); all these issues have been flagged as errors in the dataset. After converting responses above 35 in the °C box (assuming they were in °F) (1.6%) 0.4% of responses remain flagged as errors.
- A16—Which of the following, if any, is your household considering replacing or adding to your heating or energy supply in the next 12 months? This question had several possible options to tick and a free text “other” option. The free text response contained comments about various energy related issues, including describing works previously completed, comments on why various options are not possible and comments regarding installed solar panels. This may be useful data and has been retained in the dataset, but it is not necessarily the case that a participant answering ‘other’ is considering a change to their heating or energy supply in the next 12 months.
- B3—How many other households do you share with at the moment? This question is applicable only to those who responded in the previous question that their accommodation is not self-contained. 0.6% of respondents who answered that their accommodation is not self-contained (i.e., shared with other households) went on to contradictorily indicate that they lived with zero other households.
- B5—How many rooms are available for use only by this household? And B6-How many of these rooms are bedrooms? Some respondents reported more bedrooms than rooms; these have been flagged as errors (0.4%). Some reported zero bedrooms, these have been edited to one bedroom; the same imputation as in the UK 2011 Census .
- C1—How many people currently live in your household, including you? And C2-Including you, how many males and females are there in each of the following age groups in your household? In some cases, the total number of householders reported in these two questions did not match; it appears that some respondents reported the ages of each householder, rather than the number of householders in each age and gender bracket. 1.7% of all respondents have C2 flagged as an error.
- C3—Thinking about the working situation of each member of your household aged 16 and over, including you, how many would you say fall into each category below? In some cases, the numbers reported were larger than the number of occupants reported in the previous two questions. Sometimes it appears that people were reporting the number of hours worked in each category. 5.3% of all respondents have C3 flagged as an error.
- C4—Including you, how many people in your household hold a degree (e.g., BA, BSc) or higher qualification (e.g., MA, PhD, PGCE)? In some cases more people with degrees were reported than people living in the house (C1), and in some cases non-numeric responses were given. 0.1% of all respondents have C4 flagged as an error.
5. Sample Bias and Representativeness
5.1. All SERL Observatory Participants
5.2. SERL Observatory Survey Respondents: Households
5.3. SERL Observatory Survey Respondents: Dwellings
5.4. SERL Observatory Participants with EPC Data
6. Usage Notes and Code Availability
- “Accredited researcher status (safe researcher training and exam): 1 month;
- University ethics approval: this varies by institution but allow at least 3–4 weeks;
- Project application (UKDS triage and SERL Data Governance Board review): 4–6 weeks” .
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. UK Domestic Energy Datasets
|Smart Energy Research Lab (SERL) Observatory ||Solent Achieving Value from Efficiency (SAVE) Data [14,38]||IDEAL Household Energy Dataset [15,35]||DEFACTO: Digital Energy Feedback and Control Technology Optimisation ||Smart Meter Energy Consumption Data in London Households ||Customer Led Net-work Revolution ||North East Scotland Energy Monitoring Project, 2010–2012 [12,13]|
|Number of homes *||>13,000||>4000||255||393 ++||5567||~11,000||215|
|Sampling method **||Stratified random sample||Stratified random sample||Mixed methods, with quota sampling.||Stratified random sample||Participants of UK Power Networks Low Carbon London project: “balanced sample representative of the Greater London population”||CLNR trial participants||Purposive selection/case studies|
|Geographic coverage||Mainland England, Scotland and Wales||Hampshire, Southampton, Portsmouth, Isle of Wight||Edinburgh, Lothians and S. Fife, Scotland||English Midlands||London||GB||North East Scotland|
|Temporal coverage **||August 2018–August 2022 and beyond||January 2017–December 2018||August 2016–June 2018||2015–2018||November 2011–February 2014||2011–2014||2010–2012|
|Whole-home energy data (fuels and resolution)||Electricity (import and export) and gas, 30 min||Electricity, 15 min +||Electricity, 1 s. Gas, 1 reading per 1 dm3 or 1 ft3||Electricity, 2 min. Gas, 30 min||Electricity, 30 min||Electricity, 30 min||Electricity, 5 min|
|Building and room data||Dwelling attributes||Dwelling attributes||Property type, age, entry floor, outdoor space. T, H, L per room. Room type, external doors and windows, floor area, height, radiators, thermostat presence–per room.||Property floor plan. Home energy survey, including a domestic energy assessment. EPC + input data (project-collected). T per room.||None||Indoor temperature (homes with air source heat pumps only)||Temperature, 1 room|
|Occupant data||Occupant characteristics, sociodemographics, self-reported energy awareness and behaviours||Occupant characteristics and household behaviours||Sociodemographics, values, attitudes, self-reported energy awareness and behaviours, including occupancy, household income band and stability||Demographic and occupancy. Heating system and appliance usage. Self-reported zonal control usage||None||Mosaic consumer classification||Demographic, “psycho-social measures including individual environmental attitudes, household characteristics, and everyday behaviours”|
|Contextual data||IMD quintile, region, LSOA, EPC, 24 weather variables||Urban-rural classification; IMD decile; modelled % of LSOA in fuel poverty||Weather, Urban-rural classification||Weather||None||External temperature (homes with air source heat pumps only)||Urban-rural classification|
|Appliance data||Presence of 14 appliances||None||Inventory, presence of smart systems; T for boiler pipes, radiator pipes, hot water outlets, fires, cookers; Electricity for selected appliances, main sub-circuits.||Heating system details.||None||Homes with and without: solar PV (with automatic or with manual in-premises balancing); air source heat pumps; EVs. Electricity for EV charge points||None|
|Other data||Potential for linking||Time use diary||Tariffs, meter readings||Changes to property near end of trial vs start.||Tariff–1100 approx. had a dynamic Time of Use tariff; remainder flat rate.||Tariff type, flat or Time of Use||Carbon footprint questionnaire|
|Availability||Accredited researchers on ap-proved projects (see Section 6)||Registered users via UK Data Archive||Open access, CC BY 4.0||Contact data controllers at Lough-borough University regarding access||Open access, un-specified license||Open Access, CC BY-SA 4.0||UK Data Archive, safe-guarded|
Appendix B. Supporting Tables and Figures
|East of England||1151||8.6%|
- Department for Business Energy & Industrial Strategy. Annex: 2019 UK Greenhouse Gas Emissions, Final Figures by end User and Fuel Type; Department for Business Energy & Industrial Strategy: London, UK, 2019.
- National Grid. System Operability Framework 2015; National Grid: Warwick, UK, 2015. [Google Scholar]
- Strbac, G. Demand side management: Benefits and challenges. Energy Policy 2008, 36, 4419–4426. [Google Scholar] [CrossRef]
- Golubchikov, O.; O’Sullivan, K. Energy periphery: Uneven development and the precarious geographies of low-carbon transition. Energy Build. 2020, 211, 109818. [Google Scholar] [CrossRef]
- Webborn, E.; Oreszczyn, T. Champion the energy data revolution. Nat. Energy 2019, 4, 624–626. [Google Scholar] [CrossRef]
- Hamilton, I.; Oreszczyn, T.; Summerfield, A.; Steadman, P.; Elam, S.; Smith, A. Co-benefits of Energy and Buildings Data: The Case for supporting Data Access to Achieve a Sustainable Built Environment. Procedia Eng. 2015, 118, 958–968. [Google Scholar] [CrossRef][Green Version]
- Summerfield, A.J.; Lowe, R. Challenges and future directions for energy and buildings research. Build. Res. Inf. 2012, 40, 391–400. [Google Scholar] [CrossRef][Green Version]
- Department for Business Energy & Industrial Strategy. Smart Meter Statistics in Great Britain: Quarterly Report to end March 2021; Department for Business Energy & Industrial Strategy: London, UK, 2021.
- Department for Business Energy & Industrial Strategy. Digitalising Our Energy System for Net Zero: Strategy and Action Plan 2021; Department for Business Energy & Industrial Strategy: London, UK, 2021.
- SECAS. Smart Energy Code; Smart Energy Code Administrator & Secretariat: London, UK, 2015; Volume 2015. [Google Scholar]
- Smart Energy Research Lab. Welcome to the Smart Energy Research Lab. Available online: www.serl.ac.uk (accessed on 18 June 2020).
- Craig, T.; Dent, I. North East Scotland Energy Monitoring Project, 2010–2012, Study Level Documentation; UK Data Service: Southampton, UK, 2016. [Google Scholar]
- Craig, T.; Dent, I. North East Scotland Energy Monitoring Project, 2010–2012 [Data Collection]. SN: 8122; UK Data Service: Southampton, UK, 2017. [Google Scholar]
- Rushby, T.; Anderson, B.; James, P.; Bahaj, A. Solent Achieving Value from Efficiency (SAVE) Data, 2017–2018 [data collection]. SN: 8676; UK Data Service: Southampton, UK, 2020. [Google Scholar]
- Goddard, N.; Kilgour, J.; Pullinger, M.; Arvind, D.K.; Lovell, H.; Moore, J.; Shipworth, D.; Sutton, C.; Webb, J.; Berliner, N.; et al. IDEAL Household Energy Dataset; Edinburgh DataShare: Edinburgh, UK, 2021. [Google Scholar]
- Cooper, A.; Shipworth, D.; Humphrey, A. UK Energy Lab: A Feasibility Study for A Longitudinal, Nationally Representative Sociotechnical Survey of Energy Use Synthesis Report; UK Energy Lab.: London, UK, 2014. [Google Scholar]
- UK Data Service. Training Requirements to Access SecureLab; UK Data Service: Southampton, UK; Available online: https://ukdataservice.ac.uk/help/secure-lab/training-requirements/ (accessed on 19 October 2021).
- Webborn, E.; McKenna, E.; Elam, S.; Anderson, B.; Cooper, A.; Oreszczyn, T. Increasing response rates and improving research design: Learnings from the Smart Energy Research Lab in the United Kingdom. Energy Res. Soc. Sci. 2021, in press. [Google Scholar]
- McKenna, E.; Frerk, M.; Elam, S. Smart Energy Research Lab (SERL) Data Governance Board (DGB) Terms of Reference; SERL: London, UK, 2020. [Google Scholar]
- Corti, L.; Welpton, R. Access to sensitive data for research: ‘The 5 Safes,’ Data Impact blog. 2015. Available online: https://blog.ukdataservice.ac.uk/access-to-sensitive-data-for-research-the-5-safes/ (accessed on 13 September 2021).
- Ministry of Housing Communities & Local Government. Domestic Energy Performance Certificates API Energy Performance of Buildings Data England and Wales. 2021. Available online: https://epc.opendatacommunities.org/docs/api/domestic (accessed on 19 October 2021).
- ECMWF. ERA5 ECMWF. Available online: https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5 (accessed on 19 October 2021).
- ECMWF. ERA5: Data Documentation; ECMWF: Reading, UK, 2021; Available online: https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation (accessed on 19 October 2021).
- Crawley, J.; Biddulph, P.; Northrop, P.J.; Wingfield, J.; Oreszczyn, T.; Elwell, C. Quantifying the measurement error on England and Wales EPC ratings. Energies 2019, 12, 18. [Google Scholar] [CrossRef][Green Version]
- Jenkins, D.; Simpson, S.; Peacock, A. Investigating the consistency and quality of EPC ratings and assessments. Energy 2017, 138, 480–489. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019. [Google Scholar]
- Smart Energy Research Lab. SERL Observatory Data; GitHub: San Francisco, CA, USA, 2021; Available online: https://github.com/smartEnergyResearchLab/observatoryData (accessed on 19 October 2021).
- Office for National Statistics. 2011 Census. 2011. Available online: https://www.ons.gov.uk/census/2011census (accessed on 2 September 2021).
- Ordnance Survey. AddressBase. Available online: https://www.ordnancesurvey.co.uk/business-government/products/addressbase (accessed on 2 September 2021).
- Ministry of Housing Communities & Local Government. English Housing Survey. 2021. Available online: https://www.gov.uk/government/collections/english-housing-survey (accessed on 2 September 2021).
- Institute for Social and Economic Research. Wave 10 Data Released; Understanding Society: Colchester, UK,, 2020; Available online: https://www.understandingsociety.ac.uk/2020/11/26/wave-10-data-released (accessed on 2 September 2021).
- Shipworth, M.; Firth, S.K.; Gentry, M.I.; Wright, A.J.; Shipworth, D.T.; Lomas, K.J. Central heating thermostat settings and timing: Building demographics. Build. Res. Inf. 2010, 38, 50–69. [Google Scholar] [CrossRef][Green Version]
- Ministry of Housing Communities & Local Government. English Housing Survey 2019 to 2020: Headline Report-Section 2 Housing Stock Tables; Ministry of Housing Communities & Local Government: London, UK, 2020.
- UK Data Service. Apply to Access Smart Energy Research Lab Data; UK Data Service: Southampton, UK, 2021; Available online: https://ukdataservice.ac.uk/find-data/access-conditions/secure-application-requirements/apply-to-access-serl/ (accessed on 8 September 2021).
- Frerk, M. Smart Meter Energy Data: Public Interest Advisory Group Final Report-Phase 2; Sustainability First & Centre for Sustainable Energy: Bristol, UK, 2021. [Google Scholar]
- Mahmood, A.; Javaid, N.; Asghar Khan, M.; Razzaq, S. An overview of load management techniques in smart grid. Int. J. Energy Res. 2015, 39, 1437–1450. [Google Scholar] [CrossRef]
- Chicco, G. Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy 2012, 42, 68–80. [Google Scholar] [CrossRef]
- Kipping, A.; Trømborg, E. Modeling and disaggregating hourly electricity consumption in Norwegian dwellings based on smart meter data. Energy Build. 2016, 118, 350–369. [Google Scholar] [CrossRef]
- Adams, J.N.; Bélafi, Z.D.; Horváth, M.; Kocsis, J.B.; Csoknyai, T. How smart meter data analysis can support understanding the impact of occupant behavior on building energy performance: A comprehensive review. Energies 2021, 14, 9. [Google Scholar] [CrossRef]
- Hurst, W.; Montanez, C.A.C.; Shone, N. Towards an approach for fuel poverty detection from gas smart meter data using decision tree learning. ACM Int. Conf. Proceeding Ser. 2020, 23–28. [Google Scholar]
- Wang, Y.; Chen, Q.; Hong, T.; Kang, C. Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges. IEEE Trans. Smart Grid. 2018, 10, 3125–3148. [Google Scholar] [CrossRef][Green Version]
- McKenna, E.; Few, J.; Webborn, E.; Anderson, B.; Elam, S.; Shipworth, D.; Cooper, A.; Pullinger, M.; Oreszczyn, T. Explaining daily total energy demand in British housing using linked smart meter and socio-technical data in a bottom-up statistical model. OSF Prepr. 2021. [Google Scholar]
- Huebner, G.; Watson, N.; Direk, K.; McKenna, E.; Webborn, E.; Elam, S.; Oreszczyn, T. Self-reported energy use in UK homes during the first COVID-19 lockdown: A survey study. SocArXiv 2021. [Google Scholar]
- Elam, S. Smart Meter Data and Public Interest Issues—The National Perspective: Discussion Paper 1; Annex A—Existing data: London, UK, 2016. [Google Scholar]
- CEEDS: The Centre for Energy Epidemiology Data Service. CEE Data Asset Register (2017-08); RCUK Centre for Energy Epidemiology: London, UK, 2017. [Google Scholar]
- Pullinger, M.; Kilgour, J.; Goddard, N.; Berliner, N.; Webb, L.; Dzikovska, M.; Lovell, H.; Mann, J.; Sutton, C.; Webb, J.; et al. The IDEAL household energy dataset, electricity, gas, contextual sensor data and survey data for 255 UK homes. Sci. Data 2021, 8, 146. [Google Scholar] [CrossRef] [PubMed]
- Georgia Tech. Smart Meter Data Portal. Available online: https://smda.github.io/smart-meter-data-portal/ (accessed on 26 August 2021).
|… recruited in wave 1 (August/September 2019)||1708||12.8|
|… recruited in wave 2 (August/September 2020)||3169||23.8|
|… recruited in wave 3 (January/February 2021)||8444||63.4|
|… with an electricity smart meter||13,320||100.0|
|… with an electricity smart meter with any valid data and still ‘active’ on 31 May 2021||12,823||96.3|
|… with a gas smart meter||10,202||76.6|
|… with a gas smart meter with any valid data and still ‘active’ on 31 May 2021||9730||73.0|
|… with SERL survey data||12,977||97.4|
|… with EPC data||6921||52.0|
|… with weather data||13,321||100.0|
|… with region known||13,321||100.0|
|… with LSOA known||13,321||100.0|
|… with IMD quintile known||13,321||100.0|
|Type||Resolution||Details||Units||Households *||Earliest First Read Date||Mean First Read Date||Mean Availability **||Median Availability **|
|3||Ignore||Invalid read time, no read. The row exists for a different read type that was taken at the wrong time so we do not require a read for this time stamp.|
|2||No meter||The meter does not exist in the DCC system; the read is not actually ‘missing’ because we do not expect it. For example, when there is no gas meter all gas read flags will be ‘2′ for rows that exist because there are electricity reads.|
|1||Valid||The read exists and does not meet any of the other error flag criteria, and valid_read_time = TRUE thus presumed valid.|
|0||Missing||The read should exist as far as we are aware but it is missing.|
|−1||Max read||The largest number storable on the meter (equal to 16,777,215, all 1s in 32-bit binary (the 64-bit equivalent is converted to 32-bit to save memory). Likely due to a technical fault.|
|−2||Very high but not max||Table S5 in the serl_smart_meter_documentation_edition03.pdf (Supplementary File) shows the threshold for flagging a read as larger than we (cautiously) deem plausible, excluding those flagged as ‘max reads’.|
|−3||Negative||Value less than 0 (no occurrences).|
|−4||Elec in kWh||Electricity reads are required to be recorded in Wh but for some meters we believe that the readings have incorrectly been stored in kWh. Some properties do not suffer from this problem for the entire period of collection; the issue may start or stop when a meter is replaced. Due to the difficulty in automatically assessing this issue, any meter with at least 5 rows of electricity data where the daily reads are approximately 1/1000th of the sum of the half-hourly reads for that day are flagged with this error for their entire recording period. A new column with unit correction has been included, but since the reads are rounded to the nearest kWh instead of the nearest Wh researchers may wish to exclude such data.|
|−5||Valid read, invalid read time||Originally flagged as valid (1) but valid_read_time = FALSE therefore we cannot say over what time period the energy has been recorded. For example, at 15:01 we may wish to keep the read, and assume it is for 14:30–15:01. However, at 15:15 perhaps it is 14:30–15:15 but it may be 15:00–15:30, depending on the previous read. We suggest researchers filter out reads at invalid times.|
|Flag||Half-Hourly Reads||Daily Reads|
|Elec in kWh||0||0||0||0||0||594,690||0|
|Valid but invalid time||228||18||186||18||69,431||6||1260|
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Webborn, E.; Few, J.; McKenna, E.; Elam, S.; Pullinger, M.; Anderson, B.; Shipworth, D.; Oreszczyn, T. The SERL Observatory Dataset: Longitudinal Smart Meter Electricity and Gas Data, Survey, EPC and Climate Data for over 13,000 Households in Great Britain. Energies 2021, 14, 6934. https://doi.org/10.3390/en14216934
Webborn E, Few J, McKenna E, Elam S, Pullinger M, Anderson B, Shipworth D, Oreszczyn T. The SERL Observatory Dataset: Longitudinal Smart Meter Electricity and Gas Data, Survey, EPC and Climate Data for over 13,000 Households in Great Britain. Energies. 2021; 14(21):6934. https://doi.org/10.3390/en14216934Chicago/Turabian Style
Webborn, Ellen, Jessica Few, Eoghan McKenna, Simon Elam, Martin Pullinger, Ben Anderson, David Shipworth, and Tadj Oreszczyn. 2021. "The SERL Observatory Dataset: Longitudinal Smart Meter Electricity and Gas Data, Survey, EPC and Climate Data for over 13,000 Households in Great Britain" Energies 14, no. 21: 6934. https://doi.org/10.3390/en14216934