UK Domestic Appliance-Level Electricity (UK-DALE) dataset
April 2017 release
This dataset records the power demand from five houses. In each house we record both the whole-house mains power demand every six seconds as well as power demand from individual appliances every six seconds. In three of the five houses (houses 1, 2 and 5) we also record the whole-house voltage and current at 16 kHz.
To download the disaggregated data as ZIPPED CSV files please, download ukdale.zip from the UKERC EDC. It's 3.5 GBytes in size so will take a while to download! If the download link doesn't work then please check the UKERC EDC status page. If the UKERC EDC are having issues then please wait until their issues are resolved. Please note that I do not host the dataset. The dataset is hosted by the UKERC EDC. If the download link doesn't work and the UKERC EDC status page doesn't report any issues then please contact the UKERC EDC. For other formats, please keep reading...
Each release of the dataset is labelled with the month and year. The most recent (and final) release is for April 2017. UK-DALE now includes 4.3 years of data for house 1.
Paper
The following paper describes the data recording system and the January 2015 release of the dataset. Please cite this paper if you use the dataset or the recording hardware:
Jack Kelly and William Knottenbelt.
The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five
UK homes.
Scientific
Data 2, Article number:150007, 2015,
DOI:10.1038/sdata.2015.7
BibLaTex
@Article{UK-DALE, Title = {The {UK-DALE} dataset, domestic appliance-level electricity demand and whole-house demand from five {UK} homes}, Author = {Jack Kelly and William Knottenbelt}, Journaltitle = {Scientific Data}, Year = {2015}, Date = {2015/03/31}, Number = {150007}, Volume = {2}, Doi = {10.1038/sdata.2015.7} }
BibTex
@Article{UK-DALE, title = {The {UK-DALE} dataset, domestic appliance-level electricity demand and whole-house demand from five {UK} homes}, author = {Jack Kelly and William Knottenbelt}, journal = {Scientific Data}, year = {2015}, date = {2015/03/31}, number = {150007}, volume = {2}, doi = {10.1038/sdata.2015.7} }
Small correction to the paper
The paper states:
The uncompressed 16 kHz 24-bit files would require 28.8 GBytes per day so we compress the files using the Free Lossless Audio Codec (FLAC) to reduce the storage requirements to ≈ 4.8 GBytes per day.
In fact, the uncompressed 16 kHz 24-bit files require 8.3 GBytes per day, not 28.8 GBytes per day!
Also, for some further analysis of the energy used by the individual appliance monitors, and the effect this has on the "proportion of energy submetered", please see this blog post.
Brief description of the data formats available
1 second and 6 second data
All five homes have whole-home power recorded every six seconds; and appliance-level data is at six second resolution. Homes 1, 2 and 5 also have whole-home active power and apparent power at 1 second resolution. The six-second and one-second data is stored in CSV files where the first column is the UNIX timestamp.
NILMTK HDF5 version
An HDF5 version of the 1-second and 6-second data (for use with NILMTK) is available on the UKERC EDC. See below for how to download it.
Utility meters
Gas and electricity utility meter readings for house 1 are available in two formats:
16 kHz voltage and current from homes 1, 2 and 5
The complete April 2017 version of the 16kHz dataset occupies 7.6 TBytes.
The 16 kHz data are stored as a sequence of stereo FLAC files ("FLAC" stands for "Free Lossless Audio Codec"). Each FLAC file is about 200 MBytes. One channel is whole-house voltage, the other is whole-house current.
The name of each FLAC file is the UNIX timestamp at the start of the recording for that flac file. The underscore in the filename should be interpreted as a decimal mark (i.e. it separates the integer part from the fractional part of the UNIX timstamp).
For more info about the high frequency data, please see our paper and the snd_card_power_meter github repository (the code we used to record the high frequency data.)
Converting from FLAC files to volts and amps
First you probably want to convert from FLAC (a lossless audio compression) to WAV. There are many audio tools that can convert from FLAC to WAV. I often use sox.
Once you have the WAV file, you'll need to convert from the [-1,1]
range of values in the WAV file to volts and amps. In Python, you can load WAV files using Python's built-in
wave
package. You'll need
the calibration.cfg
file for the house in
question (found here). This file specifies an amps_per_adc_step
parameter and a volts_per_adc_step
parameter. To
calculate volts from the WAV files, use this
formula: volts_per_adc_step × number_of_ADC_steps
× value_from_wav_file
. The
variable number_of_ADC_steps=231
for houses
1 and 2 and number_of_ADC_steps=215
for
house 5. Use a similar formula for amps. (The software we use for
recording the data used 32-bit integers to capture the audio signal
for houses 1 and 2 and 16-bit integers for house 5. Hence, for
houses 1 and 2, there are 232 ADC steps for the full
range from [-1,1] and 231 ADC steps for half the range
from [0,1] or [-1,0].) You can safely ignore the
'phase_difference
' parameter and just assume that the
measurement hardware introduces no significant phase shift.
Download
January 2015 version from the UK Energy Research Council's Energy Data Center
The UKERC EDC currently holds the Jan 2015 version of UK-DALE. The EDC will soon have the Apr 2017 version too. Please cite the data DOI if you use the dataset!
- The January 2015 release of the 1 second and 6 second data are available from the UK Energy Research Council's Energy Data Centre using our dataset DOI:10.5286/UKERC.EDC.000001
- The Jan 2015 release of the 16 kHz data can be downloaded from the UKERC EDC via dataset DOI:10.5286/UKERC.EDC.000002
April 2017 from the UKERC EDC
The April 2017 version of UK-DALE is available from the UKERC EDC. Please cite the dataset DOIs of 10.5286/UKERC.EDC.000003 for the 16 kHz data and 10.5286/UKERC.EDC.000004 for the disaggregated data.
Dataset license
This data is made freely available under Creative Commons Attribution 4.0 International (CC BY 4.0). See more at https://creativecommons.org/licenses/by/4.0/
Change log
April 2017 release
- House 1 now includes 4.3 years of data (starting on 09/11/2012 22:28:15 GMT and ending on 26/04/2017 18:35:53 BST).
- The FLAC files have been moved into a directory structure of
the form
house_1/2015/wk04
This change is required to match the directory structure used by the UKERC EDC. - BUG FIX: In previous versions of UK-DALE, the directory storing 16 kHz FLAC files for house_5 incorrectly had some FLAC files which were actually recorded from house_1. These files were recorded from house_1 during 2015, weeks 34-41. These files have been moved to the house_1 directory, where they belong!
- BUG FIX: In previous versions, the directory storing 16 kHz FLAC files for house_5 contained five WAV files. These have now been converted to FLAC. However, these five files are almost certainly too short. These files are listed in house_5/KNOWN_BAD_FILES.txt
- The ZIP file in previous versions contained some cruft that
wasn't required. For example, it contained a
file
building1/channel_54.dat
which should be ignored. The new ZIP file is nice and clean!
May 2016 release
- House 1 now includes 3.5 years of data (starting on 09/11/2012 22:28:15 GMT and ending on 13/05/2016 12:11:37 BST).
August 2015 release
- House 1's FLAC files and 6-second files have been updated to 17th August 2015. So there are now 2.5 years of data for house 1!
- The fridges, kettles, washing machines, microwaves and dish washers now have some additional metadata: on_power_threshold, max_power, min_on_duration and min_off_duration. This metadata helps NILMTK's Electric.get_activations() function to extract individual appliance activations.
January 2015 release
- We now have five homes in the dataset (up from four in the last release).
- We have 16 kHz recordings of mains voltage and current from houses 1, 2 and 5. This data is available for download over FTP. In total, we now have 6 TBytes of data (that's compressed)!
- House 1 has 655 days of recordings and 54 meters installed
- The new revision of the paper includes lots of plots describing the data (most plots produced with NILMTK. And here are the scripts to produce the plots in the paper).
- The metadata has been updated to include some more information about each house (number of occupants, year the house was built etc).
- The 'ragged' third column in the IAM
.dat
files recording button press data has been moved tochannel_X_button_press.dat
files. So none of the.dat
files have ragged columns any more. This should make the data easier to load.
Contact
Email [email protected], the guy who maintains this dataset.