April 2017 release

This dataset records the power demand from five houses. In each house we record both the whole-house mains power demand every six seconds as well as power demand from individual appliances every six seconds. In three of the five houses (houses 1, 2 and 5) we also record the whole-house voltage and current at 16 kHz.

To download the disaggregated data as ZIPPED CSV files please, download ukdale.zip from the UKERC EDC. It's 3.5 GBytes in size so will take a while to download! For other formats, please keep reading...

Each release of the dataset is labelled with the month and year. The most recent (and final) release is for April 2017. UK-DALE now includes 4.3 years of data for house 1.

Paper

The following paper describes the data recording system and the January 2015 release of the dataset. Please cite this paper if you use the dataset or the recording hardware:

Jack Kelly and William Knottenbelt. The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes. Scientific Data 2, Article number:150007, 2015, DOI:10.1038/sdata.2015.7

BibLaTex

@Article{UK-DALE,
  Title        = {The {UK-DALE} dataset, domestic appliance-level
                  electricity demand and whole-house demand from five {UK} homes},
  Author       = {Jack Kelly and William Knottenbelt},
  Journaltitle = {Scientific Data},
  Year         = {2015},
  Date         = {2015/03/31},
  Number       = {150007},
  Volume       = {2},
  Doi          = {10.1038/sdata.2015.7}
}

BibTex

@Article{UK-DALE,
  title        = {The {UK-DALE} dataset, domestic appliance-level
                  electricity demand and whole-house demand from five {UK} homes},
  author       = {Jack Kelly and William Knottenbelt},
  journal      = {Scientific Data},
  year         = {2015},
  date         = {2015/03/31},
  number       = {150007},
  volume       = {2},
  doi          = {10.1038/sdata.2015.7}
}

Small correction to the paper

The paper states:

The uncompressed 16 kHz 24-bit files would require 28.8 GBytes per day so we compress the files using the Free Lossless Audio Codec (FLAC) to reduce the storage requirements to ≈ 4.8 GBytes per day.

In fact, the uncompressed 16 kHz 24-bit files require 8.3 GBytes per day, not 28.8 GBytes per day!

Also, for some further analysis of the energy used by the individual appliance monitors, and the effect this has on the "proportion of energy submetered", please see this blog post.

Brief description of the data formats available

1 second and 6 second data

All five homes have whole-home power recorded every six seconds; and appliance-level data is at six second resolution. Homes 1, 2 and 5 also have whole-home active power and apparent power at 1 second resolution. The six-second and one-second data is stored in CSV files where the first column is the UNIX timestamp.

NILMTK HDF5 version

An HDF5 version of the 1-second and 6-second data (for use with NILMTK) is available on the UKERC EDC. See below for how to download it.

Utility meters

Gas and electricity utility meter readings for house 1 are available in two formats:

16 kHz voltage and current from homes 1, 2 and 5

The complete April 2017 version of the 16kHz dataset occupies 7.6 TBytes.

The 16 kHz data are stored as a sequence of stereo FLAC files ("FLAC" stands for "Free Lossless Audio Codec"). Each FLAC file is about 200 MBytes. One channel is whole-house voltage, the other is whole-house current.

The name of each FLAC file is the UNIX timestamp at the start of the recording for that flac file. The underscore in the filename should be interpreted as a decimal mark (i.e. it separates the integer part from the fractional part of the UNIX timstamp).

For more info about the high frequency data, please see our paper and the snd_card_power_meter github repository (the code we used to record the high frequency data.)

Converting from FLAC files to volts and amps

First you probably want to convert from FLAC (a lossless audio compression) to WAV. There are many audio tools that can convert from FLAC to WAV. I often use sox.

Once you have the WAV file, you'll need to convert from the [-1,1] range of values in the WAV file to volts and amps. In Python, you can load WAV files using Python's built-in wave package. You'll need the calibration.cfg file for the house in question (found here). This file specifies an amps_per_adc_step parameter and a volts_per_adc_step parameter. To calculate volts from the WAV files, use this formula: volts_per_adc_step × number_of_ADC_steps × value_from_wav_file. The variable number_of_ADC_steps=231 for houses 1 and 2 and number_of_ADC_steps=215 for house 5. Use a similar formula for amps. (The software we use for recording the data used 32-bit integers to capture the audio signal for houses 1 and 2 and 16-bit integers for house 5. Hence, for houses 1 and 2, there are 232 ADC steps for the full range from [-1,1] and 231 ADC steps for half the range from [0,1] or [-1,0].) You can safely ignore the 'phase_difference' parameter and just assume that the measurement hardware introduces no significant phase shift.

Download

January 2015 version from the UK Energy Research Council's Energy Data Center

The UKERC EDC currently holds the Jan 2015 version of UK-DALE. The EDC will soon have the Apr 2017 version too. Please cite the data DOI if you use the dataset!

  • The January 2015 release of the 1 second and 6 second data are available from the UK Energy Research Council's Energy Data Centre using our dataset DOI:10.5286/UKERC.EDC.000001
  • The Jan 2015 release of the 16 kHz data can be downloaded from the UKERC EDC via dataset DOI:10.5286/UKERC.EDC.000002

April 2017 from the UKERC EDC

The April 2017 version of UK-DALE is available from the UKERC EDC. Please cite the dataset DOIs of 10.5286/UKERC.EDC.000003 for the 16 kHz data and 10.5286/UKERC.EDC.000004 for the disaggregated data.

Dataset license

This data is made freely available under Creative Commons Attribution 4.0 International (CC BY 4.0). See more at https://creativecommons.org/licenses/by/4.0/

Change log

April 2017 release

  • House 1 now includes 4.3 years of data (starting on 09/11/2012 22:28:15 GMT and ending on 26/04/2017 18:35:53 BST).
  • The FLAC files have been moved into a directory structure of the form house_1/2015/wk04 This change is required to match the directory structure used by the UKERC EDC.
  • BUG FIX: In previous versions of UK-DALE, the directory storing 16 kHz FLAC files for house_5 incorrectly had some FLAC files which were actually recorded from house_1. These files were recorded from house_1 during 2015, weeks 34-41. These files have been moved to the house_1 directory, where they belong!
  • BUG FIX: In previous versions, the directory storing 16 kHz FLAC files for house_5 contained five WAV files. These have now been converted to FLAC. However, these five files are almost certainly too short. These files are listed in house_5/KNOWN_BAD_FILES.txt
  • The ZIP file in previous versions contained some cruft that wasn't required. For example, it contained a file building1/channel_54.dat which should be ignored. The new ZIP file is nice and clean!

May 2016 release

  • House 1 now includes 3.5 years of data (starting on 09/11/2012 22:28:15 GMT and ending on 13/05/2016 12:11:37 BST).

August 2015 release

  • House 1's FLAC files and 6-second files have been updated to 17th August 2015. So there are now 2.5 years of data for house 1!
  • The fridges, kettles, washing machines, microwaves and dish washers now have some additional metadata: on_power_threshold, max_power, min_on_duration and min_off_duration. This metadata helps NILMTK's Electric.get_activations() function to extract individual appliance activations.

January 2015 release

  • We now have five homes in the dataset (up from four in the last release).
  • We have 16 kHz recordings of mains voltage and current from houses 1, 2 and 5. This data is available for download over FTP. In total, we now have 6 TBytes of data (that's compressed)!
  • House 1 has 655 days of recordings and 54 meters installed
  • The new revision of the paper includes lots of plots describing the data (most plots produced with NILMTK. And here are the scripts to produce the plots in the paper).
  • The metadata has been updated to include some more information about each house (number of occupants, year the house was built etc).
  • The 'ragged' third column in the IAM .dat files recording button press data has been moved to channel_X_button_press.dat files. So none of the .dat files have ragged columns any more. This should make the data easier to load.

Contact

Email jack@jack-kelly.com, the guy who maintains this dataset.