April 2017 release

This dataset records the power demand from five houses. In each house we record both the whole-house mains power demand every six seconds as well as power demand from individual appliances every six seconds. In three of the five houses (houses 1, 2 and 5) we also record the whole-house voltage and current at 16 kHz.

Each release of the dataset is labelled with the month and year. The most recent (and probably final) release is for April 2017. UK-DALE now includes 4.3 years of data for house 1.

Paper

The following paper describes the data recording system and the January 2015 release of the dataset. Please cite this paper if you use the dataset or the recording hardware:

Jack Kelly and William Knottenbelt. The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes. Scientific Data 2, Article number:150007, 2015, DOI:10.1038/sdata.2015.7

BibLaTex

@Article{UK-DALE,
  Title        = {The {UK-DALE} dataset, domestic appliance-level
                  electricity demand and whole-house demand from five {UK} homes},
  Author       = {Jack Kelly and William Knottenbelt},
  Journaltitle = {Scientific Data},
  Year         = {2015},
  Date         = {2015/03/31},
  Number       = {150007},
  Volume       = {2},
  Doi          = {10.1038/sdata.2015.7}
}

BibTex

@Article{UK-DALE,
  title        = {The {UK-DALE} dataset, domestic appliance-level
                  electricity demand and whole-house demand from five {UK} homes},
  author       = {Jack Kelly and William Knottenbelt},
  journal      = {Scientific Data},
  year         = {2015},
  date         = {2015/03/31},
  number       = {150007},
  volume       = {2},
  doi          = {10.1038/sdata.2015.7}
}

Small correction to the paper

The paper states:

The uncompressed 16 kHz 24-bit files would require 28.8 GBytes per day so we compress the files using the Free Lossless Audio Codec (FLAC) to reduce the storage requirements to ≈ 4.8 GBytes per day.

In fact, the uncompressed 16 kHz 24-bit files require 8.3 GBytes per day, not 28.8 GBytes per day!

Also, for some further analysis of the energy used by the individual appliance monitors, and the effect this has on the "proportion of energy submetered", please see this blog post.

Brief description of the data formats available

1 second and 6 second data

All five homes have whole-home power recorded every six seconds; and appliance-level data is at six second resolution. Homes 1, 2 and 5 also have whole-home active power and apparent power at 1 second resolution. The six-second and one-second data is stored in CSV files where the first column is the UNIX timestamp.

NILMTK HDF5 version

An HDF5 version of the 1-second and 6-second data (for use with NILMTK) is available on the UKERC EDC and via my FTP server

Utility meters

Gas and electricity utility meter readings for house 1 are available in two formats:

16 kHz voltage and current from homes 1, 2 and 5

The complete April 2017 version of the 16kHz dataset occupies 7.6 TBytes.

The 16 kHz data are stored as a sequence of stereo FLAC files ("FLAC" stands for "Free Lossless Audio Codec"). Each FLAC file is about 200 MBytes. One channel is whole-house voltage, the other is whole-house current.

The name of each FLAC file is the UNIX timestamp at the start of the recording for that flac file. The underscore in the filename should be interpreted as a decimal mark (i.e. it separates the integer part from the fractional part of the UNIX timstamp).

For more info about the high frequency data, please see our paper and the snd_card_power_meter github repository (the code we used to record the high frequency data.)

Converting from FLAC files to volts and amps

First you probably want to convert from FLAC (a lossless audio compression) to WAV. There are many audio tools that can convert from FLAC to WAV. I often use sox.

Once you have the WAV file, you'll need to convert from the [-1,1] range of values in the WAV file to volts and amps. You'll need the calibration.cfg file for the house in question. This file specifies an amps_per_adc_step parameter and a volts_per_adc_step parameter. To calculate volts from the WAV files, use this formula: volts_per_adc_step × number_of_ADC_steps × value_from_wav_file. The variable number_of_ADC_steps=231 for houses 1 and 2 and number_of_ADC_steps=215 for house 5. Use a similar formula for amps. (The software we use for recording the data used 32-bit integers to capture the audio signal for houses 1 and 2 and 16-bit integers for house 5. Hence, for houses 1 and 2, there are 232 ADC steps for the full range from [-1,1] and 231 ADC steps for half the range from [0,1] or [-1,0].) You can safely ignore the 'phase_difference' parameter and just assume that the measurement hardware introduces no significant phase shift.

Download

January 2015 version from the UK Energy Research Council's Energy Data Center

The UKERC EDC currently holds the Jan 2015 version of UK-DALE. The EDC will soon have the Apr 2017 version too. Please cite the data DOI if you use the dataset!

  • The January 2015 release of the 1 second and 6 second data are available from the UK Energy Research Council's Energy Data Centre using our dataset DOI:10.5286/UKERC.EDC.000001
  • The Jan 2015 release of the 16 kHz data can be downloaded from the UKERC EDC via dataset DOI:10.5286/UKERC.EDC.000002

April 2017 version from my Imperial FTP server

The April 2017 version of UK-DALE is available from my Imperial FTP server.

The Imperial FTP server will be disconnected some time in Summer 2017. I have transferred the April 2017 version of UK-DALE to the UKERC EDC, and it is partially available already. When my FTP server is disconnected, UK-DALE will only be available from the UKERC EDC.

I'd recommend using a 'proper' FTP client program to connect (because it will give useful error messages and be able to resume downloading if the network connection fails). For example, you could use FileZilla (it's free and open-source). Or, if you're feeling lazy, you could try just pointing your web browser to ftp://achillea.doc.ic.ac.uk:55555/UK-DALE/

You can establish an anonymous FTP connection using these details (please note the unusual port!):

      host: achillea.doc.ic.ac.uk
      port: 55555
  username: anonymous
  password: <your email address>
 directory: /UK-DALE/

April 2017 from the UKERC EDC

At the time of writing (June 22, 2017), the UKERC EDC are still transferring the April 2017 data to the EDC. The data that has been transferred is visible on the EDC.

Dataset license

This data is made freely available under Creative Commons Attribution 4.0 International (CC BY 4.0). See more at https://creativecommons.org/licenses/by/4.0/

Change log

April 2017 release

  • House 1 now includes 4.3 years of data (starting on 09/11/2012 22:28:15 GMT and ending on 26/04/2017 18:35:53 BST).
  • The FLAC files have been moved into a directory structure of the form house_1/2015/wk04 This change is required to match the directory structure used by the UKERC EDC.
  • BUG FIX: In previous versions of UK-DALE, the directory storing 16 kHz FLAC files for house_5 incorrectly had some FLAC files which were actually recorded from house_1. These files were recorded from house_1 during 2015, weeks 34-41. These files have been moved to the house_1 directory, where they belong!
  • BUG FIX: In previous versions, the directory storing 16 kHz FLAC files for house_5 contained five WAV files. These have now been converted to FLAC. However, these five files are almost certainly too short. These files are listed in house_5/KNOWN_BAD_FILES.txt
  • The ZIP file in previous versions contained some cruft that wasn't required. For example, it contained a file building1/channel_54.dat which should be ignored. The new ZIP file is nice and clean!

May 2016 release

  • House 1 now includes 3.5 years of data (starting on 09/11/2012 22:28:15 GMT and ending on 13/05/2016 12:11:37 BST).

August 2015 release

  • House 1's FLAC files and 6-second files have been updated to 17th August 2015. So there are now 2.5 years of data for house 1!
  • The fridges, kettles, washing machines, microwaves and dish washers now have some additional metadata: on_power_threshold, max_power, min_on_duration and min_off_duration. This metadata helps NILMTK's Electric.get_activations() function to extract individual appliance activations.

January 2015 release

  • We now have five homes in the dataset (up from four in the last release).
  • We have 16 kHz recordings of mains voltage and current from houses 1, 2 and 5. This data is available for download over FTP. In total, we now have 6 TBytes of data (that's compressed)!
  • House 1 has 655 days of recordings and 54 meters installed
  • The new revision of the paper includes lots of plots describing the data (most plots produced with NILMTK. And here are the scripts to produce the plots in the paper).
  • The metadata has been updated to include some more information about each house (number of occupants, year the house was built etc).
  • The 'ragged' third column in the IAM .dat files recording button press data has been moved to channel_X_button_press.dat files. So none of the .dat files have ragged columns any more. This should make the data easier to load.

Contact

Email jack@jack-kelly.com, the guy who maintains this dataset.