Not currently working on the energy disaggregation competition

Towards the end of last year, I was lucky enough to have a short postdoc paid for by EDF Energy. The main focus of the postdoc was on looking at ways to design a competition to compare the performance of different disaggregation algorithms. This postdoc finished in January 2017 so I am not currently working on the disaggregation competition (although I strongly believe that finding a good way to compare NILM algorithms is one of the most important unsolved problems in NILM).

Very briefly: the main challenge in designing a NILM competition is getting enough clean, private testing data. It turns out that the performance of NILM algorithms can be quite inconsistent across houses: an algorithm might work well on some houses; but on other houses that same algorithm might work badly. Also, one of the promising uses of NILM is to identify "extreme" energy behaviour (such as leaving your electric oven on constantly just in case you fancy doing some baking). Identifying "extreme" behaviour is useful because users can save large sums of money with a single, simple change in behaviour. But - by definition - "extreme" behaviour is rare. Hence we need a large testing dataset (maybe 100 houses) to be confident that we're accurately capturing the performance of each algorithm; and that each algorithm can recognise "extreme" energy behaviour. Recording this quantity of real data would be very expensive and time consuming. Hence we could consider building a high-quality simulator to generate realistic data. But this raises a whole host of additional challenges!

Simulating disaggregated electricity data

To do rigorous NILM research, we need lots of high-quality disaggregated electricity data. This is especially true if we want to run a good NILM competition.

There are now 20 public datasets listed on the NILM wiki. But all real data suffers from problems which make it problematic for use in a NILM competition. These problems include:

Survey launched: Please help us to design a competition for energy disaggregation algorithms!

We are working on a competition for energy disaggregation algorithms. Please help us to design this competition by filling in this survey!

A competition for energy disaggregation algorithms

Now that I've (finally!) submitted my PhD thesis, I can focus on designing and implementing a competition for energy disaggregation algorithms. EDF Energy have kindly given me post-doc funding from now until the end of December 2016 to work on the NILM competition.

The broad plan is to first consult with the NILM community and create a specification for the NILM competition which works for everyone. Then I plan to implement a web application which can run the NILM competition.

Right now, I'm writing a survey on the design of a competition for energy disaggregation algorithms. The aim of the survey is to systematically collect feedback about the design of the competition. I plan to launch the survey soon. Prior to the launch, I'm really eager to hear feedback on the survey itself. For example: is the survey missing any vital questions? Do some questions not provide sufficient options? Do some questions not make sense?!

Please note that, prior to the launch of the survey, my aim is to get feedback on the design of the survey itself. So please don't actually submit any answers yet! Feel free to select options and click "next" but just please don't click "submit" at the end of the survey. I'll write another blog post when the survey is ready to accept answers.

It's probably best to provide feedback about the survey in public on the relevant thread on the Energy Disaggregation Google Group. If you want your feedback to be private then, by all means, email me directly at!

And please do get in touch if you have feedback on any aspect of the proposed NILM competition.

Please help design a competition for energy disaggregation algorithms!

Has disaggregation accuracy improved since the 1980s? Which algorithms are most accurate for a given use-case? Which (if any) use-cases are well served by NILM already?

It's pretty much impossible to answer any of these questions with confidence (unless you only consider the tiny number of algorithms for which you have access to executable code). We can't directly compare published results across papers because, when testing the disaggregation accuracy of NILM algorithms, each paper uses different datasets, different metrics, different pre-processing, etc.

This means that we can't measure progress over time. Nor can we decide which NILM algorithms are most promising and which might be dead-ends.

These are bad problems. Let's work towards fixing them.

Some other machine learning communities have had great success running yearly competitions. For example, the ImageNet "Large Scale Visual Recognition Challenge" has been running yearly since 2010. Some regard this competition as having played a crucial role in the recent dramatic increase in the accuracy of image classification algorithms.

The idea of running a NILM competition has been rumbling around for several years. But designing and implementing a NILM competition is hard. The community uses sample rates ranging from monthly to MHz. No single metric is informative for all use-cases. Collecting ground truth data (the power demand of individual appliances) is expensive and time-consuming.

Maybe we can pull this off. The first step is to decide on a design which will work for everyone.

To give us something concrete to debate, we'll outline one way this could work. This is not meant to be definitive! Think of this as the DNA for a clumsy, inefficient animal 500 million years ago. Together, we need to evolve this design into an elegant, efficient beast, well adapted to its environment.

Please shoot holes in this proposal! What won't work for you? What's impractical? What's unfair? What opens the competition up to cheating? How can we make the competition more attractive to researchers? How can we make the competition more informative for the community? How can we simplify the process?

The draft proposal is available on Google Docs. I've linked to a Google Doc rather than copying-and-pasting the proposal into this post so that we can update the proposal as the discussion develops. Please add your comments either to the mailing list discussion; or to the Google Doc (please sign your comment with your name; unless you deliberately want to be anonymous); or if you want to keep your comment private then email me.

Thanks, (in no particular order) Jack, Mario, Oli, Stephen, Grant, Marco, Peter

3rd International Workshop on NILM -- SAVE THE DATE!

Dear NILM researchers,

The 3rd International Workshop on Non-Intrusive Load Monitoring (NILM) will be held in Vancouver, Canada from May 14 to 15, 2016. The venue for the workshop is still under consideration. Last workshop was held June/2014 at the University of Texas, Austin, in Austin, TX.

Live stream of London NILM Workshop on Weds 8th July

It's almost time for the London NILM Workshop!

We'll be live streaming the event via a Google Hangout On Air. The stream will start at 10:00 BST (UTC+1), but do keep an eye on NILM_Workshop on Twitter for any new links if we encounter technical difficulties on the day!

Announcing the 2015 European NILM workshop

The Second European Workshop on Non-intrusive Load Monitoring will be held on 8th July 2015 at Imperial College London. More details on Oli's blog.

Energy disaggregation online discussion forum

I've finally gotten round to putting together an online discussion forum for energy disaggregation! Feel free to join! Also, while we're finding our feet with this Google Group, please feel free to discuss the Google Group settings on the Energy Disaggregation forum.

MSc project proposal: An online competition for comparing energy disaggregation algorithms

A sizeable challenge in the energy disaggregation community is that of comparing NILM algorithms from different researchers. In other words, if we have two papers, and one paper reports an accuracy of 80%, and another reports an accuracy of 85% then we cannot infer that the second paper is better because the authors used different datasets, different pre-processing etc. Hence we are working on a project proposal for the consideration of Imperial Computer Science MSc students. If a group of students selects the project then they'll work on it for the duration of next term. Here's the full, draft project specification. Comments most welcome!


Subscribe to RSS - nilm