Hacking the Current Cost

I have almost 30 Current Cost Individual Appliance Monitors (I need to monitor the power consumption of every appliance in my home for my PhD project).  Unfortunately, I sometimes see drop-outs on a single channel lasting thousands of seconds, which is simply unacceptable.  I see these epic drop-outs even if the IAM is within a meter of its EnviR.  So data is being lost somewhere between the IAM transmitting its packet and it being sent via XML from the EnviR.  I think the IAMs simply squirt a reading onto the RF carrier every 6 seconds without waiting for a "ping" from the EnviR.  There are two possible places where the packets are being lost:  1) the EnviR drops packets or 2) RF collisions

1) EnviR drops packets
If the EnviR is busy processing a packet of RF data when a new packet of RF data arrives then maybe it will fail to receive the new packet.  So if two IAMs send packets in quick succession then the second to send will be ignored.   The RFM01 only has a 16 bit buffer so it could easily overflow.  I have experiemented with setting two EnviRs to receive data from a single IAM.  Sometimes both EnviRs receive a packet; sometimes only one will receive a packet and sometimes both will drop the packet.  I take this as evidence that sometimes an EnviR will drop a packet because it's too busy.
2) RF collisions
An alternative explanation for the long drop outs is that some of the failed IAM transmissions are caused by RF collisions. How likely are RF collisions?  Apparently the Current Cost devices use a 4kbps data rate.  So a single bit take 1/4000 of a second to transmit so a single byte takes 8/4000 seconds = 2ms.  The RF packets on the CC transmitter are 16 bytes long. So a single packet takes 16 x 2ms = 32ms.  So about 30 packets can fit into a second and 180 can fit into the 6 second gap between IAM transmissions.  Let's make the maths simple and assume that we have 180 discrete time slots per 6 second cycle.  The chance of a single IAM transmitting in any given time slot is 1/180.  If we had only two IAMs then the chance of them sharing a single time slot (and hence colliding) is 1/180 x 1/180 = 1/32400.  But we have 30 IAMs hence we have a total of 30-choose-2 pairs = 435 pairs, so the chance of any pair colliding is 435/32400 = 1.3%; which is rather too high for comfort given that I want this logging to run for months and months.  And of course there are several reasons to believe the chance of a collision is even higher: we don't have discrete time slots and collisions can happen between any set of transmitters, not just pairs.  Ick.
My plan

First I'm going to assume that the main problem is that the EnviR drops packets because it's too busy.  Hence I want to connect an RF receiver directly to my laptop in order to sniff IAM data directly from the air without having to use an EnviR.  I'm somewhat out of my depth here!  After  a bit of googling, I came across this Nanode IRC conversation about sniffing the SPI bus of a Current Cost to reverse engineer their protocol.  I assume I just need a Bus Pirate to sniff the SPI bus of the EnviR to get the initialisation commands the EnviR sends to its on-board RFM01 RF module; and then I can buy an RFM01 module and connect this to the bus pirate's SPI bus to communicate directly with the RFM01 from my laptop.

If I find that RF collisions are a major problem then I may investigate the EDF wireless transmitter plugs.  These are similar to the Current Cost IAMs except, crucially, the EDF models use transceivers and not just transmitters.  The EDF Eco Manager base station "pings" each transmitter plug in sequence and the transmitter plug responds within about 20ms.  This should totally avoid RF collisions.  The problem is that I already have 30 Current Cost IAMs!  I'm planning to take one apart to see if there's any possibility of converting it to a trasceiver type (the IAMs say "transmitter only" on the back).  If not then I guess I'll have to try to return or eBay my IAMs and buy EDF transmitter plugs.  I'll still have to build my own transciever because each Eco Manager can only handle 14 transmitter plugs.  If I use multiple Eco Managers then RF collisions will become possible again.

Below are some notes on tools and forums...

SPI to USB converters

Logic Analysers

RF modules

Forum threads and blog posts

Current Cost specs

udev rules for Current Costs

Update Oct 11th 2012

It may not be necessary to add a new udev rule to access a CurrentCost (or program a Nanode). Instead it may just be necessary to add yourself to the dialout group. Haven't tested this on a new installation yet.

udev manages the /dev/ filesystem on a modern Linux machine. When you first connect a Current Cost USB cable to a Linux machine, you may find that the relevant /dev/ttyUSB[0-9] file has not got the correct permissions to allow you to access it as a normal user.

These are the steps that allowed me to connect my Current Cost to my Ubuntu Server 12.04 machine:

  1. I added my username, jack, to the fuse group with the command sudo usermod -a -G fuse jack. (The -a (append) is ESSENTIAL! If you forget the -a then your username will only be in the fuse group. If you accidentally forget the -a then boot into Ubuntu recovery mode, then run a disk check (to mount the filesystem as read/write), then drop into root command line mode, then issue the comman usermod -a -G sudo username as per the instructions here and then you can add yourself to the default groups listed here.)
  2. Log out and log back in again for the group changes to take effect (it isn't sufficient to close the terminal and open it again).
  3. I created a new file /etc/udev/rules.d/current-cost.rules. This file contains the following text:
    SUBSYSTEM!="usb_device", ACTION!="add", GOTO="currentcost_rules_end"
    # Current Cost (the following rule should be all on one line)
    ATTRS{idVendor}=="067b", ATTRS{idProduct}=="2303", MODE="660", GROUP:="fuse" 
  4. After saving this file, I unplugged all Current Cost EnviR monitors from my machine and plugged them in again. (Note that running sudo service udev restart doesn't appear to be sufficient or necessary to start using the new udev rules.)
  5. Check that the permissions have been set correctly by running ls -ld /dev/ttyUSB?. The result should be something like this: crw-rw---- 1 root fuse 188, 0 Aug 10 08:43 /dev/ttyUSB0 (the crucial things to check are that group has read and write permissions and that the group is set to fuse)

A quick explanation of the udev rules

Each line is a rule. udev checks the truth of all == and !=; if those checks all succeed then it evaluates the assignment operators = and := (the second of which ensures that the value assigned to that key isn't changed by a subsequent rule). Let's consider the line which starts ATTRS.... This will set MODE="660" (owner and group have read and write permissions) and will set GROUP:="fuse" (:= ensures that the group will not be changed later) for all devices where ATTRS{idVendor}=="067b" and ATTRS{idProduct}=="2303".

How do we which attributes to check for? If we only care about idVendor and idProduct then plug the device in and run lsusb

jack@lenovo:~/workingcopies/iam_logger$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 004: ID 5986:0241 Acer, Inc BisonCam, NB Pro
Bus 001 Device 006: ID 0bda:0158 Realtek Semiconductor Corp. USB 2.0 multicard reader
Bus 002 Device 026: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port
Bus 003 Device 002: ID 0a5c:2150 Broadcom Corp. BCM2046 Bluetooth Device

The line Bus 002 Device 026: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port is my Current Cost USB cable. If we want finer control then run udevadm info --name=/dev/ttyUSB0 --attribute-walk to see a long list of key, value pairs you could use in your udev rules.

More info:

Windows Notes

Just some random notes about using Windows

Backup batch script

Below is the batch script I use to achieve the following behaviour:

  • Copy files from hard disk to a USB disk.
  • The backup procress must never delete a file from either source or destination, except for the Photos director.
  • The end result is that the destination ends up having more files than the source because files are never deleted by the backup process. But the Photos directory on both disks should be exact copies.
REM tutorial:
REM options:
REM /e    = copy all sub-folders, even empty ones
REM /mir  = mirror (check the files in the destination, and only copy newer files)
REM /np   = no progress counter
REM /log: = create a logfile
REM /xo   = exclude older files
REM /xd   = exclude directory
robocopy D:\ I:\Backup\Storage /e /np /log:backup_log.txt /xo /xd $RECYCLE.BIN RECYCLER "System 
Volume Information" Photos
robocopy D:\Photos I:\Backup\Storage\Photos /e /mir /np /log:backup_photos_log.txt

Setting up Emacs for Python development

  • Ubuntu packages to install: emacs autocutsel texinfo git mercurial (git and texinfo are required by el-get; mercurial is required to install pymacs)
  • To set the font size for just this session: press M-: and then type (set-face-attribute 'default nil :height 100) (taken from stack overflow)

Domestic power consumption data on github

I've just put all my existing smart meter data on github.

This dataset isn't especially useful for NILM work yet because I don't have a "ground truth" record of each appliance's state change.  This will change when I install my 24 individual appliance monitors.

Monitoring individual appliances

For some time I've been monitoring my home's aggregate power consumption using a CurrentCost EnviR.  I'm now planning to upgrade my monitoring hardware.  Firstly, I want to install CurrentCost Individual Appliance Monitor plugs on my appliances (£13.33 each).  Secondly, I want to measure aggregate real, reactive power and voltage using an Open Energy Montitor.

List of appliances to monitor

(each Current Cost ENVI display can only cope with 9 IAMs)


A (livingroom)
  1. TV
  2. Amp
  3. Subwoofer
  4. HTPC
  5. Washing machine
  6. ADSL modem
  7. Livingroom lamp1
  8. Livingroom lamp2
  9. Livingroom lamp3
B (livingroom)
  1. Bedroom1 lamp1
  2. Bedroom1 lamp2
  3. Bedroom2 lamp
  4. Bedroom DAB radio etc
  5. Hair dryer
  6. Hair straighteners
  7. Iron
  8. Hoover
C (in study)
  1. Toaster
  2. Kettle
  3. Coffee Maker / Bread Maker
  4. Microwave
  5. Fridge
  6. Kitchen Radio
  7. Dish washer
  8. Kitchen lamp
D (in study)
  1. Laptop
  2. Desktop
  3. 24" LCD
  4. Office HiFi
  5. Study lamp1 & lamp2 (sharing a plug)
  6. Printer
  7. GigE switch
  8. Fan
  9. Battery charger


Update 21/6/2012

I bought 3 CurrentCost Individual Appliance Monitors to test.  They seeem to work well.  My main concern was that the wireless range would be too short to allow me to monitor every appliance in my house but the wireless range seems fine.  Sure, the system drops a few more samples from the wireless monitor that's furthest from the CurrentCost EnviR but the data is entirely usable.  I've modified my Python logging script to handle multiple sensors.

Python notes

Documenting code

Python libraries

    Paper accepted into Imperial College Energy and Performance Colloquium 2012

    My submission to the Imperial College Energy and Performance Colloquium 2012 has been accepted. It's just an extended abstract which briefly outlines some ideas for my PhD research.

    The paper is:

    • Kelly J, Knottenbelt WDisaggregating Multi-State Appliances from Smart Meter Data. Imperial College Energy and Performance Colloquium. 29 May - 1 June 2012.  PDF


    Smart electricity meters record the aggregate consumption of an entire building.  However, appliance-level information is more useful than aggregate data for a variety of purposes including energy management and load forecasting. Disaggregation aims to decompose an aggregate signal into appliance-by-appliance information.

    Existing disaggregation systems tend to perform well for single-state appliances like toasters but perform less well for multi-state appliances like dish washers and tumble driers.

    In this paper, we propose an expressive probabilistic graphical modelling framework with two main design aims: 1) to represent and disaggregate multi-state appliances and 2) to use as many features from the smart meter signal as possible to maximise disaggregation performance.

    A new language for mathematical computing: Julia

    Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, mostly written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing. 

    More info: The Julia Language and Why We Created Julia and A Matlab Programmer's Take on Julia.  Sounds pretty awesome.

    Incidentally, the third link includes a quote which pretty much exactly captures my current feelings about Matlab:

    The Matlab language is slow, it is crufty, and has many idiosyncracies... I strongly disagree, however, with the opinion, common among some circles, that Matlab is to be dismissed just because it is crufty or "not well designed". It is actually a very productive language that is very well suited to numerical computing and algorithm exploration. Cruftiness and slowness are the price we pay for its convenience and flexibility.

    I fundamentally disagree with the last statement though.  Cruftiness and slowness should not be the price we pay for convenience and flexibility.  Matlab could've been designed to be both high-performance and productive.  For example: one source of slowness and cruftiness is that objects are usually passed by value, not by reference (yes, I know MATLAB does copy-on-write... which is great... until you want to write to an object).  I think that defaulting to pass-by-value is simply a design mistake.  Pass by reference wouldn't prevent MATLAB from doing the things it does, and would make it faster.

    Awesome stats, machine learning & information theory videos on YouTube

    I'm still very much enjoying the Coursera / Stanford Probabilistic Graphical Models course but occassionally I need to turn to another source to help explain the concepts.  I've just re-descovered MathematicalMonk on YouTube.  He has over 200 videos on machine learning, information theory and stats.  The videos I've sampled so far have been excellent.  Very lucid. 


    Subscribe to RSS - blogs