blog

Summer schools & workshops on smart energy / disaggregation

This is just a stub entry for now... I will flesh it out in coming months.  I aim to list any summer schools, workshops and conferences which are relevant to smart meter disaggregation.

Concrete example of floating point arithmetic behaving in unexpected ways

I've heard lots of people say that it's best to use a floating point number only when you really need to.  During my MSc we learnt about how floating point numbers are encoded and did little pencil-and-paper exercises to demonstrate how decimal fractions are converted into surprisingly odd floating point representations.  I've read about computer arithmetic errors causing the failure of a patriot missile.  But the following little problem that I've just bumped into seems to be a very clean, concrete way to demonstrate that floating point numbers are to be handled with care.   Here's the example... if I subtract 0.8 from 1, the remainder is 0.2, right?  So let's try asking Matlab or C++.  Try evalating the following:

(1 - 0.8) == 0.2

This expression will return a boolean.  It's simply subtracting 0.8 from 1 and then asking if the answer is equal to 0.2.   Rather surprisingly, it returns false.  Why?  Because 0.2 cannot be precisely represented in binary floating point; the significand is 1100 recurring.  0.2 decimal = 3E4CCCCD in 32-bit floating point (hex representation). Now if we convert from binary floating point back to decimal, we get: 3E4CCCCD = 2.0000000298023223876953125E-1   (You can learn more about floating point arithmetic on WikiPedia and to tinker with this nifty floating point converter applet.)  The bottom line is: if the quantity you're trying to represent can easily be represented using integers, then it's probably best to do so.  e.g. if you're trying to represent monetary values in C++, and you know you'll only be interested in values of a specific precision (like 0.1 pence) then you could build a simple Money class which internally represents money as integers.

There's lots of good discussion (and links) of the limitations of floating point here: http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems

Update 18/6/2012

I've just learnt that Python can cope with decimal numbers if you import decimal:

1 - 0.8

0.19999999999999996

(1-0.8)==0.2

False

import decimal

1-0.8

0.2

(1-0.8)==0.2

True

Update 21/11/2013

This is a good explanation of the "leakyness" of FP: John D. Cook: Floating point numbers are a leaky abstraction.

Stanford's free online Probabilistic Graphical Models course

Just a very quick note to say I'm a week into Stanford's free online Probabilistic Graphical Models course.  It's really, really good and I'm learning loads (although does require a fair amount of work).  The online course covers the same content as Stanford's postgraduate PGM course (it's not watered down like Stanford's free online Machine Learning course) and has interesting programming assignments.  Very juicy stuff and it should substantially improve my ability to refine and implement some of my hand-wavy ideas.

This is the first on-line course I've taken and I'm very impressed.  It seems to be a near-perfect mix of the best bits from "real" lectures and the best bits from studying alone with a text book.  i.e. it's engaging and "human" like a lecture; but you also have the option to pause / rewind (like reading a text book) to think things through.

Ubuntu notes

Just some notes on useful Ubuntu tweaks

Which programming language for my Disaggregation system? Matlab versus Python; Graphical Models.

Over the course of my PhD, I intend to write a smart meter disaggregation system.  Maybe this system will end up as a web service; maybe not.  At the very least, it will need to play nicely with existing web services like Pachube.  I've been wondering which language(s) I should use to build my system.  My current answer to this question is to write a complete prototype of the "backend" in Python, with the front-end written in JavaScript, HTML5 and SVG.  It's likely that parts of the "backend" will run rather slowly in Python; but luckily it's easy to get Python to play well with C++ code, so I'd plan to re-write computationally intensive sections in C++.

My initial plan was to use Matlab.  But after writing several thousand lines of Matlab, I couldn't help but feel uncomfortable with it.  There are some seriously ugly bits of the language; and in general it has a rather "hacked together" feel to it.  It turns out I'm not the only one who feels uncomfortable with Matlab: there's a blog called "Abandon MATLAB" with gems like "[Mathworks] even updated the docs for “getframe” to clarify that you need to turn off the fucking screen saver and walk away from the computer like it’s 1992.".  One especially interesting post in "Abandon MATLAB" links to the results of a survey which compares attitudes to MATLAB to attitudes to Python.  Basically, I feel content that I wasn't completely crazy to abandon Matlab in favor of Python and C++.  I'll admit that I'm struggling a bit to wrap my head around JavaScript but I'm getting there with the help of Douglas Crockford's excellent book "JavaScript: The Good Parts".

Using iPhone photo GPS coordinates in Google Maps

Say you have a photo taken on an iPhone and you want to find out where it was taken.  How can this be done?  Easy:

  1. Open in a photo editor which lets you view the metadata.  For example, gwenview works well on Ubuntu.
  2. Find the following attributes (I'll give specific values to make this example concrete)
    1. GPS Longitude Reference = West
    2. GPS Latitude Reference = North
    3. GPS Longitude = 0deg 3.19000'
    4. GPS Latitude = 51deg 27.72000'
  3. Given the data above, you'd enter 51 27.72', -0 3.19' into Google Maps.  The minus before the latitude is there because the latitude reference is "west" instead of "east".

The Geological Record of Ocean Acidification

A rather worrisome paper has just been published in Science magazine. The authors conclude:

[T]he current rate of (mainly fossil fuel) CO2 release stands out as capable of driving a combination and magnitude of ocean geochemical changes potentially unparalleled in at least the last ~300 [million years] of Earth history, raising the possibility that we are entering an unknown territory of marine ecosystem change.

An excellent summary is available on Ars Technica (which is where I first read about this paper).  The paper is B. Hönisch et al, ‘The Geological Record of Ocean Acidification’, Science, vol. 335, no. 6072, pp. 1058–1063, Mar. 2012. DOI: 10.1126/science.1208277.

On the plus side, reading about this research has given me more enthusiasm to finish insulating our bedrooms this weekend!

Where to make notes whilst learning a new programming language

For the past few days I've been teaching myself JavaScript for a PhD project. I'm using the excellent book "JavaScript: The Good Parts" by Douglas Crockford.  To begin with, I took notes in my hand-written note book.  But that was slow and clunky.  So I started making notes in Google Docs.  But that doesn't have syntax highlighting.  So it finally dawned on me: the best place to make notes whilst learning a new language is in code!  This feels so blindingly obvious now that I feel dumb mentioning it but it took me a little while to figure out.  Of course, we all tinker with code snippets whilst learning a new language.  But I'm now trying to get into the habit of creating a new file for each topic, and to put lots of comments in the code to explain each new language feature that I learn.  The code will be my (runnable) notes.

For example, here's my file on the topic of function invocation:

Making graphs for websites and web apps

I've been doing a little research into creating interactive graphs on web pages.  Some quick notes from my research (this isn't meant to be an exhaustive list by any means):

Pages

Subscribe to RSS - blogs