# programming

## Non-Intrusive Load Monitoring ToolKit (nilmtk)

Nipun, Oli and I have just started work on an open source toolkit for non-intrusive load monitoring called nilmtk. We're pretty excited about it! It's only in the very, very earliest stages (the code repository currently has precisely zero lines of code in it!) although we've started to flesh out the design on the project's wiki.

## Using computer science to improve environmental sustainability

I believe that computer science geeks have an important role to play in engineering our way out of the various environmental problems we currently find ourselves lumbered with.

This blog post is a non-exhaustive list of interesting subject areas for folks who, like me, are interested in both computer science and environmental sustainability. It aims to provide an (incomplete) answer to the question: how can a computer scientist contribute measurable improvements to our environmental sustainability?

## C++ notes

# makefiles

## Automatic dependency generation

## rfm_edf_ecomanager code now works with the Arduino IDE

Just a very quick update: my rfm_edf_ecomanager C++ AVR code now should compile within the Arduino IDE.

## Setting up Emacs for Python development

- Ubuntu packages to install:
`emacs autocutsel texinfo git mercurial`

(git and texinfo are required by el-get; mercurial is required to install pymacs) - To set the font size for just this session: press
`M-:`

and then type`(set-face-attribute 'default nil :height 100)`

(taken from stack overflow)

## Python notes

### Documenting code

- sphinx-apidoc "is a tool for automatic generation of Sphinx sources that, using the autodoc extension, document a whole package in the style of other automatic API documentation tools."
- Math support in Sphinx
- Example documentation markup from An Example PyPi project

### Python libraries

## A new language for mathematical computing: Julia

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, mostly written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing.

More info: The Julia Language and Why We Created Julia and A Matlab Programmer's Take on Julia. Sounds pretty awesome.

Incidentally, the third link includes a quote which pretty much exactly captures my current feelings about Matlab:

The Matlab language is slow, it is crufty, and has many idiosyncracies... I strongly disagree, however, with the opinion, common among some circles, that Matlab is to be dismissed just because it is crufty or "not well designed". It is actually a very productive language that is very well suited to numerical computing and algorithm exploration. Cruftiness and slowness are the price we pay for its convenience and flexibility.

I fundamentally disagree with the last statement though. Cruftiness and slowness *should not be the price we pay* for convenience and flexibility. Matlab could've been designed to be *both *high-performance *and* productive. For example: one source of slowness and cruftiness is that objects are usually passed by value, not by reference (yes, I know MATLAB does copy-on-write... which is great... until you want to write to an object). I think that defaulting to pass-by-value is simply a design mistake. Pass by reference wouldn't prevent MATLAB from doing the things it does, and would make it faster.

## Concrete example of floating point arithmetic behaving in unexpected ways

I've heard lots of people say that it's best to use a floating point number only when you really need to. During my MSc we learnt about how floating point numbers are encoded and did little pencil-and-paper exercises to demonstrate how decimal fractions are converted into surprisingly odd floating point representations. I've read about computer arithmetic errors causing the failure of a patriot missile. But the following little problem that I've just bumped into seems to be a very clean, concrete way to demonstrate that floating point numbers are to be handled with care. Here's the example... if I subtract 0.8 from 1, the remainder is 0.2, right? So let's try asking Matlab or C++. Try evalating the following:

`(1 - 0.8) == 0.2`

This expression will return a boolean. It's simply subtracting 0.8 from 1 and then asking if the answer is equal to 0.2. Rather surprisingly, it returns false. Why? Because 0.2 cannot be precisely represented in binary floating point; the significand is 1100 recurring. 0.2 decimal = 3E4CCCCD in 32-bit floating point (hex representation). Now if we convert from binary floating point back to decimal, we get: 3E4CCCCD = 2.0000000298023223876953125E-1 (You can learn more about floating point arithmetic on WikiPedia and to tinker with this nifty floating point converter applet.) The bottom line is: if the quantity you're trying to represent *can* easily be represented using integers, then it's probably best to do so. e.g. if you're trying to represent monetary values in C++, and you know you'll only be interested in values of a specific precision (like 0.1 pence) then you could build a simple Money class which internally represents money as integers.

There's lots of good discussion (and links) of the limitations of floating point here: http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems

**Update 18/6/2012**

I've just learnt that Python can cope with decimal numbers if you `import decimal`

:

`1 - 0.8`

`0.19999999999999996 `

`(1-0.8)==0.2 `

`False `

`import decimal `

`1-0.8 `

`0.2 `

`(1-0.8)==0.2 `

`True`

Update 21/11/2013

This is a good explanation of the "leakyness" of FP: John D. Cook: Floating point numbers are a leaky abstraction.

## Which programming language for my Disaggregation system? Matlab versus Python; Graphical Models.

Over the course of my PhD, I intend to write a smart meter disaggregation system. Maybe this system will end up as a web service; maybe not. At the very least, it will need to play nicely with existing web services like Pachube. I've been wondering which language(s) I should use to build my system. My current answer to this question is to write a complete prototype of the "backend" in Python, with the front-end written in JavaScript, HTML5 and SVG. It's likely that parts of the "backend" will run rather slowly in Python; but luckily it's easy to get Python to play well with C++ code, so I'd plan to re-write computationally intensive sections in C++.

My initial plan was to use Matlab. But after writing several thousand lines of Matlab, I couldn't help but feel uncomfortable with it. There are some seriously ugly bits of the language; and in general it has a rather "hacked together" feel to it. It turns out I'm not the only one who feels uncomfortable with Matlab: there's a blog called "Abandon MATLAB" with gems like "*[Mathworks] even updated the docs for “getframe” to clarify that you need to turn off the fucking screen saver and walk away from the computer like it’s 1992.*". One especially interesting post in "Abandon MATLAB" links to the results of a survey which compares attitudes to MATLAB to attitudes to Python. Basically, I feel content that I wasn't completely crazy to abandon Matlab in favor of Python and C++. I'll admit that I'm struggling a bit to wrap my head around JavaScript but I'm getting there with the help of Douglas Crockford's excellent book "JavaScript: The Good Parts".