Python notes
Installing Emacs on Ubuntu and Python in 2017
- The aim is to use python3 and virtualenv for python development.
sudo apt install emacs25-lucid python-pip python3-dev
- Install
elpy
- copy
elpy-profile.el
as described in this elpy issue
- copy
- Now install Python packages required by
elpy
. The idea is to deliberately keep these “elpy requirements” separate from the packages required by each python project. This way, we can update the “elpy packages” across the entire system in one go; and we can list just the packages required by each project usingpip freeze --local
.
Add this to the end of ~/.profile
:
# Python virtualenvwrapper
export PATH="$PATH:$HOME/.local/bin"
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
export WORKON_HOME=$HOME/.virtualenvs
export PROJECT_HOME=$HOME/workingcopies
export VIRTUALENVWRAPPER_VIRTUALENV_ARGS='--system-site-packages'
Add this to the end of ~/.bashrc
:
# Python virtualenvwrapper
source $HOME/.local/bin/virtualenvwrapper.sh
Logout. Log back in.
~/.emacs
should contain this:
(require 'package)
(add-to-list 'package-archives
'("elpy" . "http://jorgenschaefer.github.io/packages/"))
(custom-set-variables
;; custom-set-variables was added by Custom.
;; If you edit it by hand, you could mess it up, so be careful.
;; Your init file should contain only one such instance.
;; If there is more than one, they won't work right.
'(package-selected-packages (quote (elpy)))
)
;; eply
(package-initialize)
(elpy-enable)
(elpy-use-ipython)
Documenting code
- sphinx-apidoc “is a tool for automatic generation of Sphinx sources that, using the autodoc extension, document a whole package in the style of other automatic API documentation tools.”
- Math support in Sphinx
- Example documentation markup from An Example PyPi project
Python libraries
- Monary - need to install these
Ubuntu
packages,
then install
mongo-c-driver, then
create
/etc/ld.so.conf.d/libmongoc.conf
and write/usr/local/lib
in that file. Then runsudo ldconfig
. Then runmake test
. Then dopip install pkgconfig monary
. - Theano - “Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.”
- Pandas - “an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language… Time series-functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging. Even create domain-specific time offsets and join time series without losing data;”
- GPUStats - “gpustats is a PyCUDA-based library implementing functionality similar to that present in scipy.stats. It implements a simple framework for specifying new CUDA kernels and extending existing ones. Here is a (partial) list of target functionality: Probability density functions (pdfs). These are intended to speed up likelihood calculations in particular in Bayesian inference applications, such as in PyMC, Random variable generation using CURAND”
- PyOpenCL
- PyMC - “PyMC is a python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo. Its flexibility and extensibility make it applicable to a large suite of problems. Along with core sampling functionality, PyMC includes methods for summarizing output, plotting, goodness-of-fit and convergence diagnostics.”
-
SciKits - “Welcome to SciKits! Here you’ll find a searchable index of add-on toolkits that complement SciPy, a library of scientific computing routines. The SciKits cover a broad spectrum of application domains, including financial computation, audio processing, geosciences, computer vision, engineering, machine learning, medical computing and bioinformatics.”
- sckikit-learn: machine learning in Python
- NetworkX - “NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.”
Python 2 versus 3
GUIs
- From some very superficial searching, it looks like wxPython is the prefered Python GUI for use with matplotlib (I could be wrong though). One big disadvantage is that wxPython isn’t packaged as standard with Python, whilst tkInter is.
- wxPython screenshots
- list of Python GUI toolkits
Plotting
- Nice examples by Eli Bendersky (including live data and interactivity) of using matplotlib with wxPython GUIs
- embedding matplotlib plots in GUI apps
Tutorials & videos
- Advanced Statistical Computing at Vanderbilt University’s Department of Biostatistics by Chris Fonnesbeck (lead dev. PyMC)
- Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
- Python Scientific Lecture Notes
- goatbar’s iPython “research tools” videos
- Data analysis in Python with Pandas (3hr video)
- Ubuntu tutorial videos on using Python with GTK and GObject and Glade
Optimised compilers
- Numba: a NumPy-aware optimised compiler for Python (and here’s a Numba vs Cython blog post by jakevdp)
Development tools
- Using iPython from within PyDev (Eclipse)
Statistics and graphical models
- pebl - Python Environment for Bayesian Learning
Installing an up-to-date scientific Python stack on Ubuntu when you do have root permissions
sudo apt-get install python-dev python-pip python-sphinx libzmq-dev python-matplotlib python-scipy
sudo pip install cython pandas pyzmq jinja2 ipython sphinx
libzmq-dev, pyzmq and jinja2 are all required for iPython notebook.
For scipy:
sudo apt-get install libatlas-base-dev gfortran python-pip
sudo pip install scipy
If you get the following error when trying to import scipy
libatlas.so.3: cannot open shared object file: No such file or directory
then run
sudo update-alternatives --config libblas.so.3
and select /usr/lib/atlas-base/atlas/libblas.so.3
(this tip taken from
Daniel Nouri’s blog post on
libblas)
Installing Python when you don’t have root permissions
./configure --prefix=/data/usr
- Then edit
Makefile
and add-fPIC
to the end of the line that startsCC=
(as per this SO answer) (-fPIC
is required soxmllib2
compiles correctly) make -j8
make install
Install stuff for GTK+ development (when you don’t have root permissions)
- Compile Python as above
- Install libxml2:
- download
libxml2-2.X.Y.tar.gz
from xmlsoft.org/libxml2 ./configure --prefix=/data/usr
make -j8
make install
cd python
setenv LD_LIBRARY_PATH "/data/usr/lib"
python setup.py build
python setup.py install
- download
- Make sure
LD_LIBRARY_PATH
is set as above - Follow “Installing from Source” instructions from
here
(install jhbuild, then install pygobject using jhbuild). Some notes
on that process:
- I added the following two lines to
~/.config/jhbuildrc:
prefix = "/homes/dk3810/.local/opt"
modulesets_dir = "/homes/dk3810/.local/modulesets"
- I copied the
*.modules
files from releng to/homes/dk3810/.local/modulesets
- I added the following two lines to
Profiling
- add the following to
~/.bash_aliases
:alias profile='python -m cProfile -s time'
(from SO)
Packaging
distribute
aims to supercedesetuptools
.distribute
is compatible with Python 3,setuptools
isn’t. Mysetup.py
files are sufficiently simple to mean that I don’t need to modify anything to allow users to use eithersetuptools
ordistribute
.- Building and Distributing Packages with Distribute
- See
here
for details of where files are installed by
pip
- Official Python documentation on modules and packages and directory layout
- pip documentation
- Non-recursive upgrades using pip
Integrating git workflow with the Python package publishing process
- SO: How to configure setup.py to have pip install from GitHub master?
- SO: Automatic version number both in setup.py (setuptools) AND source code?
- Blog post on cberner.com on Git revision numbers for setuptools packages using a simple bash script.
- Blog post on dcreager.net on Extracting setuptools version numbers from your git using a small Python script
- setuptools manual on using “tagging” (but this doesn’t integrate directly with git)
Notes for creating a package
Aims & Overview:
- Upload just description of project to
pypi
usingpython setup.py register
. - Don’t upload code to
pypi
. Instead usedownload_url
insetup.py
to point to github. e.g.:download_url = "https://github.com/JackKelly/rfm_ecomanager_logger/tarball/master#egg=rfm_ecomanager_logger-dev"
- Use git tags to track version numbers.
- Automatically suck these version numbers into Python’s packaging
system and also into the project’s
__version__
attribute.
Details:
- Setup directory structure etc. as described in The Hitchhiker’s Guide to Packaging and “Dive Into Python 3: Chapter 16, Packaging Python Libraries”
- Use
git -s tag VERSION.NUMBER
for version numbers and push these tags to github withgit push --tags
(read tagging to learn how to use git tagging) - Read blog post on dcreager.net on Extracting setuptools version numbers from your git using a small Python script
- Figure out how to use these version numbers for version as well as for setuptools. See SO: Automatic version number both in setup.py (setuptools) AND source code?
- Might need to use ConfigParse to parse setup.cfg (see this
example) to
extract the version number from
setup.cfg
- Figure out how to point
download_url
to the correct tag - It appears that two things are necessary to get upgrading to work
correctly:
version
(insetup.py
) needs to incremement anddownload_url
needs to point to a URL with#egg=PROJECT-VERSION
appended to it (or upload all the files topypi
instead of downloading fromgithub
, but that feels rather ugly to duplicate lots of files)