Python
Must read
If you are new to python, this is a good tutorial.
To keep good coding standards we try to stick to the PEP8 (the 8th Python Enhancements Proposals which is the most extensively used). If you haven't read it already, give it a try.
Also, it might be a good idea to have a look at this link: Code Like a Pythonista: Idiomatic Python. It is a bit old, but definitely a good reference for coding style, tips and good practices.
Other good resources are listed in The Hitchhiker's Guid to Python (you can scroll down to where additoinal learning resources are listed).
A nice list of Python based resources to "do stuff" is maintend on GitHub under "Awesome Python".
Matplotlib
Matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB®* or Mathematica®†), web application servers, and six graphical user interface toolkits.
Reference manual link
Scikit-learn
Machine Learning in Python. Simple and efficient tools for data mining and data analysis.
http://scikit-learn.org/stable/index.html.
IDEs
Python code and programs can be entirely written using just you favorite test editor (VIM, Emacs, Nano, Notepad, etc.). However, you might also use an Integrated Development Environment (IDE) which facilitates the development and help you in several different ways (code styling, refactoring, exploratory analysis..). The main IDEs used in the lab are:
- Canopy: Enthought Canopy is a comprehensive Python analysis environment with easy installation and updates of the proven Enthought Python distribution - all part of a robust platform you can explore, develop and visualize on.
- PyCharm: The intelligent Python IDE with unique code assistance and analysis, for productive Python development on all levels.
IPython Notebook
IPython Notebook is a nice web-based server-client tool for reporting and exploratory analysis (although using a very different approach, it could be compared with Canopy). I've successfully set up a locally accesible IPython Notebook server on my machine (cronopio, Mac OSX) and in a public instance in a linux machine (naxos2, CentOS 7). Setting up the server to work in local is pretty straight forward. First you need to install ipython and the notebook extension using PIP:
pip install "ipython[notebook]".
Then, you can start the server with:
ipython notebook
and access notebooks with your favorite browser via http://localhost:8888.
Currently, only naxos2 is configured. To start your own instance of the service:
> ipython profile create nbserver. This will create a profile folder under your $HOME/
> cp /data/ipython-nb/ $HOME/.ipython/profile-nbserver/ipython_notebook_config.py
> ipython notebook --profile=nbserver
Access via https://naxos2:9999, biociphers as psswd.
Configuring CentOS7 for Public Notebooks
Setting up a public instance it's a little bit more involved. Here is a good reference to follow and some notes from my experience with naxos2 (CentOS 7) :
- IPython Notebook has 2 dependencies that most likely you will need to install yourself:
- czmq: try sudo yum install czmq-devel
- tornado: sudo yum install python-tornado
- With those dependencies solved, you should be able to install ipython[notebook]
- Follow the reference instructions (setup a password, create a certificate and create a profile instance). You might have noticed the recommendation of changing the generic 8888 port to a known one (in the manual, 9999).
- Opening ports in CentOS7.
- The way to open ports has changing a little bit from previous CentOS versions. Instead of iptables you will need to use firewall-cmd. To open the port 9999:
- sudo firewall-cmd --zone=public --add-port=9999/tcp --permanent
- sudo firewall-cmd --reload
- The way to open ports has changing a little bit from previous CentOS versions. Instead of iptables you will need to use firewall-cmd. To open the port 9999:
- To preview your reports, you will still need a couple extra modules:
- pygments: sudo pip install pygments
- pandoc: sudo yum install pandoc
- Check that you can access to the running instance via: https://[server-name]:9999
Virtual Environments
If you want to install a specific version of a Python package on any of our machines, you will need to do so in a Python virtual environment.
Setup
python -m venv ./my_environment
. ./my_environment/bin/activate
# Now your python and pip commands will call the executables in ./my_environment/bin/.
pip install -U pip setuptools # Make sure you have the latest versions of pip and setuptools.
pip install {package} {package} ...
Migrating a virtual environment from one machine to another
# On the origin machine
pip freeze > requirements.txt
# Copy this file to the new machine
# On the new machine, run
pip install -U -r requirements.txt