Using iPython notebooks and Pycharm together

ipython-blog-7

IPython notebooks have become an indispensable tool for many Python developers. They are a reasonably good environment for interactive computing, can contain inline data visualisations and can be hosted remotely for sharing results or working together with other developers. In many academic environments and increasingly in industry IPython notebooks are used for data visualisation work and exploratory programming, depending on the IPython interactive environment for fast prototyping of ideas.

As nice an environment we have in IPython, I often wish for the features of a full-fledged IDE. Here at Comperio we use PyCharm a lot which has excellent code editing, semantic completion, a graphical debugger and efficient code navigaton capabilities. In this blog post I’m going to show how you can simultaneously work on code in both the IDE and IPython notebook or interactive shell while keeping the running notebook and IDE project in sync.

Hey, PyCharm already have IPython notebook integration. What about that? Personally I find that the IPython notebook integration in the latest PyCharm (version 4.0.6) still isn’t adequate for serious work. You get the the completion and code navigation from PyCharm, but editing and navigation is reduced to half a dozen buttons. Further some functionality such as debugging appears to be plainly non-functional. Regardless there are other very nice IDEs for Python such as Wing or Eclipse, and the approach here will work equally well with them.

This cunning recipe consists of two spicy ingredients, Both are neat tricks on their own, but together they form a smooth workflow bridging exploratory programming and more structured software engineering. We are going to:

Install our code as an editable Pip package.
Use the IPython autoreload extension to dynamically reload code.

So let’s get cooking!

Editable Pip packages

We are going to organise our code in a Python package and install it with Pip using the -e or —editable option. This installs the package as it is pointing to our project directory and that we are always importing the code that we are editing. We could also accomplish this with some hacking on sys.path or PYTHONPATH, but having our code available as a package is a lot more seamless. It makes sense to use virtualenv (or EPD/Anaconda environments) to isolate your system Python from your development packages.

First we create Python project in PyCharm, add source folder with setup.py defining a basic python package.

Then we create stub file with the following code in our python module and a folder for our notebooks.

def get_page():
    print "Don't know how to do this yet."

1 2	def get_page(): print "Don't know how to do this yet."

And we activate our virtualenv/Conda environment and run pip install -e .

Pip and Git: If you install your package with -e from a Git repository it may think that you want to install from Git even if you’re giving it a file path. This is usually not what you want when you’re developing since you would have to commit your code for the package to update itself. An ugly but practical way to avoid this behavior is to move the .git folder out of the way when installing the package.

%autoreloading code

Now to the important part which is dynamically loading the IDE project into our IPython notebook. Let’s first fire up the notebook.

Start iPython and create a notebook.

You have probably used reload(module) to update the Python environment at runtime. This hardly ever works for more than five minutes and results in an inscrutable mess of old new stuff in your modules and classes.There are however a bag of neat tricks taking care of at least the majority of the problems around reloading Python code or modules (see http://pyunit.sourceforge.net/notes/reloading.html), and the IPython developers has collected these into their autoreload extension. Let’s look at it in action.

Here we set up the autoreload module and import our stub function in the first two cells. In the third we run our function. We then change the function definition in the IDE and save the file.

def get_page():
    print "Hey I'm updated."

1 2	def get_page(): print "Hey I'm updated."

And when we run the function again in cell four the updated code is run.

From notebook to project and back

Combining editable Pip packages and the autoreload module we have a way to seamlessly load our project code in the notebook. When we are ready to move our exploratory programming back to project we can move our code over, import any new definitions and refine our implementation while using it in our notebook. In this way we can quickly move from noodling around in the notebook to developing and testing in the IDE and move back to the notebook to use our project code in further unstructured meanderings.

In the next post we will demonstrate this in more detail.