Getting Started With the Neo4j Graph Database – Linking Neo4j and Jupyter SciPy Docker Containers Using Docker Compose

Pondering the Sunday Times Panama Papers directors/companies database yesterday (Panama Papers, Quick Start in SQLite3), I thought it was about time I got my head round using a graph database to store this sort of relational information.

Getting started with the neo4j database has been on my to do list for some time, so when pointed to a post on the neo4j about Analyzing the Panama Papers with Neo4j: Data Models, Queries & More, I thought I should give it a go.

So here’s a quick start to the first part – getting a working environment up and running. A quick search turned up a set of examples of how to get started using Neo4j using Jupyter notebooks by Nicole White, so I opted for a notebook/neo4j combination. The following docker-compose.yml file fires up a notebook server and a neo4j in separate docker containers and links them together:

neo4j:
  image: kbastani/docker-neo4j:latest
  ports:
    - "7474:7474"
    - "1337:1337"
  volumes:
    - /opt/data

jupyterscipy:
  image: jupyter/scipy-notebook
  ports:
    - "8888:8888"
  links:
    - neo4j:neo4j
  volumes:
    - .:/home/jovyan/work

Launching the docker CLI from Kitematic, I can cd into the directory containing the docker-compose.yml file and run the command docker-compose up -d to download and launch the containers.

Screenshot_11_04_2016_12_15

In the browser, launch a Jupyter terminal and pull down the example notebooks by running the command:

git clone https://github.com/nicolewhite/neo4j-jupyter.git

The notebooks will be downloaded in to the folder neo4j-jupyter. Still in the terminal, create a figure directory, as required for the hello-world.ipynb notebook.

mkdir -p neo4j-jupyter/figure

You’ll also need to install the py2neo python package. By default, this will be installed into the Python 3 path:

pip install py2neo

The example notebooks run in a Python 2 kernel, so we need to install the package into that environment too:

source activate python2
pip install py2neo

jovyan_9838314fd9dd____work

Now you should be able to run the example notebooks. One thing to note though – you will need to change the connection details for the neo4j database slightly, In the appropriate notebook code cells, change the default graph = Graph() connection to:

graph = Graph("http://neo4j:7474/db/data/")

hello-world

I’ve run out of time to do any more just now. In the next post on this topic, I’ll see if I can work out how to get the Sunday Times data into neo4j

5 comments

  1. monette5

    Really interesting. I’ll have to put it down for today because of a localhost connection error that I can’t figure out, but I’ll definitely come back to it :)

    • Tony Hirst

      Monette5 – don’t use localhost – use the /neo4j/ alias to reach noe4j from the notebook (if that’s the issue) or the IP address assigned to the container and listed via eg Kitematic or something like /docker-machine env default/ to lookup the environment vars (including IP address) for the docker virtual machine if opening the notebook is the issue.