Building a simple Neo4j client for Clojure

So we had this idea of using a Graph database, more precisely Neo4j, for a new piece of software we were designing at Full Spectrum IVS, even though we didn’t had any previous experience with it.

This led to the making of what we think is a simple idomatic Clojure client for Neo4j communicating CYPHER queries through the Bolt protocol.

Getting our feets wet

Due to of our lack of experience with graph databases, we first had to get our heads around the basics and then the actual inner workings of Neo4j and its query language CYPHER.

This alone was a rather large undertaking, especially due to the complexity of the graph we had in mind for the project.

After several workshops and some PoC’s, we got to the conclusion that the Neo4j database was a match made in heaven for our application, and then the real work started.

We were already primarily using Clojure at this time and I could see on the official Neo4j web page, there were two projects who enabled the use of Neo4j from inside Clojure. Both of which are described below.

Neocons

So I started out testing the more mature (based on the projects age) client of the two. The client is called Neocons and is written by the very talented developer Michael Klishin, who is also the man associated with projects like Monger and Langohr.

First of all I have to say that Neocons worked perfectly, it was a breeze setting up a connection and start queering the database. It is also fantastic if you wanna use REST to communicate with Neo4j.

So why didn’t we just go with Neocons for our project. Well there were some limitations such as the lack of support for labels in the helper functions and we also wanted to use the Bolt protocol which is very sparely implemented in Neocons.

neo4j-clj

So next up was neo4j-clj client from Gorillalabs. Again talented people at work here. neo4j-clj was, like Neocons, a breeze to get started with and it actually had better support for Bolt. It also had some nice extra features missing in Neocons, such as converting the result into Clojure maps.

One of the problem we had with neo4j-clj, was that everything was done as raw queries whether you had to create or fetch data from the database. Even though neo4j-clj offers a defquery function which allows you to define reusable queries it sill felt a bit too low level for our usage.

Conlusion on existing projects

So after testing the existing clients, we came to the conclusion that we liked the basics from Neocons and neo4j-clj, but we wanted a more “Clojurish” way of doing things and a higher level of abstraction. So this became the start of our Neo4clj client.

The making of Neo4clj

The next part of this post is some of the process we went through in making the Neo4clj client as well as an example of its usage.

Acknowledgements

First of all I would like to acknowledge the work of the two other clients described above and the inspiration they provided for the basic implementation of working with Bolt and Neo4j from within Clojure.

The Vision

Our vision for our own Neo4j client is listed below.

  • Needs to feel like a part of Clojure
  • Allow the user to represent data in best practice Clojure structures
  • Have basic CRUD operations
  • Support transaction

This was our vision and we tried to accomplish all of it in the initial version. Seen in retrospective we should probably have reduced the functionality, and split the development up in smaller iterations.

Data representations

We quickly found out we needed some kind of data representation for the nodes and relationships in Neo4j. So we started out trying to identify the different data existing on the entities. This resulted in the following representations:

Node

{:id 34
:labels [:example-node :first-level]
:props {:property-1 123
:property-2 "something"}}

Relationship

{:id 12
:type :example-relationship
:from 15
:to 1
:props {:property-1 123
:property-2 "something"}}

With these two representations we had an idea of how we wanted the end result to look. We then started working on the basic CRUD operations.

Basic CRUD operations

To make the client easy to use for people not used to writing CYPHER queries, we wanted to be able to do basic CRUD operations on nodes and relationships through simple Clojure functions using our defined data representations.

Later in the process we extended the functionality to not only work on single entities, but to work on whole graphs as well. This allowed us to create nodes and relationshis in a single CYPHER query as well as fetch simple graphs directly by using or data representations.

During our work on the CRUD operations we ran into several problems, some of which is described in more detail below.

Getting the right feel and following style guides

While working on the CRUD operations we found out we needed to do some conversion between our Clojure data representations and the generated CYPHER queries. During which we decided to actually generate CYPHER representations conforming to the style guide.

This ensured we got the right feel, not only in Clojure but also in CYPHER, which made debugging messages much easier to read.

Below is an example of a call to create-node! as the corresponding CYPHER query generated by the client

(client/create-node! conn {:ref-id "p"
:labels [:person]
:props {:first-name "Neo"
:last-name "Anderson"}})

CREATE (p:Person {firstName: 'Neo', lastName: 'Anderson'}) RETURN p

As you can see above all keywords have been converted to their CYPHER equivalent based on the CYPHER style guide and the values have been converted to match the corresponding CYPHER representations.

It doesn’t need to be keywords, strings are converted in the same way when going from Clojure to CYPHER through the client.

Referencing entities

In the start where we only supported single entities, we had hard-coded the aliases used in the queries, but as we extended the features of the client to also work with complete graphs, we found out it made more sense to actually have a :ref-id key on the data representations.

The addition of the reference identifier also paved the way for a new data representation type we call lookup. This is a way of describing a node by several optional values such as id, labels or properties. We even added simple boolean algebra to it allowing and/or logic in the properties.

An example of a data representation to find all nodes with the label :person where the first name is either Erik or Kasper is given below.

{:ref-id "p"
:labels [:person]
:props [{:first-name "Erik"}
{:first-name "Kasper"}]}

Initial version limitations

The initial version have basic CRUD operations for Nodes, Relationships and simple graphs, even though some of the operations such as Delete and Update are very simple and shared among the two entity types.

Also it still requires some work from the developer to use returned instances directly in new operations, due to missing reference ids and requirements of specific representations in the different operations.

How does it work

Now for the part everyone have been waiting for. How do we use it. Well at the time of writing we have released a 1.0.0-SNAPSHOT version were the basic functionality is described below. For more information please refer to the examples in the projects documentation.

Example

To follow this example please add neo4clj to your project file by adding the following line to dependencies.

[fullspectrum/neo4clj "1.0.0-SNAPSHOT"]

The example below is run from the default user namespace. And all code executed in the repl starts with user> while the result of the code is shown directly after.

First we connect to Neo4j on localhost with basic authentication. We then create a single node and afterwards fetch it from the Neo4j database again, storing it in the variable person.

user> (def conn (client/connect "bolt://localhost:7687" "neo4j" "password"))

#'user/conn

user> (client/create-node! conn {:ref-id "p"
:labels [:person]
:props {:first-name "Claus"
:last-name "Engel-Christensen"}})

{:labels (:person),
:id 60,
:props {:first-name "Claus", :last-name "Engel-Christensen"}}

user> (def claus (first (client/find-nodes! conn {:ref-id "p"
:labels [:person]
:props {:first-name "Claus"}})))

#'user/claus

We then create a small graph referencing the previously created node, stored in the variable person and at then extracts all the nodes which has a relationship of the type :works-at and lastly we disconnect from the Neo4j database again.

user> (client/create-graph!
conn
{:lookups [(assoc claus :ref-id "p")]
:nodes [{:ref-id "c" :labels [:company] :props {:name "Full Spectrum IVS"}}]
:relationships [{:ref-id "r"
:type :works-at
:from {:ref-id "p"}
:to {:ref-id "c"}
:props {:position "Developer"}}]
:returns ["c" "p" "r"]})

({"c"
{:labels (:company), :id 61, :props {:name "Full Spectrum IVS"}},
"p"
{:labels (:person),
:id 60,
:props {:last-name "Engel-Christensen", :first-name "Claus"}},
"r"
{:end-id 61,
:type :works-at,
:start-id 60,
:id 62,
:props {:position "Developer"}}})

user> (client/get-graph
conn
{:nodes [{:ref-id "c"}
{:ref-id "p"}]
:relationships [{:ref-id "r"
:type :works-at
:from {:ref-id "p"}
:to {:ref-id "c"}}]
:returns ["c" "p" "r"]})

({"c"
{:labels (:company), :id 61, :props {:name "Full Spectrum IVS"}},
"p"
{:labels (:person),
:id 60,
:props {:last-name "Engel-Christensen", :first-name "Claus"}},
"r"
{:end-id 61,
:type :works-at,
:start-id 60,
:id 62,
:props {:position "Developer"}}})

user> (client/disconnect conn)

nil

Conclusion

In the end I really think this client accomplishes what we wanted it to to do, and yes it’s a very crude initial version, which still needs some polish. But I hope this blog post and the example shown above has inspired you to go play with the client yourself.

The project can be found at GitHub here