Graph Mining for HPC Analytics – Neo4j and Property Graphs

Hi everyone, and welcome back to another blog post. I will explain the basics of what make a property graph and how to use a tool called NEO4J to create a property graph based database.

What is a property graph? They have the same definition as most graphs. They are composed of vertices edges (G = {V, E}. In property graphs the terminology for these is nodes and relationships. Along with having nodes and relationships, what makes a property graph unique are whats called the properties, which is the information that can either be attached to the nodes or the relationships.

Let’s discuss how nodes are defined. They are considered entities. They can hold any amount of data as key-value pairs (properties). Nodes can also be labeled, to specify what domain in the graph they belong to (sub-graphs).

Relationships are similar to edges in a traditional graph except they have more minimum restrictions in a property graph. Relationships in a property graph must include a start node, end node, direction and a type. Just as nodes have properties, relationships can also hold properties. Moreover, even though relationships have a direction, they can be easily traversed in any direction.

Where does neo4j come into play? Neo4j is a native graph database that implements the data structure for the property graph mentioned above. Neo4j is a open-source, NoSQL that provides a ACID compliant backend seen in many other databases. In the next blog we will dive into more specifics of Neo4j, and how to use a declarative query language called Cypher that is similar in many ways to SQL but is optimized to work with graph databases.

