Graph technology: From SQL to neo4j

Image by Pete Linforth from Pixabay
-----------------------------------------------------------------
| ID | Firstname | Lastname | Date of birth | Notes |
-----------------------------------------------------------------
| 1 | John | Doe | 1984-01-30 | My best buddy |
| 2 | Peter | Piper | 1975-03-13 | Super skilled guy |
| 3 | Jennifer | Jones | 1983-09-15 | |
-----------------------------------------------------------------
-----------------------------------------------------------------
| PersonId | Street | ZIP | City | Country |
-----------------------------------------------------------------
| 1 | Samplestreet 3 | 1234 | Gotham | US |
| 3 | Kings-Avenue 9 | 99123 | Whoville | US |
-----------------------------------------------------------------

Relationships on data level

The flight example

Flight UA703 connects New York JFK and Los Angeles LAX on January 6, 11.36am UTC 
SELECT * FROM airports
Results for all-airports-query
SELECT
f.airline,
f.flight_number,
f.tail_number,
f.origin_airport,
f.destination_airport,
f.year,
f.month,
f.day,
f.scheduled_departure
FROM flights f
LEFT JOIN airports a ON f.origin_airport = a.iata_code
WHERE a.city = 'New York'
Results for querying flights departing from New York
SELECT
f.airline,
f.flight_number,
f.tail_number,
f.origin_airport,
f.destination_airport,
f.year,
f.month,
f.day,
f.scheduled_departure
FROM flights f
LEFT JOIN airports a ON f.origin_airport = a.iata_code
LEFT JOIN airports b ON f.destination_airport = b.iata_code
WHERE
a.city = 'New York'
and b.city = 'San Francisco'
SELECT
f.airline,
f.flight_number,
f.origin_airport,
f.destination_airport,
f.year,
f.month,
f.day,
f.scheduled_departure,
f2.airline,
f2.flight_number,
f2.origin_airport,
f2.destination_airport,
f2.year,
f2.month,
f2.day,
f2.scheduled_departure
FROM flights f
LEFT JOIN flights f2 ON f.destination_airport = f2.origin_airport
LEFT JOIN airports a ON f.origin_airport = a.iata_code
LEFT JOIN airports b ON f2.destination_airport = b.iata_code
WHERE
a.city = 'New York'
and b.city = 'San Francisco'
Result for the 2-hop query from New York to San Francisco

So what if a the data model would be designed with relationships in mind, rather than flights or airports?

Hello GraphDB, hello Neo4J

****************************************
| SQL DB | Graph DB *
****************************************
| Database | Graph *
| Table | Does not exist *
| Row | Node *
| Column | Property on node *
| M:M Table | Edge *
| M:M Table Column | Property on edge *
****************************************
  • Airports are created as nodes labeled as Airport with their name shown and colored orange.
  • Flights are created as nodes labeled as Flight with their flight-number shown and colored in blue.
  • Flights and airports are connected with two relationships with dedicated labels for ARRIVING_AT and DEPARTING_FROM
Airports and flights from the previous example turned into a graph

Graphification of the airports and flights

Neo4j sandbox browser

Let’s start loading the airports

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/martinhorvath/com.primebird.graphplayground/main/flightnet_airports.csv' AS row
MERGE (airport:Airport {iata_code: row.iata_code})
ON CREATE SET airport.name = row.airport, airport.city = row.city, airport.state = row.state, airport.country = row.country, airport.latitude = toFloat(row.latitude), airport.longitude = toFloat(row.longitude);
LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/martinhorvath/com.primebird.graphplayground/main/flightnet_airports.csv' AS row
MERGE (airport:Airport {iata_code: row.iata_code})
ON CREATE SET airport.name = row.airport, airport.city = row.city, airport.state = row.state, airport.country = row.country, airport.latitude = toFloat(row.latitude), airport.longitude = toFloat(row.longitude);
MATCH(a:Airport)
return a

Load all flights

LOAD CSV WITH HEADERS FROM 'https://github.com/martinhorvath/com.primebird.graphplayground/raw/main/flightnet_flights.csv' AS row
MERGE (flight:Flight {id: row.id})
ON CREATE SET flight.flight_number = row.flight_number, flight.airline = row.airline, flight.tail_number = row.tail_number, flight.origin_airport = row.origin_airport, flight.destination_airport = row.destination_airport, flight.air_time = row.air_time,flight.departureDate = date(row.year + '-' + row.month + '-' + row.day), flight.departureTime = row.departure_time, flight.departureDate = date(row.year + '-' + row.month + '-' + row.day), flight.arrivalTime = row.arrival_time
MATCH(f:Flight)
return f

Relationships ARRIVING_AT and DEPARTING_FROM

MATCH (a:Airport),(f:Flight)
WHERE a.iata_code = f.origin_airport
CREATE (f)<-[r:DEPARTING_FROM]-(a)
RETURN a,r,f
MATCH (a:Airport),(f:Flight)
WHERE a.iata_code = f.origin_airport
CREATE (f)<-[r:DEPARTING_FROM]-(a)
RETURN a,r,f
MATCH (a:Airport),(f:Flight)
WHERE a.iata_code = f.destination_airport
CREATE (f)-[r:ARRIVING_AT]->(a)
RETURN a,r,f

Experience the power of graph queries

MATCH(a)
WHERE a:Airport
RETURN a;
MATCH(a:Airport)-[r:DEPARTING_FROM]->(f:Flight)
WHERE a.city = "San Francisco"
RETURN a,f;
Flights departing from San Francisco
MATCH(f:Flight)-[r:ARRIVING_AT]->(a:Airport)
WHERE a.city = "New York"
RETURN a,f
Flights arriving at New York
MATCH(a:Airport)-[r:DEPARTING_FROM]->(f:Flight)-[r2:ARRIVING_AT]->(a2:Airport)-[r3:DEPARTING_FROM]->(f2:Flight)-[r4:ARRIVING_AT]->(a3:Airport)
WHERE a.city = "New York" and a3.city = "San Francisco"
RETURN a,f,f2,a2,a3
match p=(a)-[:DEPARTING_FROM|ARRIVING_AT*1..5]->(b)
where a.city='New York' and b.city = 'San Francisco'
return p
match p=shortestPath((a)-[:DEPARTING_FROM|ARRIVING_AT*]->(b))
where a.city='New York' and b.city = 'San Francisco'
return p, length(p)

Manufacturing industry

Utilities and maintenance

Closing word

--

--

--

I'm a consultant working on international projects in the field of geospatial data and customer experience, with a passion for technology and mountaineering.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Skirmish Simulator

Light at the end of the COVID-19 tunnel

Grindstone Fire continues to engulf Pilot Mountain; NC Forest Service to hold virtual public…

I Never Thought That I Will Become An IT Consultant

An Alternative Guide to Becoming a Kick-Ass Junior Dev

How to own a professional website at zero costs! (Part 1)

Understanding MQTT Retained messages

Windows MQTT Client MqttDesk

Multi-region serverless backend — reloaded

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Martin Horvath

Martin Horvath

I'm a consultant working on international projects in the field of geospatial data and customer experience, with a passion for technology and mountaineering.

More from Medium

Why Data Economy is subsuming the API Economy

Using TomTom Data via APIs to Improve Delivery ETA Accuracy

Finding Curve Inflection Points in PostGIS

Disney Streaming Embraces 3 Key Tenets of Experimentation