This project is a showcase of spring-data, neo4j, rest and hateoas.
The Spring Boot based application provides the following features:
-
Performs an unattended download of General Transit Feed Specification (GTFS) data from New Jersey Transit.
-
Loads the downloaded GTFS files into Neo4j.
-
Provides web services, such as a trip planner API, to allow you to interact with the data.
You need an account as an NJ transit developer. You can sign up here.
This project is largely derived off the work that was done by Rick Van Bruggen (github account rvanbruggen)
Loading the data is based off his blog entry "Loading General Transport Feed Spec (GTFS) files into Neo4j - part 1/2"
Querying the data is based off his blog entry "Querying GTFS data - using Neo4j 2.3 - part 2/2"
-
About 15 minutes
-
You need access to GTFS files. You have two approaches to load this data:
-
Create a developer account at NJ Transit, and have neo4j-gtfs download and load the file for you.
-
Download a GTFS feed file from any transportation provider, and have neo4j-gtfs load it for you.
-
-
JDK 1.8 or later
-
You can also import the code straight into your IDE:
Before you can build a this application, you need to set up a Neo4j server.
Neo4j has an open source server you can install for free:
On a Mac, just type:
$ brew install neo4j
Also on a mac, make sure the Neo4j imports folder has proper write permissions. This should take care of it:
chmod -R g+w /usr/local/Cellar/neo4j/3.1.4/libexec/import
For other options, visit https://neo4j.com/download/community-edition/
Once you installed, launch it with it’s default settings:
$ neo4j start
You should see a message like this:
Starting Neo4j. Started neo4j (pid 96416). By default, it is available at http://localhost:7474/ There may be a short delay until the server is ready. See /usr/local/Cellar/neo4j/3.0.6/libexec/logs/neo4j.log for current status.
By default, Neo4j has a username/password of neo4j/neo4j. However, it requires that the new account password be changed. To do so, execute the following command:
$ curl -v -u neo4j:neo4j -X POST localhost:7474/user/neo4j/password -H "Content-type:application/json" -d "{\"password\":\"secret\"}"
This changes the password from neo4j to secret (something to NOT DO in production!) With that completed, you should be ready to run this guide.
Neo4j Community Edition requires credentials to access it. This can be configured with a couple of properties.
spring.data.neo4j.username=neo4j
spring.data.neo4j.password=secret
This includes the default username neo4j
and the newly set password secret
we picked earlier.
Similarly, your login to download the GTFS files from the New Jersey Transit website are stored in this file.
njgtfs.login=foo
njgtfs.password=bar
Warning
|
Do NOT store real credentials in your source repository. Instead, configure them in your runtime using Spring Boot’s property overrides. |
You can run the application from the command line with Gradle or Maven. Or you can build a single executable JAR file that contains all the necessary dependencies, classes, and resources, and run that. This makes it easy to ship, version, and deploy the service as an application throughout the development lifecycle, across different environments, and so forth.
If you are using Gradle, you can run the application using ./gradlew bootRun
. Or you can build the JAR file using ./gradlew build
. Then you can run the JAR file:
java -jar build/libs/neo4j-gtfs-0.1.0.jar
If you are using Maven, you can run the application using ./mvnw spring-boot:run
. Or you can build the JAR file with ./mvnw clean package
. Then you can run the JAR file:
java -jar target/neo4j-gtfs-0.1.0.jar
Note
|
The procedure above will create a runnable JAR. You can also opt to build a classic WAR file instead. |
The server comes up on http://localhost:8080 by default.
With this in place, let’s load up the data and interact with it.
Two endpoints are provided:
-
Dowload and import the data fully automated from the NJ Transit developer website:
http://localhost:8080/customrest/LoadData -
The NJ GTFS download is clickwrapped, so getting things to work automated has been very temperamental.
If the /customrest/LoadData endpoint does not succeed, you can import a pre-downloaded zip file by:-
Placing it in the same directory as the Spring Boot app server (default filename rail_data.zip).
-
Then initiate importing it into Neo4j by calling this URL:
http://localhost:8080/customrest/LoadPrefetched
-
By default all the endpoints exposed via spring-data-rest are left in place. You can traverse through those by accessing the root of the app server.:
To understand how this works, read the page Understanding HATEOS prepared by the Spring community.
The application also exposes web services hosting custom cypher queries for trip planning. Currently only one such endpoint exists, and it is purpose built to provide trip options from one station to another given departure and arrival time criteria:
curltests/planTrip.sh
#!/usr/bin/env bash curl -H "Content-Type: application/json" -X POST --data @TripPlanNoTransfer.json http://localhost:8080/customrest/planTripNoTransfer | python -m json.tool
curltests/TripPlan.json
{ "serviceId":"4", "origStation":"WOOD-RIDGE", "origArrivalTimeLow" :"07:00:00", "origArrivalTimeHigh" :"08:10:00", "destStation" :"HOBOKEN", "destArrivalTimeLow":"06:30:00", "destArrivalTimeHigh":"10:00:00" }
Response - 2 possible trips - One leaving at 7:43 and the second leaving at 7:27:
[ [ { "arrivalTime": "07:43:00", "departureTime": "07:43:00", "stopName": "WOOD-RIDGE", "stopSequence": 15, "tripId": "2815" }, { "arrivalTime": "07:54:00", "departureTime": "07:54:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 16, "tripId": "2815" }, { "arrivalTime": "08:05:00", "departureTime": "08:05:00", "stopName": "HOBOKEN", "stopSequence": 17, "tripId": "2815" } ], [ { "arrivalTime": "07:27:00", "departureTime": "07:27:00", "stopName": "WOOD-RIDGE", "stopSequence": 16, "tripId": "2821" }, { "arrivalTime": "07:38:00", "departureTime": "07:38:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 17, "tripId": "2821" }, { "arrivalTime": "07:49:00", "departureTime": "07:49:00", "stopName": "HOBOKEN", "stopSequence": 18, "tripId": "2821" } ] ]
curltests/planTripOneStop.sh
#!/usr/bin/env bash curl -H "Content-Type: application/json" -X POST --data @TripPlanOneTransfer.json http://localhost:8080/customrest/planTripOneTransfer | python -m json.tool
curltests/TripPlanOneStop.json
{ "serviceId":"4", "origStation":"WOOD-RIDGE", "origArrivalTimeLow" :"06:30:00", "origArrivalTimeHigh" :"07:10:00", "destStation" :"RUTHERFORD", "destArrivalTimeLow":"06:30:00", "destArrivalTimeHigh":"10:00:00" }
Response 1 trip with one tramsfer leaves origin at 6:46 arrives at midpoint at 6:56, the transfer train leaves at 8:09 and arrives the destination at 8:09
[ [ { "arrivalTime": "06:46:00", "departureTime": "06:46:00", "stopName": "WOOD-RIDGE", "stopSequence": 16, "tripId": "2820" }, { "arrivalTime": "06:56:00", "departureTime": "06:56:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 17, "tripId": "2820" } ], [ { "arrivalTime": "08:09:00", "departureTime": "08:09:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 2, "tripId": "1249" }, { "arrivalTime": "08:17:00", "departureTime": "08:17:00", "stopName": "RUTHERFORD", "stopSequence": 3, "tripId": "1249" } ] ]
Open your browser to Neo4j’s own Cypher query tool by opening your browser to http://localhost:7474/ and start writing cypher queries like the ones below
MATCH (orig:Stop {name: "WESTWOOD"})--(orig_st:Stoptime)-[r1:PART_OF_TRIP]->(trp:Trip) WHERE orig_st.departure_time > "06:30:00" AND orig_st.departure_time < "07:10:00" AND trp.service_id="4" WITH orig, orig_st MATCH (dest:Stop {name:"HOBOKEN"})--(dest_st:Stoptime)-[r2:PART_OF_TRIP]->(trp2:Trip) WHERE dest_st.arrival_time < "08:00:00" AND dest_st.arrival_time > "07:00:00" AND dest_st.arrival_time > orig_st.departure_time AND trp2.service_id="4" WITH dest,dest_st,orig, orig_st MATCH p = allshortestpaths((orig_st)-[*]->(dest_st)) WITH nodes(p) as n UNWIND n as nodes //MATCH // p=((nodes)-[loc:LOCATED_AT]->(stp:Stop)) OPTIONAL MATCH p=(nodes)-[r:PRECEDES|LOCATED_AT]->(next) RETURN p, COALESCE(nodes.stop_sequence, next.stop_sequence AS stopSequence ORDER BY stopSequence;
//find the route and the stops for the indirect route MATCH (t:Stop),(a:Stop) WHERE t.name = "WESTWOOD" AND a.name="HOBOKEN" WITH t,a MATCH p = allshortestpaths((t)-[*]-(a)) WHERE NONE (x in relationships(p) where type(x)="OPERATES") RETURN p LIMIT 10;
MATCH p3=(orig:Stop {name:"WESTWOOD"})<-[:LOCATED_AT]-(st_orig:Stoptime)-[r1:PART_OF_TRIP]->(trp1:Trip), p4=(dest:Stop {name:"RUTHERFORD"})<-[:LOCATED_AT]-(st_dest:Stoptime)-[r2:PART_OF_TRIP]->(trp2:Trip), p1=(st_orig)-[im1:PRECEDES*]->(st_midway_arr:Stoptime), p5=(st_midway_arr)-[:LOCATED_AT]->(midway:Stop)<-[:LOCATED_AT]-(st_midway_dep:Stoptime), p2=(st_midway_dep)-[im2:PRECEDES*]->(st_dest) WHERE st_orig.departure_time > "07:30:00" AND st_orig.departure_time < "09:00:00" AND st_dest.arrival_time < "10:30:00" AND st_dest.arrival_time > "09:00:00" AND st_midway_arr.arrival_time > st_orig.departure_time AND st_midway_dep.departure_time > st_midway_arr.arrival_time AND st_dest.arrival_time > st_midway_dep.departure_time AND trp1.service_id = "4" AND trp2.service_id = "4" WITH st_orig, st_dest, nodes(p1) + nodes(p2) AS allStops1 ORDER BY (st_dest.arrival_time_int-st_orig.departure_time_int) ASC SKIP 0 LIMIT 1 UNWIND allStops1 AS stoptime MATCH p6=(loc:Stop)<-[r:LOCATED_AT]-(stoptime)-[r2:PART_OF_TRIP]->(trp5:Trip), (stoptime)-[im1:PRECEDES*]->(stoptime2) RETURN p6, im1 ORDER BY stoptime.departure_time_int ASC ;
Add new queries to the repository com.popameeting.gtfs.neo4j.repository and interact with them via Spring-Data-Rest’s provided web services - if you need the data presented differently see the projections in com.popameeting.gtfs.neo4j.entity.projection and how they are being used in the URL above.