-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search API #5
Comments
First off, loading the HRS json files into elasticsearch:
This will combine all the json files into one large file that is then ingested by the elasticsearch _bulk endpoint. The json documents are indexed on insertion, and I believe the default is to index all fields. Individual documents can be retrieved (I've made their IDs the statute reference):
The elasticsearch
|
Great start @rhydomako! A few questions looking forward:
|
For 1), as long as we limit it to GET/POST requests to the search endpoint, I don't think there is too much harm in exposing the ES API. I don't really know what the optimal amount of RAM is, but I agree that the optimal is the smallest amount we can get away with. Just experimenting using my local docker containers, it seems to run ok with a heap of 128m (only the HRS docs loaded so far). So I would suggest setting it even lower, and adjust that if we run into problems. |
I think we've ran it on 2GB of ram on a small digital ocean box. The raw json dump is taking up almost 100MB on local storage, so 128MB heap size is probably not enough with the current structure. |
This issue is to track the search API.
Modifications and improvements to search related functionality shall be discussed here.
@rhydomako has been looking into importing the HRS_Index and Supplemental index data into ElasticSearch.
The text was updated successfully, but these errors were encountered: