-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Window Size Configuration #27
Comments
I think this is a nice-to-have that isn't necessary. The only scenario where it becomes necessary is if we have a value that needs to change across more than one AWS environment. Right now we're only in one, so skipping this feature should be fine.
Is there anything specific about the user in these requests? For example, if two users provide the same inputs to this service, is it OK for them to get the same outputs? Mostly asking because I'm trying to determine what layers of caching are acceptable. If different users can share a cache, I think we might be able to borrow Nginx's caching capabilities, which have lots of tunables. |
Not inherently so. There may be something we can encounter as multiple people are using the site, but the requests don't have any user information, and if two users generate the same requests, they should get the same results. Having one user cache it for the other would be helpful in this case. |
confirm no user specific state on HTTP level, so Nginx is totally an option. S3 cache on the file system would help only in the very specific case where the service restarts without blowing away the filesystem. It would provide warm start for operations that have not been issued yet, so can't be covered by caching the TMS tiles. |
There are a number of parameters that modellab-geoprocessing uses that has some impact for performance for a typical case. These should be configurable so they can be tuned in deployment.
Default values of these should be placed in
reference.conf
, so they can be overwritten byapplication.conf
in the application directory. Consult for reference:http://doc.akka.io/docs/akka/snapshot/general/configuration.htmlIt may be nice if there was a fallback to ENV variables if
application.conf
is not present, @hectcastro may have input on this.akka-config should be used to read as in here: https://github.com/azavea/modellab-geoprocessing/blob/develop/src/main/scala/com/azavea/modellab/Instrumented.scala#L14
Structure of utility class may need to change.
Here are the parameters that need to be tuned:
Window Size:
This is the size of the chunks of base layers that that get loaded from S3. The larger the chunk the slower initial load, but also the more efficient the operations themselves:
https://github.com/azavea/modellab-geoprocessing/blob/develop/src/main/scala/com/azavea/modellab/LayerRegistry.scala#L16
Operation Size:
The size in which the operations are being evaluated. Each window results to call to above window. This should be smaller than the window size but not sure by how much:
https://github.com/azavea/modellab-geoprocessing/blob/develop/src/main/scala/com/azavea/modellab/LayerRegistry.scala#L18
NOTE: We measures these in storage tiles, which are 512
Tile Cache:
If a local path is given the S3 catalog should use Tile cache: https://github.com/azavea/modellab-geoprocessing/blob/develop/src/main/scala/com/azavea/modellab/Catalog.scala#L35
This will matter only between service restarts that do not reboot the machine, since these files are going into temp folder. The window reader caches things in memory, caching tiles to disk would be useful when the memory gets bumped or service has restarted. @caseypt and @hectcastro should +1 this feature. I am not totally sure if this is practically useful in current deployment.
The text was updated successfully, but these errors were encountered: