Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memtory usage #83

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Reduce memtory usage #83

wants to merge 4 commits into from

Conversation

mem48
Copy link
Contributor

@mem48 mem48 commented Sep 8, 2023

WIP so just a place holder, but a few tweaks to reduce memory usage

tweaks to reduce memory use, still WIP
@@ -12,13 +12,12 @@ batch_read = function(
cols_to_keep = c(
"name", # not used currently but could be handy
"distances",
"gradient_smooth",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ultimately need that column. Fine if it works, surprised if it does after this change though.

Copy link
Collaborator

@Robinlovelace Robinlovelace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this attempt to reduce memory usage. My thinking is that query could be used to ignore all keys we don't use. Any ideas about impact on memory use after this change in any case? Benchmark could help.

res = readr::read_csv(file, show_col_types = FALSE)
n_char = nchar(res$json)

res = data.table::fread(file, select = "json")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, readr, which uses vroom, seems to be unreliable.

@@ -90,17 +88,18 @@ json2sf_cs = function(
message(results_error$Freq[msgs],'x messages: "',results_error$results_error[msgs],'"\n')
}
}

results = RcppSimdJson::fparse(results_raw, query = "/marker", query_error_ok = TRUE, always_list = TRUE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to reduce what is read-in here with a different query?

@@ -147,6 +146,7 @@ cleanup_results <- function(x, cols_to_keep){
x = add_columns(x)
x = sf::st_as_sf(x)
x$SPECIALIDFORINTERNAL2 <- NULL
cols = cols_to_keep %in% names(x)
x[cols_to_keep]
cols_to_keep3 = unique(c(cols_to_keep,"gradient_segment","elevation_change","gradient_smooth"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if any of these are not needed? For NPT gradient_smooth is the only one we need.

@@ -44,10 +44,10 @@ route_rolling_average = function(x, n = 3) {


get_values = function(v, fun) {
sapply(v, function(x) fun(as.numeric(x)))
vapply(v, function(x) fun(as.numeric(x)), 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same outcome, what's the advantage?

}

extract_values = function(x) stringr::str_split(x, pattern = ",")
extract_values = function(x) stringi::stri_split_fixed(x, pattern = ",")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, what's the thinking behind this.

@Robinlovelace
Copy link
Collaborator

Can you try to make actions happy also Malcolm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants