Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Memory Usage Improvements (to fix #65) #197

Open
wants to merge 38 commits into
base: master
Choose a base branch
from
Open

WIP: Memory Usage Improvements (to fix #65) #197

wants to merge 38 commits into from

Conversation

kputnam
Copy link
Owner

@kputnam kputnam commented Jun 16, 2019

This is still WIP. Hoping for only minimal API changes.

@kputnam
Copy link
Owner Author

kputnam commented Jun 16, 2019

Reading an example 4.4MB file in master has peak memory usage of 921.5 MB. With current work in progress, this is down to 782.23 MB. Most of the problem remains in the tokenizer, not the parser, which is a relief.

@kputnam kputnam changed the title Memory Usage Improvements (to fix #65) WIP: Memory Usage Improvements (to fix #65) Jun 22, 2019
@kputnam
Copy link
Owner Author

kputnam commented Jun 22, 2019

The tokenization code was the biggest culprit for memory allocation. While there weren't a lot of retained objects that would suggest a leak, the tokenizer allocated so much short-term garbage that the GC struggled to keep up. It also took up a lot of stack frames, because it was written to use recursion rather than imperative looping.

The build for this branch is currently broken. I've made some drastic (but incomplete) changes that should reduce String allocations, along with other changes to further reduce the memory footprint of various classes. Once I have the parts working separately and unit tests written, I'll go about fitting the new tokenizer into the rest of the library. That might result in user-facing API changes, but I'm aiming to minimize that and provide shims where possible.

@kputnam
Copy link
Owner Author

kputnam commented Jun 23, 2019

Here's some preliminary results. The test script reads a file, hands it to the tokenizer, and iterates through each SegmentTok. No higher level parsing is done.

In the output below, the "Finished" is ultimately the important metric to the user. But the tokenizer's contribution to the total memory use is closer to "Finish" - "Post-init". The tokenizer may actually consume less, due to definitions and other classes being lazy-loaded while the tokenizer runs.

Some of the post-init memory could be consumed by reading the input file into memory. That could explain why the larger file has a larger post-init size compared to the smaller file.

master: 450KB input

» gtime -v ./memprof-tk.rb --fast file-s.x12 >/dev/null
Pre-init: 17 MiB
Post-init: 18 MiB
Finish: 44 MiB      <-- 26MB from tokenizer
1.992 seconds       <--
	Command being timed: "./memprof-tk.rb --fast file-s.x12"
	...
	Maximum resident set size (kbytes): 44516

wip: 450KB input

» gtime -v prof/memprof-tk.rb --fast prof/file-s.x12 >/dev/null
Pre-init: 17 MiB
Post-init: 18 MiB
Finish: 29 MiB      <-- 11MB from tokenizer
0.884 seconds       <--
	Command being timed: "prof/memprof-tk.rb --fast prof/file-s.x12"
	...
	Maximum resident set size (kbytes): 29284

The difference is more dramatic on larger files.

master: 4.4MB input

» gtime -v ./memprof-tk.rb --fast file-l.x12 >/dev/null
Pre-init: 17 MiB
Post-init: 22 MiB
Finish: 198 MiB      <-- 176MB from tokenizer
19.207 seconds       <--
	Command being timed: "./memprof-tk.rb --fast file-l.x12"
	...
	Maximum resident set size (kbytes): 199016

wip: 4.4MB input

» gtime -v prof/memprof-tk.rb --fast prof/file-l.x12 >/dev/null
Pre-init: 17 MiB
Post-init: 22 MiB
Finish: 33 MiB      <-- 11MB from tokenizer
7.932 seconds       <--
	Command being timed: "prof/memprof-tk.rb --fast prof/file-l.x12"
	User time (seconds): 7.79
	System time (seconds): 0.21
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:08.07
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 33700

When I have time, I'd like to plot out the runtime and memory usage as file size increases. One important note is these gains might be diminished when the parser is changed to accommodate the new tokenizer.

There also is further savings by not adding the FunctionalGroupDef to the config. That saves a few MB by not loading all the SegmentDefs and ElementDefs. Doing that, the faster branch runs in 19MB on the small file and 23MB on the larger one. Subtracting the post-init would suggest the tokenizer uses 1MB or less for both small and large files.

kputnam added 8 commits June 23, 2019 01:34
- Reader::Tokenizer returns fatal error when zero ISA's found or
  when an error occurs inside of an ISA..IEA envelope; other errors
  are not Reader::Result#fatal?
- Create Reader::Input to track the position of tokenized elements
- Fix subtle bugs in Reader::Pointer and/or Reader::StringPtr
- Parser::StateMachine#insert now destructively updates tokenizer state (separators and segment_dict)
- Parser::BuilderDsl uses new Reader::StacktracePosition
- Parser::DslReader has been merged into Parser::BuilderDsl
- Now ElementVal::AN and ElementVal::ID use a StringPtr where possible,
  but ElementVal::TM, ::DT, ::Nn, ::R all need to "parse" the string so
  it requires allocating the substring first (TODO)
@kputnam kputnam force-pushed the gh-65 branch 3 times, most recently from 65d8e65 to 2424ce6 Compare July 19, 2019 07:33
- Don't use def_delegators on methods called frequently, it allocates *args
- Build native extension String.strncmp to compare two substrings, which
  results in faster StringPtr#== and StringPtr#start_with?
implement zero-allocation lstrip, rstrip, is_control_character_at?(offset),
and lstrip_control_characters_offset(offset)
@kputnam kputnam force-pushed the gh-65 branch 6 times, most recently from f747e4b to e64b6d2 Compare September 11, 2019 22:09
@kputnam kputnam force-pushed the gh-65 branch 4 times, most recently from 23db4b5 to 75756c6 Compare September 12, 2019 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant