-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New command: ref-explore #33
base: main
Are you sure you want to change the base?
New command: ref-explore #33
Conversation
a4ccb76
to
7ea9671
Compare
This has been incredibly helpful for us in figuring out the same issue with ActiveRecord objects being retained, thank you! 🙇 I'd definitely like to see this upstreamed. However, there have been some cases where I added this and it showed me a bunch of IMEMO objects that gave me a path to the object I was inspecting. def add_class_references(o, addr)
return unless o.key?('class')
ref = o.fetch('class').to_i(16)
(@reverse_references[ref] ||= []) << addr
end I'm not sure if this correctly models what CRuby GC does though. |
lib/heapy/reference_explorer.rb
Outdated
@reverse_references = {} | ||
@virtual_root_address = 0 | ||
File.open(filename) do |f| | ||
f.each.with_index do |line, i| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: trailing whitespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this and many other whitespace issues by saving the file once more in my current editor configuration :D
elsif obj['type'] == 'ARRAY' | ||
"#{obj['length']} items" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For IMEMO
objects it would be useful to show obj['imemo_type']
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trusting you here, I blindly added this. The heap dump included in this repo does not seem to contain IMEMO
s. Feel free to double check that this is the desired result and maybe share an example how it is rendered :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this to work, 'imemo_type'
needs to be added here:
simple_object = o.slice('type', 'file', 'name', 'class', 'length')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙈 I clearly did not work on this piece of code for some time :D Fixed.
lib/heapy/reference_explorer.rb
Outdated
loop do | ||
print 'Enter address to inspect: ' | ||
|
||
drill_down($stdin.gets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: a proper readline-based REPL would be nicer, as this doesn't have history, exits on Ctrl-C, crashes on Ctrl-D, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point and something I did not look into before. I will check this out and depending on complexity add it to this PR, to make this feature more "production ready".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm... I just realized how dead simple readline is :D Added it.
7ea9671
to
f7b3b86
Compare
@oleg-vinted Thanks a lot for your feedback and review. I am happy that this was already helpful to someone. @schneems if there is anything I can do to help with heapy, feel free to reach out. I like this project a lot. |
3064ed2
to
863ed08
Compare
tl;dr I suggest we add this: def add_class_references(o, addr)
return unless o.key?('class')
return unless o['type'] != 'IMEMO'
ref = o.fetch('class').to_i(16)
(@reverse_references[ref] ||= []) << addr
end I dug into the Ruby source, and the GC does indeed mark every object's class, which makes sense - if an object is referenced from root (directly or indirectly), its class cannot be garbage collected. (The exception is IMEMO types, and since not-so-long-ago, there's no For some reason, however, the |
Here's a repro: require 'objspace'
ObjectSpace.trace_object_allocations_start
class MyClass; end
my_instance = MyClass.new
Object.send(:remove_const, :MyClass)
3.times { GC.start }
File.open('heap.json', 'w') do |f|
ObjectSpace.dump_all(output: f)
end
puts "Check this object's address using ref-explore:"
puts ObjectSpace.dump(my_instance.class) Without
With the patch:
|
863ed08
to
dac5992
Compare
Thanks for validating the assumptions in Ruby source code and giving me a reproduction example. Integrated your proposal. |
Allowing to perform analysis of references to an object.
dac5992
to
3618c02
Compare
I also used the heap dump generated by the example script that you provided to write some very basic specs. These test the basic functionality of the reference exploring, as well as the special case with the unreferenced class. |
Figured out the last missing part: the class reference is absent in the |
It seems like a helpful idea. Sorry for the silence. The tests are failing. Can you take a look? as an fyi Heap dumps have some odd behavior see: Basically: just because a reference exists in the dump doesn’t mean it is still retained. I think that a tool like yours is a good idea for when there is definitely a problem, but it might show false positives if a reference is not actually being retained but is not showing up in the dump. Maybe we could add something into the docs to educate people so they know the limitations of the tool (or they don’t open issues saying “heapy is claiming this object is retained in ref explore but it’s not”). |
# so we manually introduce the back-reference. | ||
def add_class_references(o, addr) | ||
return unless o.key?('class') | ||
return if o['type'] == 'IMEMO' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this type filter is not completely correct either, however, as https://github.com/ruby/ruby/blob/d5ef373b1194bac64784ae316d125d7a2cf1988a/gc.c#L7026 can sometimes mark the class
depending on the type of IMEMO.
This would still work just fine in newer version of Ruby because the class would appear in the references array if there was indeed a reference.
Are you sure this is the case and not the other way around? 🤔 I.e. |
Looks like the newly added spec was incompatible with older Rubies.
We should be able to find that out using the |
|
The slice method apparently was added to Hashes later in the lifetime of Ruby. Versions 2.3 and 2.4 do not recognize this method. do ... ensure ... end blocks were also added in 2.5, previously requiring a block started with begin.
af1749f
to
946a9ce
Compare
Also replaced |
|
||
return unless addr | ||
|
||
simple_object = o.select { |k, _v| %w[type file name class length imemo_type].include?(k) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I believe %w[type file name class length imemo_type]
gets allocated at every iteration, consider extracting this into a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that note: # frozen_string_literal: true
would prevent some more string allocations. Though I'm not sure what the coding style of this gem dictates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively: @schneems how would you feel about dropping support for EOL rubies from heapy
?
All CI failures so far were caused by tests for the older versions and this nit is the consequence of me dodging to use slice
.
Also considering that heapy
is not typically a dependency of a production application, but rather a tool running on a developer's machine I believe that it can allow itself to ask for recent Ruby versions.
Hello 👋
First of all: Thanks for heapy. It was already helpful a bunch of times when figuring out where a memory leak originated.
What's in this PR?
This pull request adds a new feature to heapy that allows to follow the references to a given object.
Use case: You are analyzing a memory bloating issue and found out that a lot of very old GC generations still contain hundreds of
ActiveRecord
object instances. This information alone does not yet tell you, why you have so many objects that have not been garbage collected. You just know what is retained, but not why.If you find an address of one of these objects (note: finding an address is out of scope for this PR), you can now follow the references to it and see why it has not been collected.
Example
Taking the fixture included in this repo as an example. The new command can run non-interactively (as all heapy commands do so far):
However, since the initial loading can take some time, there is also an interactive mode, where the dump is loaded once and the user can enter address after address:
In the examples above we can see that the object in question (an AR attribute) was not yet collected, because the request is still running, which is still referenced by the response and a
PagesController
instance.Background & considerations
I wrote this as a simple ruby script to be able to diagnose a memory bloat where I had no clue why objects were not garbage collected. Since it worked out and was helpful, I thought I could try to propose this as an upstream addition to heapy. Please let me know if you think this could be useful.
Things that will probably need further extension/talking about:
print "#{f.pos / (1024 * 1024)} MiB\r" if line_number % 1000 == 0
)grep generation\":100 my.dump
)