-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New command: ref-explore #33
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
require 'json' | ||
require 'readline' | ||
require 'set' | ||
|
||
module Heapy | ||
|
||
# Follows references to given object addresses and prints | ||
# them as a reference stack. | ||
# Since multiple reference stacks are possible, it will preferably | ||
# try to print a stack that leads to a root node, since reference chains | ||
# leading to a root node will make an object non-collectible by GC. | ||
# | ||
# In case no chain to a root node can be found one possible stack is printed | ||
# as a fallback. | ||
class ReferenceExplorer | ||
def initialize(filename) | ||
@objects = {} | ||
@reverse_references = {} | ||
@virtual_root_address = 0 | ||
File.open(filename) do |f| | ||
f.each.with_index do |line, i| | ||
o = JSON.parse(line) | ||
addr = add_object(o) | ||
add_reverse_references(o, addr) | ||
add_class_references(o, addr) | ||
end | ||
end | ||
end | ||
|
||
def drill_down_list(addresses) | ||
addresses.each { |addr| drill_down(addr) } | ||
end | ||
|
||
def drill_down_interactive | ||
while buf = Readline.readline("Enter address > ", true) | ||
drill_down(buf) | ||
end | ||
end | ||
|
||
def drill_down(addr_string) | ||
addr = addr_string.to_i(16) | ||
puts | ||
|
||
chain = find_root_chain(addr) | ||
unless chain | ||
puts 'Could not find a reference chain leading to a root node. Searching for a non-specific chain now.' | ||
puts | ||
chain = find_any_chain(addr) | ||
end | ||
|
||
puts '## Reference chain' | ||
chain.each do |ref| | ||
puts format_object(ref) | ||
end | ||
|
||
puts | ||
puts "## All references to #{addr_string}" | ||
refs = @reverse_references[addr] || [] | ||
refs.each do |ref| | ||
puts " * #{format_object(ref)}" | ||
end | ||
|
||
puts | ||
end | ||
|
||
def inspect | ||
"<ReferenceExplorer #{@objects.size} objects; #{@reverse_references.size} back-refs>" | ||
end | ||
|
||
private | ||
|
||
def add_object(o) | ||
addr = o['address']&.to_i(16) | ||
if !addr && o['type'] == 'ROOT' | ||
addr = @virtual_root_address | ||
o['name'] ||= o['root'] | ||
@virtual_root_address += 1 | ||
end | ||
|
||
return unless addr | ||
|
||
simple_object = o.select { |k, _v| %w[type file name class length imemo_type].include?(k) } | ||
simple_object['class'] = simple_object['class'].to_i(16) if simple_object.key?('class') | ||
simple_object['file'] = o['file'] + ":#{o['line']}" if o.key?('file') && o.key?('line') | ||
|
||
@objects[addr] = simple_object | ||
|
||
addr | ||
end | ||
|
||
def add_reverse_references(o, addr) | ||
return unless o.key?('references') | ||
o.fetch('references').map { |r| r.to_i(16) }.each do |ref| | ||
(@reverse_references[ref] ||= []) << addr | ||
end | ||
end | ||
|
||
# An instance of a class keeps that class marked by the GC. | ||
# This is not directly indicated as a reference in a heap dump, | ||
# so we manually introduce the back-reference. | ||
def add_class_references(o, addr) | ||
return unless o.key?('class') | ||
return if o['type'] == 'IMEMO' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Turns out this type filter is not completely correct either, however, as https://github.com/ruby/ruby/blob/d5ef373b1194bac64784ae316d125d7a2cf1988a/gc.c#L7026 can sometimes mark the This would still work just fine in newer version of Ruby because the class would appear in the references array if there was indeed a reference. |
||
|
||
class_addr = o.fetch('class').to_i(16) | ||
(@reverse_references[class_addr] ||= []) << addr | ||
end | ||
|
||
def find_root_chain(addr, known_addresses = Set.new) | ||
known_addresses << addr | ||
|
||
return [addr] if addr < @virtual_root_address # assumption: only root objects have smallest possible addresses | ||
|
||
references = @reverse_references[addr] || [] | ||
|
||
references.reject { |a| known_addresses.include?(a) }.each do |ref| | ||
path = find_root_chain(ref, known_addresses) | ||
return [addr] + path if path | ||
end | ||
|
||
nil | ||
end | ||
|
||
def find_any_chain(addr, known_addresses = Set.new) | ||
known_addresses << addr | ||
|
||
references = @reverse_references[addr] || [] | ||
|
||
next_ref = references.reject { |a| known_addresses.include?(a) }.first | ||
if next_ref | ||
[addr] + find_any_chain(next_ref, known_addresses) | ||
else | ||
[] | ||
end | ||
end | ||
|
||
def format_path(path) | ||
return '' unless path | ||
|
||
path.split('/').reverse.take(4).reverse.join('/') | ||
end | ||
|
||
def format_object(addr) | ||
obj = @objects[addr] | ||
return "<Unknown 0x#{addr.to_s(16)}>" unless obj | ||
|
||
desc = if obj['name'] | ||
obj['name'] | ||
elsif obj['type'] == 'OBJECT' | ||
@objects.dig(obj['class'], 'name') | ||
elsif obj['type'] == 'ARRAY' | ||
"#{obj['length']} items" | ||
Comment on lines
+151
to
+152
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Trusting you here, I blindly added this. The heap dump included in this repo does not seem to contain There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For this to work, simple_object = o.slice('type', 'file', 'name', 'class', 'length') There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🙈 I clearly did not work on this piece of code for some time :D Fixed. |
||
elsif obj['type'] == 'IMEMO' | ||
obj['imemo_type'] | ||
end | ||
desc = desc ? " #{desc}" : '' | ||
addr = addr ? " 0x#{addr.to_s(16).upcase}" : '' | ||
"<#{obj['type']}#{desc}#{addr}> (allocated at #{format_path obj['file']})" | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I believe
%w[type file name class length imemo_type]
gets allocated at every iteration, consider extracting this into a constant.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that note:
# frozen_string_literal: true
would prevent some more string allocations. Though I'm not sure what the coding style of this gem dictates.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively: @schneems how would you feel about dropping support for EOL rubies from
heapy
?All CI failures so far were caused by tests for the older versions and this nit is the consequence of me dodging to use
slice
.Also considering that
heapy
is not typically a dependency of a production application, but rather a tool running on a developer's machine I believe that it can allow itself to ask for recent Ruby versions.