Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect file offset calculation in memory mapping #220

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions news/220.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Fix incorrect file offset calculation when analyzing ELF files with
non-standard ELF layouts. Previously, pystack would fail to correctly analyze
Python binaries that had non-standard ELF layouts (for example when compiled
with certain linker options). The fix properly accounts for PT_LOAD segment
mappings when calculating file offsets.
2 changes: 1 addition & 1 deletion requirements-test.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
coverage[toml]
pyinstaller<6.0
pyinstaller
pytest
pytest-cov
pytest-xdist
Expand Down
22 changes: 11 additions & 11 deletions src/pystack/_pystack.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -24,19 +24,19 @@ class CoreFileAnalyzer:
def missing_modules(self) -> List[str]: ...

class NativeReportingMode(enum.Enum):
ALL: int
OFF: int
PYTHON: int
ALL = 1
OFF = 2
PYTHON = 3

class StackMethod(enum.Enum):
ALL: int
ANONYMOUS_MAPS: int
AUTO: int
BSS: int
ELF_DATA: int
HEAP: int
SYMBOLS: int
DEBUG_OFFSETS: int
ALL = 1
ANONYMOUS_MAPS = 2
AUTO = 3
BSS = 4
ELF_DATA = 5
HEAP = 6
SYMBOLS = 7
DEBUG_OFFSETS = 8

class ProcessManager: ...

Expand Down
86 changes: 82 additions & 4 deletions src/pystack/_pystack/mem.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -398,17 +398,61 @@ CorefileRemoteMemoryManager::StatusCode
CorefileRemoteMemoryManager::getMemoryLocationFromCore(remote_addr_t addr, off_t* offset_in_file) const
{
auto corefile_it = std::find_if(d_vmaps.cbegin(), d_vmaps.cend(), [&](auto& map) {
return (map.Start() <= addr && addr < map.End()) && (map.FileSize() != 0 && map.Offset() != 0);
// When considering if the data is in the core file, we need to check if the address is
// withing the chunk of the segment that is actually in the core file. map.End() corresponds
// to the end of the segment in memory so we need to use map.FileSize() to get the end of the
// segment in the core file.
pablogsal marked this conversation as resolved.
Show resolved Hide resolved
uintptr_t fileEnd = map.Start() + map.FileSize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this imply that fileEnd <= map.End()? If so, should we be checking that condition as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, do you want an assert for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly checking my understanding, but I wouldn't be opposed to an assert

return (map.Start() <= addr && addr < fileEnd) && (map.FileSize() != 0 && map.Offset() != 0);
});
if (corefile_it == d_vmaps.cend()) {
return StatusCode::ERROR;
}

unsigned long base = corefile_it->Offset() - corefile_it->Start();
off_t base = corefile_it->Offset() - corefile_it->Start();
*offset_in_file = base + addr;
return StatusCode::SUCCESS;
}

CorefileRemoteMemoryManager::StatusCode
CorefileRemoteMemoryManager::initLoadSegments(const std::string& filename) const
{
int fd = open(filename.c_str(), O_RDONLY);
if (fd < 0) {
return StatusCode::ERROR;
}

Elf* elf = elf_begin(fd, ELF_C_READ, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ELF_C_READ? We use ELF_C_READ_MMAP everywhere else.

if (!elf) {
close(fd);
return StatusCode::ERROR;
}

std::vector<ElfLoadSegment> segments;
size_t phnum;
if (elf_getphdrnum(elf, &phnum) == 0) {
for (size_t i = 0; i < phnum; i++) {
GElf_Phdr phdr_mem;
GElf_Phdr* phdr = gelf_getphdr(elf, i, &phdr_mem);
if (phdr == nullptr) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this just silently ignoring a legitimate error?


if (phdr->p_type == PT_LOAD) {
segments.push_back(
{.vaddr = phdr->p_vaddr, .offset = phdr->p_offset, .size = phdr->p_filesz});
}
}
}

elf_end(elf);
close(fd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't elf_end close the file descriptor? I thought a successful elf_begin takes over ownership of it, so this seems like a double close to me. Should be easy to confirm that by checking the return code...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. I misread some of our other code, which uses elf_unique_ptr to guarantee that elf_end gets called, and then doesn't call close after that point - but that's because it also uses file_unique_ptr, which always calls close, and I missed it because it was hidden in a destructor.

That said: maybe we should be reusing those two helper types, since we use them elsewhere.


if (!segments.empty()) {
d_elf_load_segments_cache[filename] = std::move(segments);
return StatusCode::SUCCESS;
}
return StatusCode::ERROR;
}

CorefileRemoteMemoryManager::StatusCode
CorefileRemoteMemoryManager::getMemoryLocationFromElf(
remote_addr_t addr,
Expand All @@ -418,12 +462,46 @@ CorefileRemoteMemoryManager::getMemoryLocationFromElf(
auto shared_libs_it = std::find_if(d_shared_libs.cbegin(), d_shared_libs.cend(), [&](auto& map) {
return map.start <= addr && addr <= map.end;
});

if (shared_libs_it == d_shared_libs.cend()) {
return StatusCode::ERROR;
}

*filename = &shared_libs_it->filename;
*offset_in_file = addr - shared_libs_it->start;
return StatusCode::SUCCESS;

// Check if we have cached segments for this file
auto cache_it = d_elf_load_segments_cache.find(**filename);
if (cache_it == d_elf_load_segments_cache.end()) {
// Initialize segments if not in cache
if (initLoadSegments(**filename) != StatusCode::SUCCESS) {
return StatusCode::ERROR;
}
cache_it = d_elf_load_segments_cache.find(**filename);
}

if (cache_it->second.empty()) {
return StatusCode::ERROR;
}
Comment on lines +483 to +485
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case can never happen. Only initLoadSegments can add a key/value pair to the cache, and it returns an error if the value it would add is empty.


// Get the load address of the elf file
remote_addr_t elf_load_addr = cache_it->second[0].vaddr;

// Now relocate the address to the elf file
remote_addr_t symbol_vaddr = addr - (shared_libs_it->start - elf_load_addr);

// Find the correct segment
for (const auto& segment : cache_it->second) {
if (symbol_vaddr >= segment.vaddr && symbol_vaddr < segment.vaddr + segment.size) {
*offset_in_file = (symbol_vaddr - segment.vaddr) + segment.offset;
return StatusCode::SUCCESS;
}
}
Comment on lines +493 to +499
Copy link
Contributor

@godlygeek godlygeek Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we've got d_vmaps which is a vector of files and their offsets in virtual memory, and we iterate over it to find the file that contains a given virtual address, then we look up the vector of segments contained in that file, and iterate over it to find the segment that contains the given virtual address.

Wouldn't everything be simpler if, instead of introducing a second level of vectors, we flattened things so that we keep a vector of segments as a member variable instead of a vector of ELF files?

Copy link
Contributor

@godlygeek godlygeek Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only advantage I can think of for how you've done it is that it's lazier and allows delaying reading the segments for an ELF file until the first time they're needed. But, isn't reading the program headers quite fast? At least there's not much seeking involved, nor copying of strings, just reading a bunch of integers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I follow. These are different things:

  • d_vmaps is not a vector of files and their offsets: is a vector of memory segments (that some are backed by files) as seen by the core: this is what the core has mapped inside of it. Some data may be missing.
  • What we are storing in the cache bt initLoadSegments and we are iterating here are the segments in the ELF files that have been mapped to the process when it was alive. Some of the data in these segments may be in the core and some may not. The addresses in these segments are not relocated to the process.

First, we look at d_vmaps and we look for the file that contains a given virtual address that'a absolute (an address in the process memory), and then we look up the ELF segments for that file but what we search for is an address that's not absolute anymore because we have subtracted the load point of the library.

So not sure I understand the proposal to flatten. There is nothing obvious that I can think of that we can flatten.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just talked offline, but for the sake of anyone who finds the PR later:

The thing that we could represent as a flat structure is a list of virtual address ranges, and, for each range, the file that should be read to find its contents and the offset to begin reading at to find the start of the range. This would let us stop treating from-core and from-executable as two separate cases, and use exactly the same code to handle them, at the cost of extra time in the constructor to build this mapping.

We argued back and forth a bit about the pros and cons of the two approaches, and agreed to try to pair program the other approach on Monday to see if it's as clean as I think it'll be or as slow and ugly as Pablo thinks it will be 😆


LOG(ERROR) << "Failed to find the correct segment for address " << std::hex << std::showbase << addr
<< "(with vaddr offset " << symbol_vaddr << " ) "
<< " in file " << **filename;

return StatusCode::ERROR;
}

bool
Expand Down
10 changes: 10 additions & 0 deletions src/pystack/_pystack/mem.h
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,15 @@ class CorefileRemoteMemoryManager : public AbstractRemoteMemoryManager
ERROR,
};

struct ElfLoadSegment
{
GElf_Addr vaddr;
GElf_Off offset;
GElf_Xword size;
};
// Cache for PT_LOAD segments
mutable std::unordered_map<std::string, std::vector<ElfLoadSegment>> d_elf_load_segments_cache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this being mutable, but, fine... I guess you're delaying populating this cache instead of doing it when the memory manager is constructed to avoid paying the cost until the first time memory is actually read from a given shared library?


// Data members
std::shared_ptr<CoreFileAnalyzer> d_analyzer;
std::vector<VirtualMap> d_vmaps;
Expand All @@ -220,5 +229,6 @@ class CorefileRemoteMemoryManager : public AbstractRemoteMemoryManager
remote_addr_t addr,
const std::string** filename,
off_t* offset_in_file) const;
StatusCode initLoadSegments(const std::string& filename) const;
};
} // namespace pystack
11 changes: 11 additions & 0 deletions src/pystack/_pystack/process.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -750,6 +750,17 @@ AbstractProcessManager::findPythonVersion() const
}
int major = (version >> 24) & 0xFF;
int minor = (version >> 16) & 0xFF;

if (major == 0 && minor == 0) {
LOG(DEBUG) << "Failed to determine Python version from symbols: empty data copied";
return {-1, -1};
}

if (major != 2 && major != 3) {
LOG(DEBUG) << "Failed to determine Python version from symbols: invalid major version";
return {-1, -1};
}
Comment on lines +759 to +762
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check whether PY_RELEASE_LEVEL is in {0xA, 0xB, 0xC, 0xF} while we're at it? The valid values for that are almost as constrained as for the major version.


LOG(DEBUG) << "Python version determined from symbols: " << major << "." << minor;
return {major, minor};
}
Expand Down
Loading