forked from google/bloaty
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added first draft of "How Bloaty Works" document.
- Loading branch information
Showing
1 changed file
with
135 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
|
||
# How Bloaty Works | ||
|
||
At a high level, Bloaty's goal is to create a map of the binary where every byte | ||
has a label attached to it. Every byte starts out as unknown (unattributed). | ||
As we scan data structures in the binary we assign labels to various ranges of | ||
the binary. For example, if the user selected the "symbols" data source we scan | ||
the symbol table. For each symbol, we find the corresponding section of the | ||
binary and "paint" it with the label of that symbol. | ||
|
||
The ELF/Mach-O/etc. data structures we are traversing were not designed to | ||
enable size profiling. They were designed to assist linkers, loaders, | ||
debuggers, stack unwinders, etc. to run and debug the binary. This means | ||
that Bloaty is inherently an unconventional use of ELF/Mach-O structures, | ||
and has to be clever about how to use the available information to | ||
achieve its goal. This can pose a challenge, but also makes Bloaty fun | ||
to work on. | ||
|
||
## Range Map | ||
|
||
RangeMap (as defined in [range_map.h](https://github.com/google/bloaty/blob/master/src/range_map.h)) | ||
is the core data structure of Bloaty. It is a sparse map of | ||
`[start, end) -> std::string` that associates regions of VM or file space to | ||
a label. | ||
|
||
Sometimes we know a region's start but not its end. For example, Mach-O | ||
symbols have an address but not a size (whereas ELF symbols have both). | ||
To support this case, `RangeMap` supports adding an address with `kUnknownSize`. | ||
A range with unknown size will automatically extend to the beginning of the | ||
next region, even if the next region is added later. | ||
|
||
If we try to add a label to a range of the binary that has already been assigned | ||
a label, the first label assigned takes precedence. This means that the order | ||
in which we scan data structures is significant. So our general strategy is to | ||
scan our most granular and detailed information first. We scan generic | ||
information as a last resort, to give at least some information for parts of the | ||
binary that we couldn't find any more specific information about. | ||
|
||
## VM Space and File Space | ||
|
||
Loadable binaries have two fundamental domains of space we are trying to map: | ||
*VM space* and *file space*. File space is the bytes of the input file. VM | ||
space is the bytes of memory when the executable is loaded at runtime. Some | ||
regions of the binary exist only in file space (like debug info) and some | ||
regions exist only in VM space (like `.bss`, zero-initialized data). Even | ||
entities that exist in both spaces can have different sizes in each. | ||
|
||
We create two separate `RangeMap` structures for these two domains. For | ||
convenience, we put them together into a single structure called `DualMap`: | ||
|
||
```cpp | ||
struct DualMap { | ||
RangeMap vm_map; | ||
RangeMap file_map; | ||
}; | ||
``` | ||
We populate these two maps simultaneously as we scan the file. We must do this | ||
even in the case where we only ultimately care about one of them, because almost | ||
all of the data structures we are scanning specify locations in either file | ||
space *or* VM space, but not both. So we must be able to translate one domain | ||
to the other to get the benefit of all the metadata in the binary. | ||
## Base Map | ||
To translate between VM and file space, we always begin by creating a "base | ||
map." The base map is just a `DualMap` like any other, but we give it special | ||
meaning: | ||
* It defines what ranges of file and VM space constitute "the entire binary" | ||
(ie. the "TOTALS" row of the final report). | ||
* We use it to translate between VM space and File space. | ||
This means that the base map must be exhaustive, and must also provide | ||
translation for any entity that exists in both VM and file space. For example, | ||
if we see in the symbol table that address `0x12345` corresponds to | ||
symbol `foo`, we will add that to VM map immediately, but we will | ||
also use the base map to translate address `0x12355` to a file | ||
offset so we can add that range to the file map. | ||
How does the base map store translation info? I left one thing out about | ||
`RangeMap` above. In addition to storing a label for every region, it can also | ||
(optionally) store a member called `other_start`. This is the offset in the | ||
other space where this range maps to. | ||
We build the base map by scanning either the segments (program headers) or | ||
sections of the binary. These give both VM address and file offset for regions | ||
of the binary that are loaded into memory. To make sure we cover the entire | ||
file space, we use `[Unmapped]` as a last ditch fallback for any regions of the | ||
on-disk binary that didn't have any segment/section data associated with them. | ||
This ensures that Bloaty always accounts for the entire physical binary, even if | ||
we can't find any information about it. | ||
## Scanning Data Sources | ||
Once we have built the base map, we can get on to the meat of Bloaty's work. | ||
We can now scan the binary according to whatever data source(s) the user has | ||
selected. | ||
We must do more than just scan a single table though. For example, if the user | ||
asked us to scan symbols we can't simply scan the symbol table, add all the | ||
symbol table entries to a `RangeMap`, and call it a day. The first prerelease | ||
version of Bloaty did just that and I got very understandable complaints that | ||
Bloaty's coverage was low, and very large parts of the binary would show up as | ||
`[None]`. | ||
The problem is that the symbol table only refers to the actual machine code of | ||
the function. But a function can emit far more artifacts into a binary than | ||
just the machine code. It can also emit: | ||
* unwind info into `.eh_frame` | ||
* relocations into `.rela.dyn`, etc. | ||
* debugging information into various `.debug_*` sections. | ||
* the symbol table entries themselves (yes, these do take up space!) | ||
If we want to achieve high coverage of the binary, we have to scan all of these | ||
different sections and attribute them as best we can to individual | ||
symbols/compileunits/etc. | ||
All of this means that we have to delve deep into ELF, DWARF, Mach-O, etc. to | ||
get good coverage. We have to venture into dark arts like `.eh_frame` which is | ||
underspecified and underdocumented, and that few programs ever have any reason | ||
to parse. For Bloaty to work well, we must go deeper. | ||
## ELF | ||
TODO | ||
## DWARF | ||
TODO | ||
## Mach-O | ||
TODO |