Skip to content

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Aug 25, 2025

This adds support for names field in source maps, which contains function names. Source map mappings are correspondingly updated and emsymbolizer now can provide function name information only with source maps. It looks Dart toolchain's sourcemap generation also supports names field: https://github.com/dart-lang/sdk/blob/187c3cb004b5f6a0a1f1b242b7d1b8a6b33b9a7a/pkg/wasm_builder/lib/source_map.dart#L105-L118

This parses the name section for function name information.

I considered parsing DW_TAG_subprogram tags in llvm-dwarfdump output to get function names, but then we cannot use --recurse-depth=0 added in #9580 to reduce the amount of llvm-dwarfdump output. To avoid that problem, we can consider using DWARF-parsing Python libraries like https://github.com/eliben/pyelftools, but this will make another third party dependency.

To measure source map size increase, I ran this on wasm-opt.wasm built by adding -gsource-map link flag to if (EMSCRIPTEN) setup here
(https://github.com/WebAssembly/binaryen/blob/969bf763a495b475e2a28163e7d70a5dd01f9dda/CMakeLists.txt#L299-L365). The source map file size increased from 352743 to 443373, about 25%, which I think is tolerable.

Fixes #20715.

This adds support for `names` field in source maps, which contains
function names. Source map mappings are correspondingly updated and
emsymbolizer now can provide function name information only with source
maps.

To do this, you can use `emcc -gsource-map=names`. This also adds
separate internal settings for this namd generation and the existing
source embedding, making them more readable. Also because they are
internal settings, they don't add the number of external options. When
you run `wasm-sourcemap.py` standalone you can use `--names`.

While we have the name sections and DWARF, I think it is generally
good to support, given that the field exists for that purpose and JS
source maps support it. It looks Dart toolchain also supports it:
https://github.com/dart-lang/sdk/blob/187c3cb004b5f6a0a1f1b242b7d1b8a6b33b9a7a/pkg/wasm_builder/lib/source_map.dart#L105-L118

To measure source map size increase, I ran this on `wasm-opt.wasm` built
by the `if (EMSCRIPTEN)` setup here
(https://github.com/WebAssembly/binaryen/blob/969bf763a495b475e2a28163e7d70a5dd01f9dda/CMakeLists.txt#L299-L365)
with `-gsource-map` vs. `-gsource-map=names`. The source map file size
increased from 352743 to 443373, about 25%.

While I think 25% increase of the source map file size is tolerable,
this option is off by default, because with this we can't use emscripten-core#9580.
So far we only needed `DW_TAG_compile_unit`s in `llvm-dwarfdump`
results, and for that we could get away with printing only the top level
tags using `--recurse-depth=0`. But to gather function information, we
need to parse all `DW_TAG_subprogram`s, which can be at any depth
(because functions can be within nested namespaces or classes). So the
trick in emscripten-core#9580 does not work and dumping all `.debug_info` section will
be slow. To avoid this problem, we can consider using DWARF-parsing
Python libraries like https://github.com/eliben/pyelftools, but this
will make another third party dependency, so I'm not sure if it's worth
it at this point.

Fixes emscripten-core#20715.
@aheejin aheejin requested review from kripken, sbc100 and dschuff August 25, 2025 17:36
@kripken
Copy link
Member

kripken commented Aug 25, 2025

I'm curious if you considered an alternative where we use the Names section to get this information? That would avoid the slow dwarfdump operation, and instead read the Names section (same way we currently implement symbol maps maybe?) and add that info to source maps.

@aheejin
Copy link
Member Author

aheejin commented Aug 25, 2025

Hmm that's a good idea. It creates a dependency on the name section but in Emscripten there is no scenario where we have DWARF but not name section I guess?

@dschuff
Copy link
Member

dschuff commented Aug 25, 2025

+1, I think this is a good idea too.

Currently it is the case that the -g flags that cause DWARF generation also always leave a name section in the Binary. It's also actually true for source maps too, but I'm actually proposing to change the latter (see the discussion in #20462).
But I think that actually doesn't matter; currently when the binary comes out of the linker it has a name section, and then we generate the source map from the DWARF, and then after that we add the sourceMappingURL section and then finally strip out the DWARF and producers sections. So if we were going to remove the name section it would presumably be at the end, and it would be available for source map generation.

@aheejin
Copy link
Member Author

aheejin commented Aug 26, 2025

This now uses the name section for the function info, and it is turned on by default. (We don't have -gsourcemap=names anymore)

@aheejin
Copy link
Member Author

aheejin commented Aug 26, 2025

Hmm, the latest change seems to increase source map size hugely. Have to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should we add function names to source maps generated by emscripten?
3 participants