Skip to content

cmd/compile: redundant morestack spill space in nosplit functions #74413

Open
@mcy

Description

@mcy

Consider the following program:

package x

var x func(a, b, c, d uint64)

//go:nosplit
//go:noinline
func y(a, b, c, d uint64) {
    x(a, b, c, d)
}

//go:nosplit
func z(a, b, c, d uint64) {
    y(a, b, c, d)
}

When assembled, it looks like this:

        TEXT    .y(SB), NOSPLIT|ABIInternal, $40-32
        PUSHQ   BP
        MOVQ    SP, BP
        SUBQ    $32, SP
        MOVQ    .x(SB), DX
        MOVQ    (DX), SI
        PCDATA  $1, $0
        CALL    SI
        ADDQ    $32, SP
        POPQ    BP
        RET
        TEXT    .z(SB), NOSPLIT|ABIInternal, $40-32
        CMPQ    SP, 16(R14)
        JLS    // morestack
        PUSHQ   BP
        MOVQ    SP, BP
        SUBQ    $32, SP
        CALL    .y(SB)
        ADDQ    $32, SP
        POPQ    BP
        RET

The primary thing to note here is that both .y and .z reserves 32 bytes of spill space for its callees to spill their arguments. However, .z's sole callee is a nosplit function, which therefore does not contain a morestack check. As far as I know, these 32 bytes are never written to in any code path.

This has a few unfortunate side effects, but the itch I'm trying to scratch is that I have a bunch of performance-critical nosplit functions whose arguments/returns fully saturate the argument and return registers, and are never spilled. On x86, I am limited to about 10 stack frames before I hit the nosplit limit in the linker.

This is all well and fine: 10 frames is a lot. Unfortunately, this assumes two things:

  1. No further stack variables are created. I have a custom build tag that turns on debug instrumentation, which blows up the size of the three nested nosplit frames I actually have just enough that my program fails to link.
  2. I am running into problems with turning on fuzzing inserting nosplit instrumentation function calls that cause me to blow the stack, and fail to link.

I have been working around this in a few different ways, because this is a very niche problem being suffered by a performance weirdo. However, I did notice that 72 bytes of each frame go unused: the morestack spill path.

As the ABI documentaiton observes, there are many options for improving this situation. I'd like to suggest an improvement that should be simple to implement, and will go some way to eliminating redundant stack growth: if a function only calls functions declared as nosplit, treat it like a leaf function for the purposes of prologue.

Of course, this isn't quite so simple. First, the argument registers no longer have a natural home, so those will need to be allocated if they are in fact necessary. Second, there might be a place in reflect that expects this spill area to be here, but I'm not certain. It also messes up traceback printing, which will need to be aware of spill-space-less functions.

This also only benefits nosplit code, which isn't particularly common. I'm more-or-less hitting a pathological case. The real fix is to modify stack growth to allow callees to reserve their own space, as the ABI document details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ImplementationIssues describing a semantics-preserving change to the Go implementation.NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions