Go call stacks sizes and resizes

You can decide for yourself that an unholy alliance of the Go compiler and runtime can allocate variables on the call stack. But doing so forces you to notice the small initial size of a goroutine’s call stack: 0x1000, or 4096 bytes.

Here’s what my previous code said about the main goroutine’s initial stack:

stack top     00002d18eba43000
backing store 00002d18eba42e60
stack bottom  00002d18eba42000
Slice backing store on stack: true

0x2d18eba43000 - 0x2d18eba42000 = 0x1000 = 4096

That’s not much in my old C programmer estimation. A recursive function uses that much stack getting out of bed.

I wrote another program to see what happens when a recursive function uses up the 4096-byte initial call stack.

The function looks like this:

func probeStack(ply int, oldHi uintptr) {
    currentHi := myg.stack.hi // myg has global scope
    if currentHi != oldHi {
        fmt.Printf("ply % 3d top was   %016x\n", ply, oldHi)
        fmt.Printf("stack top         %016x has changed\n", currentHi)
        fmt.Printf("stack size        %016x\n", currentHi-myg.stack.lo)
        fmt.Printf("formal argument   %016x\n", uintptr(unsafe.Pointer(&ply)))
        fmt.Printf("local variable    %016x\n", uintptr(unsafe.Pointer(&currentHi))
        fmt.Printf("stack bottom      %016x\n\n", myg.stack.lo)
    }
    if ply > 0 {
        probeStack(ply-1, currentHi)
    }
}

The function keeps a copy of the value of the top-of-stack, myg.stack.hi. myg is a global variable, and due to some really ingenious linking in package github.com/timandy/routine my program sets it to the address of the struct g for the main goroutine of this program. Because the Go runtime isn’t protected by virtual memory as the Linux (or MacOS or Windows) kernel is, a program can read values from the runtime, like its stack high and low addresses.

The function above only prints out info about its stack if the highest stack address changes.

By running the program with different numeric arguments, we can see what size call stack allocations are.

Below, what my program printed out when given a “200” argument. The ply number starts at 200 and counts down. The function stops recursing and returns when ply contains a zero value.

 1	stack top         00002fa9e80d5000
 2	stack size        0000000000001000
 3	
 4	ply  184 top was  00002fa9e80d5000
 5	stack top         00002fa9e80e4000 has changed
 6	stack size        0000000000002000
 7	formal argument   00002fa9e80e3350
 8	local variable    00002fa9e80e32c8
 9	stack bottom      00002fa9e80e2000
10	
11	ply  163 top was  00002fa9e80e4000
12	stack top         00002fa9e8114000 has changed
13	stack size        0000000000004000
14	formal argument   00002fa9e8112390
15	local variable    00002fa9e8112308
16	stack bottom      00002fa9e8110000
17	
18	ply  120 top was  00002fa9e8114000
19	stack top         00002fa9e80ee000 has changed
20	stack size        0000000000008000
21	formal argument   00002fa9e80ea350
22	local variable    00002fa9e80ea2c8
23	stack bottom      00002fa9e80e6000
24	
25	ply   35 top was  00002fa9e80ee000
26	stack top         00002fa9e80fe000 has changed
27	stack size        0000000000010000
28	formal argument   00002fa9e80f6390
29	local variable    00002fa9e80f6308
30	stack bottom      00002fa9e80ee000

A goroutine’s call stack starts at 0x1000 (4096) bytes, and gets doubled every time the number of recursions use up all the space on the call stack.

Line 2 shows the initial stack size, 0x1000. Line 6 is the first doubling, 13 the second, 20 the third doubling, and 27 the fourth.

You can see that new, larger call stacks can be higher in memory (line 26), or lower than the previous stack (line 19). Each larger stack is not an extension of the previous stack, but an entirely new block of memory.

This investigation leaves me with a few more questions.

  1. Does the Go runtime copy the entire old stack contents into the new larger stack, or are these stacks linked together somehow?
  2. What’s in these call stacks? My “C programmer” understanding of call stacks is that they were at least somewhat determined by the CPU, like SPARC architecture stack frames needing room for register window spills.