@Chris Fallin @Trevor Elliott: will ra2 ever prefer to spill rather than start to use non-preferred registers when it has only been using preferred registers thus far?
I ask because I'm seeing the following distributions of number of clobbered callee-save registers for our sightglass benchmarks. I find it very surprising that it is always zero or all callee-save registers. Not ever a single time where we clobber just some of the callee-save registers.
# Number of samples = 757
# Min = 0
# Max = 5
#
# Mean = 1.9682959048877162
# Standard deviation = 2.4428038716280174
# Variance = 5.967290755240832
#
# Each ∎ is a count of 9
#
0 .. 1 [ 459 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1 .. 2 [ 0 ]:
2 .. 3 [ 0 ]:
3 .. 4 [ 0 ]:
4 .. 5 [ 0 ]:
5 .. 6 [ 298 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
6 .. 7 [ 0 ]:
7 .. 8 [ 0 ]:
8 .. 9 [ 0 ]:
9 .. 10 [ 0 ]:
# Number of samples = 18279
# Min = 0
# Max = 5
#
# Mean = 1.8119153126538674
# Standard deviation = 2.4034432514706436
# Variance = 5.77653946303978
#
# Each ∎ is a count of 233
#
0 .. 1 [ 11655 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1 .. 2 [ 0 ]:
2 .. 3 [ 0 ]:
3 .. 4 [ 0 ]:
4 .. 5 [ 0 ]:
5 .. 6 [ 6624 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
6 .. 7 [ 0 ]:
7 .. 8 [ 0 ]:
8 .. 9 [ 0 ]:
9 .. 10 [ 0 ]:
# Number of samples = 127
# Min = 0
# Max = 5
#
# Mean = 0.5511811023622047
# Standard deviation = 1.5659198268780583
# Variance = 2.452104904209808
#
# Each ∎ is a count of 2
#
0 .. 1 [ 113 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1 .. 2 [ 0 ]:
2 .. 3 [ 0 ]:
3 .. 4 [ 0 ]:
4 .. 5 [ 0 ]:
5 .. 6 [ 14 ]: ∎∎∎∎∎∎∎
6 .. 7 [ 0 ]:
7 .. 8 [ 0 ]:
8 .. 9 [ 0 ]:
9 .. 10 [ 0 ]:
RA2 will try to use non-preferred registers before spilling
I don't have time to page in enough context to grok the above right now but I'm happy to look at it with you later
Incidentally, something that would be interesting to try is to move pregs from the non-preferred set to the preferred set when they are used for the first time. The logic here is that once you've saved the register is the function prologue, it costs just as much as other callee-saved registers to use.
ah that's actually a great idea -- happy to review a PR for that if you want to make one
My use case doesn't have proper stack frames (returns are treated as a new entry point into the function), so it's not something I would use myself. I just use preferred/non-preferred to hint towards selecting registers with a compressed RISC-V encoding (x8-x15).
It would be nice to be able to save callee-saved registers only in the paths that actually clobber them. Interpreters may have a fast path where they don't clobber any callee-saved registers and a slow path which does a lot more work and as such has to clobber some callee-saved registers. Moving the saves from the prologue to the slow path would speed up the fast path.
Last updated: Oct 23 2024 at 20:03 UTC