Skip to content

[X86] Better handling of impossibly large stack frames #124217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 25, 2025

Conversation

mconst
Copy link
Contributor

@mconst mconst commented Jan 24, 2025

If you try to create a stack frame of 4 GiB or larger with a 32-bit stack pointer, we currently emit invalid instructions like mov eax, 5000000000 (unless you specify -fstack-clash-protection, in which case we emit a trap instead).

The trap seems nicer, so let's do that in all cases. This avoids emitting invalid instructions, and also fixes the "can't have 32-bit 16GB stack frame" assertion in X86FrameLowering::emitSPUpdate() (which used to be triggerable by user code, but is now correct).

This was originally part of #124041.

@phoebewang

If you try to create a stack frame of 4 GiB or larger with a 32-bit
stack pointer, we currently emit invalid instructions like `mov eax,
5000000000` (unless you specify `-fstack-clash-protection`, in which
case we emit a trap instead).

The trap seems nicer, so let's do that in all cases.  This avoids
emitting invalid instructions, and also fixes the "can't have 32-bit
16GB stack frame" assertion in `X86FrameLowering::emitSPUpdate()`
(which used to be triggerable by user code, but is now correct).

This was originally part of llvm#124041.
@llvmbot
Copy link
Member

llvmbot commented Jan 24, 2025

@llvm/pr-subscribers-backend-x86

Author: None (mconst)

Changes

If you try to create a stack frame of 4 GiB or larger with a 32-bit stack pointer, we currently emit invalid instructions like mov eax, 5000000000 (unless you specify -fstack-clash-protection, in which case we emit a trap instead).

The trap seems nicer, so let's do that in all cases. This avoids emitting invalid instructions, and also fixes the "can't have 32-bit 16GB stack frame" assertion in X86FrameLowering::emitSPUpdate() (which used to be triggerable by user code, but is now correct).

This was originally part of #124041.

@phoebewang


Full diff: https://github.com/llvm/llvm-project/pull/124217.diff

3 Files Affected:

  • (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+9-4)
  • (modified) llvm/test/CodeGen/X86/huge-stack-offset.ll (+3-5)
  • (modified) llvm/test/CodeGen/X86/stack-clash-extra-huge.ll (+2-24)
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 47cc6a18ef8433..a7b60afb7f5473 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -234,6 +234,14 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
   MachineInstr::MIFlag Flag =
       isSub ? MachineInstr::FrameSetup : MachineInstr::FrameDestroy;
 
+  if (!Uses64BitFramePtr && !isUInt<32>(Offset)) {
+    // We're being asked to adjust a 32-bit stack pointer by 4 GiB or more.
+    // This might be unreachable code, so don't complain now; just trap if
+    // it's reached at runtime.
+    BuildMI(MBB, MBBI, DL, TII.get(X86::TRAP));
+    return;
+  }
+
   uint64_t Chunk = (1LL << 31) - 1;
 
   MachineFunction &MF = *MBB.getParent();
@@ -829,10 +837,7 @@ void X86FrameLowering::emitStackProbeInlineGenericLoop(
           .addReg(StackPtr)
           .setMIFlag(MachineInstr::FrameSetup);
     } else {
-      // We're being asked to probe a stack frame that's 4 GiB or larger,
-      // but our stack pointer is only 32 bits.  This might be unreachable
-      // code, so don't complain now; just trap if it's reached at runtime.
-      BuildMI(MBB, MBBI, DL, TII.get(X86::TRAP));
+      llvm_unreachable("Offset too large for 32-bit stack pointer");
     }
 
     // while in the loop, use loop-invariant reg for CFI,
diff --git a/llvm/test/CodeGen/X86/huge-stack-offset.ll b/llvm/test/CodeGen/X86/huge-stack-offset.ll
index e825328ccd89a2..6629811a59b23d 100644
--- a/llvm/test/CodeGen/X86/huge-stack-offset.ll
+++ b/llvm/test/CodeGen/X86/huge-stack-offset.ll
@@ -13,11 +13,9 @@ define void @foo() nounwind {
 ; CHECK-64-NEXT: addq    [[RAX]], %rsp
 
 ; CHECK-32-LABEL: foo:
-; CHECK-32:      movl    $50000000{{..}}, %eax
-; CHECK-32-NEXT: subl    %eax, %esp
+; CHECK-32:      ud2
 ; CHECK-32-NOT:  subl    $2147483647, %esp
-; CHECK-32:      movl    $50000000{{..}}, [[EAX:%e..]]
-; CHECK-32-NEXT: addl    [[EAX]], %esp
+; CHECK-32:      ud2
   %1 = alloca [5000000000 x i8], align 16
   call void @bar(ptr %1)
   ret void
@@ -46,7 +44,7 @@ define i32 @foo3(i32 inreg %x) nounwind {
 ; CHECK-64-NEXT: subq    %rax, %rsp
 
 ; CHECK-32-LABEL: foo3:
-; CHECK-32:      subl $2147483647, %esp
+; CHECK-32:      ud2
 ; CHECK-32-NOT:  movl ${{.*}}, %eax
   %1 = alloca [5000000000 x i8], align 16
   call void @bar(ptr %1)
diff --git a/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll b/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll
index b8031056fd6b0a..d9b20f50e9a884 100644
--- a/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll
+++ b/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll
@@ -30,44 +30,22 @@ define i32 @foo() local_unnamed_addr #0 {
 ; CHECK-X86-LABEL: foo:
 ; CHECK-X86:       # %bb.0:
 ; CHECK-X86-NEXT:    ud2
-; CHECK-X86-NEXT:    .cfi_def_cfa_register %eax
-; CHECK-X86-NEXT:    .cfi_adjust_cfa_offset 4800000000
-; CHECK-X86-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
-; CHECK-X86-NEXT:    subl $4096, %esp # imm = 0x1000
-; CHECK-X86-NEXT:    movl $0, (%esp)
-; CHECK-X86-NEXT:    cmpl %eax, %esp
-; CHECK-X86-NEXT:    jne .LBB0_1
-; CHECK-X86-NEXT:  # %bb.2:
-; CHECK-X86-NEXT:    subl $12, %esp
-; CHECK-X86-NEXT:    .cfi_def_cfa_register %esp
 ; CHECK-X86-NEXT:    .cfi_def_cfa_offset 4800000016
 ; CHECK-X86-NEXT:    movl $1, 392(%esp)
 ; CHECK-X86-NEXT:    movl $1, 28792(%esp)
 ; CHECK-X86-NEXT:    movl (%esp), %eax
-; CHECK-X86-NEXT:    movl $4800000012, %ecx # imm = 0x11E1A300C
-; CHECK-X86-NEXT:    addl %ecx, %esp
+; CHECK-X86-NEXT:    ud2
 ; CHECK-X86-NEXT:    .cfi_def_cfa_offset 4
 ; CHECK-X86-NEXT:    retl
 ;
 ; CHECK-X32-LABEL: foo:
 ; CHECK-X32:       # %bb.0:
 ; CHECK-X32-NEXT:    ud2
-; CHECK-X32-NEXT:    .cfi_def_cfa_register %r11
-; CHECK-X32-NEXT:    .cfi_adjust_cfa_offset 4799995904
-; CHECK-X32-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
-; CHECK-X32-NEXT:    subl $4096, %esp # imm = 0x1000
-; CHECK-X32-NEXT:    movq $0, (%esp)
-; CHECK-X32-NEXT:    cmpl %r11d, %esp
-; CHECK-X32-NEXT:    jne .LBB0_1
-; CHECK-X32-NEXT:  # %bb.2:
-; CHECK-X32-NEXT:    subl $3976, %esp # imm = 0xF88
-; CHECK-X32-NEXT:    .cfi_def_cfa_register %rsp
 ; CHECK-X32-NEXT:    .cfi_def_cfa_offset 4799999888
 ; CHECK-X32-NEXT:    movl $1, 264(%esp)
 ; CHECK-X32-NEXT:    movl $1, 28664(%esp)
 ; CHECK-X32-NEXT:    movl -128(%esp), %eax
-; CHECK-X32-NEXT:    movl $4799999880, %ecx # imm = 0x11E1A2F88
-; CHECK-X32-NEXT:    addl %ecx, %esp
+; CHECK-X32-NEXT:    ud2
 ; CHECK-X32-NEXT:    .cfi_def_cfa_offset 8
 ; CHECK-X32-NEXT:    retq
   %a = alloca i32, i64 1200000000, align 16

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mconst
Copy link
Contributor Author

mconst commented Jan 24, 2025

Thanks for reviewing this! I think it's ready to go, if someone wants to merge it.

@phoebewang phoebewang merged commit 1f26ac1 into llvm:main Jan 25, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants