-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[X86] Better handling of impossibly large stack frames #124217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If you try to create a stack frame of 4 GiB or larger with a 32-bit stack pointer, we currently emit invalid instructions like `mov eax, 5000000000` (unless you specify `-fstack-clash-protection`, in which case we emit a trap instead). The trap seems nicer, so let's do that in all cases. This avoids emitting invalid instructions, and also fixes the "can't have 32-bit 16GB stack frame" assertion in `X86FrameLowering::emitSPUpdate()` (which used to be triggerable by user code, but is now correct). This was originally part of llvm#124041.
@llvm/pr-subscribers-backend-x86 Author: None (mconst) ChangesIf you try to create a stack frame of 4 GiB or larger with a 32-bit stack pointer, we currently emit invalid instructions like The trap seems nicer, so let's do that in all cases. This avoids emitting invalid instructions, and also fixes the "can't have 32-bit 16GB stack frame" assertion in This was originally part of #124041. @phoebewang Full diff: https://github.com/llvm/llvm-project/pull/124217.diff 3 Files Affected:
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 47cc6a18ef8433..a7b60afb7f5473 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -234,6 +234,14 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
MachineInstr::MIFlag Flag =
isSub ? MachineInstr::FrameSetup : MachineInstr::FrameDestroy;
+ if (!Uses64BitFramePtr && !isUInt<32>(Offset)) {
+ // We're being asked to adjust a 32-bit stack pointer by 4 GiB or more.
+ // This might be unreachable code, so don't complain now; just trap if
+ // it's reached at runtime.
+ BuildMI(MBB, MBBI, DL, TII.get(X86::TRAP));
+ return;
+ }
+
uint64_t Chunk = (1LL << 31) - 1;
MachineFunction &MF = *MBB.getParent();
@@ -829,10 +837,7 @@ void X86FrameLowering::emitStackProbeInlineGenericLoop(
.addReg(StackPtr)
.setMIFlag(MachineInstr::FrameSetup);
} else {
- // We're being asked to probe a stack frame that's 4 GiB or larger,
- // but our stack pointer is only 32 bits. This might be unreachable
- // code, so don't complain now; just trap if it's reached at runtime.
- BuildMI(MBB, MBBI, DL, TII.get(X86::TRAP));
+ llvm_unreachable("Offset too large for 32-bit stack pointer");
}
// while in the loop, use loop-invariant reg for CFI,
diff --git a/llvm/test/CodeGen/X86/huge-stack-offset.ll b/llvm/test/CodeGen/X86/huge-stack-offset.ll
index e825328ccd89a2..6629811a59b23d 100644
--- a/llvm/test/CodeGen/X86/huge-stack-offset.ll
+++ b/llvm/test/CodeGen/X86/huge-stack-offset.ll
@@ -13,11 +13,9 @@ define void @foo() nounwind {
; CHECK-64-NEXT: addq [[RAX]], %rsp
; CHECK-32-LABEL: foo:
-; CHECK-32: movl $50000000{{..}}, %eax
-; CHECK-32-NEXT: subl %eax, %esp
+; CHECK-32: ud2
; CHECK-32-NOT: subl $2147483647, %esp
-; CHECK-32: movl $50000000{{..}}, [[EAX:%e..]]
-; CHECK-32-NEXT: addl [[EAX]], %esp
+; CHECK-32: ud2
%1 = alloca [5000000000 x i8], align 16
call void @bar(ptr %1)
ret void
@@ -46,7 +44,7 @@ define i32 @foo3(i32 inreg %x) nounwind {
; CHECK-64-NEXT: subq %rax, %rsp
; CHECK-32-LABEL: foo3:
-; CHECK-32: subl $2147483647, %esp
+; CHECK-32: ud2
; CHECK-32-NOT: movl ${{.*}}, %eax
%1 = alloca [5000000000 x i8], align 16
call void @bar(ptr %1)
diff --git a/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll b/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll
index b8031056fd6b0a..d9b20f50e9a884 100644
--- a/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll
+++ b/llvm/test/CodeGen/X86/stack-clash-extra-huge.ll
@@ -30,44 +30,22 @@ define i32 @foo() local_unnamed_addr #0 {
; CHECK-X86-LABEL: foo:
; CHECK-X86: # %bb.0:
; CHECK-X86-NEXT: ud2
-; CHECK-X86-NEXT: .cfi_def_cfa_register %eax
-; CHECK-X86-NEXT: .cfi_adjust_cfa_offset 4800000000
-; CHECK-X86-NEXT: .LBB0_1: # =>This Inner Loop Header: Depth=1
-; CHECK-X86-NEXT: subl $4096, %esp # imm = 0x1000
-; CHECK-X86-NEXT: movl $0, (%esp)
-; CHECK-X86-NEXT: cmpl %eax, %esp
-; CHECK-X86-NEXT: jne .LBB0_1
-; CHECK-X86-NEXT: # %bb.2:
-; CHECK-X86-NEXT: subl $12, %esp
-; CHECK-X86-NEXT: .cfi_def_cfa_register %esp
; CHECK-X86-NEXT: .cfi_def_cfa_offset 4800000016
; CHECK-X86-NEXT: movl $1, 392(%esp)
; CHECK-X86-NEXT: movl $1, 28792(%esp)
; CHECK-X86-NEXT: movl (%esp), %eax
-; CHECK-X86-NEXT: movl $4800000012, %ecx # imm = 0x11E1A300C
-; CHECK-X86-NEXT: addl %ecx, %esp
+; CHECK-X86-NEXT: ud2
; CHECK-X86-NEXT: .cfi_def_cfa_offset 4
; CHECK-X86-NEXT: retl
;
; CHECK-X32-LABEL: foo:
; CHECK-X32: # %bb.0:
; CHECK-X32-NEXT: ud2
-; CHECK-X32-NEXT: .cfi_def_cfa_register %r11
-; CHECK-X32-NEXT: .cfi_adjust_cfa_offset 4799995904
-; CHECK-X32-NEXT: .LBB0_1: # =>This Inner Loop Header: Depth=1
-; CHECK-X32-NEXT: subl $4096, %esp # imm = 0x1000
-; CHECK-X32-NEXT: movq $0, (%esp)
-; CHECK-X32-NEXT: cmpl %r11d, %esp
-; CHECK-X32-NEXT: jne .LBB0_1
-; CHECK-X32-NEXT: # %bb.2:
-; CHECK-X32-NEXT: subl $3976, %esp # imm = 0xF88
-; CHECK-X32-NEXT: .cfi_def_cfa_register %rsp
; CHECK-X32-NEXT: .cfi_def_cfa_offset 4799999888
; CHECK-X32-NEXT: movl $1, 264(%esp)
; CHECK-X32-NEXT: movl $1, 28664(%esp)
; CHECK-X32-NEXT: movl -128(%esp), %eax
-; CHECK-X32-NEXT: movl $4799999880, %ecx # imm = 0x11E1A2F88
-; CHECK-X32-NEXT: addl %ecx, %esp
+; CHECK-X32-NEXT: ud2
; CHECK-X32-NEXT: .cfi_def_cfa_offset 8
; CHECK-X32-NEXT: retq
%a = alloca i32, i64 1200000000, align 16
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for reviewing this! I think it's ready to go, if someone wants to merge it. |
If you try to create a stack frame of 4 GiB or larger with a 32-bit stack pointer, we currently emit invalid instructions like
mov eax, 5000000000
(unless you specify-fstack-clash-protection
, in which case we emit a trap instead).The trap seems nicer, so let's do that in all cases. This avoids emitting invalid instructions, and also fixes the "can't have 32-bit 16GB stack frame" assertion in
X86FrameLowering::emitSPUpdate()
(which used to be triggerable by user code, but is now correct).This was originally part of #124041.
@phoebewang