-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[X86] Fix ABI for passing after i128 #124134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1555,30 +1555,30 @@ define <4 x double> @fmuladd_contract_v4f64(<4 x double> %a, <4 x double> %b, <4 | |
; SOFT-FLOAT-64-NEXT: .cfi_offset %r14, -32 | ||
; SOFT-FLOAT-64-NEXT: .cfi_offset %r15, -24 | ||
; SOFT-FLOAT-64-NEXT: .cfi_offset %rbp, -16 | ||
; SOFT-FLOAT-64-NEXT: movq %r9, %rbp | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's no There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. The previous implementation also (unintentionally) affected cases like an illegal There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why it doesn't trigger There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because functionArgumentNeedsConsecutiveRegisters only returns true for i128, so <4 x double> gets the default behavior. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, thanks! |
||
; SOFT-FLOAT-64-NEXT: movq %rcx, %r14 | ||
; SOFT-FLOAT-64-NEXT: movq %rdx, %r15 | ||
; SOFT-FLOAT-64-NEXT: movq %rsi, %r12 | ||
; SOFT-FLOAT-64-NEXT: movq %rsi, %r13 | ||
; SOFT-FLOAT-64-NEXT: movq %rdi, %rbx | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rbp | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: movq %r8, %rdi | ||
; SOFT-FLOAT-64-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r13 | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r12 | ||
; SOFT-FLOAT-64-NEXT: movq %r14, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq %rbp, %rsi | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r14 | ||
; SOFT-FLOAT-64-NEXT: movq %r15, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r15 | ||
; SOFT-FLOAT-64-NEXT: movq %r12, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: movq %r13, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq %rbp, %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r12 | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r13 | ||
; SOFT-FLOAT-64-NEXT: movq %r15, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __adddf3@PLT | ||
|
@@ -1587,13 +1587,13 @@ define <4 x double> @fmuladd_contract_v4f64(<4 x double> %a, <4 x double> %b, <4 | |
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, %r14 | ||
; SOFT-FLOAT-64-NEXT: movq %r13, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq %r12, %rdi | ||
; SOFT-FLOAT-64-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-NEXT: movq %rax, 24(%rbx) | ||
; SOFT-FLOAT-64-NEXT: movq %r14, 16(%rbx) | ||
; SOFT-FLOAT-64-NEXT: movq %r15, 8(%rbx) | ||
; SOFT-FLOAT-64-NEXT: movq %r12, (%rbx) | ||
; SOFT-FLOAT-64-NEXT: movq %r13, (%rbx) | ||
; SOFT-FLOAT-64-NEXT: movq %rbx, %rax | ||
; SOFT-FLOAT-64-NEXT: addq $8, %rsp | ||
; SOFT-FLOAT-64-NEXT: .cfi_def_cfa_offset 56 | ||
|
@@ -1633,30 +1633,30 @@ define <4 x double> @fmuladd_contract_v4f64(<4 x double> %a, <4 x double> %b, <4 | |
; SOFT-FLOAT-64-FMA-NEXT: .cfi_offset %r14, -32 | ||
; SOFT-FLOAT-64-FMA-NEXT: .cfi_offset %r15, -24 | ||
; SOFT-FLOAT-64-FMA-NEXT: .cfi_offset %rbp, -16 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r9, %rbp | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rcx, %r14 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rdx, %r15 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rsi, %r12 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rsi, %r13 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rdi, %rbx | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rbp | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r8, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r13 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r12 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r14, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rbp, %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r14 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r15, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r15 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r12, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r13, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rbp, %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r12 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r13 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r15, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __adddf3@PLT | ||
|
@@ -1665,13 +1665,13 @@ define <4 x double> @fmuladd_contract_v4f64(<4 x double> %a, <4 x double> %b, <4 | |
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, %r14 | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r13, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r12, %rdi | ||
; SOFT-FLOAT-64-FMA-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rax, 24(%rbx) | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r14, 16(%rbx) | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r15, 8(%rbx) | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r12, (%rbx) | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %r13, (%rbx) | ||
; SOFT-FLOAT-64-FMA-NEXT: movq %rbx, %rax | ||
; SOFT-FLOAT-64-FMA-NEXT: addq $8, %rsp | ||
; SOFT-FLOAT-64-FMA-NEXT: .cfi_def_cfa_offset 56 | ||
|
@@ -1711,30 +1711,30 @@ define <4 x double> @fmuladd_contract_v4f64(<4 x double> %a, <4 x double> %b, <4 | |
; SOFT-FLOAT-64-FMA4-NEXT: .cfi_offset %r14, -32 | ||
; SOFT-FLOAT-64-FMA4-NEXT: .cfi_offset %r15, -24 | ||
; SOFT-FLOAT-64-FMA4-NEXT: .cfi_offset %rbp, -16 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r9, %rbp | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rcx, %r14 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rdx, %r15 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rsi, %r12 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rsi, %r13 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rdi, %rbx | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rbp | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r8, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r13 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r12 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r14, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rbp, %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r14 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r15, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r15 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r12, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r13, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rbp, %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __muldf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r12 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r13 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r15, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __adddf3@PLT | ||
|
@@ -1743,13 +1743,13 @@ define <4 x double> @fmuladd_contract_v4f64(<4 x double> %a, <4 x double> %b, <4 | |
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, %r14 | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r13, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r12, %rdi | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq {{[0-9]+}}(%rsp), %rsi | ||
; SOFT-FLOAT-64-FMA4-NEXT: callq __adddf3@PLT | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rax, 24(%rbx) | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r14, 16(%rbx) | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r15, 8(%rbx) | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r12, (%rbx) | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %r13, (%rbx) | ||
; SOFT-FLOAT-64-FMA4-NEXT: movq %rbx, %rax | ||
; SOFT-FLOAT-64-FMA4-NEXT: addq $8, %rsp | ||
; SOFT-FLOAT-64-FMA4-NEXT: .cfi_def_cfa_offset 56 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,7 +31,7 @@ define i128 @on_stack2(i64 %a0, i64 %a1, i64 %a2, i64 %a3, i64 %a4, i128 %a5, i1 | |
define i64 @trailing_arg_on_stack(i64 %a0, i64 %a1, i64 %a2, i64 %a3, i64 %a4, i128 %a5, i64 %a6) { | ||
; CHECK-LABEL: trailing_arg_on_stack: | ||
; CHECK: # %bb.0: | ||
; CHECK-NEXT: movq 24(%rsp), %rax | ||
; CHECK-NEXT: movq %r9, %rax | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the most relevant test diff. |
||
; CHECK-NEXT: retq | ||
ret i64 %a6 | ||
} | ||
|
@@ -78,20 +78,18 @@ define void @call_trailing_arg_on_stack(i128 %x, i64 %y) nounwind { | |
; CHECK-LABEL: call_trailing_arg_on_stack: | ||
; CHECK: # %bb.0: | ||
; CHECK-NEXT: pushq %rax | ||
; CHECK-NEXT: movq %rdx, %rax | ||
; CHECK-NEXT: movq %rsi, %r9 | ||
; CHECK-NEXT: movq %rdx, %r9 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is relevant too. The |
||
; CHECK-NEXT: movq %rsi, %rax | ||
; CHECK-NEXT: movq %rdi, %r10 | ||
; CHECK-NEXT: subq $8, %rsp | ||
; CHECK-NEXT: movl $1, %esi | ||
; CHECK-NEXT: movl $2, %edx | ||
; CHECK-NEXT: movl $3, %ecx | ||
; CHECK-NEXT: movl $4, %r8d | ||
; CHECK-NEXT: xorl %edi, %edi | ||
; CHECK-NEXT: pushq %rax | ||
; CHECK-NEXT: pushq %r9 | ||
; CHECK-NEXT: pushq %r10 | ||
; CHECK-NEXT: callq trailing_arg_on_stack@PLT | ||
; CHECK-NEXT: addq $32, %rsp | ||
; CHECK-NEXT: addq $16, %rsp | ||
; CHECK-NEXT: popq %rax | ||
; CHECK-NEXT: retq | ||
call i128 @trailing_arg_on_stack(i64 0, i64 1, i64 2, i64 3, i64 4, i128 %x, i64 %y) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? The fourth
i128
seems be split to R9 and stack.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you mean either this or four single
i128
is not legal argument, because even for the later, the fourthi128
should be turn into memory by FE. In this way, I think we may not need to handle the problem here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm assuming that the frontend will handle
<2 x i128>
by directly using byval (https://clang.godbolt.org/z/bznzTKohz -- interestingly clang still directly returns the vector, I would have expected it to use sret, rather than relying on sret demotion in the backend...)I could extend this code to also handle vectors of i128 and require the whole vector argument to be in consecutive arguments. I'm just not sure it makes sense to handle this, as there is no defined psABI for <2 x i128> in the first place, and if there were, the frontend would handle that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning is another story, we emit warnings sometimes https://clang.godbolt.org/z/fYo937x3K
I think we can leave it as it.