Skip to content

[Clang][PowerPC] Add __dmr1024 type and DMF integer calculation builtins #142480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 20, 2025

Conversation

maryammo
Copy link
Contributor

@maryammo maryammo commented Jun 2, 2025

Define the __dmr1024 type used to manipulate the new DMR registers introduced by the Dense Math Facility (DMF) on PowerPC, and add six Clang builtins that correspond to the integer outer-product accumulate to ACC PowerPC instructions:

  • __builtin_mma_dmxvi8gerx4
  • __builtin_mma_pmdmxvi8gerx4
  • __builtin_mma_dmxvi8gerx4pp
  • __builtin_mma_pmdmxvi8gerx4pp
  • __builtin_mma_dmxvi8gerx4spp
  • __builtin_mma_pmdmxvi8gerx4spp.

Define the __dmr type used to manipulate the new DMR registers introduced by
the Dense Math Facility (DMF) on PowerPC, and add six Clang builtins that
correspond to the integer outer-product accumulate to ACC instructions:
__builtin_mma_dmxvi8gerx4, __builtin_mma_pmdmxvi8gerx4,
__builtin_mma_dmxvi8gerx4pp, __builtin_mma_pmdmxvi8gerx4pp,
__builtin_mma_dmxvi8gerx4spp, and __builtin_mma_pmdmxvi8gerx4spp.
@maryammo maryammo self-assigned this Jun 2, 2025
@maryammo maryammo requested a review from JDevlieghere as a code owner June 2, 2025 20:28
@maryammo maryammo added clang Clang issues not falling into any other category backend:PowerPC labels Jun 2, 2025
@maryammo maryammo requested a review from RolandF77 June 2, 2025 20:29
@llvmbot llvmbot added lldb clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jun 2, 2025
@maryammo maryammo requested a review from lei137 June 2, 2025 20:29
@llvmbot
Copy link
Member

llvmbot commented Jun 2, 2025

@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-clang

Author: Maryam Moghadas (maryammo)

Changes

Define the __dmr type used to manipulate the new DMR registers introduced by the Dense Math Facility (DMF) on PowerPC, and add six Clang builtins that correspond to the integer outer-product accumulate to ACC instructions: __builtin_mma_dmxvi8gerx4, __builtin_mma_pmdmxvi8gerx4, __builtin_mma_dmxvi8gerx4pp, __builtin_mma_pmdmxvi8gerx4pp, __builtin_mma_dmxvi8gerx4spp, and __builtin_mma_pmdmxvi8gerx4spp.


Patch is 26.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142480.diff

11 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsPPC.def (+12)
  • (modified) clang/include/clang/Basic/PPCTypes.def (+1)
  • (modified) clang/lib/AST/ASTContext.cpp (+1)
  • (modified) clang/test/AST/ast-dump-ppc-types.c (+10-3)
  • (added) clang/test/CodeGen/PowerPC/builtins-ppc-mmaplus.c (+94)
  • (added) clang/test/CodeGen/PowerPC/ppc-future-mma-builtin-err.c (+21)
  • (added) clang/test/CodeGen/PowerPC/ppc-future-paired-vec-memops-builtin-err.c (+20)
  • (added) clang/test/CodeGen/PowerPC/ppc-mmaplus-types.c (+184)
  • (modified) clang/test/CodeGenCXX/ppc-mangle-mma-types.cpp (+5)
  • (modified) clang/test/Sema/ppc-pair-mma-types.c (+98)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+1)
diff --git a/clang/include/clang/Basic/BuiltinsPPC.def b/clang/include/clang/Basic/BuiltinsPPC.def
index bb7d54bbb793e..099500754a0e0 100644
--- a/clang/include/clang/Basic/BuiltinsPPC.def
+++ b/clang/include/clang/Basic/BuiltinsPPC.def
@@ -1134,6 +1134,18 @@ UNALIASED_CUSTOM_BUILTIN(mma_pmxvbf16ger2np, "vW512*VVi15i15i3", true,
                          "mma,paired-vector-memops")
 UNALIASED_CUSTOM_BUILTIN(mma_pmxvbf16ger2nn, "vW512*VVi15i15i3", true,
                          "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmxvi8gerx4, "vW1024*W256V", false,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_pmdmxvi8gerx4, "vW1024*W256Vi255i15i15", false,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmxvi8gerx4pp, "vW1024*W256V", true,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_pmdmxvi8gerx4pp, "vW1024*W256Vi255i15i15", true,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmxvi8gerx4spp,  "vW1024*W256V", true,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_pmdmxvi8gerx4spp, "vW1024*W256Vi255i15i15", true,
+                         "mma,paired-vector-memops")
 
 // FIXME: Obviously incomplete.
 
diff --git a/clang/include/clang/Basic/PPCTypes.def b/clang/include/clang/Basic/PPCTypes.def
index 9e2cb2aedc9fc..cfc9de3a473d4 100644
--- a/clang/include/clang/Basic/PPCTypes.def
+++ b/clang/include/clang/Basic/PPCTypes.def
@@ -30,6 +30,7 @@
 #endif
 
 
+PPC_VECTOR_MMA_TYPE(__dmr, VectorDmr, 1024)
 PPC_VECTOR_MMA_TYPE(__vector_quad, VectorQuad, 512)
 PPC_VECTOR_VSX_TYPE(__vector_pair, VectorPair, 256)
 
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 45f9602856840..ffb4ca61b00c4 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -3455,6 +3455,7 @@ static void encodeTypeForFunctionPointerAuth(const ASTContext &Ctx,
     case BuiltinType::BFloat16:
     case BuiltinType::VectorQuad:
     case BuiltinType::VectorPair:
+    case BuiltinType::VectorDmr:
       OS << "?";
       return;
 
diff --git a/clang/test/AST/ast-dump-ppc-types.c b/clang/test/AST/ast-dump-ppc-types.c
index 26ae5441f20d7..a430268284413 100644
--- a/clang/test/AST/ast-dump-ppc-types.c
+++ b/clang/test/AST/ast-dump-ppc-types.c
@@ -1,9 +1,11 @@
+// RUN: %clang_cc1 -triple powerpc64le-unknown-unknown -target-cpu future \
+// RUN:   -ast-dump  %s | FileCheck %s
 // RUN: %clang_cc1 -triple powerpc64le-unknown-unknown -target-cpu pwr10 \
-// RUN:   -ast-dump -ast-dump-filter __vector %s | FileCheck %s
+// RUN:   -ast-dump  %s | FileCheck %s
 // RUN: %clang_cc1 -triple powerpc64le-unknown-unknown -target-cpu pwr9 \
-// RUN:   -ast-dump -ast-dump-filter __vector %s | FileCheck %s
+// RUN:   -ast-dump  %s | FileCheck %s
 // RUN: %clang_cc1 -triple powerpc64le-unknown-unknown -target-cpu pwr8 \
-// RUN:   -ast-dump -ast-dump-filter __vector %s | FileCheck %s
+// RUN:   -ast-dump  %s | FileCheck %s
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -ast-dump %s | FileCheck %s \
 // RUN:   --check-prefix=CHECK-X86_64
 // RUN: %clang_cc1 -triple arm-unknown-unknown -ast-dump %s | FileCheck %s \
@@ -15,16 +17,21 @@
 // are correctly defined. We also added checks on a couple of other targets to
 // ensure the types are target-dependent.
 
+// CHECK: TypedefDecl {{.*}} implicit __dmr '__dmr'
+// CHECK: `-BuiltinType {{.*}} '__dmr'
 // CHECK: TypedefDecl {{.*}} implicit __vector_quad '__vector_quad'
 // CHECK-NEXT: -BuiltinType {{.*}} '__vector_quad'
 // CHECK: TypedefDecl {{.*}} implicit __vector_pair '__vector_pair'
 // CHECK-NEXT: -BuiltinType {{.*}} '__vector_pair'
 
+// CHECK-X86_64-NOT: __dmr
 // CHECK-X86_64-NOT: __vector_quad
 // CHECK-X86_64-NOT: __vector_pair
 
+// CHECK-ARM-NOT: __dmr
 // CHECK-ARM-NOT: __vector_quad
 // CHECK-ARM-NOT: __vector_pair
 
+// CHECK-RISCV64-NOT: __dmr
 // CHECK-RISCV64-NOT: __vector_quad
 // CHECK-RISCV64-NOT: __vector_pair
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-mmaplus.c b/clang/test/CodeGen/PowerPC/builtins-ppc-mmaplus.c
new file mode 100644
index 0000000000000..2c335218ed32a
--- /dev/null
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-mmaplus.c
@@ -0,0 +1,94 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -O3 -triple powerpc64le-unknown-unknown -target-cpu future \
+// RUN:  -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -O3 -triple powerpc64-ibm-aix -target-cpu future \
+// RUN: -emit-llvm %s -o - | FileCheck %s
+
+
+// CHECK-LABEL: @test_dmxvi8gerx4(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <256 x i1>, ptr [[VPP:%.*]], align 32, !tbaa [[TBAA2:![0-9]+]]
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmxvi8gerx4(<256 x i1> [[TMP0]], <16 x i8> [[VC:%.*]])
+// CHECK-NEXT:    store <1024 x i1> [[TMP1]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6:![0-9]+]]
+// CHECK-NEXT:    ret void
+//
+void test_dmxvi8gerx4(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc, unsigned char *resp) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_dmxvi8gerx4(&vdmr, vp, vc);
+  *((__dmr *)resp) = vdmr;
+}
+
+// CHECK-LABEL: @test_pmdmxvi8gerx4(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <256 x i1>, ptr [[VPP:%.*]], align 32, !tbaa [[TBAA2]]
+// CHECK-NEXT:    [[TMP1:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.pmdmxvi8gerx4(<256 x i1> [[TMP0]], <16 x i8> [[VC:%.*]], i32 0, i32 0, i32 0)
+// CHECK-NEXT:    store <1024 x i1> [[TMP1]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_pmdmxvi8gerx4(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc, unsigned char *resp) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_pmdmxvi8gerx4(&vdmr, vp, vc, 0, 0, 0);
+  *((__dmr *)resp) = vdmr;
+}
+
+// CHECK-LABEL: @test_dmxvi8gerx4pp(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <1024 x i1>, ptr [[VDMRP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load <256 x i1>, ptr [[VPP:%.*]], align 32, !tbaa [[TBAA2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmxvi8gerx4pp(<1024 x i1> [[TMP0]], <256 x i1> [[TMP1]], <16 x i8> [[VC:%.*]])
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_dmxvi8gerx4pp(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc, unsigned char *resp) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_dmxvi8gerx4pp(&vdmr, vp, vc);
+  *((__dmr *)resp) = vdmr;
+}
+
+// CHECK-LABEL: @test_pmdmxvi8gerx4pp(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <1024 x i1>, ptr [[VDMRP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load <256 x i1>, ptr [[VPP:%.*]], align 32, !tbaa [[TBAA2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.pmdmxvi8gerx4pp(<1024 x i1> [[TMP0]], <256 x i1> [[TMP1]], <16 x i8> [[VC:%.*]], i32 0, i32 0, i32 0)
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_pmdmxvi8gerx4pp(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc, unsigned char *resp) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_pmdmxvi8gerx4pp(&vdmr, vp, vc, 0, 0, 0);
+  *((__dmr *)resp) = vdmr;
+}
+
+// CHECK-LABEL: @test_dmxvi8gerx4spp(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <1024 x i1>, ptr [[VDMRP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load <256 x i1>, ptr [[VPP:%.*]], align 32, !tbaa [[TBAA2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmxvi8gerx4spp(<1024 x i1> [[TMP0]], <256 x i1> [[TMP1]], <16 x i8> [[VC:%.*]])
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_dmxvi8gerx4spp(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc, unsigned char *resp) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_dmxvi8gerx4spp(&vdmr, vp, vc);
+  *((__dmr *)resp) = vdmr;
+}
+
+// CHECK-LABEL: @test_pmdmxvi8gerx4spp(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = load <1024 x i1>, ptr [[VDMRP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load <256 x i1>, ptr [[VPP:%.*]], align 32, !tbaa [[TBAA2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.pmdmxvi8gerx4spp(<1024 x i1> [[TMP0]], <256 x i1> [[TMP1]], <16 x i8> [[VC:%.*]], i32 0, i32 0, i32 0)
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[RESP:%.*]], align 128, !tbaa [[TBAA6]]
+// CHECK-NEXT:    ret void
+//
+void test_pmdmxvi8gerx4spp(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc, unsigned char *resp) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_pmdmxvi8gerx4spp(&vdmr, vp, vc, 0, 0, 0);
+  *((__dmr *)resp) = vdmr;
+}
diff --git a/clang/test/CodeGen/PowerPC/ppc-future-mma-builtin-err.c b/clang/test/CodeGen/PowerPC/ppc-future-mma-builtin-err.c
new file mode 100644
index 0000000000000..c6029f9eb8352
--- /dev/null
+++ b/clang/test/CodeGen/PowerPC/ppc-future-mma-builtin-err.c
@@ -0,0 +1,21 @@
+// RUN: not %clang_cc1 -triple powerpc64le-unknown-linux-gnu -target-cpu future \
+// RUN:   %s -emit-llvm-only 2>&1 | FileCheck %s
+
+__attribute__((target("no-mma")))
+void test_mma(unsigned char *vdmrp, unsigned char *vpp, vector unsigned char vc) {
+  __dmr vdmr = *((__dmr *)vdmrp);
+  __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_dmxvi8gerx4(&vdmr, vp, vc);
+  __builtin_mma_pmdmxvi8gerx4(&vdmr, vp, vc, 0, 0, 0);
+  __builtin_mma_dmxvi8gerx4pp(&vdmr, vp, vc);
+  __builtin_mma_pmdmxvi8gerx4pp(&vdmr, vp, vc, 0, 0, 0);
+  __builtin_mma_dmxvi8gerx4spp(&vdmr, vp, vc);
+  __builtin_mma_pmdmxvi8gerx4spp(&vdmr, vp, vc, 0, 0, 0);
+
+// CHECK: error: '__builtin_mma_dmxvi8gerx4' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_pmdmxvi8gerx4' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_dmxvi8gerx4pp' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_pmdmxvi8gerx4pp' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_dmxvi8gerx4spp' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_pmdmxvi8gerx4spp' needs target feature mma,paired-vector-memops
+}
diff --git a/clang/test/CodeGen/PowerPC/ppc-future-paired-vec-memops-builtin-err.c b/clang/test/CodeGen/PowerPC/ppc-future-paired-vec-memops-builtin-err.c
new file mode 100644
index 0000000000000..c31847e3ca4c4
--- /dev/null
+++ b/clang/test/CodeGen/PowerPC/ppc-future-paired-vec-memops-builtin-err.c
@@ -0,0 +1,20 @@
+// RUN: not %clang_cc1 -triple powerpc64le-unknown-linux-gnu -target-cpu future \
+// RUN:   %s -emit-llvm-only 2>&1 | FileCheck %s
+
+__attribute__((target("no-paired-vector-memops")))
+void test_pair(unsigned char *vdmr, unsigned char *vpp, vector unsigned char vc) {
+   __vector_pair vp = *((__vector_pair *)vpp);
+  __builtin_mma_dmxvi8gerx4((__dmr *)vdmr, vp, vc);
+  __builtin_mma_pmdmxvi8gerx4((__dmr *)vdmr, vp, vc, 0, 0, 0);
+  __builtin_mma_dmxvi8gerx4pp((__dmr *)vdmr, vp, vc);
+  __builtin_mma_pmdmxvi8gerx4pp((__dmr *)vdmr, vp, vc, 0, 0, 0);
+  __builtin_mma_dmxvi8gerx4spp((__dmr *)vdmr, vp, vc);
+  __builtin_mma_pmdmxvi8gerx4spp((__dmr *)vdmr, vp, vc, 0, 0, 0);
+
+// CHECK: error: '__builtin_mma_dmxvi8gerx4' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_pmdmxvi8gerx4' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_dmxvi8gerx4pp' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_pmdmxvi8gerx4pp' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_dmxvi8gerx4spp' needs target feature mma,paired-vector-memops
+// CHECK: error: '__builtin_mma_pmdmxvi8gerx4spp' needs target feature mma,paired-vector-memops
+}
diff --git a/clang/test/CodeGen/PowerPC/ppc-mmaplus-types.c b/clang/test/CodeGen/PowerPC/ppc-mmaplus-types.c
new file mode 100644
index 0000000000000..dbae2d0c0829a
--- /dev/null
+++ b/clang/test/CodeGen/PowerPC/ppc-mmaplus-types.c
@@ -0,0 +1,184 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu future \
+// RUN:   -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu pwr10 \
+// RUN:   -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu pwr9 \
+// RUN:   -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu pwr8 \
+// RUN:   -emit-llvm -o - %s | FileCheck %s
+
+typedef __vector_quad vq_t;
+
+// CHECK-LABEL: @test_dmr_copy(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR1_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR2_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[PTR1:%.*]], ptr [[PTR1_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR2:%.*]], ptr [[PTR2_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR1_ADDR]], align 8
+// CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds <1024 x i1>, ptr [[TMP0]], i64 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load <1024 x i1>, ptr [[ADD_PTR]], align 128
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[PTR2_ADDR]], align 8
+// CHECK-NEXT:    [[ADD_PTR1:%.*]] = getelementptr inbounds <1024 x i1>, ptr [[TMP2]], i64 1
+// CHECK-NEXT:    store <1024 x i1> [[TMP1]], ptr [[ADD_PTR1]], align 128
+// CHECK-NEXT:    ret void
+//
+void test_dmr_copy(__dmr *ptr1, __dmr *ptr2) {
+  *(ptr2 + 1) = *(ptr1 + 2);
+}
+
+// CHECK-LABEL: @test_dmr_typedef(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[INP_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[OUTP_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRIN:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMROUT:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[INP:%.*]], ptr [[INP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[OUTP:%.*]], ptr [[OUTP_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[INP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRIN]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[OUTP_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP1]], ptr [[VDMROUT]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[VDMRIN]], align 8
+// CHECK-NEXT:    [[TMP3:%.*]] = load <1024 x i1>, ptr [[TMP2]], align 128
+// CHECK-NEXT:    [[TMP4:%.*]] = load ptr, ptr [[VDMROUT]], align 8
+// CHECK-NEXT:    store <1024 x i1> [[TMP3]], ptr [[TMP4]], align 128
+// CHECK-NEXT:    ret void
+//
+void test_dmr_typedef(int *inp, int *outp) {
+  __dmr *vdmrin = (__dmr *)inp;
+  __dmr *vdmrout = (__dmr *)outp;
+  *vdmrout = *vdmrin;
+}
+
+// CHECK-LABEL: @test_dmr_arg(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[VDMR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[VDMR:%.*]], ptr [[VDMR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load <1024 x i1>, ptr [[TMP1]], align 128
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[VDMRP]], align 8
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[TMP3]], align 128
+// CHECK-NEXT:    ret void
+//
+void test_dmr_arg(__dmr *vdmr, int *ptr) {
+  __dmr *vdmrp = (__dmr *)ptr;
+  *vdmrp = *vdmr;
+}
+
+// CHECK-LABEL: @test_dmr_const_arg(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[VDMR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[VDMR:%.*]], ptr [[VDMR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load <1024 x i1>, ptr [[TMP1]], align 128
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[VDMRP]], align 8
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[TMP3]], align 128
+// CHECK-NEXT:    ret void
+//
+void test_dmr_const_arg(const __dmr *const vdmr, int *ptr) {
+  __dmr *vdmrp = (__dmr *)ptr;
+  *vdmrp = *vdmr;
+}
+
+// CHECK-LABEL: @test_dmr_array_arg(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[VDMRA_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[VDMRA:%.*]], ptr [[VDMRA_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRA_ADDR]], align 8
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds <1024 x i1>, ptr [[TMP1]], i64 0
+// CHECK-NEXT:    [[TMP2:%.*]] = load <1024 x i1>, ptr [[ARRAYIDX]], align 128
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr, ptr [[VDMRP]], align 8
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[TMP3]], align 128
+// CHECK-NEXT:    ret void
+//
+void test_dmr_array_arg(__dmr vdmra[], int *ptr) {
+  __dmr *vdmrp = (__dmr *)ptr;
+  *vdmrp = vdmra[0];
+}
+
+// CHECK-LABEL: @test_dmr_ret(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds <1024 x i1>, ptr [[TMP1]], i64 2
+// CHECK-NEXT:    ret ptr [[ADD_PTR]]
+//
+__dmr *test_dmr_ret(int *ptr) {
+  __dmr *vdmrp = (__dmr *)ptr;
+  return vdmrp + 2;
+}
+
+// CHECK-LABEL: @test_dmr_ret_const(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds <1024 x i1>, ptr [[TMP1]], i64 2
+// CHECK-NEXT:    ret ptr [[ADD_PTR]]
+//
+const __dmr *test_dmr_ret_const(int *ptr) {
+  __dmr *vdmrp = (__dmr *)ptr;
+  return vdmrp + 2;
+}
+
+// CHECK-LABEL: @test_dmr_sizeof_alignof(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[PTR_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMRP:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[VDMR:%.*]] = alloca <1024 x i1>, align 128
+// CHECK-NEXT:    [[SIZET:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[ALIGNT:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[SIZEV:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[ALIGNV:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[PTR:%.*]], ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[PTR_ADDR]], align 8
+// CHECK-NEXT:    store ptr [[TMP0]], ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr [[VDMRP]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load <1024 x i1>, ptr [[TMP1]], align 128
+// CHECK-NEXT:    store <1024 x i1> [[TMP2]], ptr [[VDMR]], align 128
+// CHECK-NEXT:    store i32 128, ptr [[SIZET]], align 4
+// CHECK-NEXT:    sto...
[truncated]

@maryammo maryammo requested review from mandlebug and removed request for JDevlieghere June 2, 2025 20:29
@@ -30,6 +30,7 @@
#endif


PPC_VECTOR_MMA_TYPE(__dmr1024, VectorDmr1024, 1024)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a vector type, maybe we can just keep the same naming convention?

Suggested change
PPC_VECTOR_MMA_TYPE(__dmr1024, VectorDmr1024, 1024)
PPC_VECTOR_MMA_TYPE(__vector_dmr, VectorDmr, 1024)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type names have to match GCC and are negotiated - __dmr1024 is the current choice.

@@ -0,0 +1,184 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe this tests can be renamed to dmf-types.c?

@@ -0,0 +1,20 @@
// RUN: not %clang_cc1 -triple powerpc64le-unknown-linux-gnu -target-cpu future \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better not to have "future" in the test case names as "future" is a sliding thing?

@@ -0,0 +1,94 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, what does mmaplus imply here? Since this tests mma builtins that uses the dmr registers, maybe builtins-ppc-mma-dmr.c?

@maryammo maryammo requested a review from lei137 June 17, 2025 14:17
Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this LGTM.
Just a few nits. Please also update your descripton and PR title as it says you are dding __dmr type, but you are actually adding type __dmr1024.
Thx!

@@ -3455,6 +3455,7 @@ static void encodeTypeForFunctionPointerAuth(const ASTContext &Ctx,
case BuiltinType::BFloat16:
case BuiltinType::VectorQuad:
case BuiltinType::VectorPair:
case BuiltinType::VectorDmr1024:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe we can do this since DMR is an acronym and to match the actual new type name defined?

Suggested change
case BuiltinType::VectorDmr1024:
case BuiltinType::DMR1024:

// RUN: %clang_cc1 -O3 -triple powerpc64le-unknown-unknown -target-cpu future \
// RUN: -emit-llvm %s -o - | FileCheck %s
// RUN: %clang_cc1 -O3 -triple powerpc64-ibm-aix -target-cpu future \
// RUN: -emit-llvm %s -o - | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need testing for aix 32bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PC_VECTOR_MMA_TYPE is only defined for PPC64

// RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu pwr9 \
// RUN: -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -triple powerpc64le-linux-unknown -target-cpu pwr8 \
// RUN: -emit-llvm -o - %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since DMR is future specific, do we need run lines on cpu targets that it won't be valid for?

@@ -2,12 +2,110 @@
// RUN: -target-cpu pwr10 %s -verify
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should move these to a different test file -> ppc-dmr-types.c

@maryammo maryammo changed the title [Clang][PowerPC] Add __dmr type and DMF integer calculation builtins [Clang][PowerPC] Add __dmr1024 type and DMF integer calculation builtins Jun 18, 2025
@maryammo maryammo merged commit 65cb3bc into llvm:main Jun 20, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:PowerPC clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category lldb
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants