[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features #71545

simpal01 · 2023-11-07T14:58:24Z

The floating-point and MVE features together specify the MVE functionality that is supported on the Cortex-M85 processor. But the FPU extension for the underlying architecture(armv8.1-m.main) is FPV5 which does not include MVE-F. So Compiler's -S output and -save-temps=obj loses MVE feature which leads to assembler error. What happening here is .fpu directive overrides any previously set features by .cpu directive. Since the the corresponding .fpu generated (.fpu fpv5-d16) does not include MVE-F, it overrides those features even though it is supported and set by the .cpu directive. Looks like .fpu is supposed to do this.

In this case, there should be an .arch_extension directive re-enabling the relevant extensions after .fpu if the goal is to keep these extensions enabled. GCC also does the same.

So this patch enables the MVE features by emitting the below arch extension:
.fpu fpv5-d16
.arch_extension mve.fp

llvmbot · 2023-11-07T14:58:57Z

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-clang

Author: None (simpal01)

Changes

The floating-point and MVE features together specify the MVE functionality that is supported on the Cortex-M85 processor. But the FPU extension for the underlying architecture(armv8.1-m.main) is FPV5 which does not include MVE-F.

So either when we explicitly specify -mfpu=fpv5-d16 or Compiler's -S output and -save-temps=obj loses MVE feature which leads to assembler error. What happening here is .fpu directive overrides any previously set features by .cpu directive. Since the the corresponding .fpu generated (.fpu fpv5-d16) does not include MVE-F, it overrides those features even though it is supported and set by the .cpu directive. Looks like .fpu is supposed to do this.

In this case, there should be an .arch_extension directive re-enabling the relevant extensions after .fpu if the goal is to keep these extensions enabled. GCC also does the same.

So this patch enables the MVE features by emitting the below arch extension:
.fpu fpv5-d16
.arch_extension mve.fp

Full diff: https://github.com/llvm/llvm-project/pull/71545.diff

3 Files Affected:

(added) clang/test/CodeGen/arm-v8.1m-check-mve.ll (+56)
(modified) llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp (+3)
(modified) llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp (+10-6)

diff --git a/clang/test/CodeGen/arm-v8.1m-check-mve.ll b/clang/test/CodeGen/arm-v8.1m-check-mve.ll
new file mode 100644
index 000000000000000..cfcb0223961e31e
--- /dev/null
+++ b/clang/test/CodeGen/arm-v8.1m-check-mve.ll
@@ -0,0 +1,56 @@
+; REQUIRES: arm-registered-target
+; RUN: %clang --target=arm-none-eabi -mcpu=cortex-m85 -mfloat-abi=hard -save-temps=obj -S -o - %s | FileCheck %s
+; RUN: %clang --target=arm-none-eabi -mcpu=cortex-m55 -mfloat-abi=hard -save-temps=obj -S -o - %s | FileCheck %s
+; RUN: %clang --target=arm-none-eabi -mcpu=cortex-m85 -mfloat-abi=hard -O2 -c -mthumb -save-temps=obj %s
+; RUN: %clang --target=arm-none-eabi -mcpu=cortex-m55 -mfloat-abi=hard -O2 -c -mthumb -save-temps=obj %s
+; CHECK: .fpu   fpv5-d16
+; CHECK  .arch_extension mve.fp
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv8.1m.main-none-unknown-eabihf"
+
+%struct.dummy_t = type { float, float, float, float }
+
+define dso_local signext i8 @foo(ptr noundef %handle) #0 {
+entry:
+  %handle.addr = alloca ptr, align 4
+  store ptr %handle, ptr %handle.addr, align 4
+  %0 = load ptr, ptr %handle.addr, align 4
+  %a = getelementptr inbounds %struct.dummy_t, ptr %0, i32 0, i32 0
+  %1 = load float, ptr %a, align 4
+  %sub = fsub float 0x3F5439DE40000000, %1
+  %2 = load ptr, ptr %handle.addr, align 4
+  %a1 = getelementptr inbounds %struct.dummy_t, ptr %2, i32 0, i32 0
+  %3 = load float, ptr %a1, align 4
+  %4 = call float @llvm.fmuladd.f32(float 0x3F847AE140000000, float %sub, float %3)
+  store float %4, ptr %a1, align 4
+  %5 = load ptr, ptr %handle.addr, align 4
+  %b = getelementptr inbounds %struct.dummy_t, ptr %5, i32 0, i32 1
+  %6 = load float, ptr %b, align 4
+  %sub2 = fsub float 0x3F5439DE40000000, %6
+  %7 = load ptr, ptr %handle.addr, align 4
+  %b3 = getelementptr inbounds %struct.dummy_t, ptr %7, i32 0, i32 1
+  %8 = load float, ptr %b3, align 4
+  %9 = call float @llvm.fmuladd.f32(float 0x3F947AE140000000, float %sub2, float %8)
+  store float %9, ptr %b3, align 4
+  %10 = load ptr, ptr %handle.addr, align 4
+  %c = getelementptr inbounds %struct.dummy_t, ptr %10, i32 0, i32 2
+  %11 = load float, ptr %c, align 4
+  %sub4 = fsub float 0x3F5439DE40000000, %11
+  %12 = load ptr, ptr %handle.addr, align 4
+  %c5 = getelementptr inbounds %struct.dummy_t, ptr %12, i32 0, i32 2
+  %13 = load float, ptr %c5, align 4
+  %14 = call float @llvm.fmuladd.f32(float 0x3F9EB851E0000000, float %sub4, float %13)
+  store float %14, ptr %c5, align 4
+  %15 = load ptr, ptr %handle.addr, align 4
+  %d = getelementptr inbounds %struct.dummy_t, ptr %15, i32 0, i32 3
+  %16 = load float, ptr %d, align 4
+  %sub6 = fsub float 0x3F5439DE40000000, %16
+  %17 = load ptr, ptr %handle.addr, align 4
+  %d7 = getelementptr inbounds %struct.dummy_t, ptr %17, i32 0, i32 3
+  %18 = load float, ptr %d7, align 4
+  %19 = call float @llvm.fmuladd.f32(float 0x3FA47AE140000000, float %sub6, float %18)
+  store float %19, ptr %d7, align 4
+  ret i8 0
+}
+
+declare float @llvm.fmuladd.f32(float, float, float) #1
diff --git a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
index 373d5b59bca6640..20b52ebc544a1ed 100644
--- a/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
+++ b/llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
@@ -12648,6 +12648,9 @@ bool ARMAsmParser::enableArchExtFeature(StringRef Name, SMLoc &ExtLoc) {
       {ARM::AEK_CRYPTO,
        {Feature_HasV8Bit},
        {ARM::FeatureCrypto, ARM::FeatureNEON, ARM::FeatureFPARMv8}},
+      {(ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP),
+       {Feature_HasV8_1MMainlineBit},
+       {ARM::HasMVEFloatOps}},
       {ARM::AEK_FP,
        {Feature_HasV8Bit},
        {ARM::FeatureVFP2_SP, ARM::FeatureFPARMv8}},
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
index b65d1b24e63d39b..e84b597e4382edc 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMTargetStreamer.cpp
@@ -238,14 +238,18 @@ void ARMTargetStreamer::emitTargetAttributes(const MCSubtargetInfo &STI) {
                         ? ARMBuildAttrs::AllowNeonARMv8_1a
                         : ARMBuildAttrs::AllowNeonARMv8);
   } else {
-    if (STI.hasFeature(ARM::FeatureFPARMv8_D16_SP))
+    if (STI.hasFeature(ARM::FeatureFPARMv8_D16_SP)) {
       // FPv5 and FP-ARMv8 have the same instructions, so are modeled as one
       // FPU, but there are two different names for it depending on the CPU.
-      emitFPU(STI.hasFeature(ARM::FeatureD32)
-                  ? ARM::FK_FP_ARMV8
-                  : (STI.hasFeature(ARM::FeatureFP64) ? ARM::FK_FPV5_D16
-                                                      : ARM::FK_FPV5_SP_D16));
-    else if (STI.hasFeature(ARM::FeatureVFP4_D16_SP))
+      if (STI.hasFeature(ARM::FeatureD32))
+        emitFPU(ARM::FK_FP_ARMV8);
+      else {
+        emitFPU(STI.hasFeature(ARM::FeatureFP64) ? ARM::FK_FPV5_D16
+                                                 : ARM::FK_FPV5_SP_D16);
+        if (STI.hasFeature(ARM::HasMVEFloatOps))
+          emitArchExtension(ARM::AEK_SIMD | ARM::AEK_DSP | ARM::AEK_FP);
+      }
+    } else if (STI.hasFeature(ARM::FeatureVFP4_D16_SP))
       emitFPU(STI.hasFeature(ARM::FeatureD32)
                   ? ARM::FK_VFPV4
                   : (STI.hasFeature(ARM::FeatureFP64) ? ARM::FK_VFPV4_D16

DavidSpickett · 2023-11-07T16:55:58Z

I think the commit title would make more sense at a glance if it was saying what is being done instead of what's being fixed. How about:

[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features

Then the commit message is fine as is, and tells the reader why this was done and what it fixes.

clang/test/CodeGen/arm-v8.1m-check-mve.ll

simpal01 · 2023-11-09T15:53:24Z

I think the commit title would make more sense at a glance if it was saying what is being done instead of what's being fixed. How about:
[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features
Then the commit message is fine as is, and tells the reader why this was done and what it fixes.

Done

… unsupported MVE instructions for cortex M85/M55. The floating-point and MVE features together specify the MVE functionality that is supported on the Cortex-M85 processor. But the FPU extension for the underlying architecture(armv8.1-m.main) is FPV5 which does not include MVE-F. So either when we explictly specify -mfpu=fpv5-d16 or Compiler's -S output and `-save-temps=obj` loses MVE feature which leads to assembler error. What happening here is .fpu directive overrides any previously set features by .cpu directive. Since the the corresponding .fpu generated (.fpu fpv5-d16) does not include MVE-F, it overrides those features even though it is supported and set by the .cpu directive. Looks like .fpu is supposed to do this. In this case, there should be an .arch_extension directive re-enabling the relevant extensions after .fpu if the goal is to keep these extensions enabled. GCC also does the same. So this patch enables the MVE features by: .fpu fpv5-d16 .arch_extension mve.fp

…t does not include MVE features. 1. .arch_extension will always be on the very next line. CHECK-NEXT would be bit more robust.

…eads to unsupported MVE instructions for cortex M85/M55.

simpal01 · 2023-11-15T17:02:31Z

@DavidSpickett i have removed the file arm-v8.1m-check-mve.ll and added a c file(clang/test/CodeGen/arm-v8.1m-check-mve.c.) instead for testing. Mainly because to align with the structure of the current test files.

clang/test/CodeGen/arm-v8.1m-check-mve.c

DavidSpickett · 2023-11-16T09:12:48Z

Just small things on the test case from me. Nominate @davemgreen to give it a final check and approval, since I have not been around this area in a while.

…eads to unsupported MVE instructions for cortex M85/M55.

DavidSpickett · 2023-11-16T11:39:40Z

Tests now LGTM, thanks for the updates.

davemgreen

Hi. From what I can tell the logic looks OK. We add the archextension in places we expect to now?

It is generally considered best practice to not have clang test that run the entire pass pipeline. In this case it looks like it's trying to SLP vectorize the code to make sure MVE operations are produced? The problem can be that decision like that can change, and it is better if the tests are more narrowly focussed than replying on the whole pipeline. You do loose end-to-end testing but it may be better to have a clang test that checks the IR generated is what is expected, and an assembly test to make sure the .arch_extension mve.fp is recognized and turns on the MVE instructions it should do.

clang/test/CodeGen/arm-v8.1m-check-mve.c

…eads to unsupported MVE instructions for cortex M85/M55. The tests are now more narrowly focussed than running on the whole pipeline. Now added: 1. a llc test that checks the assembly generated contains the .arch_extension mve.fp if it supports the mve floating point operations. 2. an assembly test to make sure the .arch_extension mve.fp is recognized and turns on the MVE instructions it should do.

simpal01 · 2023-11-21T09:25:30Z

Hi. From what I can tell the logic looks OK. We add the archextension in places we expect to now?

It is generally considered best practice to not have clang test that run the entire pass pipeline. In this case it looks like it's trying to SLP vectorize the code to make sure MVE operations are produced? The problem can be that decision like that can change, and it is better if the tests are more narrowly focussed than replying on the whole pipeline. You do loose end-to-end testing but it may be better to have a clang test that checks the IR generated is what is expected, and an assembly test to make sure the .arch_extension mve.fp is recognized and turns on the MVE instructions it should do.

DONE

davemgreen

Thanks. LGTM

…lude MVE feature for V8.1m Cpus as it is upstreamed. There was a patch in BMT which fixes the issue of lossing MVE feature for Compiler's -S output and `-save-temps=obj` which leads to assembler error for Arm V8.1m Cpus. The patch has been upstreamed in llvm-project: llvm/llvm-project#71545

There is a patch in BMT which fixes the issue of losing MVE features and an assembler error for V8.1m Cpus because of that. This patch has been upstreamed in llvm-project now: llvm/llvm-project#71545

…not include MVE features (llvm#71545) The floating-point and MVE features together specify the MVE functionality that is supported on the Cortex-M85 processor. But the FPU extension for the underlying architecture(armv8.1-m.main) is FPV5 which does not include MVE-F. So Compiler's -S output and `-save-temps=obj` loses MVE feature which leads to assembler error. What happening here is .fpu directive overrides any previously set features by .cpu directive. Since the the corresponding .fpu generated (.fpu fpv5-d16) does not include MVE-F, it overrides those features even though it is supported and set by the .cpu directive. Looks like .fpu is supposed to do this. In this case, there should be an .arch_extension directive re-enabling the relevant extensions after .fpu if the goal is to keep these extensions enabled. GCC also does the same. So this patch enables the MVE features by emitting the below arch extension: .fpu fpv5-d16 .arch_extension mve.fp --------- Co-authored-by: Simi Pallipurath <simi.pallipurath.com>

llvmbot added clang Clang issues not falling into any other category backend:ARM labels Nov 7, 2023

simpal01 requested review from momchil-velikov, smithp35, pratlucas and tmatheson-arm November 7, 2023 15:18

DavidSpickett reviewed Nov 7, 2023

View reviewed changes

clang/test/CodeGen/arm-v8.1m-check-mve.ll Outdated Show resolved Hide resolved

DavidSpickett reviewed Nov 7, 2023

View reviewed changes

clang/test/CodeGen/arm-v8.1m-check-mve.ll Outdated Show resolved Hide resolved

DavidSpickett reviewed Nov 7, 2023

View reviewed changes

clang/test/CodeGen/arm-v8.1m-check-mve.ll Outdated Show resolved Hide resolved

simpal01 changed the title ~~[ARM] .fpu equals fpv5-d16 disables floating point MVE which leads to unsupported MVE instructions assembler error for cortex M85/M55.~~ [llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features Nov 8, 2023

simpal01 requested a review from davemgreen November 9, 2023 09:50

Simi Pallipurath added 2 commits November 9, 2023 18:04

fixup! [llvm][ARM] Emit MVE .arch_extension after .fpu directive if i…

951537d

…t does not include MVE features. 1. .arch_extension will always be on the very next line. CHECK-NEXT would be bit more robust.

simpal01 force-pushed the main branch from 6c38663 to 951537d Compare November 9, 2023 18:05

Simi Pallipurath added 2 commits November 15, 2023 16:45

fixup! [ARM] .fpu equals fpv5-d16 disables floating point MVE which l…

38c3ee0

…eads to unsupported MVE instructions for cortex M85/M55.

fixup! [ARM] .fpu equals fpv5-d16 disables floating point MVE which l…

e8bf45c

…eads to unsupported MVE instructions for cortex M85/M55.

DavidSpickett reviewed Nov 16, 2023

View reviewed changes

clang/test/CodeGen/arm-v8.1m-check-mve.c Outdated Show resolved Hide resolved

clang/test/CodeGen/arm-v8.1m-check-mve.c Outdated Show resolved Hide resolved

clang/test/CodeGen/arm-v8.1m-check-mve.c Outdated Show resolved Hide resolved

fixup! [ARM] .fpu equals fpv5-d16 disables floating point MVE which l…

891c77b

…eads to unsupported MVE instructions for cortex M85/M55.

davemgreen reviewed Nov 16, 2023

View reviewed changes

clang/test/CodeGen/arm-v8.1m-check-mve.c Outdated Show resolved Hide resolved

llvmbot added the mc Machine (object) code label Nov 21, 2023

simpal01 closed this Nov 21, 2023

simpal01 reopened this Nov 21, 2023

davemgreen approved these changes Nov 21, 2023

View reviewed changes

simpal01 mentioned this pull request Nov 21, 2023

Remove the patch as it is upstreamed. ARM-software/LLVM-embedded-toolchain-for-Arm#358

Closed

simpal01 mentioned this pull request Nov 21, 2023

Remove the patch as it is upstreamed. ARM-software/LLVM-embedded-toolchain-for-Arm#359

Merged

simpal01 merged commit 74cdb8e into llvm:main Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features #71545

[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features #71545

Uh oh!

simpal01 commented Nov 7, 2023 •

edited

Loading

Uh oh!

llvmbot commented Nov 7, 2023 •

edited

Loading

Uh oh!

DavidSpickett commented Nov 7, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

simpal01 commented Nov 9, 2023

Uh oh!

simpal01 commented Nov 15, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DavidSpickett commented Nov 16, 2023

Uh oh!

DavidSpickett commented Nov 16, 2023

Uh oh!

davemgreen left a comment

Uh oh!

Uh oh!

simpal01 commented Nov 21, 2023

Uh oh!

davemgreen left a comment

Uh oh!

Uh oh!

[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features #71545

[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features #71545

Uh oh!

Conversation

simpal01 commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSpickett commented Nov 7, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

simpal01 commented Nov 9, 2023

Uh oh!

simpal01 commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DavidSpickett commented Nov 16, 2023

Uh oh!

DavidSpickett commented Nov 16, 2023

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simpal01 commented Nov 21, 2023

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simpal01 commented Nov 7, 2023 •

edited

Loading

llvmbot commented Nov 7, 2023 •

edited

Loading

simpal01 commented Nov 15, 2023 •

edited

Loading