Skip to content

[AMDGPU] Remove wavefrontsize feature from GFX10+ #98400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 16, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions llvm/lib/Target/AMDGPU/AMDGPU.td
Original file line number Diff line number Diff line change
Expand Up @@ -1464,7 +1464,6 @@ def FeatureISAVersion10_Common : FeatureSet<
FeatureLDSBankCount32,
FeatureDLInsts,
FeatureNSAEncoding,
FeatureWavefrontSize32,
FeatureBackOffBarrier]>;

def FeatureISAVersion10_1_Common : FeatureSet<
Expand Down Expand Up @@ -1548,7 +1547,6 @@ def FeatureISAVersion11_Common : FeatureSet<
FeatureDot10Insts,
FeatureNSAEncoding,
FeaturePartialNSAEncoding,
FeatureWavefrontSize32,
FeatureShaderCyclesRegister,
FeatureArchitectedFlatScratch,
FeatureAtomicFaddRtnInsts,
Expand Down Expand Up @@ -1625,7 +1623,6 @@ def FeatureISAVersion12 : FeatureSet<
FeatureDot11Insts,
FeatureNSAEncoding,
FeaturePartialNSAEncoding,
FeatureWavefrontSize32,
FeatureShaderCyclesHiLoRegisters,
FeatureArchitectedFlatScratch,
FeatureArchitectedSGPRs,
Expand Down
8 changes: 8 additions & 0 deletions llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,14 @@ GCNSubtarget::initializeSubtargetDependencies(const Triple &TT,
: AMDGPUSubtarget::SOUTHERN_ISLANDS;
}

if (!hasFeature(AMDGPU::FeatureWavefrontSize32) &&
!hasFeature(AMDGPU::FeatureWavefrontSize64)) {
if (getGeneration() >= AMDGPUSubtarget::GFX10)
ToggleFeature(AMDGPU::FeatureWavefrontSize32);
else
ToggleFeature(AMDGPU::FeatureWavefrontSize64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid this by having a separate SupportsWave32 feature implied by FeatureWavefrontSize32

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to select default somewhere. So say a subtarget has FeatureSupportsWave32, then what?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. the line if (getGeneration() >= AMDGPUSubtarget::GFX10) can be changed to if (hasFeature(AMDGPU::FeatureSupportsWave32)), but that's it.

Also I shall mention, this if can be avoided altogether because subtargets without an option to select wave size just have a nailed wave64 feature in their definition. This if here is just to catch cases when tests use manual -mattr switching it off, but I can remove it and only set wave32 as a default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did that, unconditionally add wave32 if none is set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if you use the implies feature, the feature parsing logic should flip the incompatible case for you instead of manually doing it here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a way to tell it is incompatible in the first place. OK, wave32 implies it supports wave32. How does it turn wave32 automatically? How does it tell that wave64 is incompatible with wave32?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO to do what you want features must be:

  1. Organized in groups.
  2. We need to tell features in group are mutually exclusive.
  3. There shall be a way to tell this one is the default.

None of that exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if there's a SupportsWave32 feature, implied by FeatureWavefrontSize32, if you specify wavefrontsize32 to a wave64-only target, the incompatible feature check in the parsing logic will hit and it will assume you specified an invalid target and unset the target-cpu.

}

// We don't support FP64 for EG/NI atm.
assert(!hasFP64() || (getGeneration() >= AMDGPUSubtarget::SOUTHERN_ISLANDS));

Expand Down
11 changes: 10 additions & 1 deletion llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1408,9 +1408,18 @@ class AMDGPUAsmParser : public MCTargetAsmParser {
copySTI().ToggleFeature("southern-islands");
}

AMDGPU::IsaVersion ISA = AMDGPU::getIsaVersion(getSTI().getCPU());
FeatureBitset FB = getFeatureBits();
if (!FB[AMDGPU::FeatureWavefrontSize64] &&
!FB[AMDGPU::FeatureWavefrontSize32]) {
if (ISA.Major >= 10)
copySTI().ToggleFeature(AMDGPU::FeatureWavefrontSize32);
else
copySTI().ToggleFeature(AMDGPU::FeatureWavefrontSize64);
}

setAvailableFeatures(ComputeAvailableFeatures(getFeatureBits()));

AMDGPU::IsaVersion ISA = AMDGPU::getIsaVersion(getSTI().getCPU());
if (ISA.Major >= 6 && isHsaAbi(getSTI())) {
createConstantSymbol(".amdgcn.gfx_generation_number", ISA.Major);
createConstantSymbol(".amdgcn.gfx_generation_minor", ISA.Minor);
Expand Down
20 changes: 18 additions & 2 deletions llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,26 @@ using namespace llvm;

using DecodeStatus = llvm::MCDisassembler::DecodeStatus;

static const MCSubtargetInfo &addDefaultWaveSize(const MCSubtargetInfo &STI,
MCContext &Ctx) {
if (!STI.hasFeature(AMDGPU::FeatureWavefrontSize64) &&
!STI.hasFeature(AMDGPU::FeatureWavefrontSize32)) {
MCSubtargetInfo &STICopy = Ctx.getSubtargetCopy(STI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really understand this subtargetcopy business

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STI is const, I cannot just flip the bit. The same is done in the AsmParser with copySTI().

if (AMDGPU::isGFX10Plus(STI))
STICopy.ToggleFeature(AMDGPU::FeatureWavefrontSize32);
else
STICopy.ToggleFeature(AMDGPU::FeatureWavefrontSize64);
return STICopy;
}

return STI;
}

AMDGPUDisassembler::AMDGPUDisassembler(const MCSubtargetInfo &STI,
MCContext &Ctx, MCInstrInfo const *MCII)
: MCDisassembler(STI, Ctx), MCII(MCII), MRI(*Ctx.getRegisterInfo()),
MAI(*Ctx.getAsmInfo()), TargetMaxInstBytes(MAI.getMaxInstLength(&STI)),
: MCDisassembler(addDefaultWaveSize(STI, Ctx), Ctx), MCII(MCII),
MRI(*Ctx.getRegisterInfo()), MAI(*Ctx.getAsmInfo()),
TargetMaxInstBytes(MAI.getMaxInstLength(&STI)),
CodeObjectVersion(AMDGPU::getDefaultAMDHSACodeObjectVersion()) {
// ToDo: AMDGPUDisassembler supports only VI ISA.
if (!STI.hasFeature(AMDGPU::FeatureGCN3Encoding) && !isGFX10Plus())
Expand Down
2 changes: 0 additions & 2 deletions llvm/test/CodeGen/AMDGPU/check-subtarget-features.ll
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
; RUN: not llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,-wavefrontsize64 < %s 2>&1 | FileCheck %s -check-prefix=ERR -implicit-check-not=error:
; RUN: not llc -global-isel=1 -mtriple=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,-wavefrontsize64 < %s 2>&1 | FileCheck %s -check-prefix=ERR -implicit-check-not=error:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagnostics is now missing, BE will just initialize wavesize to default. If we want to catch this case we would still need to scan the original string.

; RUN: not llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,+wavefrontsize64 < %s 2>&1 | FileCheck %s -check-prefix=ERR -implicit-check-not=error:
; RUN: not llc -global-isel=1 -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,+wavefrontsize64 < %s 2>&1 | FileCheck %s -check-prefix=ERR -implicit-check-not=error:

Expand Down
16 changes: 8 additions & 8 deletions llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wavefrontsize.ll
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
; RUN: llc -mtriple=amdgcn -mcpu=fiji -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W64 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize32,-wavefrontsize64 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W32 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=-wavefrontsize32,+wavefrontsize64 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W64 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -verify-machineinstrs -amdgpu-enable-vopd=0 < %s | FileCheck -check-prefixes=GCN,W32 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=-wavefrontsize32,+wavefrontsize64 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W64 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize32 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W32 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize64 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W64 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32 -verify-machineinstrs -amdgpu-enable-vopd=0 < %s | FileCheck -check-prefixes=GCN,W32 %s
; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -verify-machineinstrs < %s | FileCheck -check-prefixes=GCN,W64 %s

; RUN: opt -O3 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -O3 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -O3 -mattr=+wavefrontsize32 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -passes='default<O3>' -mattr=+wavefrontsize32 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -O3 -mattr=+wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=tonga -O3 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1010 -O3 -mattr=+wavefrontsize32,-wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1010 -O3 -mattr=-wavefrontsize32,+wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1100 -O3 -mattr=+wavefrontsize32,-wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1100 -O3 -mattr=-wavefrontsize32,+wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1010 -O3 -mattr=+wavefrontsize32 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1010 -O3 -mattr=+wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1100 -O3 -mattr=+wavefrontsize32 -S < %s | FileCheck -check-prefix=OPT %s
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1100 -O3 -mattr=+wavefrontsize64 -S < %s | FileCheck -check-prefix=OPT %s

; GCN-LABEL: {{^}}fold_wavefrontsize:
; OPT-LABEL: define amdgpu_kernel void @fold_wavefrontsize(
Expand Down
2 changes: 1 addition & 1 deletion llvm/test/CodeGen/AMDGPU/unknown-processor.ll
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
; RUN: not llc -mtriple=amdgcn-- -mcpu=unknown -verify-machineinstrs < %s 2>&1 | FileCheck -check-prefix=ERROR -check-prefix=GCN %s
; RUN: llc -mtriple=amdgcn-- -mcpu=unknown -verify-machineinstrs < %s 2>&1 | FileCheck -check-prefix=ERROR -check-prefix=GCN %s
; RUN: llc -mtriple=r600-- -mcpu=unknown -verify-machineinstrs < %s 2>&1 | FileCheck -check-prefix=ERROR -check-prefix=R600 %s
target datalayout = "A5"

Expand Down
Loading
Loading