Skip to content

Commit 7a9a425

Browse files
Alexander-JohnstonfwyzardRuyksteffenlarsen
authored
[SYCL][CUDA] Initial CUDA backend support (#1091)
* [SYCL][LIBCLC] Additional libclc builtins to support SYCL work Adds builtins to libclc to support the CUDA backend for SYCL. Contributors Alexander Johnston <alexander@codeplay.com> David Wood <david.wood@codeplay.com> Victor Lomuller <victor@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] CMake and lit support for SYCL CUDA backend Adds defines CMake and lit variables used for SYCL CUDA backend development and test Contributors Alexander Johnston <alexander@codeplay.com> Bjoern Knafla <bjoern@codeplay.com> Ruyman Reyes <ruyman@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Local Accessor Support for CUDA Provides the LocalAccessorToSharedMemory compiler pass required for supporting SYCL local accessors in CUDA. Contributors Alexander Johnston <alexander@codeplay.com> David Wood <david.wood@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Change __spirv_BuiltIn.. to functions Changes the following builtins to functions __spirv_BuiltInGlobalSize __spirv_BuiltInWorkgroupSize __spirv_BuiltInNumWorkgroups __spirv_BuiltInLocalInvocationId __spirv_BuiltInWorkgroupId __spirv_BuiltInGlobalOffset Contributors David Wood <david.wood@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Add SYCL CUDA support to clang driver Adds CUDA support for sycl compilation in the clang driver Contributors Alexander Johnston <alexander@codeplay.com> David Wood <david.wood@codeplay.com> Victor Lomuller <victor@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Initial Implementation of the CUDA backend Contributors Alan Forbes <alan.forbes@codeplay.com> Alexander Johnston <alexander@codeplay.com> Bjoern Knafla <bjoern@codeplay.com> Daniel Soutar <daniel.soutar@codeplay.com> David Wood <david.wood@codeplay.com> Kumudha Narasimhan <kumudha.narasimhan@codeplay.com> Mehdi Goli <mehdi.goli@codeplay.com> Przemek Malon <przemek.malon@codeplay.com> Ruyman Reyes <ruyman@codeplay.com> Stuart Adams <stuart.adams@codeplay.com> Svetlozar Georgiev <svetlozar.georgiev@codeplay.com> Steffen Larsen <steffen.larsen@codeplay.com> Victor Lomuller <victor@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Update libclc install rules Have libclc install clc-* and libspirv-* to lib and share Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Inline cl namespace to simplify SYCL API usage Synchronise the CUDA backend with the general SYCL changes from #974. Signed-off-by: Andrea Bocci <andrea.bocci@cern.ch> * Added missing flags for device-side builtins Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Removing unnecessary tool from the tree Acked-by: Victor Lomuller <victor@codeplay.com> Signed-off-by: Ruyman <ruyman@codeplay.com> * [SYCL][PI] Fix kernel group info parameter conversion Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com> * [SYCL][CUDA] Refactor __SYCL_INLINE macro Synchronise the CUDA backend with the general SYCL changes from #1121. Signed-off-by: Andrea Bocci <andrea.bocci@cern.ch> * [SYCL] Have default_selector consider SYCL_BE Have the default_selector consider the env var SYCL_BE when rating device scores to make choosing a backend easier. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Select GlobalPlugin based on SYCL_BE Rather than choose the last found plugin as GlobalPlugin, select it depending on the SYCL_BE env var. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Improve default device selection checks Better checks for CUDA and OpenCL devices to match with SYCL_BE in the default device selection, based on the platform version info. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Formatting update for device_selector.cpp Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Changed CUDA unit tests to call through plugin Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com> * [SYCL] Pass SYCL_BE=PI_OPENCL in check-sycl To ensure that the check-sycl targets test OpenCL devices, pass SYCL_BE=PI_OPENCL. This mirrors the check-sycl-cuda target which passes SYCL_BE=PI_CUDA. Without this it is nondeterministic which device is tested by check-sycl. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Remove PI_CUDA specific details from clang Removes PI_CUDA specific code paths and tests from clang, opting to always enable them. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Disable linear_id/opencl-interop.cpp for cuda Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Further fixes to CUDA device selection Fix platform string comparison for CUDA platform detection. Fix device info platform query so that it uses the device's plugin, rather than the GlobalPlugin. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Code style and cleanup to CUDA support Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Enable asserts in all buildbot builds Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Minor test and build configuration Fix minor test and build configuration issues introduced in the development of the CUDA backend. Signed-off-by: Alexander Johnston <alexander@codeplay.com> Co-authored-by: Andrea Bocci <andrea.bocci@cern.ch> Co-authored-by: Ruyman <ruyman@codeplay.com> Co-authored-by: Steffen Larsen <56076654+steffenlarsen@users.noreply.github.com>
1 parent a0c0e33 commit 7a9a425

File tree

820 files changed

+20902
-3437
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

820 files changed

+20902
-3437
lines changed

buildbot/configure.py

Lines changed: 39 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -11,30 +11,49 @@ def do_configure(args):
1111
sycl_dir = os.path.join(args.src_dir, "sycl")
1212
spirv_dir = os.path.join(args.src_dir, "llvm-spirv")
1313
ocl_header_dir = os.path.join(args.obj_dir, "OpenCL-Headers")
14-
icd_loader_lib = ''
14+
icd_loader_lib = os.path.join(args.obj_dir, "OpenCL-ICD-Loader", "build")
15+
llvm_targets_to_build = 'X86'
16+
llvm_enable_projects = 'clang;llvm-spirv;sycl;opencl-aot'
17+
libclc_targets_to_build = ''
18+
sycl_build_pi_cuda = 'OFF'
19+
llvm_enable_assertions = 'ON'
1520

1621
if platform.system() == 'Linux':
17-
icd_loader_lib = os.path.join(args.obj_dir, "OpenCL-ICD-Loader", "build", "libOpenCL.so")
22+
icd_loader_lib = os.path.join(icd_loader_lib, "libOpenCL.so")
1823
else:
19-
icd_loader_lib = os.path.join(args.obj_dir, "OpenCL-ICD-Loader", "build", "OpenCL.lib")
24+
icd_loader_lib = os.path.join(icd_loader_lib, "OpenCL.lib")
25+
26+
if args.cuda:
27+
llvm_targets_to_build += ';NVPTX'
28+
llvm_enable_projects += ';libclc'
29+
libclc_targets_to_build = 'nvptx64--;nvptx64--nvidiacl'
30+
sycl_build_pi_cuda = 'ON'
31+
32+
if args.assertions:
33+
llvm_enable_assertions = 'ON'
2034

2135
install_dir = os.path.join(args.obj_dir, "install")
2236

23-
cmake_cmd = ["cmake",
24-
"-G", "Ninja",
25-
"-DCMAKE_BUILD_TYPE={}".format(args.build_type),
26-
"-DLLVM_EXTERNAL_PROJECTS=sycl;llvm-spirv;opencl-aot",
27-
"-DLLVM_EXTERNAL_SYCL_SOURCE_DIR={}".format(sycl_dir),
28-
"-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR={}".format(spirv_dir),
29-
"-DLLVM_ENABLE_PROJECTS=clang;sycl;llvm-spirv;opencl-aot",
30-
"-DOpenCL_INCLUDE_DIR={}".format(ocl_header_dir),
31-
"-DOpenCL_LIBRARY={}".format(icd_loader_lib),
32-
"-DLLVM_BUILD_TOOLS=ON",
33-
"-DSYCL_ENABLE_WERROR=ON",
34-
"-DLLVM_ENABLE_ASSERTIONS=ON",
35-
"-DCMAKE_INSTALL_PREFIX={}".format(install_dir),
36-
"-DSYCL_INCLUDE_TESTS=ON", # Explicitly include all kinds of SYCL tests.
37-
llvm_dir]
37+
cmake_cmd = [
38+
"cmake",
39+
"-G", "Ninja",
40+
"-DCMAKE_BUILD_TYPE={}".format(args.build_type),
41+
"-DLLVM_ENABLE_ASSERTIONS={}".format(llvm_enable_assertions),
42+
"-DLLVM_TARGETS_TO_BUILD={}".format(llvm_targets_to_build),
43+
"-DLLVM_EXTERNAL_PROJECTS=sycl;llvm-spirv;opencl-aot",
44+
"-DLLVM_EXTERNAL_SYCL_SOURCE_DIR={}".format(sycl_dir),
45+
"-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR={}".format(spirv_dir),
46+
"-DLLVM_ENABLE_PROJECTS={}".format(llvm_enable_projects),
47+
"-DLIBCLC_TARGETS_TO_BUILD={}".format(libclc_targets_to_build),
48+
"-DOpenCL_INCLUDE_DIR={}".format(ocl_header_dir),
49+
"-DOpenCL_LIBRARY={}".format(icd_loader_lib),
50+
"-DSYCL_BUILD_PI_CUDA={}".format(sycl_build_pi_cuda),
51+
"-DLLVM_BUILD_TOOLS=ON",
52+
"-DSYCL_ENABLE_WERROR=ON",
53+
"-DCMAKE_INSTALL_PREFIX={}".format(install_dir),
54+
"-DSYCL_INCLUDE_TESTS=ON", # Explicitly include all kinds of SYCL tests.
55+
llvm_dir
56+
]
3857

3958
print(cmake_cmd)
4059

@@ -63,6 +82,8 @@ def main():
6382
parser.add_argument("-o", "--obj-dir", metavar="OBJ_DIR", required=True, help="build directory")
6483
parser.add_argument("-t", "--build-type",
6584
metavar="BUILD_TYPE", required=True, help="build type, debug or release")
85+
parser.add_argument("--cuda", action='store_true', help="switch from OpenCL to CUDA")
86+
parser.add_argument("--assertions", action='store_true', help="build with assertions")
6687

6788
args = parser.parse_args()
6889

@@ -74,4 +95,3 @@ def main():
7495
ret = main()
7596
exit_code = 0 if ret else 1
7697
sys.exit(exit_code)
77-

clang/include/clang/Basic/DiagnosticDriverKinds.td

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,9 @@ def warn_drv_unknown_cuda_version: Warning<
6464
"Unknown CUDA version %0. Assuming the latest supported version %1">,
6565
InGroup<CudaUnknownVersion>;
6666
def err_drv_cuda_host_arch : Error<"unsupported architecture '%0' for host compilation.">;
67+
def err_drv_no_sycl_libspirv : Error<
68+
"cannot find `libspirv-nvptx64--nvidiacl.bc`. Provide path to libspirv library via "
69+
"-fsycl-libspirv-path, or pass -fno-sycl-libspirv to build without linking with libspirv.">;
6770
def err_drv_mix_cuda_hip : Error<"Mixed Cuda and HIP compilation is not supported.">;
6871
def err_drv_invalid_thread_model_for_target : Error<
6972
"invalid thread model '%0' in '%1' for this target">;

clang/include/clang/Basic/DiagnosticIDs.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ namespace clang {
2828
// Size of each of the diagnostic categories.
2929
enum {
3030
DIAG_SIZE_COMMON = 300,
31-
DIAG_SIZE_DRIVER = 250, // 200 -> 250 for SYCL related diagnostics
31+
DIAG_SIZE_DRIVER = 210,
3232
DIAG_SIZE_FRONTEND = 150,
3333
DIAG_SIZE_SERIALIZATION = 120,
3434
DIAG_SIZE_LEX = 400,

clang/include/clang/Driver/Options.td

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1872,6 +1872,9 @@ def fsycl_help_EQ : Joined<["-"], "fsycl-help=">,
18721872
def fsycl_help : Flag<["-"], "fsycl-help">, Alias<fsycl_help_EQ>,
18731873
Flags<[DriverOption, CoreOption]>, AliasArgs<["all"]>, HelpText<"Emit help information "
18741874
"from all of the offline compilation tools">;
1875+
def fsycl_libspirv_path_EQ : Joined<["-"], "fsycl-libspirv-path=">,
1876+
Flags<[CC1Option, CoreOption]>, HelpText<"Path to libspirv library">;
1877+
def fno_sycl_libspirv : Flag<["-"], "fno-sycl-libspirv">, HelpText<"Disable check for libspirv">;
18751878
def fsyntax_only : Flag<["-"], "fsyntax-only">,
18761879
Flags<[DriverOption,CoreOption,CC1Option]>, Group<Action_Group>;
18771880
def ftabstop_EQ : Joined<["-"], "ftabstop=">, Group<f_Group>;

clang/lib/Basic/Targets/NVPTX.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,8 @@ NVPTXTargetInfo::NVPTXTargetInfo(const llvm::Triple &Triple,
5757
.Default(32);
5858
}
5959

60-
TLSSupported = false;
60+
// FIXME: Needed for compiling SYCL to PTX.
61+
TLSSupported = Triple.getEnvironment() == llvm::Triple::SYCLDevice;
6162
VLASupported = false;
6263
AddrSpaceMap = &NVPTXAddrSpaceMap;
6364
UseAddrSpaceMapMangling = true;

clang/lib/Basic/Targets/NVPTX.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,12 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo {
141141
Opts.support("cl_khr_global_int32_extended_atomics");
142142
Opts.support("cl_khr_local_int32_base_atomics");
143143
Opts.support("cl_khr_local_int32_extended_atomics");
144+
// PTX actually supports 64 bits operations even if the Nvidia OpenCL
145+
// runtime does not report support for it.
146+
// This is required for libclc to compile 64 bits atomic functions.
147+
// FIXME: maybe we should have a way to control this ?
148+
Opts.support("cl_khr_int64_base_atomics");
149+
Opts.support("cl_khr_int64_extended_atomics");
144150
}
145151

146152
/// \returns If a target requires an address within a target specific address

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -842,9 +842,6 @@ void EmitAssemblyHelper::EmitAssembly(BackendAction Action,
842842
PerFunctionPasses.add(
843843
createTargetTransformInfoWrapperPass(getTargetIRAnalysis()));
844844

845-
if (LangOpts.SYCLIsDevice)
846-
PerFunctionPasses.add(createSYCLLowerWGScopePass());
847-
848845
CreatePasses(PerModulePasses, PerFunctionPasses);
849846

850847
legacy::PassManager CodeGenPasses;

clang/lib/CodeGen/CGCall.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -755,6 +755,12 @@ CodeGenTypes::arrangeLLVMFunctionInfo(CanQualType resultType,
755755
return *FI;
756756

757757
unsigned CC = ClangCallConvToLLVMCallConv(info.getCC());
758+
// This is required so SYCL kernels are successfully processed by tools from CUDA. Kernels
759+
// with a `spir_kernel` calling convention are ignored otherwise.
760+
if (CC == llvm::CallingConv::SPIR_KERNEL && CGM.getTriple().isNVPTX() &&
761+
getContext().getLangOpts().SYCLIsDevice) {
762+
CC = llvm::CallingConv::C;
763+
}
758764

759765
// Construct the function info. We co-allocate the ArgInfos.
760766
FI = CGFunctionInfo::create(CC, instanceMethod, chainCall, info,

clang/lib/CodeGen/CodeGenAction.cpp

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#include "CodeGenModule.h"
1111
#include "CoverageMappingGen.h"
1212
#include "MacroPPCallbacks.h"
13+
#include "SYCLLowerIR/LowerWGScope.h"
1314
#include "clang/AST/ASTConsumer.h"
1415
#include "clang/AST/ASTContext.h"
1516
#include "clang/AST/DeclCXX.h"
@@ -33,6 +34,7 @@
3334
#include "llvm/IR/GlobalValue.h"
3435
#include "llvm/IR/LLVMContext.h"
3536
#include "llvm/IR/LLVMRemarkStreamer.h"
37+
#include "llvm/IR/LegacyPassManager.h"
3638
#include "llvm/IR/Module.h"
3739
#include "llvm/IRReader/IRReader.h"
3840
#include "llvm/Linker/Linker.h"
@@ -326,6 +328,17 @@ namespace clang {
326328
CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone)
327329
Ctx.setDiagnosticsHotnessRequested(true);
328330

331+
// The parallel_for_work_group legalization pass can emit calls to
332+
// builtins function. Definitions of those builtins can be provided in
333+
// LinkModule. We force the pass to legalize the code before the link
334+
// happens.
335+
if (LangOpts.SYCLIsDevice) {
336+
PrettyStackTraceString CrashInfo("Pre-linking SYCL passes");
337+
legacy::PassManager PreLinkingSyclPasses;
338+
PreLinkingSyclPasses.add(createSYCLLowerWGScopePass());
339+
PreLinkingSyclPasses.run(*getModule());
340+
}
341+
329342
// Link each LinkModule into our module.
330343
if (LinkInModules())
331344
return;

clang/lib/CodeGen/CodeGenModule.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,8 @@ void CodeGenModule::createSYCLRuntime() {
240240
switch (getTriple().getArch()) {
241241
case llvm::Triple::spir:
242242
case llvm::Triple::spir64:
243+
case llvm::Triple::nvptx:
244+
case llvm::Triple::nvptx64:
243245
SYCLRuntime.reset(new CGSYCLRuntime(*this));
244246
break;
245247
default:

0 commit comments

Comments
 (0)