Skip to content

[SYCL] Make device ids unique per backend #4247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 64 commits into from
Sep 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
647b8ca
[SYCL] Make device ids unique per backend
bso-intel Aug 3, 2021
3054c5c
revert the order of backend
bso-intel Aug 3, 2021
d0a9675
fix cuda errors
bso-intel Aug 5, 2021
4d3e6be
fix bugs
bso-intel Aug 6, 2021
843c63d
fix device_num test
bso-intel Aug 6, 2021
0b6524d
update doc
bso-intel Aug 6, 2021
a7e25dc
clang-format
bso-intel Aug 6, 2021
64620b4
revert
bso-intel Aug 7, 2021
7aabe18
Merge remote-tracking branch 'upstream/sycl' into unique-device-id-pe…
bso-intel Aug 17, 2021
846e2d0
update
bso-intel Aug 17, 2021
bfa2f70
Merge branch 'unique-device-id-per-backend' of https://github.com/bso…
bso-intel Aug 17, 2021
9fd4242
Update sycl/source/detail/platform_impl.cpp
bso-intel Aug 19, 2021
0b32652
Update sycl/source/detail/platform_impl.cpp
bso-intel Aug 19, 2021
cc0310b
Update sycl/source/detail/plugin.hpp
bso-intel Aug 19, 2021
cf2403d
Update sycl/source/detail/plugin.hpp
bso-intel Aug 19, 2021
258db05
add missed env var
bso-intel Aug 19, 2021
8e12004
Merge remote-tracking branch 'upstream/sycl' into unique-device-id-pe…
bso-intel Aug 19, 2021
4581603
Update sycl/source/detail/device_filter.cpp
bso-intel Aug 23, 2021
c8d214c
Update sycl/source/detail/pi.cpp
bso-intel Aug 23, 2021
d0497c5
Update sycl/source/detail/device_filter.cpp
bso-intel Aug 23, 2021
17fd9cd
Update sycl/source/detail/device_filter.cpp
bso-intel Aug 23, 2021
8ba9a74
address feedback
bso-intel Aug 24, 2021
b02c8bf
make thread-safe
bso-intel Aug 31, 2021
8ed6fae
Merge remote-tracking branch 'upstream/sycl' into unique-device-id-pe…
bso-intel Aug 31, 2021
3f51214
fix cuda issue
bso-intel Aug 31, 2021
917ab91
handle -1
bso-intel Sep 1, 2021
11ac050
Merge remote-tracking branch 'upstream/sycl' into unique-device-id-pe…
bso-intel Sep 1, 2021
f7ebfce
address feedback
bso-intel Sep 7, 2021
2a0b146
fix deadlock
bso-intel Sep 8, 2021
b0bfd31
first platform
bso-intel Sep 8, 2021
b946482
fix race
bso-intel Sep 8, 2021
262d0ac
reset device id
bso-intel Sep 8, 2021
6c44156
try not resetting PiPlatforms
bso-intel Sep 9, 2021
8b59615
change PiPlatform as vector
bso-intel Sep 9, 2021
efd6b1c
add locks
bso-intel Sep 10, 2021
2e0a05d
Merge remote-tracking branch 'upstream/sycl' into unique-device-id-pe…
bso-intel Sep 15, 2021
f423471
Update sycl/source/detail/pi.cpp
bso-intel Sep 16, 2021
9d19187
Update sycl/source/detail/pi.cpp
bso-intel Sep 16, 2021
bc5373d
Update sycl/source/detail/platform_impl.cpp
bso-intel Sep 16, 2021
6e5e7b7
Update sycl/source/detail/platform_impl.cpp
bso-intel Sep 16, 2021
51a6a71
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
0412791
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
b73cae4
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
58aaf31
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
a5b50f1
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
23609bf
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
fbaade4
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
5d00fcc
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
117b886
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
3d7c652
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
6dbcd19
Update sycl/source/detail/plugin.hpp
bso-intel Sep 16, 2021
e65d394
clang-format
bso-intel Sep 16, 2021
316a9dc
cleanup residue
bso-intel Sep 16, 2021
54e0c8a
revert to fix deadlock
bso-intel Sep 16, 2021
cef727f
missed init mutexes
bso-intel Sep 16, 2021
b8af7ea
add comments
bso-intel Sep 16, 2021
f55bb5a
address feedback
bso-intel Sep 17, 2021
7e02fb5
print platform name
bso-intel Sep 18, 2021
239b483
Update sycl/source/detail/platform_impl.cpp
bso-intel Sep 21, 2021
972c789
fix plugin index
bso-intel Sep 21, 2021
4b2d9bd
fix deadlock
bso-intel Sep 21, 2021
16d98cb
Update sycl/source/detail/platform_impl.cpp
bso-intel Sep 22, 2021
63de170
Update sycl/source/detail/plugin.hpp
bso-intel Sep 22, 2021
814e1e9
address feedback
bso-intel Sep 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions sycl/doc/EnvironmentVariables.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,25 +55,25 @@ subject to change. Do not rely on these variables in production code.

This environment variable limits the SYCL RT to use only a subset of the system's devices. Setting this environment variable affects all of the device query functions (`platform::get_devices()` and `platform::get_platforms()`) and all of the device selectors.

The value of this environment variable is a comma separated list of filters, where each filter is a triple of the form "`backend:device_type:device_num`" (without the quotes). Each element of the triple is optional, but each filter must have at least one value. Possible values of "backend" are:
- host
The value of this environment variable is a comma separated list of filters, where each filter is a triple of the form "`backend`:`device_type`:`device_num`" (without the quotes). Each element of the triple is optional, but each filter must have at least one value. Possible values of `backend` are:
- `host`
- `level_zero`
- opencl
- cuda
- \*
- `opencl`
- `cuda`
- `*`

Possible values of "`device_type`" are:
- host
- cpu
- gpu
- acc
- \*
Possible values of `device_type` are:
- `host`
- `cpu`
- `gpu`
- `acc`
- `*`

`Device_num` is an integer that indexes the enumeration of devices from the sycl-ls utility tool, where the first device in that enumeration has index zero in each backend. For example, `SYCL_DEVICE_FILTER`=2 will return all devices with index '2' from all different backends. If multiple devices satisfy this device number (e.g., GPU and CPU devices can be assigned device number '2'), then default_selector will choose the device with the highest heuristic point.
`device_num` is an integer that indexes the enumeration of devices from the sycl-ls utility tool, where the first device in that enumeration has index zero in each backend. For example, `SYCL_DEVICE_FILTER=2` will return all devices with index '2' from all different backends. If multiple devices satisfy this device number (e.g., GPU and CPU devices can be assigned device number '2'), then default_selector will choose the device with the highest heuristic point. When `SYCL_DEVICE_ALLOWLIST` is set, it is applied before enumerating devices and affects `device_num` values.

Assuming a filter has all three elements of the triple, it selects only those devices that come from the given backend, have the specified device type, AND have the given device index. If more than one filter is specified, the RT is restricted to the union of devices selected by all filters. The RT does not include the "host" backend and the host device automatically unless one of the filters explicitly specifies the "host" device type. Therefore, `SYCL_DEVICE_FILTER`=host should be set to enforce SYCL to use the host device only.
Assuming a filter has all three elements of the triple, it selects only those devices that come from the given backend, have the specified device type, AND have the given device index. If more than one filter is specified, the RT is restricted to the union of devices selected by all filters. The RT does not include the `host` backend and the `host` device automatically unless one of the filters explicitly specifies the `host` device type. Therefore, `SYCL_DEVICE_FILTER=host` should be set to enforce SYCL to use the `host` device only.

Note that all device selectors will throw an exception if the filtered list of devices does not include a device that satisfies the selector. For instance, `SYCL_DEVICE_FILTER`=cpu,level_zero will cause host_selector() to throw an exception. `SYCL_DEVICE_FILTER` also limits loading only specified plugins into the SYCL RT. In particular, `SYCL_DEVICE_FILTER`=level_zero will cause the cpu_selector to throw an exception since SYCL RT will only load the level_zero backend which does not support any CPU devices at this time. When multiple devices satisfy the filter (e..g, `SYCL_DEVICE_FILTER`=gpu), only one of them will be selected.
Note that all device selectors will throw an exception if the filtered list of devices does not include a device that satisfies the selector. For instance, `SYCL_DEVICE_FILTER=cpu,level_zero` will cause `host_selector()` to throw an exception. `SYCL_DEVICE_FILTER` also limits loading only specified plugins into the SYCL RT. In particular, `SYCL_DEVICE_FILTER=level_zero` will cause the `cpu_selector` to throw an exception since SYCL RT will only load the `level_zero` backend which does not support any CPU devices at this time. When multiple devices satisfy the filter (e..g, `SYCL_DEVICE_FILTER=gpu`), only one of them will be selected.

### `SYCL_PRINT_EXECUTION_GRAPH` Options

Expand Down
2 changes: 1 addition & 1 deletion sycl/include/CL/sycl/detail/pi.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ template <class To, class From> To cast(From value);
extern std::shared_ptr<plugin> GlobalPlugin;

// Performs PI one-time initialization.
const std::vector<plugin> &initialize();
std::vector<plugin> &initialize();

// Get the plugin serving given backend.
template <backend BE> __SYCL_EXPORT const plugin &getPlugin();
Expand Down
61 changes: 38 additions & 23 deletions sycl/source/detail/device_filter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,65 +12,80 @@
#include <detail/device_impl.hpp>

#include <cstring>
#include <string_view>

__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
namespace detail {

std::vector<std::string_view> tokenize(const std::string &Filter,
const std::string &Delim) {
std::vector<std::string_view> Tokens;
size_t Pos = 0;
size_t LastPos = 0;

while ((Pos = Filter.find(Delim, LastPos)) != std::string::npos) {
std::string_view Tok(Filter.data() + LastPos, (Pos - LastPos));

if (!Tok.empty()) {
Tokens.push_back(Tok);
}
// move the search starting index
LastPos = Pos + 1;
}

// Add remainder if any
if (LastPos < Filter.size()) {
std::string_view Tok(Filter.data() + LastPos, Filter.size() - LastPos);
Tokens.push_back(Tok);
}
return Tokens;
}

device_filter::device_filter(const std::string &FilterString) {
size_t Cursor = 0;
size_t ColonPos = 0;
auto findElement = [&](auto Element) {
size_t Found = FilterString.find(Element.first, Cursor);
if (Found == std::string::npos)
return false;
Cursor = Found;
return true;
std::vector<std::string_view> Tokens = tokenize(FilterString, ":");
size_t TripleValueID = 0;

auto FindElement = [&](auto Element) {
return std::string::npos != Tokens[TripleValueID].find(Element.first);
};

// Handle the optional 1st field of the filter, backend
// Check if the first entry matches with a known backend type
auto It = std::find_if(std::begin(getSyclBeMap()), std::end(getSyclBeMap()),
findElement);
FindElement);
// If no match is found, set the backend type backend::all
// which actually means 'any backend' will be a match.
if (It == getSyclBeMap().end())
Backend = backend::all;
else {
Backend = It->second;
ColonPos = FilterString.find(":", Cursor);
if (ColonPos != std::string::npos)
Cursor = ColonPos + 1;
else
Cursor = Cursor + It->first.size();
TripleValueID++;
}

// Handle the optional 2nd field of the filter - device type.
// Check if the 2nd entry matches with any known device type.
if (Cursor >= FilterString.size()) {
if (TripleValueID >= Tokens.size()) {
DeviceType = info::device_type::all;
} else {
auto Iter = std::find_if(std::begin(getSyclDeviceTypeMap()),
std::end(getSyclDeviceTypeMap()), findElement);
std::end(getSyclDeviceTypeMap()), FindElement);
// If no match is found, set device_type 'all',
// which actually means 'any device_type' will be a match.
if (Iter == getSyclDeviceTypeMap().end())
DeviceType = info::device_type::all;
else {
DeviceType = Iter->second;
ColonPos = FilterString.find(":", Cursor);
if (ColonPos != std::string::npos)
Cursor = ColonPos + 1;
else
Cursor = Cursor + Iter->first.size();
TripleValueID++;
}
}

// Handle the optional 3rd field of the filter, device number
// Try to convert the remaining string to an integer.
// If succeessful, the converted integer is the desired device num.
if (Cursor < FilterString.size()) {
if (TripleValueID < Tokens.size()) {
try {
DeviceNum = stoi(FilterString.substr(Cursor));
DeviceNum = std::stoi(Tokens[TripleValueID].data());
HasDeviceNum = true;
} catch (...) {
std::string Message =
Expand Down
16 changes: 8 additions & 8 deletions sycl/source/detail/pi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ getPluginOpaqueData<cl::sycl::backend::esimd_cpu>(void *);

namespace pi {

static void initializePlugins(std::vector<plugin> *Plugins);
static void initializePlugins(std::vector<plugin> &Plugins);

bool XPTIInitDone = false;

Expand Down Expand Up @@ -369,17 +369,17 @@ bool trace(TraceLevel Level) {
}

// Initializes all available Plugins.
const std::vector<plugin> &initialize() {
std::vector<plugin> &initialize() {
static std::once_flag PluginsInitDone;

std::call_once(PluginsInitDone, []() {
initializePlugins(&GlobalHandler::instance().getPlugins());
// std::call_once is blocking all other threads if a thread is already
// creating a vector of plugins. So, no additional lock is needed.
std::call_once(PluginsInitDone, [&]() {
initializePlugins(GlobalHandler::instance().getPlugins());
});

return GlobalHandler::instance().getPlugins();
}

static void initializePlugins(std::vector<plugin> *Plugins) {
static void initializePlugins(std::vector<plugin> &Plugins) {
std::vector<std::pair<std::string, backend>> PluginNames = findPlugins();

if (PluginNames.empty() && trace(PI_TRACE_ALL))
Expand Down Expand Up @@ -438,7 +438,7 @@ static void initializePlugins(std::vector<plugin> *Plugins) {
GlobalPlugin = std::make_shared<plugin>(PluginInformation,
backend::level_zero, Library);
}
Plugins->emplace_back(
Plugins.emplace_back(
plugin(PluginInformation, PluginNames[I].second, Library));
if (trace(TraceLevel::PI_TRACE_BASIC))
std::cerr << "SYCL_PI_TRACE[basic]: "
Expand Down
43 changes: 31 additions & 12 deletions sycl/source/detail/platform_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,27 +95,30 @@ static bool IsBannedPlatform(platform Platform) {

std::vector<platform> platform_impl::get_platforms() {
std::vector<platform> Platforms;
const std::vector<plugin> &Plugins = RT::initialize();

std::vector<plugin> &Plugins = RT::initialize();
info::device_type ForcedType = detail::get_forced_type();
for (unsigned int i = 0; i < Plugins.size(); i++) {

for (plugin &Plugin : Plugins) {
pi_uint32 NumPlatforms = 0;
// Move to the next plugin if the plugin fails to initialize.
// This way platforms from other plugins get a chance to be discovered.
if (Plugins[i].call_nocheck<PiApiKind::piPlatformsGet>(
if (Plugin.call_nocheck<PiApiKind::piPlatformsGet>(
0, nullptr, &NumPlatforms) != PI_SUCCESS)
continue;

if (NumPlatforms) {
std::vector<RT::PiPlatform> PiPlatforms(NumPlatforms);
if (Plugins[i].call_nocheck<PiApiKind::piPlatformsGet>(
if (Plugin.call_nocheck<PiApiKind::piPlatformsGet>(
NumPlatforms, PiPlatforms.data(), nullptr) != PI_SUCCESS)
return Platforms;

for (const auto &PiPlatform : PiPlatforms) {
platform Platform = detail::createSyclObjFromImpl<platform>(
getOrMakePlatformImpl(PiPlatform, Plugins[i]));
getOrMakePlatformImpl(PiPlatform, Plugin));
{
std::lock_guard<std::mutex> Guard(*Plugin.getPluginMutex());
// insert PiPlatform into the Plugin
Plugin.getPlatformId(PiPlatform);
}
// Skip platforms which do not contain requested device types
if (!Platform.get_devices(ForcedType).empty() &&
!IsBannedPlatform(Platform))
Expand All @@ -141,14 +144,26 @@ std::vector<platform> platform_impl::get_platforms() {
// This function matches devices in the order of backend, device_type, and
// device_num.
static void filterDeviceFilter(std::vector<RT::PiDevice> &PiDevices,
const plugin &Plugin) {
RT::PiPlatform Platform) {
device_filter_list *FilterList = SYCLConfig<SYCL_DEVICE_FILTER>::get();
if (!FilterList)
return;

std::vector<plugin> &Plugins = RT::initialize();
auto It =
std::find_if(Plugins.begin(), Plugins.end(), [Platform](plugin &Plugin) {
return Plugin.containsPiPlatform(Platform);
});
if (It == Plugins.end())
return;

plugin &Plugin = *It;
backend Backend = Plugin.getBackend();
int InsertIDx = 0;
int DeviceNum = 0;
// DeviceIds should be given consecutive numbers across platforms in the same
// backend
std::lock_guard<std::mutex> Guard(*Plugin.getPluginMutex());
int DeviceNum = Plugin.getStartingDeviceId(Platform);
for (RT::PiDevice Device : PiDevices) {
RT::PiDeviceType PiDevType;
Plugin.call<PiApiKind::piDeviceGetInfo>(Device, PI_DEVICE_INFO_TYPE,
Expand Down Expand Up @@ -181,6 +196,10 @@ static void filterDeviceFilter(std::vector<RT::PiDevice> &PiDevices,
DeviceNum++;
}
PiDevices.resize(InsertIDx);
// remember the last backend that has gone through this filter function
// to assign a unique device id number across platforms that belong to
// the same backend. For example, opencl:cpu:0, opencl:acc:1, opencl:gpu:2
Plugin.setLastDeviceId(Platform, DeviceNum);
}

std::shared_ptr<device_impl> platform_impl::getOrMakeDeviceImpl(
Expand Down Expand Up @@ -237,12 +256,12 @@ platform_impl::get_devices(info::device_type DeviceType) const {

// Filter out devices that are not present in the SYCL_DEVICE_ALLOWLIST
if (SYCLConfig<SYCL_DEVICE_ALLOWLIST>::get())
applyAllowList(PiDevices, MPlatform, this->getPlugin());
applyAllowList(PiDevices, MPlatform, Plugin);

// Filter out devices that are not compatible with SYCL_DEVICE_FILTER
filterDeviceFilter(PiDevices, Plugin);
filterDeviceFilter(PiDevices, MPlatform);

PlatformImplPtr PlatformImpl = getOrMakePlatformImpl(MPlatform, *MPlugin);
PlatformImplPtr PlatformImpl = getOrMakePlatformImpl(MPlatform, Plugin);
std::transform(
PiDevices.begin(), PiDevices.end(), std::back_inserter(Res),
[PlatformImpl](const RT::PiDevice &PiDevice) -> device {
Expand Down
52 changes: 50 additions & 2 deletions sycl/source/detail/plugin.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,10 @@ auto packCallArguments(ArgsT &&... Args) {
class plugin {
public:
plugin() = delete;

plugin(RT::PiPlugin Plugin, backend UseBackend, void *LibraryHandle)
: MPlugin(Plugin), MBackend(UseBackend), MLibraryHandle(LibraryHandle),
TracingMutex(std::make_shared<std::mutex>()) {}
TracingMutex(std::make_shared<std::mutex>()),
MPluginMutex(std::make_shared<std::mutex>()) {}

plugin &operator=(const plugin &) = default;
plugin(const plugin &) = default;
Expand Down Expand Up @@ -184,11 +184,59 @@ class plugin {
void *getLibraryHandle() { return MLibraryHandle; }
int unload() { return RT::unloadPlugin(MLibraryHandle); }

// return the index of PiPlatforms.
// If not found, add it and return its index.
// The function is expected to be called in a thread safe manner.
int getPlatformId(RT::PiPlatform Platform) {
auto It = std::find(PiPlatforms.begin(), PiPlatforms.end(), Platform);
if (It != PiPlatforms.end())
return It - PiPlatforms.begin();

PiPlatforms.push_back(Platform);
LastDeviceIds.push_back(0);
return PiPlatforms.size() - 1;
}

// Device ids are consecutive across platforms within a plugin.
// We need to return the same starting index for the given platform.
// So, instead of returing the last device id of the given platform,
// return the last device id of the predecessor platform.
// The function is expected to be called in a thread safe manner.
int getStartingDeviceId(RT::PiPlatform Platform) {
int PlatformId = getPlatformId(Platform);
if (PlatformId == 0)
return 0;
return LastDeviceIds[PlatformId - 1];
}

// set the id of the last device for the given platform
// The function is expected to be called in a thread safe manner.
void setLastDeviceId(RT::PiPlatform Platform, int Id) {
int PlatformId = getPlatformId(Platform);
LastDeviceIds[PlatformId] = Id;
}

bool containsPiPlatform(RT::PiPlatform Platform) {
auto It = std::find(PiPlatforms.begin(), PiPlatforms.end(), Platform);
return It != PiPlatforms.end();
}

std::shared_ptr<std::mutex> getPluginMutex() { return MPluginMutex; }

private:
RT::PiPlugin MPlugin;
backend MBackend;
void *MLibraryHandle; // the handle returned from dlopen
std::shared_ptr<std::mutex> TracingMutex;
// Mutex to guard PiPlatforms and LastDeviceIds.
// Note that this is a temporary solution until we implement the global
// Device/Platform cache later.
std::shared_ptr<std::mutex> MPluginMutex;
// vector of PiPlatforms that belong to this plugin
std::vector<RT::PiPlatform> PiPlatforms;
// represents the unique ids of the last device of each platform
// index of this vector corresponds to the index in PiPlatforms vector.
std::vector<int> LastDeviceIds;
}; // class plugin
} // namespace detail
} // namespace sycl
Expand Down
Loading