Description
Describe the bug
The SYCL specification in section 3.9.11 states that calls to submit on the queue are non-blocking:
These calls will be non-blocking on the host, but enqueue operations to the queue that the command group is submitted to.
Therefore I believe it's incorrect for the runtime to wait on other commands when submitting a command, meaning that it may need to hold off on submitting the command to PI until a later point.
To Reproduce
The following code snippet using the host_task
hangs with DPC++, but I believe it is valid SYCL code:
#include <sycl/sycl.hpp>
#include <vector>
#include <mutex>
int main(int argc, char *argv[]) {
sycl::queue q;
std::vector<int> a(32, 2);
sycl::buffer<int, 1> Ba{a.data(), a.size()};
std::mutex l;
l.lock();
sycl::event e = q.submit([&](sycl::handler &h) {
h.host_task([&]() {
l.lock();
l.unlock();
});
});
q.submit([&](sycl::handler &h) {
h.depends_on(e);
auto acca = Ba.get_access<sycl::access::mode::read_write>(h);
h.parallel_for<class kernel1>(sycl::range<1>{1}, [=](sycl::id<1> idx) {
acca[0] = 1;
});
});
l.unlock();
q.wait();
auto acca = Ba.get_access<sycl::access::mode::read>();
printf("val: %d\n", acca[0]);
return 0;
}
I'm getting deadlocks with this on Linux with the CUDA plugin.
But I believe it hangs in the runtime, suggesting that the kernel enqueue is actually blocking trying to wait on the event e
from the host_task
, which is invalid.
cc @Pennycook