[SYCL] Add -fsycl-fp32-prec-sqrt flag #5309

npmiller · 2022-01-14T10:58:00Z

This flag enables correctly rounded sycl::sqrt (the default precision
requirement is 3 ULP).

And enables the flag for CUDA and HIP targets.

This is a follow up from #5141, to have a proper fix for #4041.

This flag enables correctly rounded `sycl::sqrt` (the default precision requirement is 3 ULP). And enables the flag for CUDA and HIP targets.

clang/include/clang/Driver/Options.td

sycl/doc/UsersManual.md

clang/include/clang/Driver/Options.td

elizabethandrews

FE changes LGTM. Thanks!

smanna12

FE changes look good to me.

premanandrao

Just a couple minor nits.

clang/lib/Driver/ToolChain.cpp

clang/lib/Driver/ToolChains/HIPSPV.h

Co-authored-by: premanandrao <premanand.m.rao@intel.com>

clang/lib/Driver/ToolChains/AMDGPU.cpp

clang/test/Driver/sycl-nvptx-sqrt.cpp

Co-authored-by: Artem Gindinson <artem.gindinson@intel.com>

AGindinson

The implementation/tests LGTM, thanks!
I would prefer to leave the actual @intel/dpcpp-clang-driver-reviewers approval to @mdtoguchi or @hchilama in view of the new option being added.

bader

Regression/kernel_name_class.cpp from llvm-test-suite fails with:

Memory access fault by GPU node-1 (Agent handle: 0x828770) on address 0x7fe65b61e000. Reason: Page not present or supervisor privilege.

I don't think it's directly related to this patch, but it might be an issue in the runtime library or hip plug-in. Any ideas what's going on here?

smanna12

FE changes LGTM

elizabethandrews

FE changes LGTM!

npmiller · 2022-01-18T15:15:50Z

Regression/kernel_name_class.cpp from llvm-test-suite fails with:

Memory access fault by GPU node-1 (Agent handle: 0x828770) on address 0x7fe65b61e000. Reason: Page not present or supervisor privilege.

I don't think it's directly related to this patch, but it might be an issue in the runtime library or hip plug-in. Any ideas what's going on here?

This is strange, I'm unable to reproduce these failures with this branch and the latest llvm-test-suite (on gfx908). So I'm not too sure what's going on, I'm tempted to just rebase on the latests sycl branch and then we can try re-running the CI

bader · 2022-01-18T15:23:11Z

Based on the history of pre-commit checks in this pull request, the issue seems to be sporadic, but still the log suggests there is a bug somewhere as test program accesses wrong memory location.

bader · 2022-01-18T15:40:38Z

@npmiller, I'll restart GitHub Actions jobs.

xtian-github

what fp64? No need for it?

zjin-lcf · 2022-01-18T22:47:53Z

what fp64? No need for it?

Thanks for the suggestion. I will update my example to include double-precision data type.

xtian-github · 2022-01-19T01:03:30Z

what fp64? No need for it?

Thanks for the suggestion. I will update my example to include double-precision data type.

Thanks for adding fp64! If we do support fp64, do we need to rename the option a bit, e.g. remove fp32 given the CUDA compiler option does not have "type" in the name.

npmiller · 2022-01-19T10:35:19Z

what fp64? No need for it?

The SYCL spec already requires double precision sqrt to be correctly rounded so I don't believe this flag would make sense for fp64.

what fp64? No need for it?

The SYCL spec already requires double precision sqrt to be correctly rounded so I don't believe this flag would make sense for fp64.

With the verification result of the updated example, I found that nvcc calls the fast sqrt for double precision. The optimization option is just "-O3". However, the sycl compiler calls the correctly rounded sqrt. Thanks.

So, for SYCL, under -O3, do we have (or need to have) same behavior as NVCC for SYCL compiler?

zjin-lcf · 2022-01-19T11:53:48Z

what fp64? No need for it?

The SYCL spec already requires double precision sqrt to be correctly rounded so I don't believe this flag would make sense for fp64.

With the verification result of the updated example, I found that nvcc calls the fast sqrt for double precision. The optimization option is just "-O3". However, the sycl compiler calls the correctly rounded sqrt. Thanks.

npmiller · 2022-01-31T10:46:12Z

@bader looks like all of the testing passed on this after re-running the CI

bader · 2022-01-31T11:13:06Z

Great. Let's check that @xtian-github's concerns are resolved.

bader · 2022-02-04T12:32:05Z

@xtian-github, ping.

xtian-github

LGTM

* upstream/sycl: (3571 commits) [ESIMD] Doxygen update part III - core APIs. (intel#5472) [SYCL][DOC] Move proposed FPGA extensions (intel#5453) [SYCL] Add -fsycl-fp32-prec-sqrt flag (intel#5309) [SYCL] Emit program build logs for warning levels >= 2 (intel#5319) [SYCL] Add clang support for code_location in KernelInfo (intel#5335) [SYCL][Doc] Move FPGA extensions (intel#5470) [ESIMD] Fix public simd and simd_view APIs. (intel#5465) [SYCL] Deprecate sycl::atomics in SYCL 2020 mode (intel#5440) [SYCL] Add unit test for PR 5414 (intel#5450) [XPTI] Allow arbitrary data types in metadata (intel#4998) [SYCL][DOC] Move discard queue events to supported (intel#5452) [Driver][SYCL] Initial support for allowing fat static -lname processing (intel#5413) [SYCL] Fix dead pointer usage if leaf buffer overflows (intel#5417) [SYCL][L0] Fix memory leak in USM prefetch (intel#5461) [SYCL][Doc] Add new free function queries proposal (intel#5106) [SYCL][ESIMD] Update vc-intrinsics deps to the top of the trunk (intel#5460) [SYCL][DOC] Move old spec constant extension spec (intel#5456) [SYCL][DOC] Move deprecated extensions (intel#5458) [SYCL][DOC] Fix links to old SubGroupMask doc (intel#5459) [ESIMD] Doxygen update part II - memory APIs. (intel#5443) ...

It follows the approach from intel#5141 and intel#5309 adding intermediate fcuda-prec-div flag. Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>

[SYCL] Add -fsycl-fp32-prec-sqrt flag

57dcd40

This flag enables correctly rounded `sycl::sqrt` (the default precision requirement is 3 ULP). And enables the flag for CUDA and HIP targets.

npmiller requested review from a team as code owners January 14, 2022 10:58

npmiller requested a review from andykaylor January 14, 2022 10:58

npmiller mentioned this pull request Jan 14, 2022

[SYCL] Add test for correctly rounded sqrt intel/llvm-test-suite#741

Open

smanna12 previously approved these changes Jan 14, 2022

View reviewed changes

elizabethandrews reviewed Jan 14, 2022

View reviewed changes

clang/include/clang/Driver/Options.td Outdated Show resolved Hide resolved

sycl/doc/UsersManual.md Show resolved Hide resolved

[SYCL] Update command line help text

eb27453

npmiller dismissed smanna12’s stale review via eb27453 January 14, 2022 15:51

smanna12 reviewed Jan 14, 2022

View reviewed changes

clang/include/clang/Driver/Options.td Show resolved Hide resolved

[SYCL] Add test for unused argument warning

f90cb8d

elizabethandrews previously approved these changes Jan 14, 2022

View reviewed changes

smanna12 previously approved these changes Jan 14, 2022

View reviewed changes

premanandrao reviewed Jan 14, 2022

View reviewed changes

clang/lib/Driver/ToolChain.cpp Outdated Show resolved Hide resolved

clang/lib/Driver/ToolChains/HIPSPV.h Outdated Show resolved Hide resolved

Update clang/lib/Driver/ToolChains/HIPSPV.h

5cebae7

Co-authored-by: premanandrao <premanand.m.rao@intel.com>

npmiller dismissed stale reviews from smanna12 and elizabethandrews via 5cebae7 January 14, 2022 16:58

npmiller and others added 2 commits January 14, 2022 17:01

Update clang/lib/Driver/ToolChain.cpp

71e578a

Co-authored-by: premanandrao <premanand.m.rao@intel.com>

[SYCL] Fix formatting

b701652

AGindinson reviewed Jan 14, 2022

View reviewed changes

clang/lib/Driver/ToolChains/AMDGPU.cpp Outdated Show resolved Hide resolved

clang/test/Driver/sycl-nvptx-sqrt.cpp Outdated Show resolved Hide resolved

npmiller and others added 3 commits January 17, 2022 10:31

[SYCL] sycl-nvptx-sqrt.cpp test works on Windows

eb0abf2

Update clang/lib/Driver/ToolChains/AMDGPU.cpp

486eef3

Co-authored-by: Artem Gindinson <artem.gindinson@intel.com>

[SYCL] Fix formatting

db77ba8

AGindinson reviewed Jan 17, 2022

View reviewed changes

[SYCL] Check conflicting flag on AMD

0ff1e21

bader approved these changes Jan 18, 2022

View reviewed changes

bader requested review from mdtoguchi and hchilama January 18, 2022 14:47

smanna12 approved these changes Jan 18, 2022

View reviewed changes

elizabethandrews approved these changes Jan 18, 2022

View reviewed changes

bader requested a review from a team January 18, 2022 15:00

mdtoguchi approved these changes Jan 18, 2022

View reviewed changes

pvchupin requested a review from xtian-github January 18, 2022 19:35

xtian-github reviewed Jan 18, 2022

View reviewed changes

npmiller requested a review from xtian-github January 31, 2022 10:45

xtian-github approved these changes Feb 4, 2022

View reviewed changes

bader merged commit 5c8b7e7 into intel:sycl Feb 4, 2022

npmiller mentioned this pull request Feb 4, 2022

[CUDA] sycl::sqrt leads to IEEE754 incompatible results on NVidia cards #4041

Closed

npmiller mentioned this pull request Mar 13, 2024

[HIP] DPC++ does not use correctly rounded sqrt, even when using -fno-fast-math #12961

Open

MrSidims mentioned this pull request Feb 18, 2025

[SYCL][CUDA][HIP] Propagate -foffload-fp32-prec-sqrt #17044

Merged

[SYCL] Add -fsycl-fp32-prec-sqrt flag #5309

[SYCL] Add -fsycl-fp32-prec-sqrt flag #5309

Uh oh!

Conversation

npmiller commented Jan 14, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elizabethandrews left a comment

Choose a reason for hiding this comment

Uh oh!

smanna12 left a comment

Choose a reason for hiding this comment

Uh oh!

premanandrao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AGindinson left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bader left a comment

Choose a reason for hiding this comment

Uh oh!

smanna12 left a comment

Choose a reason for hiding this comment

Uh oh!

elizabethandrews left a comment

Choose a reason for hiding this comment

Uh oh!

npmiller commented Jan 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bader commented Jan 18, 2022

Uh oh!

bader commented Jan 18, 2022

Uh oh!

xtian-github left a comment

Choose a reason for hiding this comment

Uh oh!

zjin-lcf commented Jan 18, 2022

Uh oh!

xtian-github commented Jan 19, 2022

Uh oh!

npmiller commented Jan 19, 2022 • edited by xtian-github Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zjin-lcf commented Jan 19, 2022

Uh oh!

npmiller commented Jan 31, 2022

Uh oh!

bader commented Jan 31, 2022

Uh oh!

bader commented Feb 4, 2022

Uh oh!

xtian-github left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AGindinson left a comment •

edited

Loading

npmiller commented Jan 18, 2022 •

edited

Loading

npmiller commented Jan 19, 2022 •

edited by xtian-github

Loading