-
Notifications
You must be signed in to change notification settings - Fork 60
Pull requests: NVIDIA/Fuser
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP] Always do CGA split in persistent Hopper matmul
#4610
opened Jun 10, 2025 by
jacobhinkle
•
Draft
CUB-based block-parallel topk implementation as a device func
#4607
opened Jun 10, 2025 by
naoyam
Loading…
Generate ldstmatrix shared memory address using IdModel
Matmuls
#4579
opened Jun 5, 2025 by
rdspring1
Loading…
Create alternate loop domain for ldstmatrix shared memory indexing
Matmuls
#4578
opened Jun 5, 2025 by
rdspring1
Loading…
Modify matmul scheduler to scheduler alternate_loop_domain for ldstmatrix
Matmuls
#4551
opened May 30, 2025 by
rdspring1
Loading…
Improve Hopper matmul heuristic to enable larger CGAs
#4547
opened May 30, 2025 by
jacobhinkle
Loading…
Respect
min
and max
of inputs to create more precise repro scripts
#4535
opened May 28, 2025 by
crcrpar
Loading…
Recreate python_frontend test_basic for nvfuser_direct
Direct Bindings
Python extension with direct mapping to NvFuser CPP objects.
Python API
Issues related to the Python API
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.