NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1.5k
Star 10.7k

Code
Issues 603
Pull requests 282
Discussions
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 44 Milestones 1

New pull request New

282 Open 2,166 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix: Double build time limit since #5027 halfs NUM_JOBS

#5212 opened Jun 14, 2025 by yuantailing

Loading…

Overlap: Skip last iter on length

#5211 opened Jun 13, 2025 by IzzyPutterman

Loading…

[fix] Fix Llama4 min-latency import error

#5209 opened Jun 13, 2025 by nv-yilinf

Loading…

[TRTLLM-5770] feat: Integrate TRT-LLM Gen FP8 block scale MoE with Pytorch workflow kernel autotuner

#5207 opened Jun 13, 2025 by DomBrown

Loading…

[feat] Add EAGLE3 support for Qwen3

#5206 opened Jun 13, 2025 by nv-yilinf

Loading…

feat: Introduce UserProvidedConfig for speculative decoding

#5204 opened Jun 13, 2025 by Funatiq • Draft

feat: Enable EPLB to existing MoE models

#5203 opened Jun 13, 2025 by syuoni

Loading…

[fix][test] Speedup Nemotron NAS unittests

#5202 opened Jun 13, 2025 by omera-nv

Loading…

[draft][fix] rewrite completion API to avoid repetitive tokens

#5201 opened Jun 13, 2025 by LinPoly

Loading…

Test

#5199 opened Jun 13, 2025 by ZhanruiSunCh • Draft

Merge current waive list with the ToT waive list

#5198 opened Jun 13, 2025 by yiqingy0

Loading…

tests: add ds r1 tp4 test

#5197 opened Jun 13, 2025 by xinhe-nv • Draft

tests: add multi nodes tests

#5196 opened Jun 13, 2025 by xinhe-nv • Draft

test: add deepseek rcca cases

#5195 opened Jun 13, 2025 by ruodil

Loading…

refactor: dummy request creation

#5192 opened Jun 13, 2025 by lfr-0531

Loading…

[chore] Linking fixes to NVRTC wrapper Community want to contribute

PRs initiated from Community

#5189 opened Jun 13, 2025 by AlessioNetti

Loading…

test: add llama4 models for perf test

#5187 opened Jun 13, 2025 by ruodil

Loading…

add dgx b200 8gpu test case in post merge

#5185 opened Jun 13, 2025 by yuanjingx87

Loading…

[TRTLLM-5653][infra] Run docs build only if PR contains only doc changes

#5184 opened Jun 13, 2025 by zhanga5

Loading…

feat: MoE trtllm backend kernel update

#5183 opened Jun 13, 2025 by rosenrodt

Loading…

Add debug hook to support dump tensor data and add new debug functions easily

#5182 opened Jun 13, 2025 by HuiGao-NV

Loading…

Removed <think> on head of reasoning_content for DeepSeek-R1 model

#5181 opened Jun 13, 2025 by k-l-lambda

Loading…

test: Add json_mode_eval for guided decoding evaluation

#5179 opened Jun 13, 2025 by syuoni

Loading…

enh: Add script to map tests <-> jenkins stages & vice-versa

#5177 opened Jun 13, 2025 by venkywonka

Loading…

[doc] Update Perf-Overview.MD with V0.20 Release Data

#5176 opened Jun 13, 2025 by zbpatel

Loading…

Previous 1 2 3 4 5 … 11 12 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!