#

llm-jailbreaks

Here are 4 public repositories matching this topic...

agentic_security

msoedov / agentic_security

Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪

agent-framework ai-red-team prompt-testing llm-security llm-vulnerabilities llm-evaluation llm-fuzzing llm-evaluation-framework llm-guardrails llm-scanner llm-jailbreaks llm-fuzzer llm-fuzzer-aggregator agent-security

Updated Jun 10, 2025
Python

ThuCCSLab / JailbreakEval

[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.

llm-safety llm-jailbreaks

Updated Apr 1, 2025
Python

circle-guard-bench

whitecircle-ai / circle-guard-bench

First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)

benchmarking benchmark ai jailbreak safeguard guardrail guardrails large-language-models llm large-language-model llm-security llm-eval llm-evaluation llm-as-a-judge llm-jailbreaks

Updated Jun 6, 2025
Python

UCSB-NLP-Chang / SemanticSmooth

Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'

large-language-models llm-jailbreaks

Updated Jun 9, 2024
Python

Improve this page

Add a description, image, and links to the llm-jailbreaks topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-jailbreaks topic, visit your repo's landing page and select "manage topics."