max-andr

Follow

🚀

Maksym Andriushchenko max-andr

🚀

Follow

PhD student @ EPFL🇨🇭. Interested in robustness and generalization in LLMs.

205 followers · 332 following

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Highlights

Pro

Organizations

Block or Report

Block or report max-andr

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Pinned

tml-epfl/llm-adaptive-attacks tml-epfl/llm-adaptive-attacks Public

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]

Shell 108 8
JailbreakBench/jailbreakbench JailbreakBench/jailbreakbench Public

An Open Robustness Benchmark for Jailbreaking Language Models [arXiv 2024]

Python 88 11
RobustBench/robustbench RobustBench/robustbench Public

RobustBench: a standardized adversarial robustness benchmark [NeurIPS'21 Benchmarks and Datasets Track]

Python 608 95
tml-epfl/understanding-fast-adv-training tml-epfl/understanding-fast-adv-training Public

Understanding and Improving Fast Adversarial Training [NeurIPS 2020]

Python 91 12
square-attack square-attack Public

Square Attack: a query-efficient black-box adversarial attack via random search [ECCV 2020]

Python 144 26
relu_networks_overconfident relu_networks_overconfident Public

Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem [CVPR 2019, oral]

Python 182 21