Lipeng (Tony) He's picture

3 5 25

Lipeng (Tony) He

ttttonyhe

·

https://lipeng.ac

ttttonyhe

AI & ML interests

Trustworthy Machine Learning

Recent Activity

authored a paper 2 days ago

Safety at One Shot: Patching Fine-Tuned LLMs with A Single Instance

submitted a paper 2 days ago

Safety at One Shot: Patching Fine-Tuned LLMs with A Single Instance

updated a collection 3 days ago

Red-Teaming Models & Datasets

View all activity

Organizations

commented a paper 3 months ago

Locket: Robust Feature-Locking Technique for Language Models

Paper • 2510.12117 • Published Oct 14, 2025 •

commented a paper 11 months ago

Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense

Paper • 2502.00840 • Published Feb 2, 2025 •