Wai Man (Raymond) Si

CISPA – Helmholtz Center for Information Security
Saarbrücken, 66123 Saarland
I am a Ph.D student at CISPA Helmholtz Center for Information Security, advised by Prof. Michael Backes and Dr. Yang Zhang. Prior to that, I received my B.S. (2018) and M.S. (2021) degrees from Georgia Institute of Technology. where I am fortunate to work with Prof. Alexander Lerch and Prof. Mark Riedl.
My research focuses on attacks targeting NLP models, including adversarial and poisoning attacks. I am also interested in developing safer models through post-training techniques. Currently, my work explores LLM behavior using mechanistic interpretability, as well as lightweight methods for steering or modifying model behavior.
Honors and Awards
- 2025Education Scholarship, Education and Youth Development Bureau of Macau
- 2023Best Paper Finalist, CSAW Europe
- 2022CCS Best Paper Award Honorable Mention, ACM
News
Apr 2025 | Our paper titled “SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation” got accepted in ICLR 2025! |
---|---|
Apr 2023 | Our paper titled “Two-in-One: A Model Hijacking Attack Against Text Generation Models” got accepted in USENIX Security 2023! |
Nov 2022 | Our paper “Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots” got best paper award honorable mention at CCS 2022! |
Aug 2022 | Our paper titled “Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbot” got accepted in CCS 2022! |