Wai Man (Raymond) Si

prof_pic.jpg

CISPA – Helmholtz Center for Information Security

Saarbrücken, 66123 Saarland

I am a Ph.D student at CISPA Helmholtz Center for Information Security, advised by Prof. Michael Backes and Dr. Yang Zhang. Prior to that, I received my B.S. (2018) and M.S. (2021) degrees from Georgia Institute of Technology. where I am fortunate to work with Prof. Alexander Lerch and Prof. Mark Riedl.

My research focuses on attacks targeting NLP models, including adversarial and poisoning attacks. I am also interested in developing safer models through post-training techniques. Currently, my work explores LLM behavior using mechanistic interpretability, as well as lightweight methods for steering or modifying model behavior.


Honors and Awards

  • 2025
    Education Scholarship, Education and Youth Development Bureau of Macau
  • 2023
    Best Paper Finalist, CSAW Europe
  • 2022
    CCS Best Paper Award Honorable Mention, ACM

News

Apr 2025 Our paper titled “SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation” got accepted in ICLR 2025!
Apr 2023 Our paper titled “Two-in-One: A Model Hijacking Attack Against Text Generation Models” got accepted in USENIX Security 2023!
Nov 2022 Our paper “Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots” got best paper award honorable mention at CCS 2022!
Aug 2022 Our paper titled “Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbot” got accepted in CCS 2022!

Selected publications

  1. ICLR
    SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
    Mingjie Li, Wai Man Si, Michael Backes, and 2 more authors
    In , 2025
  2. USENIX
    Two-in-One: A Model Hijacking Attack Against Text Generation Models
    Wai Man Si, Michael Backes, Yang Zhang, and 1 more author
    In , 2023
  3. CCS
    Why So Toxic?: Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
    Wai Man Si, Michael Backes, Jeremy Blackburn, and 4 more authors
    In , 2022