publications

Please see my full publication list at google scholar.

2025

  1. ICLR
    SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
    Mingjie Li, Wai Man Si, Michael Backes, and 2 more authors
    In , 2025

2024

  1. arXiv
    ICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization
    Wai Man Si, Michael Backes, and Yang Zhang
    CoRR, 2024

2023

  1. USENIX
    Two-in-One: A Model Hijacking Attack Against Text Generation Models
    Wai Man Si, Michael Backes, Yang Zhang, and 1 more author
    In , 2023
  2. arXiv
    Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing
    Wai Man Si, Michael Backes, and Yang Zhang
    CoRR, 2023
  3. arXiv
    Comprehensive Assessment of Toxicity in ChatGPT
    Boyang Zhang, Xinyue Shen, Wai Man Si, and 6 more authors
    CoRR, 2023

2022

  1. CCS
    Why So Toxic?: Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
    Wai Man Si, Michael Backes, Jeremy Blackburn, and 4 more authors
    In , 2022

2021

  1. SIGDAL
    Telling Stories through Multi-User Dialogue by Modeling Character Relations
    Wai Man Si, Prithviraj Ammanabrolu, and Mark O. Riedl
    In , 2021

2019

  1. PRICAI
    Boosting Variational Generative Model via Condition Enhancing and Lexical-Editing
    Zhengwei Tao, Wai Si, Juntao Li, and 2 more authors
    In , 2019