شماره ركورد
9751
پديد آورنده
معصومه چلنگر
عنوان
حملات به llm
مقطع تحصيلي
كارشناسي
رشته تحصيلي
علوم كامپيوتر
سال فارغ التحصيلي
1404
استاد راهنما
استاد لاريجاني
استاد مشاور
استاد لاريجاني
دانشجوي وارد كننده اطلاعات
معصومه چلنگريامچي
تاريخ ورود اطلاعات
1404/06/21
دانشكده
علوم كامپيوتر و رياضي
عنوان به انگليسي
Privacy Auditing and Attacks in LLMs
چكيده
Introduction
Large Language Models (LLMs) have emerged as powerful tools capable of generating
human-like text across a wide variety of applications. These models, such as GPT-4,
LLaMA, and Mistral, are typically trained on vast datasets that include public, private,
and proprietary textual data. However, their strength in learning intricate language
patterns also makes them susceptible to privacy risks, particularly in scenarios involving
data memorization and unintended information leakage.
As the deployment of LLMs in real-world systems continues to grow—from health-
care and legal analysis to content generation and customer support—so does the concern
about the confidentiality of the data they are trained on. One particularly troubling sce-
nario involves the unintentional regurgitation of training data in response to adversarial
prompts, potentially exposing sensitive or private information.
This report presents a unified examination of three key investigations into privacy
risks in LLMs:
• The first study focuses on membership inference attacks (MIAs) conducted on
synthetic data generated by LLMs, showing that even artificial outputs can reveal
information about the original training set.
• The second study expands on MIA techniques by introducing a context-aware
framework (CAMIA) tailored for generative models. It highlights how memoriza-
tion is influenced by the context and complexity of textual prefixes.
• The third study investigates the vulnerabilities of watermarking schemes used
to trace AI-generated text, and how novel smoothing attacks can remove such traces
without compromising content quality.
Together, these studies illustrate the multifaceted nature of privacy risks in modern
language models and underscore the urgent need for robust auditing mechanisms and
protective techniques to safeguard user data.