AI's Semantic Spies: Leaking Training Data!

MathyAIwithMike

November 18th, 2025 (4 months ago)

Nov 17 - Nov 18, 2025

Unknown length

Delve into the unsettling reality of AI models leaking sensitive training data through semantic memorization. Discover how "Semantic Spies" exploit chat templates to extract the essence of training data, bypassing traditional string-matching methods. Learn about the innovative "embedding-first" approach that uncovers hidden data leakage, even in RL-trained models. Understand the implications for intellectual property and AI safety, and why a new approach to model security is crucial for developers and users of open models. This is a game-changer in AI security!

Download