MS Final Oral Exam: Gautham Suresh
PARROT: Privacy-Focused Agentic Retrieval Augmented Generation for Personal Social Media Analytics
Social media platforms store large amounts of user data, but users currently cannot ask questions about their history in natural language. Assistants like Meta AI are more focused on generating personalized responses and targeting ads, but do not give the capacity to explore and query personal data. Third-party tools also have issues, requiring users to upload personal data to cloud services, which compromises user privacy. This creative component presents PARROT, a local-first, privacy-preserving, personal AI analytics agent. Built specifically for the private social platform, AsMoment, users can ask questions about their own social history without exposing personal raw data to remote services. It ingests data into a local property graph database and creates semantic embeddings locally during this process. The query interface is built on a ReAct agent that dynamically chooses between structured graph query generation and vector similarity search, creating a hybrid agentic Retrieval Augmented Generation (RAG) pipeline. Traditional RAG pulls text from a vector database and passes it to a Large Language Model (LLM) in a fixed pipeline. However, PARROT considers which retrieval method to use, calls tools multiple times, and forms a final answer. Users can stay fully local by using a local LLM, but if they instead connect a remote LLM to the agent, a Personally Identifiable Information (PII) redaction layer is applied. The solution also has cross-session memory, persisting the user’s context across sessions. The architecture was first validated on Mastodon, then ported to AsMoment. Our results show that privacy-preserving natural language analytics over personal social data is eective, regardless of whether a local or cloud-based LLM is used.
Committee: Simanta Mitra (major professor)