An empirical study investigating the systemic security vulnerabilities found in GPTs, which are customized AI agents built on OpenAI's large language models. The authors analyze the attack surface of four core components, expert prompt, chat history, tools, and knowledge—from a platform-user perspective.
Through a multidimensional attack suite, the research successfully demonstrates significant security flaws, including information leakage of proprietary instructions and the misuse of integrated tools via direct and indirect prompt injection methods like knowledge poisoning. Finally, the study proposes and evaluates low-overhead, prompt-level defense mechanisms to mitigate these identified security risks, offering guidance for the more secure development of the GPT ecosystem.
Source: https://arxiv.org/pdf/2512.00136