The Intersection of Embodied AI and LLMs: Unveiling New Security Threats

As artificial intelligence continues to advance, embodied AI—systems like autonomous vehicles and household robots that operate in the physical world—are increasingly adopting large language models (LLMs) to improve decision-making and reasoning capabilities. However, the integration of LLMs into these real-world systems brings with it significant security risks.

A new framework, BALD (Backdoor Attacks against LLM-based Decision-making systems), provides a comprehensive evaluation of potential attack vectors within these systems.

The BALD framework on LLM-based decision-making systems.

LLMs, known for their ability to generalize across various tasks through vast amounts of common knowledge, are often fine-tuned for specific applications in embodied AI. While this customization enhances performance, it also introduces new attack surfaces.

The BALD framework identifies three primary attack mechanisms:

word injection, where attackers embed trigger words in input prompts;

scenario manipulation, which involves altering the physical environment to influence decisions; and

knowledge injection, where malicious information is subtly introduced into the AI’s knowledge base.

Experiments using models like GPT-3.5, LLaMA2, and PaLM2, in tasks such as autonomous driving and home robotics, revealed the high success rates of these attacks. Some, like word injection and knowledge injection, achieved nearly 100% success, while scenario manipulation demonstrated effectiveness up to 90%.

These attacks can lead to dangerous outcomes, such as autonomous vehicles accelerating toward obstacles or robots performing hazardous actions like placing a knife on a bed.

One of the key insights from this research is that LLMs are highly vulnerable during the fine-tuning stage. Attacks can be stealthy, with minimal impact on benign performance, making them difficult to detect.

Additionally, current defense mechanisms—such as outlier word detection—struggle to counter scenario-based attacks, highlighting the urgent need for improved security measures in LLM-based embodied AI systems.

As embodied AI continues to evolve, ensuring the security of these systems is paramount. The BALD framework's findings underscore the risks posed by backdoor attacks, urging researchers and developers to adopt more robust defenses to protect against these vulnerabilities.

article

Jiao, R., Xie, S., Yue, J., Sato, T., Wang, L., Wang, Y., ...Zhu, Q. (2024). Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems. arXiv, 2405.20774. Retrieved from https://arxiv.org/abs/2405.20774v2

Subscribe to my Newsletter