Published December 28, 2023
| Version v1
Publication
Incremental AI Risks from Proxy-Simulations
Creators
Contributors
Description
Numerical simulations are versatile predictive tools that permit explorations of complex systems. The ability of LLM agents to simulate real-world scenarios will expand the AI risk landscape. In the proxysimulation threat model, a user (or a deceptively aligned AI) can obfuscate the goal behind simulationbased predictions by leveraging the generalizability of simulation tools. Three highly idealized proxysimulation examples are presented that illustrate how damage, casualties, and concealment of illegal activities can be planned for, in obfuscation. This approach bypasses existing alignment and safety filters (GPT4, Claude2 and LLama2). AI-enabled simulations facilitate access to prediction-based planning that is not otherwise readily available. To the extent that goal obfuscation is possible, this increases AI risk.
Additional details
Identifiers
- URL
- https://hal.science/hal-04365629
- URN
- urn:oai:HAL:hal-04365629v1
Origin repository
- Origin repository
- UNICA