Published December 28, 2023 | Version v1
Publication

Incremental AI Risks from Proxy-Simulations

Description

Numerical simulations are versatile predictive tools that permit explorations of complex systems. The ability of LLM agents to simulate real-world scenarios will expand the AI risk landscape. In the proxysimulation threat model, a user (or a deceptively aligned AI) can obfuscate the goal behind simulationbased predictions by leveraging the generalizability of simulation tools. Three highly idealized proxysimulation examples are presented that illustrate how damage, casualties, and concealment of illegal activities can be planned for, in obfuscation. This approach bypasses existing alignment and safety filters (GPT4, Claude2 and LLama2). AI-enabled simulations facilitate access to prediction-based planning that is not otherwise readily available. To the extent that goal obfuscation is possible, this increases AI risk.

Additional details

Identifiers

URL
https://hal.science/hal-04365629
URN
urn:oai:HAL:hal-04365629v1

Origin repository

Origin repository
UNICA