If left to their own devices, some large language models (LLMs) have a tendency to become bellicose in wargaming scenarios, and in rare cases, seek annihilation – there were several instances when the LLMs suddenly unleashed nuclear weapons upon their virtual adversaries.
That’s the summary of a research paper titled “Escalation Risks from Language Models in Military and Diplomatic Decision-Making.” The report, co-authored by five computer science scholars and AI researchers, delves into the hypothetical, the what-ifs of LLM decision-making when presented with a set of conditions to guide foreign diplomacy and defense strategy. The paper is currently awaiting peer review.
Jacquelyn Schneider, a Hoover Fellow at the Hoover Institution on War, Revolution and Peace, and the director of the Hoover Wargaming and Crisis Simulation Initiative, brought together the team that evaluated five LLMs in a game of international diplomacy.
The Institution, a public policy think tank, is based at Stanford University and was founded by Herbert Hoover in 1919, a decade before he became the 31st U.S. President. His 1959 statement to the University’s Board of Trustees has defined the Institution’s mission: “The overall mission of this Institution is, from its records, to recall the voice of experience against the making of war, and by the study of these records and their publication, to recall man’s endeavors to make and preserve peace, and to sustain for America the safeguards of the American way of life.”
That team included Juan-Pablo Rivera, who is working towards a master’s degree in computational analytics at the Georgia Institute of Technology, and Chandler Smith, a research scholar with ML Alignment & Theory Scholars, a scientific and educational seminar and independent research program based in Berkeley, CA. In addition to Ms. Schneider, there are three other team members linked to Stanford University; Gabriel Mukobi, working towards a master’s degree in AI Alignment, Anka Reuel, a Computer Science Ph.D. candidate, and Max Lamparth, a post doctorate fellow researching emergent capabilities of LLMs.
In the simulations, eight “autonomous nation agents” with different military capabilities and histories, engaged in international relations with each other. Each nation was embedded with the same LLM to serve as its leader. After each run, a different LLM was used for all eight agents, until all of the five LLMs (GPT-4, GPT 3.5, Claude 2.0, Llama-2-Chat, and GPT-4-Base) were utilized.
“All models show signs of sudden and hard-to-predict escalations,” the authors said. Sometimes those escalations occurred where there wasn’t any beef between two or more of the agents.
“These finding are in line with previous work on non-LLM-based, computer-assisted wargaming, where Emery (2021) finds that computer models did escalate more than human actors. We further observe that models tend to develop arms-race dynamics between each other, leading to increasing military and nuclear armament, and in rare cases, to the choice to deploy nuclear weapons. Qualitatively, we also collect the models’ chain-of-thought reasoning for choosing actions and observe worrying justifications for violent escalators actions. We assert that much more analysis is needed to better understand when and why LLMs may escalate conflicts before deploying these models in high-stakes real-world settings to avoid unintended consequences, security risks, or even catastrophic failures.”
The authors added that even if humans are the final decision-makers and have oversight of AI functions when supplied with LLM strategic advice, human decision-makers could become “Increasingly reliant on the counsel offered by autonomous agents.”
Another research project that evaluated the possible effect of AI on defense strategy and weaponry, “An Overview of Catastrophic AI Risks,” stated that we are at a pivotal moment. “As AI gain influence over traditional military weaponry and increasingly take on command and control functions, humanity faces a paradigm shift in warfare,” wrote co-authors Dan Hendrycks, a machine learning researcher and the director of the Center for AI Safety, Mantas Maeika, a computer science Ph.D. student at the University of Illinois Urbana-Champaign, and Thomas Woodside, a junior fellow at the Center for Security and Emerging Technology at Yale University.
Lethal Autonomous Weapons, or LAWS, according to their report, “Offer potential improvements in decision-making speed and precision. Warfare, however, is a high-stakes, safety-critical domain for AI’s with significant moral and practical concerns. Though their existence is not necessarily a catastrophe in itself, LAWs may serve as an on-ramp to catastrophes stemming from malicious use, accidents, loss of control, or an increased likelihood of war.”
The U.S. Department of Defense released numerous statements over the past two years regarding the integration of AI into its operations, and has spent billions of dollars in that integration process.
Last summer, Deputy Secretary of Defense Kathleen H. Hicks sent out a memorandum to senior Pentagon leadership, announcing the establishment of the “Chief Digital and Artificial Intelligence Officer Generative Artificial Intelligence and Large Language Models Task Force, Task Force Lima.”
“Generative artificial intelligence capabilities, such as large language models, are growing in popularity, capability and impact around the globe,” she said. “These capabilities are trained on massive datasets in order to generate content at a level of detail and apparent coherence that would have previously required human authorship. These capabilities unlock new opportunities, just as they pose significant risks. The DoD faces an imperative to explore the use of this technology and the potential of these models’ scale, speed, and interactive capabilities to improve the Department’s mission effectiveness while simultaneously identifying proper protection measures and mitigating a variety of related risks.”
In December, Air Force Secretary Frank Kendall said the Air Force and Space Force are “fully committed – and pushing hard – to develop and deploy artificial intelligence as a key element in meeting security challenges posed by China and other adversaries.”
He added, “The critical parameter on the battlefield is time. The AI will be able to do much more complicated things much more accurately and much faster than human beings can. If the human is in the loop, you will lose. You can have human supervision and watch over what the AI is doing, but if you try to intervene you are going to lose. The difference in how long it takes a person to do something and how long it takes the AI to do something is the key difference.”
On January 9, 2024, a senior Pentagon policy official announced that the U.S. Department of Defense is “Increasing AI capacity through strategy, alignment.”
“If you imagine, essentially, a continuum of activities from science and technology investments all the way to fielding capabilities, this administration within the Department of Defense has launched new initiative at each place, essentially, in the continuum,” said Michael C. Horowitz, Deputy Assistant Secretary of Defense for Force Development and Emerging Capabilities. He said that the creation of a Chief Digital and Artificial Intelligence Office, and recent strategy updates aimed at aligning AI adoption with broader defense strategy, are part of the increase.
The U.S. is among 51 “Endorsing States” (as of Jan. 12, 2024) that has pledged to abide by the “Political Declaration on Responsible Military Use of Artificial Intelligence and Autonomy,” a set of non-binding guidelines addressing the use of AI and autonomy capabilities in the military domain. Other endorsers include France, Germany, Spain, Japan, Turkey and the United Kingdom.
The document contains 16 measures, including “States should implement appropriate safeguards to mitigate risks of failures in military AI capabilities, such as the ability to detect and avoid unintended consequences and the ability to respond, for example by disengaging or deactivating deployed systems, when such systems demonstrate unintended behavior.”