Losing public law values
National security AI systems make legal compliance and accountability harder to enforce
National security AI further complicates the roles of our secrecy surrogates. It doubles down on the same things that we already find problematic in national security generally: the lack of transparency; the difficulty that courts and Congress have in providing effective oversight; the breadth of delegations from Congress to the Executive and the Executive’s broad flexibility in implementing those delegations; and limited requirements for the government to give reasons for its decisions. When national security agencies are using AI tools, legal compliance becomes harder because AI tools shift power away from agency lawyers to engineers. Accountability becomes harder because AI systems create significant ambiguity about who is responsible for algorithmic errors. And it is harder to force the Executive to justify its decisions and give reasons for its choices when the systems it is using do not easily reveal their decisional pathways. Further, at least at first blush, characteristics of AI reduce the potency of some of the checks on the national security Executive, including leaks and interagency tensions.
1. Legality
One of the most salient aspects of a democracy is that government officials, like the citizens they serve, must comply with the law. Executive branch lawyers play a key role in ensuring that this tenet is embedded in government policy and practice. Agency clients—at least those who understand the value of legal advice—consult with their general counsel’s offices. Agencies and the White House seek the views of the Justice Department’s Office of Legal Counsel to obtain definitive interpretations of the U.S. Constitution and statutes. Congress may call agency general counsels to testify about legal policy decisions made by those agencies. And nongovernmental actors often sue the government to enforce its obligation to comply with the law. Executive lawyering plays a particularly important role when it comes to national security operations, where courts and outside litigation play less of a role in shaping government compliance with the law.
Compared to a lawyer’s engagement with a human policymaker, it is harder for the lawyers to know what factors an algorithm is weighing as it formulates predictions, or whether the system has learned to make predictions based on hidden hallucinations.
(...)
Adding AI to the mix may further weaken the public’s confidence that the Executive is acting consistent with the Constitution and statutes, as well as international law. From an executive branch lawyer’s perspective, it will be hard to trace what data the government’s computer scientists are training national security algorithms on, and even harder to trace what data a defense contractor used to train algorithms it sold to the government. This means it will be difficult to assess whether any of that data was wrongfully obtained (because, for example, the government collected it without a warrant though a warrant was required) or contains information that the government undertook not to collect as a policy matter. Compared to a lawyer’s engagement with a human policymaker, it is harder for the lawyers to know what factors an algorithm is weighing as it formulates predictions, or whether the system has learned to make predictions based on hidden hallucinations. Imagine if the lawyers assessing the legality of the bin Laden raid also had to take into account the fact that DOD and the CIA had used AI tools to detect and target him. A difficult job becomes even harder.
In addition, it may be hard for a lawyer to know whether it is even permissible for the government to use an AI/ML system in a given setting, as where a treaty or statute is written in a way that envisions a human making a particular judgment. The breadth of many national security laws (for example, Congress’s delegation to the President to use “necessary and appropriate force” in its recent authorizations for use of military force) will make it challenging to build algorithms whose recommendations accurately capture congressional and presidential intent. And making representations to actors in the other branches of government about whether an operation complies with international and domestic law will be thorny. In a non-AI/ML case, the Office of the Director of National Intelligence misunderstood how National Security Agency surveillance tools operated and misrepresented the process to the Foreign Intelligence Surveillance Court. Add neural nets to the mix and it is easy to see how that kind of legal error will become common.
Finally, as AI/ML becomes pervasive in the mine run of U.S. national security operations, the power to shape the operations will shift toward computer scientists and engineers and away from lawyers and policymakers, even if those engineers are not trained in law or policy. It is the engineers who—expressly or implicitly—embed values into the algorithm in deciding what initial weights to use and what outcomes to prioritize. The relative lack of sophistication about technology among certain U.S. government agencies and officials (including lawyers), as well as foreign officials, will decrease their ability to test the values, including legal rules, that undergird these AI/ML applications. As Chapter 4 argues, military and intelligence operators, ethicists, and lawyers should demand to work closely with the engineers who are crafting and testing these systems.
It is the engineers who—expressly or implicitly—embed values into the algorithm in deciding what initial weights to use and what outcomes to prioritize. The relative lack of sophistication about technology among certain U.S. government agencies and officials (including lawyers), as well as foreign officials, will decrease their ability to test the values, including legal rules, that undergird these AI/ML applications.
2. Competence
In addition to ensuring that U.S. national security operations—military targeting, electronic and human surveillance, counterintelligence, covert action—comply with the relevant legal frameworks, we want those operations to be developed and executed competently and efficiently. We do not want our officials to misuse the public fisc, behave self-servingly or recklessly, or double down on fruitless tools or policies. Because secrecy reduces the number of people involved in developing a policy, it can decrease the quality of the policy ex ante and help officials conceal missteps ex post. To pick just one example, if a broader set of interagency policymakers and members of Congress had known the truth about the Gulf of Tonkin incident (in which the White House falsely claimed that North Vietnam had attacked two U.S. destroyers), Congress likely would not have given President Johnson such a capacious and open-ended authorization to use force in Vietnam.
Introducing AI into this secrecy ecosystem has the potential to further compromise competent and efficient execution of U.S. national security policies. As I wrote previously about the use of AI by the Defense Department:
The military use of predictive algorithms and machine learning tools seems certain to replicate and even exacerbate, at least for the casual observer, many of the critiques that the military has faced over the past fifteen years: a lack of transparency, a willingness to adopt aggressive interpretations of the law, a concern that the military makes detention and targeting decisions based on flawed data, and a perceived dehumanization of lethal action (in the form of drone strikes and increasingly automated decision-making).
One of the most important roles for Congress—the most traditional overseer—is to help ensure that the Executive is performing competently and efficiently. And yet, at least before mid-2023, Congress as a body was woefully undereducated about AI. Chapter 1 discussed why Congress often has difficulty gaining access to secret programs. Even when the relevant congressional committees do gain access to highly classified programs, Congress has had persistent challenges in understanding technologically complex programs and overseeing the Executive’s use of them. This leaves the Executive undersupervised regarding the quality and efficacy of its NSAI tools.
Even when the relevant congressional committees do gain access to highly classified programs, Congress has had persistent challenges in understanding technologically complex programs and overseeing the Executive’s use of them. This leaves the Executive undersupervised regarding the quality and efficacy of its NSAI tools.
(…)
Like members of Congress, executive officials themselves may have a harder time understanding when a given NSAI tool is fit for purpose. Consider the traditional national security activities that officials may generally encounter, such as covert action, foreign electronic surveillance, and military operations. Depending on their specific portfolios, a wide range of officials at different levels of seniority are likely to have encountered those operations repeatedly and will readily understand how they are developed, approved, and executed. Now consider national security operations involving AI tools. The staff of the National Security Commission on AI, a congressionally created group of experts that produced several reports, “interviewed numerous government officials from different departments and at different levels of seniority who will freely admit they do not understand basic AI concepts.” The Commission continued, “[F]or the government to successfully adopt AI, many if not most end users will need to gain a baseline understanding of AI’s limitations, data management requirements, and ethical use.” Absent that understanding, officials may not understand the strategic risks of using certain AI/ML tools, even as they face pressure to approve them to win the AI race.
This means that it may be harder for senior national security officials to lead, rather than follow, the technology. Richard Danzig notes that “senior officials are called to make decisions about, and on the basis of, technologies that did not exist at the time of their education and earlier experience. . . . Very few have the time, talent and taste to update their understandings, [and as a result] most do not.” This means we may face a bottom-up approach to adopting these tools, with the attendant risks that the more junior officials and computer scientists developing them and proposing their use lack a full perspective on foreign policy risks. Further, the fact that an even smaller group of national security officials than usual is capable of assessing the quality of the AI systems could lead to “groupthink” about the wisdom of deploying the tools. Insiders may misjudge public sentiment as they weigh what policies the public will support. Finally, government officials may only understand the system’s flaws after the fact because of how complicated the systems are.
The staff of the National Security Commission on AI, a congressionally created group of experts that produced several reports, “interviewed numerous government officials from different departments and at different levels of seniority who will freely admit they do not understand basic AI concepts.”
3. Accountability
Because we have not yet determined how to assign responsibility for errors made by AI systems, we will soon face a “double unaccountability” problem in the national security realm. We already have limited access to information about executive acts, which makes it difficult to trace when errors have happened and which person, office, or agency is responsible for that error. (Take a U.S. drone strike that produces civilian deaths: high levels of classification make it hard for observers to know whether the Defense Department or the CIA conducted a particular strike.) Court cases arise so infrequently in this setting that judges are rarely positioned to make definitive judgments that assign accountability. And even when they are in that position, their lack of familiarity with technology and their deference to the Executive means they rarely will. The difficulty of detecting who is responsible for significant algorithmic errors doubles down on this challenge. The chain of AI development, training, and learning creates a range of possible actors to hold responsible. The approach that Volvo has adopted for its self-driving cars is to accept responsibility up front for any accidents that result from the use of those cars—but that company is one of only a few actors taking that approach, at least to date. There is significant disagreement in the literature about how to manage the AI accountability dilemma.
Because we have not yet determined how to assign responsibility for errors made by AI systems, we will soon face a “double unaccountability” problem in the national security realm. We already have limited access to information about executive acts, which makes it difficult to trace when errors have happened and which person, office, or agency is responsible for that error.
Further complicating how we think about accountability is the fact that the private sector will be responsible for some AI development. Although not a problem unique to AI, the use of private companies and contractors adds another layer of actors to the product development chain. There have been reports that some of the Defense Department’s contractors, which have worked on AI/ML programs such as Project Maven, are building technology that is not “revolutionary or innovative” but have successfully put their technology “in the language of military-speak” to persuade the Pentagon to buy it. One article notes, “[E]ven the seemingly underdeveloped products pose ethical concerns and could lead to unproven technologies in the hands of government officials with major potential for misuse.” Many officials in those companies are less attuned to compliance with the laws of armed conflict than the U.S. military is. Further, various U.S. laws such as FOIA do not apply to contractors. Nor does a range of treaties. And the public generally lacks information about the contents of the government’s contracts. All of this makes it harder for the public—and perhaps even the military—to identify and hold responsible those who produce poor products.
Excerpted from The Double Black Box published by Oxford University Press ©2025