NATIONAL HARBOR, Md.—Now that the Air Force is starting to deploy artificial intelligence operationally, service leaders are grappling with AI’s limitations—not just what it can and cannot do, but the extensive data and technical and human infrastructure it needs to work.
That was the takeaway from industry experts and senior officers at AFA’s Air, Space & Cyber Conference on Sept 23.
At a recent experiment staged by the Advanced Battle Management System Cross-Functional Team, for example, vendors and Air Force coders used AI to do the work of “match effectors”—deciding which platforms and weapons systems should be used against a particular target and generating Courses of Action, or COAs, to achieve a military objective.
An AI algorithm was able to generate a COA in 10 seconds, compared to 16 minutes for a human, but “they weren’t necessarily completely viable COAs,” said Maj. Gen. Robert Claude, Space Force representative to the ABMS Cross-Functional Team.
“While [the AI] was much more timely and there were more COAs generated,” some did not take all necessary factors into account; for instance, proposing the use of infrared-guided weapons when the weather was cloudy, Claude told reporters.
“We’re getting faster results and we’re getting more results, but there’s still going to have to be a human in the loop for the foreseeable future to make sure that, yes, it’s a viable COA or no, we need just a little bit more of this to make the COA viable,” he explained.
The tendency of generative AI to “hallucinate,” or invent answers, is well understood, but other forms of AI have problems too, explained David Ware, a partner at consulting firm McKinsey, who moderated a panel on data and AI for decision superiority.
“Generative AI uniquely has a hallucination problem, but all AI models have problems with accuracy and bias,” he told Air & Space Forces Magazine in a brief interview following the panel.
For instance, if a model meant to do targeting is trained on historic information, its decision-making could be flawed. “It’ll be biased towards that prior information. It’ll still have an accuracy issue that I have to overcome,” Ware said.
Such limitations make it essential to carefully fence off AI systems during deployment, added Ryan Tseng, president, cofounder, and chief strategy officer for Shield AI, a startup that develops AI and autonomous technology for the U.S. military and its allies.
In an aircraft, for example, flight and other critical systems “that don’t necessarily need the creativity and very wide scope thinking of [a Large Language Model], should be segmented off, with the LLMs and very complex autonomy in another part of the system,” Tseng said.
Such a segmented technical architecture “helps keep the AI in a box, so to speak. It’s … bounding the downsides of what can happen,” he said.
Limiting the scope of what is expected from AI, at least initially, is another strategy for dealing with its limitations, said Maj. Gen. Luke Cropsey, the program executive officer for command, control, communications, and battle management (C3BM).
C3BM launched in 2022 to bring roughly 50 programs comprising a future air and space “battle network” under one roof for clearer direction and stronger oversight.
Restricted uses of AI will allow human operators to second-guess the systems, Cropsey said, forming a kind of safety net.
With such narrow use cases, “we can still have a fairly good intuition at an operational level what the results should look like,” and check the work of the AI, Cropsey explained.
“There is, I think, a sweet point between the complexity of the space that you’re operating in, the human mind’s ability to correlate the inputs to the outputs, and the trust that we have, or don’t have, towards the model that’s generating those outputs.”
The flaws and failings of AI models aren’t the only limitations Air Force leaders are wrestling with, Cropsey said: “One of my biggest challenges is the underlying infrastructure that actually makes it all work.”
“I have literally independently owned and operated [technology] stacks all over the place,” Cropsey added, explaining that incompatible data formats and other interoperability issues were among his toughest problems. His most important work “is not glamorous at all,” Cropsey told the panel, “It’s just the hard work of figuring out, how do you get the right infrastructure where you need it?”
“Fundamental to the actual execution is the scalability problem,” agreed new Deputy Chief of Staff of the Air Force for warfighter communications and cyber systems Maj. Gen. Michele Edmondson.
Infrastructure doesn’t just mean hardware or fiber, explained Ware. “It’s all of the above: hardware and software and everything else that have to be layered on top of each other. … It’s less lacking the bare metal, although that is an issue in certain cases, particularly in those edge AI use cases,” where AI, which requires the highest powered computer chips, is deployed in front line environments.
Most of the time, he added, “it’s that they need to have systems that can support all the data and all the tools being in the right place to do the development.” Because of classification issues and interoperability problems between different vendors’ equipment or proprietary data formats, “that’s hard in the classified and vendor environment that exists in the DOD,” he said.
Personnel numbers are also an issue, said Edmondson. “We need more people with the right skill sets … We’ve stood up a new data analytics career field, which I think is great, but that is a very small core subset of human beings. We need more people that can help us.”
The Air Force, she argued, needs to do more to take advantage of the native skills of young Airmen now joining up who are “digitally literate from the time they were born. We have got to capitalize on that, and we have got to upscale them throughout their career, so that we continue to build on the skills they will bring with them.”
Data is another chokepoint, especially at the edge, said Edmondson. “We really struggle, I think, with data integrity and being able to integrate more data. We’ve got to be able to share and aggregate data.”