Background: Atomic Canyon’s first collaboration with the Department of Energy came in March 2024, when the company partnered with Oak Ridge National Laboratory to develop AI tools for the nuclear sector. Then last November, Pacific Gas & Electric deployed Atomic Canyon’s AI at Diablo Canyon, an industry first.
Since then, the ties between nuclear and AI have only grown stronger, as indicated by the AI x Nuclear Executive Summit, which was cohosted this July by INL and ORNL. There, Atomic Canyon and ORNL signed a memorandum of understanding to expand the scope of their existing collaboration to streamline the licensing process for nuclear power plants.
Atomic Canyon isn’t the only developer in this space. Earlier this week, AI start-up Nuclearn raised $10.5 million in Series A funding to advance the development of their AI platform for nuclear plant operations, further highlighting the need for benchmarking in this emerging niche.
More details: Atomic Canyon explained that this new benchmark suite will specifically allow for the evaluation of AI systems that can access public nuclear documentation through RAG techniques. The company also said that, more broadly, it will “establish critical standards for AI adoption in nuclear facility operations, addressing an industry need for objective evaluation methods for generative AI systems in nuclear environments.”
Existing AI benchmarks are not sufficient for the nuclear space because of the sector’s unique requirements in safety, regulation, and security, Atomic Canyon said. Nuclear AI will also be working with a combination of public documentation and a wide range of varying, facility-specific data. Considering that Atomic Canyon is currently in talks with utilities that represent “40% to 50% of the U.S. nuclear fleet,” it will be critical to ensure that AI remains reliable from plant to plant.
Open source: In an interview with Latitude Media, Atomic Canyon CEO Trey Lauderdale said, “We believe it’s our responsibility as the lead in generative AI and nuclear to create the first benchmark for the industry in how these large language models fare when they’re used.”
To fulfill that perceived responsibility, Atomic Canyon plans to release all data, definitions, criteria, and evaluation software under permissive open-source public licenses. The company set a six-month timeline to curate datasets, define benchmark tasks and evaluation metrics, and produce a technical summary.