May 27, 2025

Federal agencies urge organizations to fortify AI data security across system lifecycles

Editor's Note

AI systems are only as secure and reliable as the data that powers them. That’s the central message of a guidance sheet jointly issued May 22 by the NSA, CISA, FBI, and cybersecurity agencies from Australia, New Zealand, and the UK. The document outlines best practices for securing data used to train and operate AI systems—emphasizing that protecting data at every stage of the AI lifecycle is essential for maintaining the accuracy, trustworthiness, and integrity of AI outcomes.

According to the guidance, the entire AI system lifecycle—from planning and design through deployment and monitoring—faces risks such as data poisoning, supply chain compromise, and data drift. If attackers manipulate training data or inject false information, they can alter AI models’ behavior or degrade their accuracy. Similarly, gradual changes in real-world data can cause AI systems to fail unless models are regularly updated and monitored.

To counter these risks, the agencies recommend measures like verifying data provenance, using encryption and digital signatures, establishing secure access controls, employing trusted infrastructure, and implementing data-quality monitoring. Organizations should also audit and sanitize training data, apply anomaly detection, and leverage techniques like federated learning, differential privacy, and data masking to protect sensitive content.

The document highlights that datasets sourced from the open web, such as curated or crawled databases, are particularly vulnerable to manipulation. Tactics like split-view or frontrunning poisoning can introduce malicious content with minimal effort and cost. As such, data consumers are advised to verify raw data hashes, require model certification, and incorporate content credentials that verify the origin and edit history of AI training materials.

Throughout, the guidance stresses the need for continuous data risk assessments and metadata validation. It also urges proactive defense against statistical bias and data duplication, which can distort AI model outputs. Ultimately, the agencies call for a “best-effort” commitment from model developers and users alike to implement strong safeguards and routinely validate AI data security, particularly in high-stakes applications such as healthcare.

Read More >>

Join our community

Learn More
Video Spotlight
Live chat by BoldChat