Debevoise & Plimpton Discusses How to Protect AI Models and Data

One of the most difficult challenges for cybersecurity professionals is the increasing complexity of corporate systems. Mergers, vendor integrations, new software tools and remote work all expand the footprint of companies’ information systems, creating a larger attack surface for hackers. The adoption of artificial intelligence presents additional and, in some ways, unique cybersecurity challenges for protecting the AI models themselves as well as the sensitive data that is used to train and operate the AI systems.

On August 31, 2022, in recognition of these growing challenges, the UK National Cyber Security Centre (“NCSC”) released its Principles for the Security of Machine Learning, which are designed to help companies protect AI systems from exploitation and include the following recommendations:

  • Limit access to the models.
  • Improve logging and audit capability.
  • Make sure data comes from trusted sources.
  • Secure the supply chain.
  • Secure the infrastructure.
  • Track assets.
  • Balance transparency and security.

These principles recognize that while traditional cyberattacks generally focus on stealing data or rendering it unavailable, AI attacks, by contrast, are often attempts to interfere with how models function, and therefore require additional cybersecurity defenses. In this Debevoise Data Blog post, we examine the growing cybersecurity threats to AI systems and how companies can prepare for and respond to these attacks.

Threats to AI Systems

Some threats to AI systems are familiar. For example, AI models often use large volumes of sensitive personal information for training and operations, and this data must be protected from theft or from encryption through ransomware, which are not new threats. But AI programs also present new challenges because sensitive company data that is normally stored in secure areas of the network is now being copied into less secure data lakes for use by AI developers. In addition, AI vulnerabilities are often harder to detect, and, once found, they can be more difficult to patch than traditional software or systems. Moreover, some AI security threats are entirely new, such as data poisoning, model manipulation and extraction attacks.

            Data Poisoning

This occurs when an attacker corrupts a set of AI training data to cause a model to behave unexpectedly. Examples include:

  • Flooding a chatbot with inappropriate language in an effort to train it to use words that are offensive to customers.
  • Feeding a stock trading algorithm with false news reports about a takeover or corporate scandal.
  • Bombarding a lending model with fake loan applications that present an inaccurate view of the current state of the credit markets.

            Model Manipulation

This occurs when the model itself has been altered to change its behavior and achieve malicious goals. Examples include:

  • Breaking into internet-connected medical devices to deliver dangerous doses of medication.
  • Manipulating fraud-detection models to allow attackers to engage in fraud without detection, which is also referred to as an evasion attack.
  • Causing a major traffic jam by hacking a taxi app model and sending dozens of cars to converge at the same location simultaneously.

            Confidentiality Attacks

These involve an attacker obtaining sensitive training data or information about the model itself through queries to the model. This can be done through extraction or model inversion, where an attacker probes a model in order to understand its key nonpublic elements or to extract some of its sensitive training data. Examples include:

  • Targeted inquiries made to an investment firm’s proprietary stock-trading model to extract either training data or key elements of the model itself.
  • Targeted inquiries made to a healthcare model that could reveal which patients suffer from a certain disease or are participating in a certain drug trial (also known as a Membership Inference Attack).

            Evasion (or AI Input) Attacks

In an AI input attack, the attacker knows enough about the model to make specially crafted inputs to circumvent the model’s decision process and manipulates data to evade model classifiers. Examples include:

  • Altering traffic signs or road markings to trick semiautonomous vehicles into driving in a dangerous manner by ignoring stop signs or swerving into oncoming traffic.
  • Image-based spam that evades the detection of spam filters because the spam content is embedded within an attached image.

Steps That Companies Can Take to Protect AI Systems

How significant these risks are is largely unknown, partly because companies are not required to report these attacks and partly because many of these models lack the access controls and audit trails needed to be able to detect them. But experts generally agree that these risks are growing. Drawing on the UK NCSC principles and emerging best practice, here are some steps that companies with substantial AI programs should consider to better protect their models and big data projects:

  • Develop a method for identifying high-risk AI models and data sets.
  • Create an inventory of high-risk AI models and data sets.
  • Limit access to high-risk AI models and data sets, including employing multifactor authentication or additional passwords for certain levels of access.
  • Enhance logging and audit capabilities for high-risk AI models and data sets.
  • Conduct penetration testing for high-risk AI models and data sets.
  • Extend data loss prevention technologies to high-risk AI models and data sets.
  • Explore options for increasing the capability to detect model intrusions and adversarial data.
  • Maintain secure backups of high-risk models and data sets that can be restored in case of compromise.
  • Employ additional encryption or security controls over high-risk AI models and data sets.
  • Update Cybersecurity Incident Response and Business Continuity Plans to specifically address AI incidents, including by:
    • Collecting and reviewing logs from the AI system.
    • Determining who should be informed internally about a potential AI cyber incident.
    • Assessing whether the AI system is currently in use and, if so, the extent of its use, and what manual and automated alternatives exist for carrying out those tasks.
    • Estimating the financial impact of stopping the use of the AI system.
    • Determining whether to discontinue the use of an AI system that may have been compromised.
    • Assessing any applicable contractual breach notification obligations.
    • Scoping the nature and volume of potentially tainted decisions that have been made by the AI system.
    • Deciding which external resources may be of assistance.
    • Assessing risks from potential civil litigation and regulatory action.
  • Testing the AI portion of the updated incident response plan with a tabletop exercise involving a mock AI-related cyberattack.

This post comes to us from Debevoise & Plimpton LLP. It is based on the firm’s Data Blog post, “Protecting AI Models and Data – The Latest Cybersecurity Challenge,” dated September 22, 2022, and available here.