AI contracts often have gaps in the personal data definition. Here is what those look like and what to do about it.

PART 1: Video Lesson Identifying 2 Critical Problems in Personal Data Provisions

In a recent webinar, Arohi Kashyap and Shannon Yavorsky reviewed personal data and privacy issues in AI contracts, including the definition of personal data. To learn more about critical shortcomings in some of these provisions, watch this 12-minute video and download our one-page PDF handout summary.

The personal data definition sets the privacy boundaries for the vendor’s obligations and the customer’s rights. These provisions determine the deletion obligations, audit rights, training restrictions and liability scope in all contracts. But there are more nuances when the vendor's AI system generates data from your inputs, combine it with other sources, or process it through multiple model and infrastructure layers. These are issues that we didn’t face in traditional SaaS contracts.

While we can’t cover all issues relating to this topic in 12 minutes, Shannon and Arohi hit on some of the key concerns and issues while reviewing a personal data definition drafted by AI.

Here is the definition they reviewed:

"Personal data means any information that identifies or is reasonably capable of identifying a natural person, including directly identifying information and information that, when combined with other data reasonably available to provider, could identify a natural person, in each case to the extent such information is submitted by or on behalf of customer to the services as part of customer content... provided, however, that personal data shall exclude any information that has been processed by provider to remove all reasonably available means of identification in accordance with provider's de-identification standards as set forth in Exhibit B."

Problem 1: The definition does not cover derived data.

The phrase "submitted by or on behalf of customer to the services" covers what the customer puts in. It does not cover what the AI system produces from those inputs, such as inferences, model outputs, service logs, derived insights. GDPR, CCPA, and other regulations may still apply to that derived data regardless of what the contract says. If the contract does not govern it, the vendor has no contractual guardrails and the customer has no contractual protections over data the AI system created.

The fix: push to expand the definition to include data generated by or derived from the processing of customer content. The liability follows the data, not the contract's silence on it.

Problem 2: "Reasonably available to the provider" is undefined.

This phrase is meant to qualify identifiability by limiting what counts as personal data to what the vendor could combine other data with. But it is not defined. Reasonably available from where? From the customer's inputs only? From the model? From publicly available sources? From anywhere in the vendor's systems?

The fix? Replace the blanket phrase with a specific list. Push for language that identifies exactly what sources the vendor may combine data with and closes off everything else. A vague reasonableness standard benefits whoever interprets it later.

Final Advice: Look hard at the de-identification carve-out.

The definition excludes data the vendor has processed "in accordance with provider's de-identification standards as set forth in Exhibit B." The vendor's de-identification standard (which you may never have reviewed) determines whether your data escapes the definition entirely. AI vendors with access to large quantities of customer data have strong incentives to de-identify and leverage that data for model training, product development, or other purposes. Once data falls outside the personal data definition, your DPA protections may not reach it. Shannon’s response is to strike the carve-out entirely rather than accept a standard you have not seen or approved. Arohi, as the vendor, agreed.

Get a new contract lesson like this every week.

How to Contract publishes a practical, no-fluff lesson like this every Monday on contract issues lawyers and contract teams actually negotiate. Subscribe with your email and we'll send each one straight to your inbox, free. Forward this to a colleague who negotiates AI deals, and tell them to sign up too.

Subscribe Now

PART 2: PDF Download Identifying Critical Concepts to Know About Personal Data Definitions in AI Contracts

This lesson also includes a downloadable two-page handout covering the broader topic of personal data definitions.

HTC Handout: How to Evaluate Personal Data Definitions in AI Contracts (2026)

676.20 KB • PDF File

The handout gives you the full picture of how personal data definitions work in AI contracts: what the regulatory standards actually require, how vendor forms may fall short of those standards, and what a well-drafted definition looks like when all the right categories of data are addressed. It is written as a reference you can use at your desk, not something you read once and set aside. Pull it out when you are reviewing a vendor agreement and work through it alongside the contract.

Download it and keep it handy. The next time you have to review a personal data definition in an AI vendor contract, you will know exactly what to look for.

The full webinar is available only to How to Contract members.