California AB 2013, signed by Governor Newsom in September 2024 and effective January 1, 2026, requires providers of generative AI systems to publicly document their training data. It’s one of the first US laws to directly regulate AI training data transparency.
Who It Covers
AB 2013 applies to any person that offers a “generative artificial intelligence system” to California residents — defined as AI that generates text, images, audio, video, or other content based on user input.
In scope:
- Generative AI chatbots and assistants
- AI image generators
- AI video and audio generation tools
- AI writing assistants
Exempted:
- Systems that use generative AI incidentally (e.g., autocorrect, spell check)
- Systems not offered to consumers in California
- Non-commercial research and development
The law applies to providers regardless of where they’re based. If California residents use your generative AI product, you’re in scope.
What Must Be Disclosed
Providers must post training data documentation on their website (or in the application). The disclosure must include:
1. High-level summary of training datasets
A general description of the data types used to train the system. This can be at a category level — you don’t need to enumerate every dataset.
2. Intended purpose
What the system is intended to do and what outputs it’s designed to produce.
3. Known limitations
Acknowledged limitations of the system, including potential for bias, errors, or hallucination.
4. Information about whether the training data included personal information
Not the personal information itself, but whether the training corpus included data subject to California privacy law (CCPA/CPRA).
5. Dates of data cutoffs
When the training data collection ended (i.e., the knowledge cutoff).
What It Doesn’t Require
AB 2013 is notably narrower than it might seem:
- No specific dataset disclosure. You don’t have to list which specific datasets you used — a description of categories is sufficient.
- No third-party verification. There’s no audit requirement or independent verification mechanism.
- No opt-out for training data. The law creates disclosure obligations, not opt-out rights (those are handled separately under CPPA rulemaking).
Effective Date and Enforcement
The law took effect January 1, 2026. Enforcement authority rests with the California Attorney General and the California Privacy Protection Agency.
The penalty structure isn’t set by AB 2013 itself — it references general California law. The AG can seek injunctive relief, and penalties under existing consumer protection law can be significant.
What to Do
Step 1: Determine if you’re covered. Do you offer a generative AI system (text, image, audio, video) to California users? If yes, you’re in scope.
Step 2: Audit your training data documentation. Work with your ML team to compile what can be disclosed. You need: general data categories used, data cutoff dates, known limitations, and whether personal data was in the training set.
Step 3: Draft the disclosure. Write a plain-language disclosure that covers all required elements. Have legal review it.
Step 4: Publish it. Post it on your website, in your app, or wherever users engage with the system. Link to it from your AI documentation or product page.
Step 5: Update it. When you release new model versions with different training data, update the disclosure.
The Bigger Picture
AB 2013 is part of a broader California push on AI transparency. The CPPA is separately working on AI rulemaking under the CPRA that will go further — including opt-out rights for training data and automated decision-making regulations. AB 2013 is the first layer; more is coming.
Companies that build good training data documentation practices now will be better positioned when the CPPA’s fuller AI rules arrive.
This article is for informational purposes only and does not constitute legal advice. Always consult qualified counsel before making compliance decisions. Try the free compliance checker →
