What must be disclosed under AB 2013?

Covered developers must post a high-level summary of the datasets used to train the system. The summary must cover the data's sources or owners, how the datasets further the system's intended purpose, the number of data points (ranges or estimates are acceptable), the types of data points, whether the datasets are protected by copyright/trademark/patent or are in the public domain, whether the data was purchased or licensed, whether the data includes personal information or aggregate consumer information (as defined by the CCPA), whether and how the data was cleaned or processed, the time period the data was collected, the dates the datasets were first used during development, and whether synthetic data was used.

Is there a compute or FLOP threshold for AB 2013?

No. AB 2013 contains no compute threshold and no reference to floating-point operations. This is a key difference from California's vetoed SB 1047, which targeted frontier models by compute. AB 2013 applies to qualifying generative AI systems regardless of training compute.

When does AB 2013 take effect?

California AB 2013 took effect on January 1, 2026. By that date, covered developers had to post compliant training-data summaries — including for systems already in the market that were released or substantially modified on or after January 1, 2022. The law is now in force.

Does AB 2013 require disclosure of individual data sources?

No. AB 2013 requires a high-level summary that identifies the sources or owners of the datasets and describes their characteristics — not a complete list of every URL, file, or record in the training corpus.

California AB 2013: Compliance Guide (2026)

Q: Who does California AB 2013 apply to?

AB 2013 applies to any 'developer' — a person, business, or government agency that designs, codes, produces, or substantially modifies a generative AI system — that makes the system publicly available to Californians, where the system was released or substantially modified on or after January 1, 2022. There is no compute threshold and no minimum size; the law applies broadly regardless of how much compute was used to train the system.

Q: Is there a private right of action under AB 2013?

AB 2013 itself specifies no penalties and no dedicated enforcement mechanism. Commentators expect it to be enforced under California's Unfair Competition Law (Bus. & Prof. Code § 17200), which can be enforced by the Attorney General, district attorneys, and other public prosecutors. Any private right of action under the UCL is narrow — limited to plaintiffs who lost money or property as a result of a violation — and is not a general right to sue for damages over a missing disclosure.

Effective

January 1, 2026

Enforcement

January 1, 2026

Max Penalty

No statutory penalties; expected to be enforced under California's Unfair Competition Law (Bus. & Prof. Code § 17200)

Jurisdiction

US · California

§ Timeline

Sep 2024Jan 2026

SignedEffective

California Assembly Bill 2013 (2023–2024 Reg. Sess.) — Generative artificial intelligence: training data transparency. Cal. Civ. Code § 3111. Signed September 28, 2024; effective January 1, 2026.

Overview

California AB 2013 (Generative Artificial Intelligence: Training Data Transparency), signed by Governor Gavin Newsom on September 28, 2024, is California's entry into the movement for AI training data transparency. The law requires developers of generative AI systems that are made publicly available to Californians to post a high-level summary of the datasets used to train those systems.

The legislation responds to concerns from creators, publishers, and the public about the lack of visibility into what AI systems are trained on. By mandating disclosure, California aims to:

Enable scrutiny of potential copyright and intellectual-property issues in training datasets
Surface whether personal or consumer information was used in training
Support regulators, researchers, and the public in evaluating AI systems
Create accountability incentives for AI developers to document their data

Unlike content-moderation laws or bias-audit requirements, AB 2013 focuses entirely on the supply side of AI — the data that goes into training systems — rather than on outputs or deployment context.

Important — this is not a frontier-model / compute-threshold law. AB 2013 has no compute or FLOP threshold and makes no reference to the California Privacy Protection Agency (CPPA). It applies broadly to any qualifying generative AI system regardless of how much compute was used to train it. The compute-keyed approach belonged to a different bill — SB 1047, which Governor Newsom vetoed in 2024. If you have seen AB 2013 described as keyed to systems trained above ~10^23 FLOPs, that description is incorrect.

Who It Applies To

Covered Developers

AB 2013 applies to a "developer," defined as a person, partnership, state or local government agency, or corporation that designs, codes, produces, or substantially modifies an artificial intelligence system or service for use by members of the public. A developer must comply when it makes a covered generative AI system — or a substantial modification of one — publicly available to Californians for use.

The obligation attaches to systems that were released or substantially modified on or after January 1, 2022. This is a backward-looking requirement: it captures models already deployed in the market, not just new releases. As a result, the major foundation-model providers had to post training-data summaries for their existing systems by the January 1, 2026 effective date.

There is no compute threshold and no minimum model size — the law applies regardless of the computational resources used to train the system. This is the principal distinction between AB 2013 and the vetoed SB 1047.

What Is a "Generative AI System"?

Under AB 2013, "generative artificial intelligence" means AI that can generate derived synthetic content — such as text, images, video, and audio — that emulates the structure and characteristics of the AI's training data. This covers large language models, image-generation models, audio-synthesis models, and multimodal systems.

Territorial Scope

The law applies based on whether the system is made publicly available to Californians, not on where the developer is headquartered. A developer in New York, London, or Tokyo that offers a covered generative AI product to Californians must comply.

Disclosure Requirements

Covered developers must post on their website a high-level summary of the datasets used to train the generative AI system. (The statute requires posting on the developer's website — there is no alternative repository-based path; see Exemptions.) The summary must address each of the following:

1. Sources and Ownership

The sources or owners of the datasets

2. Purpose and Composition

How the datasets further the intended purpose of the AI system or service
The number of data points included in the datasets (which may be in general ranges, and may use estimates for dynamic datasets)
A description of the types of data points within the datasets

3. Intellectual-Property and Licensing Status

Whether the datasets include data protected by copyright, trademark, or patent, or whether the datasets are entirely in the public domain
Whether the datasets were purchased or licensed by the developer

4. Personal and Consumer Information

Whether the datasets include personal information, as defined in the California Consumer Privacy Act (CCPA)
Whether the datasets include aggregate consumer information, as defined in the CCPA

Note: the CCPA is referenced here only to define "personal information" and "aggregate consumer information." AB 2013 does not import CCPA's enforcement mechanism, and the CPPA has no role under AB 2013.

5. Processing and Timing

Whether there was any cleaning, processing, or other modification to the datasets by the developer, including the intended purpose of those efforts
The time period during which the data was collected, including notice if collection is ongoing
The dates the datasets were first used during the development of the AI system

6. Synthetic Data

Whether the datasets include data generated by other AI (synthetic data)

Exemptions

AB 2013's exemptions are narrow and specific. The documentation requirement does not apply to a generative AI system or service whose sole purpose is one of the following:

Exemption	Scope
Security and integrity	An AI system or service whose sole purpose is to help ensure security and integrity
Aircraft operation	An AI system or service whose sole purpose is the operation of aircraft in the national airspace
National security / defense	An AI system or service developed for national security, military, or defense purposes that is made available only to a federal entity

There is no open-source exemption, no internal/enterprise exemption, no research/academic exemption, and no below-compute-threshold exemption. Any description of AB 2013 that lists those carve-outs — or that ties an exemption to CPPA rulemaking — is inaccurate. Open-source developers are not exempt and must post the required summary on their website like any other covered developer.

Compliance Timeline

Date	Milestone
January 1, 2022	Start of the backward-looking coverage window — systems released or substantially modified on or after this date are in scope
September 28, 2024	AB 2013 signed into law by Governor Newsom
January 1, 2026	Act takes effect — training-data summaries must be posted, including for already-deployed systems within the coverage window
Each subsequent release / substantial modification	A new or updated summary must be posted before the system or its substantial modification is made publicly available to Californians

By the January 1, 2026 effective date, major foundation-model developers — including OpenAI and Anthropic — had posted training-data summaries to comply.

Penalties & Enforcement

No Statutory Penalty or Dedicated Enforcer

AB 2013 does not itself specify any penalties and does not create a dedicated enforcement mechanism or a private right of action for non-disclosure. The statute states what must be posted but is silent on consequences.

Likely Enforcement Under the Unfair Competition Law

Legal commentators expect AB 2013 to be enforced as an unlawful business practice under California's Unfair Competition Law (UCL), Business & Professions Code § 17200. The UCL authorizes enforcement by the California Attorney General, district attorneys, and other public prosecutors.

Narrow UCL Private Right of Action

The UCL allows a private right of action, but it is narrow: a private plaintiff must have lost money or property as a result of the violation in order to sue. There is no general right for individuals to sue for "injunctive relief and actual damages" simply because a developer failed to post a training-data summary. Any earlier description of AB 2013 as creating a broad private right of action overstates the law.

Relationship to Other California AI Laws

AB 2013 is one of several California AI laws now in effect. It operates alongside:

SB 1047 (vetoed by Governor Newsom in 2024) — would have imposed safety-testing requirements on large frontier models defined by training compute; it was the compute-threshold bill and is not law
SB 942 (California AI Transparency Act) — content-provenance / AI-detection disclosure requirements (with its own quantitative user threshold, unlike AB 2013)
AB 2602 — addresses AI use of a performer's digital likeness
AB 1836 — restricts use of deceased individuals' digital likeness

Compliance Steps

Confirm you are a "developer" in scope. Determine whether you design, code, produce, or substantially modify a generative AI system that you make publicly available to Californians. There is no compute threshold to clear — if you make a covered system publicly available, you are likely in scope.
Apply the January 1, 2022 lookback. Inventory every generative AI system you have released or substantially modified on or after January 1, 2022 that is available to Californians, including already-deployed models — not just new launches.
Audit your training datasets. Conduct a systematic inventory of the datasets used to train each in-scope system: their sources/owners, types, and approximate size (ranges and estimates are acceptable).
Determine intellectual-property and licensing status. For each dataset, record whether it includes copyrighted, trademarked, or patented material or is public domain, and whether it was purchased or licensed.
Assess personal and consumer information. Determine whether the datasets include personal information or aggregate consumer information as defined by the CCPA.
Document processing, timing, and synthetic data. Record any cleaning/processing you performed and its purpose, the collection time period (and whether collection is ongoing), the dates the datasets were first used in development, and whether any synthetic (AI-generated) data was used.
Post the high-level summary on your website. Publish the required summary on the developer's website. There is no alternative compliance path through a model repository, Model Card, or open-source license.
Repeat before each release or substantial modification. Post a new or updated summary before each future release or substantial modification of the system is made publicly available to Californians.

Frequently Asked Questions

Who does AB 2013 apply to? Any developer — anyone who designs, codes, produces, or substantially modifies a generative AI system — that makes the system publicly available to Californians, where the system was released or substantially modified on or after January 1, 2022. There is no compute threshold.

What must be disclosed? A high-level summary of the training datasets: their sources/owners, how they further the system's purpose, the number and types of data points, copyright/trademark/patent or public-domain status, whether the data was purchased or licensed, whether it includes personal or aggregate consumer information (CCPA terms), how it was cleaned or processed, the collection period, the dates first used in development, and whether synthetic data was used.

Is there a compute threshold? No. AB 2013 has no compute or FLOP threshold and no CPPA role. The compute-keyed approach belonged to the vetoed SB 1047.

When did the law take effect? January 1, 2026.

Does it require listing every website scraped? No — a high-level summary of dataset sources and characteristics is required, not a complete list of every URL or document.

Is there a private right of action? The statute itself creates none. Enforcement is expected under the Unfair Competition Law (Bus. & Prof. Code § 17200) by the Attorney General, district attorneys, and prosecutors. The UCL private right of action is narrow — only plaintiffs who lost money or property as a result of a violation.

Does it apply to open-source models? Yes. There is no open-source exemption and no repository-based compliance path. Open-source developers must post the required summary on their website like everyone else.

Official Sources

§ Source documents

AB 2013 full text (California Legislature)

§ Also in The Ledger

▸ Brief

California AB 2013: training data disclosure requirements

Stay ahead of AI compliance changes

Get weekly regulation updates, enforcement news, and compliance deadlines — free.

California AB 2013.