August 01. 2025

EU AI Act News: Rules on General-Purpose AI Start Applying, Guidelines and Template for Summary of Training Data Finalized

Authors:

Obligations relating to general-purpose artificial intelligence (“GPAI”) models under the EU AI Act enter into force on 2 August 2025. Ahead of this deadline, the European Commission (the “Commission”) has published a suite of documents that will be highly relevant for stakeholders that either develop AI models using high volumes of data or fine-tune such models:

Guidelines on the Scope of Obligations for GPAI models (the “Guidelines,” published on 18 July 2025),
GPAI Code of Practice (published on 10 July 2025),
Template for the Public Summary of Training Content for GPAI models (the “Template,” published on 24 July 2025).

We highlight the key aspects of these developments below.

Guidelines on the Scope of Obligations for GPAI Models

The Guidelines provide guidance as to when and how the EU AI Act’s provisions on GPAI models will apply. They are non-binding, but the Commission states that it will base its enforcement of the

EU AI Act on the interpretation set out in the Guidelines.

Some of the main topics covered by the Guidelines are: (1) what types of models will be considered GPAI models; (2) when a GPAI model will considered “placed on the market” and which entity is the “provider” in that context; and (3) expectations on enforcement of the EU AI Act. We outline some of the key points below.

What is a GPAI model?

The Guidelines establish an indicative criterion that a model will be considered a GPAI model when its training compute is larger than 10²³ floating point operations per second (“FLOP”), and it is able to generate language, text-to-image or text-to-video. The training compute indicator is higher than in the preliminary version of the Guidelines, which specified 10²² FLOP. This criterion must be read in conjunction with the EU AI Act definition, under which an AI model that “displays significant generality” and is able to “competently perform a wide range of distinct tasks” will be a GPAI model. In simple terms, the Guidelines establish these threshold requirements to ensure the GPAI provisions under Chapter V of the EU AI Act apply to foundation models that are capable of being integrated with AI systems to perform different functions. The Guidelines consider non-language based modalities and AI models trained with lower training compute to be less effective at displaying generality.

In addition, the concept of a model’s lifecycle is relevant for GPAI model providers’ obligations under the EU AI Act. According to the Guidelines, this lifecycle begins at the start of the model’s large pre-training run (meaning the foundational training run), and any further development of the model by or on behalf of the original provider will be a continuance of that model’s lifecycle, rather than resulting in a new model. The analysis may differ where a downstream entity modifies the model, as explained below.

Who is the provider, and when is a GPAI model “placed on the market?”

Providers established or located in countries outside of the European Union can be subject to obligations under the EU AI Act, when a GPAI model is “placed on the market” in the European Union. The Guidelines emphasize that a model is “placed on the market” where a GPAI model is integrated into other apps or AI systems—for example, where a GPAI model forms part of a chatbot or an app made available for the first time on the EU market.

In some cases, downstream entities that modify a GPAI model will be considered “providers” of that modified GPAI model. The Commission clarifies that this will only occur if the modification results in a “significant change” in the “generality, capabilities, or systemic risk” of the model. Where the training compute used to modify the GPAI model is more than one-third of the training compute of the original GPAI model, this condition is expected to be met. If the downstream modifier of the GPAI model does not know the training compute of the original model, the Guidelines state that the threshold is one-third of 10²³ FLOP for GPAI models and one-third of 10²⁵ FLOP for GPAI models with systemic risk. The Guidelines include an Annex with additional information on how to calculate the training compute. The Annex states that any method can be used to estimate the training compute as long as it is accurate within an overall error margin of 30% of the reported estimate. The Annex describes two potential methods for calculating the estimate: the “hardware-based approach,” which tracks graphics processing unit (“GPU”) usage; and the “architecture-based approach,” which estimates operations directly based on the model’s architecture. If a downstream modifier becomes a GPAI provider, it is only expected to comply with obligations relevant to its modifications (e.g., information and data relevant to the modification).

Insights on enforcement

Although GPAI model provider obligations enter into force on 2 August 2025, GPAI models placed on the market before that date have until 2 August 2027 to comply. The Guidelines highlight that those models are not required to undergo “retraining” or “unlearning,” if this is not possible, where training data information is unavailable, or if retrieving that data would cause a disproportionate burden. Providers must disclose and justify those cases in their mandatory copyright policy and public summary of training content.

The Commission can only take enforcement action from 2 August 2026, but this does not affect the applicability of obligations from 2 August 2025.

GPAI Code of Practice

GPAI model providers can sign up to the GPAI Code of Practice on a voluntary basis. In the Guidelines, the Commission explains the impact of voluntarily agreeing—or not—to the GPAI Code of Practice:

GPAI model providers who adhere to the GPAI Code of Practice will benefit from “increased trust” by the Commission, and compliance monitoring will focus on checking adherence to the GPAI Code of Practice. If a provider makes commitments under the GPAI Code of Practice, this could also be a mitigating factor when determining fines issued for non-compliance.
Providers who do not join the GPAI Code of Practice will likely be subject to more information and access requests from the Commission, and the Commission expects them to demonstrate their EU AI Act compliance through other means, which they should report to the AI Office. This includes, for example, “carrying out a gap analysis that compares the measures they have implemented with the measures set out by a code of practice that is assessed as adequate.” The Commission also states that the AI Office will have less of an understanding regarding the provider’s compliance with Chapter V of the EU AI Act if they do not sign the Code of Practice.

The GPAI Code of Practice is comprised of three chapters: Transparency, Copyright and Safety and Security. The Transparency and Copyright Chapters are relevant for all GPAI models, whereas the Safety and Security Chapter applies only to those GPAI models with systemic risk.

The Transparency Chapter addresses GPAI model providers’ obligations to produce technical documentation for the AI Office and national competent authorities, as well as information and documentation for downstream providers. The “Model Documentation Form” sets out in a single document what information in-scope providers should disclose, and to which stakeholders.
The Copyright Chapter focuses on providers’ obligation to establish a policy for compliance with EU law on copyright and related rights, particularly rightsholders’ reservations of rights. It sets out specific commitments for signatories to the GPAI Code of Practice, for example, in relation to the use of web-crawlers. Providers will also commit to implementing technical safeguards to prevent their models from generating outputs that infringe EU copyright laws by reproducing training content.
The Safety and Security Chapter is the most detailed and technical of the three, dealing with the obligation of GPAI model providers with systemic risk to perform model evaluation, assess and mitigate systemic risks, report relevant information about serious incidents to the competent authorities, and ensure cybersecurity protection for the model and its physical infrastructure. This Chapter will likely require significant compliance efforts from in-scope providers, first to create risk management frameworks and systems, and to implement, update and monitor these systems on an ongoing basis.

Template for the Public Summary of Training Content for GPAI Models

In contrast to the non-binding nature of the Guidelines and the GPAI Code of Practice, GPAI model providers are required to use the Template when drawing up the mandatory summary about content used for training the GPAI model. The Commission notes that the Template provides a “minimal baseline” for the information which should be made publicly available, so providers are, as a general rule, bound to provide all the information set out in the Template.

The Template groups training data sources into the following different categories:

Publicly available datasets;
Private non-publicly available datasets obtained from third parties (licensed datasets);
Data crawled and scraped from online sources;
User data;
Synthetic data; and
Other data.

In its Explanatory Notice to the Template, the Commission explains that the level of detail required to be provided is tailored to the type of data category. For example, for licensed datasets, providers only need to confirm whether licensing agreements have been concluded with rightsholders, and the data modalities concerned (e.g., text or audio). In contrast, where datasets have been crawled or scraped from online sources, providers will need to provide significantly more information, such as a list of the top 10% most relevant internet domains by size of content crawled or scraped (5% for SMEs).

Regardless of the sources of training data used, providers will need to describe the measures implemented to respect rightsholders’ reservation of rights, and the general measures taken to remove illegal content from the training data. Providers are also encouraged to provide any other relevant information about the data processing, though this is on a voluntary basis.

Takeaway

Stakeholders may encounter timing challenges in assessing how these communications apply to their operations ahead of upcoming regulatory deadlines. While the Guidelines, GPAI Code of Practice and the Template offer valuable guidance in achieving compliance with the EU AI Act, the increased level of detail in these documents—as compared to the statutory provisions—may, in some instances, add to the overall regulatory burden. If you would like to discuss how these developments could affect your business, please reach out to the authors or your usual Mayer Brown contact.