Eu Ai Act Os Guide Gpai
A guide for OS developers on GPAI in the EU AI Act
The EU AI Act became law on August 1, 2024, introducing risk-based rules that determine which AI systems and GPAI models can be marketed and deployed in the EU and how. It is being implemented in phases till August 2027. and starting on August 2, 2025, providers of GPAI models must comply with a set of obligations when placing models on the EU market, irrespective of whether they are established in the EU or elsewhere. Providers of GPAI models placed on the EU market before August 2, 2025 have until August 2, 2027 to comply.
The good news for the open-source community is that the AI Act is designed to facilitate or automate compliance for researchers and open-source developers. Many researchers working on developing GPAI models for scientific purposes fall entirely outside of the scope of the Act, as does development outside of commercial activity. For models that do fall within the scope of the EU AI Act, release under a free and open source license also exempts developers from some of the requirements, particularly those that would be redundant or impractical for openly shared models. These exemptions are designed to reflect a recognition of the value and potential of open development, while still ensuring accountability. However, it can be difficult to know when and to what extent they apply.
The primary goal of this guide is to shed some light on these questions and to offer an accessible entry point to researchers and developers working on and with open GPAI models. We walk you through key definitions, the obligations, and open-source exemptions, and how open-source providers may comply using official guidance from the European Commission like the GPAI Code of Practice, GPAI guidelines, and template for the public summary of training data. And if youβre pressed for time, we even made an interactive application to give you a high-level view, try it out!
You can find the Spaces version of this blog here. Need further AI Act guidance? We've also published general guides for open-source developers at Hugging Face and the Linux Foundation.
This app helps open-source developers assess whether their GPAI model project qualifies them as a "GPAI model provider" under the AI Act and, if so, the relevant obligations. You can see a larger, stand-alone version here as a Hugging Face Space.
TL;DR: The term GPAI model used in the AI Act is roughly akin to what is often called βfoundation modelβ. Generally, if a model performs well on a wide range of tasks, can generate text or other media, and the cumulative training compute is 1023 FLOPs or higher, there is a good chance it will be considered a GPAI model under the AI Act.
The AI Act divides GPAI models into two categories: GPAI models and GPAI models with systemic risk (GPAISR; see next section). A GPAI model is defined in Article 3(63) as an:
The GPAI guidelines specify that βan indicative criterion for a model to be considered a GPAI model is that its training compute is greater than 1023 FLOPs and it can generate language (whether in the form of text or audio), text-to-image or text-to-video.β This threshold corresponds to the approximate amount of compute typically used to train a model with one billion parameters on a large amount of data, according to the guidelines. The guidelines provide examples of models that are in and out of scope (see Table 1).
| β Examples of GPAI models | β Examples of non-GPAI models |
| A model is trained on a broad range of natural language data (i.e. text) curated and scraped from the internet and other sources (as is currently typical for language models) using 1024 FLOPs. |
A model is trained specifically for the task of transcribing speech to text, using 1024 FLOPs. A model is trained specifically for playing chess or video games, using 1024 FLOPs. A model is trained specifically for modelling weather patterns or physical systems, using 1024 FLOPs. |
Table 1: Examples of models that do or don't qualify as GPAI models (source: European Commission, GPAI guidelines)
Note that GPAI models are different from βAI systemsβ in the AI Act β as defined in Article 3(1). As per Recital 97, while GPAI models are essential building blocks of AI systems, they are not AI systems on their own. To become an AI system, a model must be combined with additional components, such as a user interface or other functional modules, that enable interaction and deployment. Different legal obligations may apply depending on whether you provide a GPAI model, an AI system, or both (for example, integrating your GPAI in your user interface). These obligations apply simultaneously if a provider offers both the GPAI model and the AI system and the obligations applicable to AI systems will depend on the intensity and scope of the risks that the AI system can generate. These further obligations are not in the scope of this guide.
TL;DR: GPAI models with systemic risk (GPAISR) are broadly the same as so-called βfrontier modelsβ; that is, the most advanced GPAI models currently on the market. The AI Act considers a model to be a GPAISR if it meets its definition of βhigh-impact capabilitiesβ or crosses a training compute threshold of 1025 FLOPs.
As per Article 51(1), a GPAI model is classified as posing systemic risk if it meets either of two conditions:
A GPAI model is presumed to have high-impact capabilities when the cumulative compute used for training exceeds 1025 FLOPs β for the moment, this will only capture models at or near the frontier of AI development, like GPT-4o, Grok 4, or Mistral 2 Large. The GPAI guidelines explain that this threshold is relevant for identifying such high-impact capabilities. The European Commission may adjust the performance and compute thresholds over time to ensure the AI Act keeps up with the state-of-the-art.
While all models that meet the threshold must notify the European Commission, developers may also submit evidence βto demonstrate that, because of its specific characteristics, a general-purpose AI model exceptionally does not present systemic risksβ according to Recital 112; for example, if capabilities listed in Appendix 1.3.1 of the Safety and Security chapter of the Code of Practice are below those of other non-GPAISR models - which may be a useful option for e.g. very large models developed primarily as research artifacts.
TL;DR You will be considered a provider of a GPAI model, regardless of whether you are established in the EU or elsewhere, if you meet both of the following conditions: 1) you develop a GPAI model or have one developed for you; and 2) you place it on the EU market, meaning you or the organization you work for supply it for distribution or use it in the EU as part of a commercial activity, whether in return for payment or free of charge. As of writing this guide, the exact bounds of what constitutes commercial activity in this context remains a somewhat open question. While related EU regulation indicates that it is unlikely to cover the work of individual βhobbyistβ developers and does not automatically include artifacts shared on platforms such as GitHub or Hugging Face under a FOSS license without monetization by the developer, the determination will likely be made on a case-by-case basis.
The AI Act defines a provider of a GPAI model in Article 3(3) as:
Article 3 defines βplacing on the marketβ as βthe first making available of an AI system or a GPAI model on the Union marketβ (Art. 3(9)) and βmaking available on the marketβ as βthe supply of an AI system or a GPAI model for distribution or use on the Union market in the course of a commercial activity, whether in return for payment or free of chargeβ (Art. 3(10)). To simplify, under EU law, a product is placed on the market when it is made available in the EU market for the first time. After that, any later supply (like from one distributor to another, or to a customer) is referred to as making available. Recital 97 (note: in EU law, recitals provide non-binding explanations of provisions in a legal text) clarifies that βGPAI models may be placed on the market in various ways, including through libraries, application programming interfaces (APIs), as direct download, or as physical copy.β
The notion of βcommercial activityβ is essential in understanding what qualifies as placing a model or system on the EU market; which is a more specific denotation than simply making it available to EU citizens. While specific determinations about AI models have not yet been made in the context of the applicability of the EU AI Act, the βBlue Guideβ on the implementation of EU product rules was designed to serve as general guidelines within the legislative framework. According to the Blue Guide, "Commercial activity is understood as providing goods in a business related context. Non-profit organisations may be considered as carrying out commercial activities if they operate in such a context. This can only be appreciated on a case by case basis taking into account the regularity of the supplies, the characteristics of the product, the intentions of the supplier, etc. In principle, occasional supplies by charities or hobbyists should not be considered as taking place in a business related context."
As another point of reference, consider also the EUβs Cyber Resilience Act (CRA). The CRA includes language addressing whether or not a person or organization who produces free and open-source software should be considered a βmanufacturerβ under the CRA. As part of this, the CRAβs Recital 18 states in part that β*...the provision of products with digital elements qualifying as free and open-source software that are not monetised by their manufacturers should not be considered to be a commercial activity.*β While the CRAβs language is probably not binding on how the AI Act would be interpreted, this and other language in the CRA does point towards an understanding that making FOSS-licensed software available may not always, inherently be considered a βcommercial activityβ under the CRA, particularly where it is not being βmonetisedβ by the producer. This may suggest a similar approach for the AI Actβs purposes.
It is also crucial to note that the AI Act has extraterritorial reach, which means that it applies to providers that place GPAI models on the EU market, irrespective of whether they are established in the EU or a third country. Providers established or located in a third country must appoint an authorised representative established in the EU before placing a GPAI model on the EU market. However, as discussed further below, this obligation does not apply to providers of GPAI models under free and open-source licenses, unless they have systemic risk.
TL;DR: GPAI models that are developed only for scientific research and development are exempt from the AI Act.
If you develop a GPAI model solely for scientific research and development, you will not be considered a provider under the AI Act and are exempt from its obligations. This means that when GPAI development is primarily aimed at releasing the model and associated data as a scientific artifact, including notably in academic and not-for-profit settings, the EU AI Act does not introduce any additional obligations. Article 2(6) states:
Testing and development activities in the course of product-oriented research also fall outside of the scope of the AI Act according to Recital 25, although this exemption ends if the model is placed on the market or put into service in the course of testing. Recital 109 elaborates that while developers of GPAI models for scientific research purposes are exempted, they should be encouraged to voluntarily comply with these obligations of providers.
TL;DR: If you finetune a GPAI model in a way that significantly changes the model, you might have to comply with providersβ obligations to the extent that you can. As a rule of thumb, this is the case if the compute used for finetuning is higher than β of the compute used to train the base model.
You will be considered a provider of a GPAI model only if your modification leads to a significant change in the model's generality, capabilities, or systemic risk. The threshold for βa significant changeβ is whether the training compute required for the modification exceeds one-third of the original model's training compute.
If you cannot determine this value (for example, because the training compute wasn't disclosed by the original provider), the GPAI guidelines explain that you should use alternative thresholds: for GPAI models, itβs one-third of the 1023 FLOPs threshold; and for GPAISR models, itβs one-third of the 1025 FLOPs threshold.
If you make a modification that qualifies you as a provider, your obligations under Article 53 are limited to the modifications you made, meaning you only need to document the fine-tuning process, new training data, and changes. In addition, the obligation for providers of GPAI models that are established in a third country to appoint by written mandate an authorised representative in the EU, prior to placing it on the EU market, will also apply, unless the finetuned or modified GPAI model qualifies for the open-source exemption.
Qualifying as a βproviderβ of a GPAI or GPAISR model means that the model is covered by the AI Act, and subject to several obligations outlined in Articles 51 through 55. However, GPAI models that are put on the EU market while released under a free and open-source license are exempt from some of the requirements outlined in those articles. The next step in understanding oneβs requirements under the AI Act consists in understanding the scope of these exemptions.
TL;DR: If you publish a GPAI model under a free and open-source license along with sufficient documentation and without monetizing the model, you will be partly exempt from the obligations for GPAI developers.
To qualify for the open-source exemptions for GPAI models, Article 53(2) and the GPAI guidelines specify that you have to meet three conditions:
This definition of a free and open-source licence is likely to include widely used permissive software licenses like Apache 2.0 and MIT as well as permissive model licenses like OpenMDW. The GPAI guidelines explain that all four rights (i.e. access, usage, modification, and distribution) must be upheld to qualify as a free and open-source license (paragraph 78), and as such licenses with usage restrictions (e.g., research-only, acceptable use restrictions, commercial terms) do not qualify as a free and open-source licence (paragraph 83). However, the guidelines later qualify this requirement by stating that specific, proportionate, and safety-oriented usage restrictions may be permissible in domains where the licensor believes there could be a significant risk to public safety, security, or fundamental rights (paragraph 84).
If a GPAI model is provided against a price or otherwise monetized, it will not benefit from the open-source exemptions. As per the GPAI guidelines, monetization includes making the model available contingent on payment of any sort, procuring another product or service (e.g., technical support or training services) from the provider, viewing advertisements on a developer-hosted platform, or on the provider receiving and/or processing personal data. Recital 103 clarifies that βmaking AI components available through open repositories should not, in itself, constitute a monetisation,β but the boundaries depend on whether additional monetization strategies are employed around the model's distribution or use.
TL;DR: If you qualify for the open-source exemptions, you still need to provide detailed documentation of your training data and demonstrate how youβre complying with EU copyright law. You donβt have to meet the obligations to compile more detailed documentation for the European Commission or downstream users, or appoint an authorized representative in the EU.
The AI Act takes a tiered approach to the obligations for providers of GPAI models (see Table 3). There are certain baseline obligations prescribed in Articles 53 and 54 that apply to all GPAI models β with the exception of GPAI models released under a free and open-source license, which are exempt from some of these obligations (see upper-left quadrant in Table 3). In addition to these obligations, stricter obligations (prescribed in Article 55) apply to the providers of GPAISR models and none of the open-source exemptions apply to them.
| Uses Free and Open-Source License | Doesnβt Use Free and Open-Source License | |
| General-Purpose AI
(GPAI) |
Partially exempt
Need to comply with Art. 53(1)(c)-(d) (e.g., OLMo 2) |
Not exempt
Need to comply with Art. 53(1) and 54 (e.g., Llama 3-8B) |
| General-Purpose AI with Systemic Risk (GPAISR) |
Not exempt Need to comply with Art. 53(1), 54 and 55 (currently no examples) |
Not exempt Need to comply with Art. 53(1), 54 and 55 (e.g., GPT-4.5) |
Table 3: Overview of obligations and exemptions for different categories of GPAI models
We summarize each of the obligations for providers of GPAI and GPAISR models and whether open-source exemptions apply in Table 4.
| Obligations | Open Source GPAI Model | Open Source GPAISR Model | Official
Guidance |
| Article 53(1a): Draw up and keep up-to-date model documentation | Exempt | Not exempt | Code of Practice Transparency Chapter, Model Documentation Form |
| Article 53(1b): Draw up, keep up-to-date, and make available documentation to providers of AI systems who intend to integrate the GPAI model in their AI systems | Exempt | Not exempt | Code of Practice Transparency Chapter, Model Documentation Form |
| Article 53(1c): Put in place a policy to comply with EU law on copyright and related rights. | Not exempt | Not exempt | Code of Practice Copyright Chapter |
| Article 53(1d): Draw up and make publicly available a sufficiently detailed summary of training data. | Not exempt | Not exempt | Template for the public summary of training data |
| Article 54: Providers in third countries must appoint by written mandate an authorised representative in the EU, prior to placing it on the EU market. | Exempt | Not exempt | N/A |
| Article 55(1a-d): GPAISR-specific obligations including model evaluations, systemic risk assessment and mitigation, incident reporting to authorities, and cybersecurity protection. | N/A | Not exempt | Code of Practice Safety and Security Chapter |
Table 4: Obligations for GPAI model providers, open-source exemptions, and official guidance
TL;DR: Open-source GPAI model providers must comply with EU copyright law and publish a training data summary using the AI Office's template, while being exempt from transparency and documentation obligations. Providers of open-source GPAISR models must comply with all obligations in Articles 53-55. The Code of Practice provides voluntary guidance for compliance with most obligations, including measures for transparency and documentation, copyright compliance, and safety and security requirements for managing systemic risks.
We give a brief overview of the compliance requirements and measures for open-source GPAI developers based on the AI Act text itself and official guidance the Code of Practice (CoP), GPAI guidelines, and template for the public summary of training data. As a reminder, this is not legal advice but gives insight into which provisions might apply to you if you qualify as a GPAI model provider and what you may do to comply. As mentioned above, GPAI models that are developed and distributed only for research purposes are fully exempt.
Most of this guidance comes from the GPAI CoP, which is a voluntary framework designed to facilitate compliance with the obligations for providers of GPAI and GPAISR models. Once it is endorsed by EU Member States and the European Commission, providers, who voluntarily sign it, may adhere to it as a means to demonstrate their compliance. That means the CoP is one way of complying with the AI Actβs rules for GPAI models, but providers who choose not to follow it are still obliged to comply with the obligations in another way they deem fit for purpose. In any case, compliance with the AI Actβs rules will be assessed by relevant authorities.
For your convenience, here is a checklist of measures that providers of open-source GPAI models must take to comply with their obligations:
You must implement a policy to comply with EU law on copyright. While the AI Act does not specify the format of such a policy, the CoP provides one possible approach to compliance by doing the following:
You must publish a summary of training data using the AI Office's template:
N.B.: If your model is classified as GPAISR, all obligations in Articles 53, 54, and 55 apply.
TL;DR: Open-source GPAI model providers are exempt from transparency obligations if they publicly share model architecture information and use a license that qualifies as free and open-source, while open-source GPAISR model providers are not exempt and may follow the guidance in the transparency chapter of the Code of Practice. Providers of finetuned or modified open-source GPAISR models only become subject to these obligations if their modifications require more than one-third of the original model's training compute, in which case their responsibilities are limited to documenting their specific changes.
Providers of open-source GPAI models are exempt from the transparency obligations, so adhering to the measures in the transparency chapter or filling out the form is not obligatory. Providers of open-source GPAISR models are not exempt and may adhere to the transparency chapter of the Code of Practice, which outlines three measures for documenting and sharing essential information about a model's development, capabilities, and limitations. The measures include publicly disclosing contact information for requesting access to documentation; making relevant documentation available and accessible to the AI Office, market surveillance authorities, and downstream users upon request; and keeping documentation up-to-date, secure, and retained for 10 years after placing the model on the EU market.
To streamline compliance, the transparency chapter includes a Model Documentation Form for collecting all required information about a modelβs properties, methods of distribution, licenses, use, training process, training data, computational resources, and energy consumption. It makes it easier for providers to compile the required documentation, and ensures that both regulatory authorities and downstream AI system providers have access to the information they need to understand model capabilities and fulfill their own regulatory obligations.
What do I need to do if I finetune an existing GPAI or GPAISR model? As mentioned above, you will be considered a provider only if your modification leads to a significant change in the model's generality, capabilities, or systemic risk. If you qualify as a provider by this calculation, the transparency chapter clarifies that your documentation and transparency commitments should be proportionally limited to the modifications or fine-tuning performed, recognizing that you may not have access to or control over the base model's development process.
TL;DR: The copyright chapter provides guidance, including five measures complete with requirements and encouraged actions, for how providers of open-source GPAI or GPAISR models can put in place a policy to comply with EU law on copyright and related rights.
Providers of GPAI and GPAISR models are not exempt from putting in place a policy to comply with EU law on copyright and related rights. The copyright chapter of the CoP outlines 5 measures that providers may implement to comply with their obligation. In Table 5, we distill the requirements and encouraged actions mentioned under each measure in the chapter.
| Measure | Requirements | Encouraged actions |
| Measure 1.1 requires providers to establish and maintain a copyright policy document that incorporates all 5 measures. |
|
|
| Measure 1.2 provides guidance on reproducing and extracting only lawfully accessible, copyright-protected content when crawling the web. |
|
|
| Measure 1.3 mandates the identification and compliance with rights reservations, including following robots.txt protocols and other machine-readable standards. |
|
|
| Measure 1.4 requires implementing technical safeguards to prevent copyright-infringing outputs and prohibiting such uses in acceptable use policies or model documentation. |
|
|
| Measure 1.5 establishes communication requirements by designating contact points for rightsholders and implementing complaint mechanisms for copyright-related issues. |
|
Table 5: Measures, requirements, and encouraged actions in the CoP Copyright Chapter (source: European Commission, CoP for General-Purpose AI Models Copyright Chapter)
TL;DR: Providers of GPAI and GPAISR models must provide a public summary of their training data using the AI Office's template, including general model information, datasets used, and data processing aspects. The summary must be written in simple narrative form and published on official websites and distribution channels when placing models on the EU market.
The AI Office published a template for GPAI and GPAISR providers to make a sufficiently detailed summary of training data publicly available, as required by obligation in Article 53(1d). This summary must be made publicly available on a providerβs official website and all distribution channels (e.g., open repositories) when placing the model on the EU market.
The objective of this summary is to increase transparency about the data that is used throughout all stages of the training of GPAI models (from pre-training to post-training, including model alignment and finetuning), including text and data protected by EU law on copyright and related rights, while protecting trade secrets and confidential business information.
The template contains 3 sections β general model information, main datasets used, and relevant data processing aspects β with clear and short instructions to allow providers to report the required information in an easy and uniform manner. For reference, check out this public summary of training data for SmolLM3.
The Explanatory Notice provides the following clarifications to help you fill in the template:
TL;DR: The safety and security chapter of the Code of Practice outlines 10 commitments that providers of open-source GPAISR models may follow to comply with their obligations prescribed in Article 55. The requirements are designed with proportionality principles that scale with the systemic risks and provider capacity, with simplified compliance pathways for small and medium-sized enterprises (SMEs) and small mid-cap enterprises (SMCs), including startups.
While there are currently no open-source GPAISR models, providers of open-source GPAISR models will be subject to additional safety and security obligations prescribed in Article 55(1a-d). If you qualify as a provider of such a model, the safety and security chapter of the CoP outlines 10 commitments that you can meet to manage systemic risks throughout the entire model lifecycle and comply with these obligations. These include but are not limited to:
The commitments are designed around two proportionality principles:
Developers can leverage a number of open-source tools to adhere to many of these measures. For example, for the risk assessment and model evaluations, open-source frameworks like LM Evaluation Harness, lighteval, and Inspect enable standardized LLM evaluations, while platforms like Weights & Biases offer experiment tracking tools for continuous monitoring of models. For safety mitigations, developers can leverage data curation tools or red teaming frameworks, while the NIST AI Risk Management Framework provides best practices for responsible model development and deployment. For the documentation requirements, developers can continue to use already familiar model cards and dataset cards.
π¨ Given that the obligations for providers of GPAI models apply starting August 2, 2025, it is urgent that we raise the communityβs readiness for these obligations. By telling others about these obligations and sharing this guide, you can help get the community ready!
π οΈ Join the conversation! We're building follow-up resources on compliance tools and best practices, but we need your input to make them truly useful. Whether you have questions about this guide, tools and workflows to share, or want to help identify what's still missing, reach out! Let's work together to get the community ready for AI Act compliance.
This guide was written as a collaboration between researchers at Hugging Face, the Mozilla Foundation, and the Linux Foundation by Cailean Osborne, Maximilian Gahntz, Lucie-AimΓ©e Kaffee, Bruna Trevelin, Brigitte Toussignant, and Yacine Jernite. We additionally thank Steve Winslow for helpful review and advice. The views expressed are those of the individual authors and do not necessarily reflect the positions of their respective organizations. Please cite as:
@techreport{osborne2025euaiact,
title={What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI models},
author={Osborne, Cailean and Gahntz, Maximilian and Kaffee, Lucie-Aim{\'e}e and Trevelin, Bruna and Toussignant, Brigitte and Jernite, Yacine},
institution={Hugging Face, Mozilla Foundation, and Linux Foundation},
year={2025},
url={https://hf.co/spaces/hfmlsoc/eu-ai-act-os-guide-gpai},
type={Guide}
}
A guide for OS developers on GPAI in the EU AI Act
Flowchart for (open) GPAI developers to identify AI Act reqs
The EU AI Act Public Summary of Training Content for SmolLM3