AI Firms Struggle to Meet EU Transparency Rules on Training Data

Major AI companies are facing scrutiny from the European Commission as they struggle to comply with new transparency requirements under the EU AI Act. While the regulation demands detailed disclosures about training data sources, many leading commercial developers have provided only vague descriptions instead of specific information about the datasets powering their models.

Contents

Open-Source Providers Lead While Commercial Giants Lag Grace Period Extends as Enforcement Looms Broader Transparency Rules Take Effect August 2026 Copyright Protection Creates New Obligations International Tensions Over Digital Regulation Path Forward for AI Companies

The compliance gap has emerged as a critical issue ahead of formal enforcement that begins later in 2026. Under the AI Act, developers of large foundation models must reveal where their training data comes from, allowing content creators to determine whether their copyrighted material has been used without permission. These transparency obligations became effective for general-purpose AI models in August 2025, marking a significant shift in how AI development is regulated in Europe.

Open-Source Providers Lead While Commercial Giants Lag

The contrast between different types of AI developers is stark. Open-source providers like Hugging Face have already published detailed templates showing their data sources, demonstrating that compliance is technically achievable. However, leading commercial developers have so far offered only broad descriptions of data usage rather than the specific sources required by law.

The European Commission published a template in July 2025 requiring providers to give an overview of the data used to train their models. This includes the sources from which data was obtained, such as large datasets and top domain names, along with information about data processing methods. The template aims to enable parties with legitimate interests to exercise their rights under EU law.

Grace Period Extends as Enforcement Looms

Companies that released models after August 2025 are currently benefiting from a grace period, which delays immediate enforcement obligations. However, formal enforcement will begin later in 2026, and the European Commission has indicated willingness to impose fines if necessary. The Commission continues to assess whether newer models fall under immediate obligations.

The stakes are substantial. Violations of transparency and high-risk obligations can trigger administrative fines of up to 15 million euros or three percent of total worldwide annual turnover, whichever amount is higher. For global enterprises, this makes compliance a primary financial security priority rather than a secondary legal consideration.

Broader Transparency Rules Take Effect August 2026

The pressure on AI developers will intensify as broader transparency obligations under Article 50 of the AI Act become fully enforceable on August 2, 2026. These requirements apply to specific use cases including interactive AI, deepfakes, and synthetic content generation, regardless of whether the system is classified as high-risk.

The AI Act also introduces mandatory labelling for AI-generated content. Any platform or service that publishes text, audio, images, or video generated by AI must clearly mark it as artificial. The goal is to help users distinguish between human and synthetic content, reducing the risk of misinformation, deepfakes, and manipulated media.

Copyright Protection Creates New Obligations

Under the EU Copyright Directive, creators can now reserve their rights and prevent their work from being used in AI training. From 2026, AI developers must check whether a data source or website has a copyright reservation, exclude or license that content before using it in training, and keep evidence showing compliance. This represents a major shift toward content ownership and ethical AI development.

Such disclosures are intended to offer a minimal baseline of transparency, covering the use of public datasets, licensed material, and scraped websites. The requirements give regulators and users visibility into how AI models learn and ensure that creators’ rights are not ignored in the training process.

International Tensions Over Digital Regulation

The transparency issue is likely to become politically sensitive, as stricter enforcement could affect US-based technology firms and intensify transatlantic tensions over digital regulation. The European Commission must determine whether training AI models on scraped web data satisfies a lawful purpose under the General Data Protection Regulation, whether the data collection is necessary for that purpose, and whether the interests of data subjects override those of developers.

Companies must now treat AI models like any other piece of critical enterprise software. Governance professionals are finding that black box models are no longer defensible under the new standard. The legal requirement is moving from a general understanding of a model to proving the specific process of its creation.

Path Forward for AI Companies

To facilitate the transition to the new regulatory framework, the Commission launched the AI Pact, a voluntary initiative that seeks to support future implementation and invite AI providers and deployers from Europe and beyond to comply with key obligations of the AI Act ahead of time. In parallel, the AI Act Service Desk is providing information and support for smooth and effective implementation across the EU.

The Commission published three key instruments in July 2025 to support responsible development and deployment of general-purpose AI models. These include guidelines on the scope of obligations for providers, a voluntary Code of Practice offering practical guidance on transparency, copyright, and safety, and the template for public summaries of training content.

Transparency under the AI Act may therefore test both regulatory resolve and international relations as implementation moves closer. Companies that prepare early will have higher trust from users and enterprise clients, access to EU markets without legal barriers, and a stronger compliance reputation with investors and regulators. Those who delay risk last-minute compliance costs and possible restrictions from the European market.

Category

Company

Resources

Open-Source Providers Lead While Commercial Giants Lag

Grace Period Extends as Enforcement Looms

Broader Transparency Rules Take Effect August 2026

Copyright Protection Creates New Obligations

International Tensions Over Digital Regulation

Path Forward for AI Companies

Leave a Reply Cancel reply

Most Read

OpenAI funding round talks: Nvidia, Amazon, Microsoft

Abu Dhabi peace talks: US, Russia, Ukraine meet on territory

Gaza Board of Peace: Trump Invites Nations to Rebuild

Dark oxygen mystery: Deep-sea landers head to Pacific

Poco M8 5G price in India: sale starts Jan 13

Amazon in Talks to Invest Up to $50 Billion in OpenAI

About Us

Explore

Useful Links

Subscribe Us