Artificial Intelligence and copyright

The rapid development of artificial intelligence (AI) is challenging traditional copyright facing unprecedented challenges.

As a specialized law firm, we are noticing an increasing demand for legal advice on the interplay between AI technologies and intellectual property rights. Below we discuss the main legal issues arising at the intersection of AI and copyright, and offer insights into how this new reality is transforming the copyright landscape in Belgium and Europe.

The legal framework

European and Belgian legislation

The copyright framework in Belgium (the Code of Economic Law) has undergone a number of changes as of Aug. 1, 2022, due to the implementation of the European Directive on Copyright in the Digital Single Market (DSM Directive). This conversion has important implications for training AI models by introducing a exception For text and data mining (TDM).

At the same time, the EU AI Act a new regulatory framework that directly impacts AI systems that interact with copyrighted works. Although primarily focused on risk management, this regulation includes provisions relevant to the use of copyrighted content in AI models. These provisions are applicable as of Aug. 1, 2025.

Justice in development

The case law surrounding AI and copyright is still evolving. Legal proceedings are being initiated both inside and outside the EU related to AI and copyright. No case law is yet known in Belgium.

For Belgium, court proceedings in other EU member states are particularly important because they provide more interpretation on the relevant provisions of the DSM Directive and the AI Act (such as the Knesche/Laion case in Germany or the DPG Media/Howardshome case in the Netherlands).

Training AI models

Training AI models and copyright

The Belgian copyright grants protection to original works, protecting the concrete form of expression rather than the underlying ideas. This protection includes exclusive rights for the author or rights holders to make reproductions or make the work publicly available.

A fundamental question in the current legal debate is whether the training process of AI models involves copyright relevant acts. Indeed, the training of these models involves the use of extensive data sets, which often include copyrighted works.

From a legal perspective, two contrasting views exist:

  • On the one hand, some argue that during the training process, temporary reproductions of protected works are made at different stages: when the data are collected and stored, during their processing, and when the works are loaded into computer memory. These actions would fall under the exclusive reproduction right of the author and thus in principle require authorization.
  • On the other hand, others argue that AI training mainly results in extracting non-protected elements from works. The process would involve abstract patterns and concepts (so-called "embeddings") distill, while the copyrighted expressions are not preserved in the final model. According to this view, there is no infringement as long as the AI model does not reproduce literal or substantially similar passages from the original works.

This legal debate remains undecided for now, and interpretation by courts in future litigation will likely determine the copyright status of AI training processes.

The TDM exception applied to training AI models

Since Aug. 1, 2022, Belgian copyright law contains two specific exceptions for text and data mining (TDM) that are directly relevant to training AI models. These exceptions are of particular interest if we assume that reproductions of copyrighted works are made during the training process.

The training process of AI models bears strong similarities to what the legislature defines as TDM: "A computerized analysis technique aimed at decomposing text and data in digital form to generate information such as, but not limited to, patterns, trends and interrelationships." (Article I.13, 10° WER)

The law provides two categories of exceptions to the author's exclusive right of reproduction:

  1. TDM for scientific research (Article XI.191/1, §1, 7° WER)
    This exception applies specifically to research organizations and heritage institutions that perform TDM for scientific research. What is important here is that rights holders cannot prohibit this form of use. Research organizations and heritage institutions therefore enjoy an absolute exception.
  2. TDM for other purposes (Article XI.190, 20° WER)
    This broader exception covers all other forms of TDM, including commercial applications such as corporate training of generative AI models. However, this category is subject to an important limitation: rights holders retain the right to prohibit the use of their works for these purposes through an "opt-out" system. This opt-out must be appropriately communicated, specifically through machine-readable means.

The AI Act has confirmed that the TDM exception from the DSM Directive applies when training AI models. Recital 105 of the AI Act explicitly explains this relationship: "The development and training of [general purpose AI models] requires access to vast amounts of text, images, videos and other data. In this context, text and data mining techniques can be widely used for the collection and analysis of that potentially copyright and related rights protected content. [...] Directive (EU) 2019/790 introduced exceptions and limitations that allow, under certain conditions, the reproduction or extraction of works or other materials for text and data mining purposes.". At the same time, the preamble recalls the limits of these exceptions, in particular by emphasizing, "Under these rules, rights holders may choose to reserve their rights to their works or other materials in order to prevent text and data mining, unless it is for purposes of scientific research. Where opt-out rights are expressly and appropriately reserved, if a general purpose AI model provider wishes to use the works for text and data mining, it must seek permission from the rights holders.

In U.S. copyright law, a claim of fair use for AI training seems already not always possible.

Transparency requirements by the AI Act

However, the AI Act adds an important layer of transparency requirements to the use of copyrighted works for AI training. These obligations are set forth in Section 53 of the AI Act, which deals with the obligations on providers of general purpose AI models (GPAI).

Key transparency obligations include:

  1. Documentation requirement on training dates: In accordance with Section 53(1)(a), AI model providers must prepare and maintain up-to-date technical documentation of the model, including its training and testing process and the results of its evaluation, which must include the information listed in Appendix XI of the AI Act.
  2. Summary of content used: Section 53(1)(d) requires providers to prepare and disclose a sufficiently detailed summary about the general purpose content used for training the AI model, according to a template provided by the AI agency.
  3. Compliance with opt-out mechanisms: Section 53(1)(c) requires providers to establish policies for copyright compliance and, in particular, for establishing and complying, including through advanced technologies, with a reservation of rights expressed pursuant to Article XI.190, 20° WER. This provision refers directly to the opt-out possibility for rights holders. Research organization and heritage institutions doing TDM for scientific research under Article XI.191/1 §1, 7° WER should not follow this condition.
  4. Disclosure for downstream providers: Section 53(1)(b) requires the preparation and availability of information and documentation for providers wishing to integrate the AI model into their AI systems, while respecting intellectual property rights and confidential business information.

Copyright protection for AI outputs?

Human origins as a condition

Belgian copyright law protects "works of literature and art" that are original. In a series of rulings (Infopaq, Painer, Brompton Bicycle), the Court of Justice of the European Union has clarified the criterion of originality as "an author's own intellectual creation" that reflects the "personality of the author." This interpretation implicitly presupposes a human origin of the work.

In other words, copyright is designed to protect human creativity. This premise is supported by Article XI.170 of the Economic Law Code, which states that the author is the natural person who created the work. The term "natural person" in legal context refers to physical persons, and not an AI system.

Different scenarios of AI engagement

When assessing the copyright protectability of AI output, it is essential to distinguish different degrees of AI involvement:

  1. AI as a mere tool: When AI is used as a technical tool under direct human control (similar to a photo camera or word processor), where the creative choices are predominantly made by the human user, the resulting work may enjoy copyright protection. The human user is then considered the author.
  2. Human-AI co-creation: In scenarios where both humans and the AI system make substantial creative contributions, a more complex situation arises. Copyright could be assigned to the human share in the creation, but only for the elements that actually result from human creativity.
  3. Autonomous AI creation: For works generated almost entirely autonomously by an AI system, with minimal or trivial human input (e.g., by entering a simple prompt), copyright protection is most problematic. Without a clear "own intellectual creation" by a human author, such works may not satisfy the originality requirement as interpreted by the CJEU.

Can AI output constitute copyright infringement?

Under Belgian copyright law, copyright infringement occurs when a protected work is reproduced or communicated to the public without authorization. Nor can editing or adapting a work be done without the author's consent (Art. XI.165, §1 WER).

Specific challenges in AI systems

AI systems, especially large language models (LLMs) and generative image models (GenAI), are trained on huge datasets that often contain copyrighted works. This leads to some specific challenges:

  1. Memorization of training materials: AI models may "memorize" training material and then reproduce or paraphrase it in their output. This risk is particularly high for unique, distinctive works or for works that are over-represented in the training data.

    When an AI system literally reproduces substantial parts of a protected work (such as entire passages from a book, recognizable melodies from a piece of music, or visually identical elements from a work of art), the risk of infringement is most obvious.

    With AI-generated works, this assessment can be complex because (i) the output is often a mix of elements from many different sources, (ii) the path from training data to output is difficult to trace (the "black box" problem), (iii) the similarity may stem from common cultural elements or genres that are not protected.
  2. Style imitation: AI systems can "learn" and mimic the style, aesthetics or other characteristics of specific authors, artists or works, sometimes at the user's request ("generate a text in the style of author X").

    However, copyright does not protect style, genre, or general concepts, but only their concrete expression.

    This means that pure style imitation without adoption of concrete expressive elements is unlikely to constitute infringement. However, a combination of style imitation with adoption of specific, characteristic elements could potentially constitute infringement. The line between the two situations is often difficult to define, especially in the case of AI output that can seamlessly blend style and concrete elements.
  3. Derivative works: AI output may combine elements from different protected works or build on existing works, potentially leading to unauthorized derivative works.

    This is not the case when the AI is trained on a work but produces a fundamentally different output, the AI combines elements from different works into something substantially new, or the AI extracts concepts from works but applies them in a completely new way.

How a specialized lawyer can assist you

When training AI models

A attorney specializing in copyright law and technology law can provide crucial support for:

  • Legal risk analysis: Evaluating the legality of your training data and identifying potential copyright risks before the training process begins.
  • Licensing Strategies: Developing a strategy for obtaining necessary licenses for training materials, including negotiating with rights holders and collecting societies.
  • Compliance with TDM exceptions: Ensuring that your training activities fully comply with the terms of the text and data mining exceptions in copyright law, including compliance with opt-out mechanisms.
  • Documentation and evidence: Establishing robust documentation processes that can demonstrate that training was conducted legitimately, which can be crucial in the event of litigation.
  • Contractual protection: Establishing contracts with data providers, cloud providers and other partners involved in the training process, with adequate safeguards and indemnities.

When protecting AI-generated output

For questions surrounding the protectability of AI output, a copyright lawyer:

  • Developing protection strategy: Develop a customized strategy for protecting your AI-generated works, taking into account legal uncertainties and jurisdictional differences.
  • Maximizing protective capabilities: Advise on how to optimize human creative input into the AI generation process to increase the likelihood of copyright protection.
  • Alternative forms of protection: Explore other legal instruments such as trade secrets, database right, or contractual mechanisms when copyright protection is uncertain.
  • Registration and documentation: Assist in registering works with copyright authorities whenever possible, and in establishing proof systems that document the origin and ownership of works.
  • International protection strategies: Advise on jurisdiction-specific approaches, taking into account the different ways countries treat AI-generated works.

In managing breach risks

A copyright lawyer can provide indispensable support in managing risks of copyright infringement from AI output:

  • Due diligence of AI systems: Conducting legal audits of AI-generated systems to identify breach risks before implementation.
  • Prevention Strategies: Developing procedures and control mechanisms to minimize the risk of infringing AI output.
  • Response Protocols: Establish clear protocols for when a claim of infringement is received or when infringing output is identified.
  • Defense against claims: Representing your interests in infringement cases, using specialized defenses specific to AI context.
  • Negotiations and settlements: Negotiating with rights holders to resolve disputes out of court, often with more favorable terms than through court proceedings.

In compliance with the AI Act and other regulations

As AI becomes increasingly regulated, a lawyer can provide essential support for:

  • Transparency obligations: Complying with transparency obligations under the AI Act regarding training materials and copyright issues.
  • Documentation and reporting: Establish documentation and reporting systems that meet the requirements of the AI Act and other relevant regulations.
  • Policy Development: Develop internal policies and procedures to consistently meet the various legal requirements around AI and copyright.
  • Monitoring legislation: Monitoring rapidly evolving regulations and case law, and advising on their impact on your operations.

Contact

Questions? Need advice?
Contact Attorney Joris Deene.

Phone: 09/280.20.68
E-mail: joris.deene@everest-law.be

Topics