Generative AI systems are "trained" by analyzing vast amounts of data. From this data, which often consists of data collected by copyright protected material, the AI learns patterns and styles to create its own new, unique content. According to European legislation, such as the DSM Directive , this is under certain conditions allowed via an exception for Text and Data Mining (TDM). This possibility was introduced in the Belgian Code of Economic Law (CEL) by means of two exceptions: one for research organizations for scientific research (Art. XI.191/1, 7°CEL) and one for everyone for all other purposes (Art. XI.190, 20° CEL). Under this second exception, TDM is allowed for commercial purposes, such as training commercial AI models. However, authors and other rights holders do have an opt-out possibility under this second exception and thus the ability to oppose it. Separately, the question arises whether or not the use of the two TDM exceptions for AI training is consistent with the three-step test?
The three-step test
Any copyright exception, including those for TDM, must meet the so-called three-step test. This test is a cornerstone of international copyright law, found in the Berne Convention, the TRIPS Treaty and various EU directives, and is also found in Article XI.192/3 CEL. The test states that an exception is valid only if three cumulative conditions are met:
- It should only concern certain, special cases.
- It should not conflict with a normal exploitation of the work.
- It must not unreasonably prejudice the legitimate interests of the author.
The discussion of AI training focuses mainly on the second step. According to an influential interpretation by an WTO Panel in 2000, "all forms of exploiting a work, which have, or are likely to acquire, considerable economic or practical importance" must remain reserved to authors. So the question is whether TDM for AI training is such a form of exploitation.
Legal analysis and interpretation: a two-pronged approach
To assess the impact of AI training, we must make an essential distinction between the input (training the AI with existing works) and the output (the content newly generated by the AI).
The input
At first glance, it seems logical: If AI developers use works, they would have to pay a license for them. A TDM exception deprives authors of this revenue, which would be a violation of normal exploitation.
Yet this argument is legally weak for two reasons:
- The test focuses on the micro-level: The three-step test requires an analysis of the exploitation of "a work," that is, at the level of a single individual creation. In practice, the potential licensing income for the use of a single book or a single photograph for TDM will never be of "considerable economic importance." Substantial revenue arises only when licensing entire portfolios or catalogs, but the test is not at this macro level.
- The opt-out neutralizes the conflict: The ability for rights holders to reserve (opt-out) their rights to AI training for commercial purposes is a legally important one. If an author shields his work, the AI developer may not use it without permission. As a result, there can be no conflict with normal exploitation at the input stage; indeed, the author retains complete control over this potential licensing market. If an author does not set up an opt-out, it can be argued that he or she accepts the risk of insufficient demand for a TDM license.
The output
The discussion then shifts to output. What if an AI, trained on the work of authors A, B and C, generates content so good that no one wants to buy author A's new work? This undermines the market and thus normal exploitation.
The EU Court of Justice has in previous cases (ACI Adam, Movie Player) held that an exception harms normal exploitation if it encourages the distribution of illegal copies, leading to fewer sales of legal works.
However, there is a crucial difference with GenAI:
- The output of an AI is rarely an exact copy of a specific work. Only in exceptional cases will an AI reproduce protected, original elements of a work. When that happens, the author can act directly on the basis of his exclusive rights without having to invoke the three-step test.
- Usually, the AI imitates unprotected elements such as ideas, concepts and styles. The competition is therefore indirect and diffuse. It is not an illegal copy replacing a legal sale, but a new, alternative work. The legal link to one specific work from the training data is broken, making proof of a conflict with the normal exploitation of that work almost impossible.
The way out: the third step and the call for fair compensation
Because the second step of the test ("normal exploitation") proves to be too high a hurdle, rights holders must turn to the third step: demonstrating unreasonable harm to their legitimate interests. This criterion is more flexible and does allow for a macro-level analysis. Here one can take into account the overall, market-distorting impact of GenAI on certain creative sectors.
The solution, however, is not to ban the TDM exception entirely. It has been argued for decades that equitable compensation can be a way to reduce an "unreasonable" harm to an acceptable, "reasonable" level. Thus, the struggle of rights holders should focus on establishing compensation mechanisms for the commercial use of their works for AI training to offset the economic impact. It should be noted that TDM for non-commercial purposes, such as scientific research or investigative journalism, is likely to be justified under the three-step test because of societal interest and thus does not cause unreasonable harm.
What this means specifically: strategic advice
- For authors, artists and publishers:
- Activate your opt-out: This is your strongest weapon. Make sure you explicitly state in a machine-readable way (e.g., in your terms and conditions or robots.txt file) that you reserve the right to TDM. This creates a clear legal boundary.
- Focus on the third step: In legal discussions and lobbying, it makes more strategic sense to argue that there are "unreasonable damages" that need to be compensated, rather than trying to prove that "normal exploitation" is being violated.
- Document the market impact: Collect data on how AI-generated content affects demand and prices in your industry. This can be crucial to demonstrate unreasonable harm.
- For AI developers:
- Respect the opt-out flawlessly: Ignoring an explicit caveat is a direct copyright infringement in the case of AI training for commercial purposes. Invest in robust systems to detect and comply with these caveats.
- Be prepared for reimbursement models: The legal and political winds are blowing in the direction of compensation schemes. Anticipate this by examining licensing models and providing budgets for compensating rights holders.
- Analyze the output: Although the TDM exception covers inputs, you remain fully liable for outputs. Ensure that your systems do not generate content that directly infringes on existing works.
FAQ (Frequently Asked Questions)
As an author, how can I prevent my work from being used for AI training for commercial purposes?
You can use the opt-out option in Article 190, 20° CEL. You do this by indicating in a machine-readable way that you reserve the rights for TDM. This can be done, for example, in the terms of use of your website or via protocols such as robots.txt. However, this does not allow you to block AI training for scientific research.
Is the output of an AI trained on my work automatically a violation?
Not necessarily. There is infringement only if the AI output reproduces a substantial and original part of your specific work. If the output merely imitates your style, ideas or concepts-elements not protected by copyright-there is basically no infringement.
What if an AI produces an almost exact copy of my protected work?
In that case, there is clear copyright infringement. The exception for TDM applies only to the input phase (training) and does not give license to infringe on the output phase (generating content). You can then take action against the dissemination of this particular output.
Conclusion
The TDM exception provides a framework within which AI innovation is possible in Belgium, while the opt-out gives authors a powerful control mechanism in the context of AI training for commercial purposes. However, the traditional arguments around the "normal exploitation" of a work seem insufficient to address the diffuse market damage caused by AI competition. The future of the debate lies in the third step of the test: proving unreasonable harm and enforcing fair compensation schemes for creators, a battle that will dominate the legal agenda for years to come.



