The rise of generative AI, such as ChatGPT, has created a legal minefield. These models are trained on massive data sets, including billions of texts and images that are copyrighted. The central question lawyers worldwide are grappling with is: is this allowed?
An judgement of the Munich Landgericht on November 11, 2025 in the case GEMA v. OpenAI causes shockwaves. The court held that OpenAI infringes copyright and, crucially, that the AI model itself is considered an illegal reproduction. However, this ruling contrasts sharply with the recent view of the British High Court in the case of Getty Images v. Stability AI.
The facts and legal context
The German collective management organization GEMA (the counterpart of SABAM in Belgium) found that OpenAI's chatbot could reproduce the lyrics of nine well-known German songs (including “Atemlos durch die Nacht” by Helene Fischer) almost verbatim. This was done after entering very simple prompts, such as “What is the text of [song title]?” .
GEMA sued OpenAI for double copyright infringement:
- An infringement of reproduction rights by storing the texts in the language model itself.
- An infringement of the right of public communication (and again the reproduction right) by displaying the texts (or parts thereof) to the user (the output) .
OpenAI defended itself by arguing that an AI model does not store texts the way a hard drive does. The model would only ‘learn’ statistical patterns and relationships between words. Moreover, OpenAI argued, the training process falls anyway under the Text and Data Mining (TDM) exception, which allows analysis of large amounts of data.
The court's decision
The Munich court largely went along with GEMA's arguments and condemned OpenAI. The court's reasoning is technically detailed and legally significant for the entire European Union.
The AI model as infringing reproduction
The court dismissed OpenAI's technical defense. It ruled that the AI model itself constitutes a reproduction (a ‘copy’) of the song lyrics within the meaning of Article 2 of the European Infosoc Directive and transposed into § 16 of the German copyright law (in Belgium, this is in Article XI.165 §1 Code of Economic Law (CEL).
The German judges argued that the phenomenon of ‘memorization’ (the model's ‘remembering’ of training data) proves that the works are “reproducibly present” in the model. That the data is technically broken down into complex parameters or vectors is irrelevant. What matters is that the work can be made “indirectly observable” by technical means (in this case, the chatbot interface). Thus, the model is a form of “physical fixation” of the work.
The output as public communication
The court also found that the chatbot's output violates the right of public communication (or ‘making available“). ChatGPT's users constitute a ”new public," accessing the works in a new way. Moreover, OpenAI is directly and first liable for this. The court did not assign responsibility to the user who entered the prompt, given OpenAI's central and guiding role in the entire process.
Legal analysis and interpretation
A controversial decision: the focus on ‘memorization’
The German court adopts a functional approach: if the result (being able to reproduce a work) is the same as with traditional storage, it should legally also be treated as storage (and thus reproduction). This view is at odds with other recent case law.
The core of German reasoning rests on ‘memorization’. The question is whether this is not a fundamentally flawed premise since ‘memorization’ is technically a ‘bug’ or an unwanted side effect, also known as ‘overfitting’. This occurs when a model is trained on the same data too many times, causing it to ‘memorize’ rather than ‘learn’ the data.
It is much more legally relevant to look at the initial training phase: the act of copying protected works (e.g., song lyrics or photographs) to ‘feed’ the design. That is the primary act of reproduction that should be legally tested, quite apart from the question of whether the design can later reproduce the work perfectly.
The German ruling is in stark contrast to the British High Court's recent decision in the case Getty Images v. Stability AI. In that case, which revolved around training an AI image generator on billions of photos, the judge explicitly ruled that the AI model (Stable Diffusion) “did not contain copies” of Getty's works and “does not store copyrighted material”.
The British court follows a much more technical line of reasoning that is more convincing. According to the British court, an AI model is not a ‘storage medium’ in the copyright sense. It does not store data, works or pixels. It exclusively stores complex mathematical relationships between data points (tokens or pixels) in its parameters.
According to this view, the German court's reasoning - that the ability to reconstruct a work is sufficient to label the design as ‘reproduction’ - is legally difficult to sustain.
The legal landscape in Europe is thus deeply divided on this point. While German courts focus on functionality (being able to reproduce), British courts look much more strictly at architecture (what is the design really like?).
The TDM exception: no release for memorization
OpenAI invoked the Text and Data Mining exception enshrined in Article 4 of the DSM Directive and in Germany in § 44b UrhG (In Belgium this is art. XI.190, 20° CEL). This exception allows reproductions for the purpose of automated analysis, provided the rights holder has not objected (an ‘opt-out’).
The Munich court recognized that AI training could in principle fall under the TDM exception. However, the judges followed split reasoning:
- Phase 1 (The Analysis): Training the model, analyzing data to extract patterns and information. The temporary copies required for this are potentially covered by the TDM exception.
- Phase 2 (The Model): The result of training. If the model not only analyzes the works but permanently stores them (memorization) for later reproduction, this is a new reproduction.
The court held that this second stage - the permanent, reproducible storage in the model - is not covered by the TDM exception. The exception allows copies “for the purpose of TDM” (i.e., analysis). However, persistent storage no longer serves the purpose of analysis, but the purpose of output generation. This impairs the normal exploitation of the work and falls outside the exception.
The court concluded sharply: if the current state of the art makes it impossible to prevent memorization, then training a model with protected data is not covered by the TDM exception.
What this specifically means
This German ruling, although appealable, has important implications for the AI industry across Europe.
- For AI developers: This is a legal minefield. The German ruling puts a heavy burden on developers. They can no longer comfortably hide behind the TDM exception in the EU if their model is capable of “memorizing” and reproducing protected works. They will either have to technically modify their models to prevent memorization or proactively negotiate licenses with rights holders. In contrast, the British ruling gives them much more breathing room by stating that the design itself is not a copy. This legal uncertainty is problematic for innovation.
- For authors and publishers: The GEMA ruling significantly strengthens their negotiating position. With a court ruling in hand, they can now demand license fees for the use of their portfolios in AI models that can reproduce them. The British ruling tempers these expectations.
- For Belgian companies: Although this is a German judgment, it interprets European directives (Infosoc and DSM) that have also been transposed into the CEL in Belgium. It is unclear what line the Belgian courts will take. Will they take the functional (German) or the technical (British) approach? Companies deploying or developing generative AI tools should be very aware of the risk that these models may contain and generate infringing material.
Frequently asked questions (FAQ)
Does training an AI model not fall under the TDM exception?
The Munich court ruled that the TDM exception (Art. XI.190, 20° CEL) covers data analysis, but not the permanent storage (memorization) of the works in the model itself, with the aim of being able to reproduce them later.
Is an AI model itself now a ‘copy’ according to the court?
This is divisive in Europe. The German court ruled that it did. The court stated that the model, in which the song lyrics are “reproducibly present,” constitutes a “physical fixation” and thus a reproduction (copy) within the meaning of copyright law (Art. XI.165 §1 CEL). The fact that technical devices (such as the chatbot interface) are required to make the work perceptible does not change this. However, the British High Court ruled otherwise: an AI model does not store works, but only mathematical relationships, and thus is not a copy.
Who is liable for the breach: the user or OpenAI?
The German court finds OpenAI directly liable, both for the reproduction in the model and for the reproduction and public communication in the output. Given OpenAI's central and guiding role in the entire process and the fact that simple prompts were sufficient, responsibility cannot be shifted to the user.
Conclusion
The Munich court ruling is a turning point, but it is not the end of the discussion. It exposes a fundamental fault line in European jurisprudence. The question of whether an AI model is a ‘storage medium’ or a ‘mathematical tool’ is at the heart of the debate. It is waiting for further rulings, and possibly a clarification from the European Court of Justice, to settle this fundamental contradiction.



