Mr. Latte
Can AI Launder Open Source Licenses? The Legal Paradox of Code Rewrites
TL;DR Maintainers of a popular Python library used an AI tool to rewrite their LGPL-licensed codebase, attempting to relicense it under MIT. This sparked a massive legal debate, as using AI trained on the original code violates traditional “clean room” engineering principles. Furthermore, a recent ruling denying copyright to AI-generated works means this code might legally be in the public domain, potentially threatening the future of all Copyleft licenses.
Relicensing legacy open-source projects has historically been a nightmare, requiring unanimous consent from every past contributor. Recently, the maintainers of the chardet library tried a modern shortcut: using Claude Code to completely rewrite the LGPL-licensed project so they could release it under the more permissive MIT license. This seemingly clever hack has ignited a firestorm in the open-source community. It forces us to confront whether AI can be used as a legal loophole to bypass strict software licenses.
Key Points
The core conflict centers around the legal definition of a “clean room” implementation. Traditionally, rewriting licensed code requires two isolated teams—one to write the spec and another to write the code without ever seeing the original source. By prompting an AI with the original LGPL codebase, the maintainers destroyed this separation, making the output arguably a derivative work bound by the original license. Complicating matters is a recent U.S. Supreme Court decision affirming that AI-generated material cannot be copyrighted due to a lack of human authorship. This creates a bizarre paradox: the new code is either an illegal derivative work, or it belongs in the public domain where the maintainers have no right to apply an MIT license anyway.
Technical Insights
From an engineering perspective, using LLMs for massive codebase refactoring or translation feels like a superpower, drastically reducing the friction of paying down technical debt. However, this case highlights a critical blind spot: LLMs are not isolated state machines; they carry the “taint” of their inputs. Unlike a human engineer who can abstract a problem into a pure architectural spec, an AI directly transforms the protected syntax into new syntax, functioning more like a sophisticated obfuscator than a clean-room developer. This means teams cannot simply use AI as a “license laundering” tool to escape restrictive dependencies. Engineers must now treat AI prompts containing proprietary or copyleft code as legally hazardous, recognizing that the generated output carries the legal baggage of the input.
Implications
If the courts eventually accept AI-assisted rewrites as valid relicensing mechanisms, it could effectively spell the end of Copyleft licenses like the GPL, allowing corporations to freely strip protections from open-source projects. For developers and companies, this means strict governance is immediately required regarding what code is fed into AI assistants. You should assume that any AI-generated rewrite of GPL/LGPL code remains legally infectious, and avoid using AI to bypass licensing restrictions unless you are prepared for significant legal liability.
We are entering an era where code generation outpaces copyright law, leaving developers to navigate a minefield of untested legal theories. Will open-source maintainers need to invent new “AI-proof” licenses to protect their work? As this real-world test case unfolds, the software industry must watch closely—the outcome will fundamentally reshape how we share and protect code.