Class action lawsuit on scraping of open source code (and images) by MS, GitHub, OpenAI

This interview with the plaintiff in the class action lawsuit might be interesting for some people

https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data

“The DMCA applies equally to all forms of copyrightable material, and images often include attribution; artists, when they post their work online, typically include a copyright notice or a creative commons license, and those are also being ignored by [companies creating] image generators.”

“Copilot, which was [unveiled by Microsoft-owned GitHub] in June 2021, is trained on public repositories of code scraped from the web, many of which are published with licenses that require anyone reusing the code to credit its creators. Copilot has been found to regurgitate long sections of licensed code without providing credit — prompting this lawsuit that accuses the companies of violating copyright law on a massive scale.”

3 Likes

I wish the class action good luck. The article doesn’t go into what license OpenAI and/or Microsoft are applying to the generated code and images, nor whether they are claiming copyright on those items. The following gets a little bit closer to those issues:

2 Likes