GitHub Copilot receives criticism from copyright enthusiasts
3 min. read
Updated on
Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more
Last month, Microsoft’s GitHub announced Copilot, a new AI assistance service for software development. GitHub Copilot supports a variety of languages and frameworks and can offer suggestions for whole lines or entire functions right inside an IDE. GitHub Copilot is powered by OpenAI Codex, and it is trained on billions of lines of open source code. Ever since the announcement was made last week, some copyright enthusiasts have criticized GitHub. Some even claimed that Copilot scraps open source code to deliver a paid AI service for developers.
1)
Hi. I know you’re excited about copilot.
GitHub scraped your code. And they plan to charge you for copilot after you help train it further.
It’s truly disappointing to watch people cheer at having their work and time exploited by a company worth billions.
— Brian P. Hogan (@bphogan) July 2, 2021
2)
Some thoughts about Github Copilot and copyright (I am not a lawyer).
Copilot uses a version of GPT3 trained on GPL licensed code. GPL give everyone the right to copy and make derivatives. Derivatives inherit GPL.
Copilot can sometimes memorize and repeat snippets of code. pic.twitter.com/1JLwfQI65l
— Mark O. Riedl (@mark_riedl) June 30, 2021
3)
“I’m leaving GitHub because copilot uses my OpenSource code for training” is such an odd move. Anyone can fork it to there and GitHub can feed OpenSource code from anywhere to it and US copyright law permits this. I’m also pretty certain we should not strengthen copyright laws …
— Armin Ronacher (@mitsuhiko) July 3, 2021
4)
“I’m leaving GitHub because copilot uses my OpenSource code for training” is such an odd move. Anyone can fork it to there and GitHub can feed OpenSource code from anywhere to it and US copyright law permits this. I’m also pretty certain we should not strengthen copyright laws …
— Armin Ronacher (@mitsuhiko) July 3, 2021
I don’t get this whole argument of GitHub Copilot violating copyrights of GPL code. First, machine-generated code should not be considered as a derivative work. If an AI output is considered as a derivative work, you can’t build a music recognizition app since your AI model will be based on copyrighted music content. Second, even if Copilot generates the exact short snippets of code from the training datasets, it should not be considered a copyright violation. For example, consider the below code.
if (i <= 0)
i = i+1;
You can’t claim copyright of the above code since the above is not an original code. GitHub Copilot should be able to suggest such short snippets of code to developers without violating any copyright laws. It will be interesting to see how Microsoft and GitHub react to these copyright criticisms in the coming days.
User forum
0 messages