GitHub Copilot dubs itself as an “AI pair programmer” for software developers, automatically suggesting code in real time. According to GitHub, Copilot is “powered by Codex, a generative pretrained AI model created by OpenAI” and has been trained on “natural language text and source code from publicly available sources, including code in public repositories on GitHub.”
However, a class-action lawsuit filed against GitHub Copilot, its parent company Microsoft, and OpenAI claims open-source software piracy and violations of open-source licenses. Specifically, the lawsuit states that code generated by Copilot does not include any attribution of the original author of the code, copyright notices, and a copy of the license, which most open-source licenses require.
“The spirit of open source is not just a space where people want to keep it open,” says Sal Kimmich, an open-source developer advocate at Sonatype, machine learning engineer, and open source contributor and maintainer. “We have developed processes in order to keep open source secure, and that requires traceability, observability, and verification. Copilot is obscuring the original provenance of those [code] snippets.”