Get all your news in one place.
100's of premium titles.
One app.
Start reading
TechRadar
TechRadar
Craig Hale

Bad news skeptics - GitHub says it will employ user data to train its AI after all

GitHub Copilot X.

  • GitHub rolls out on-by-default AI user data training, with optional opt-out
  • Business, Enterprise and some other platforms are excluded from the change
  • The company explains that users' real-time, live data is crucial for good training

GitHub Chief Product Officer Mario Rodriguez has announced that the platform will be using user data to train its AI models, operating on an opt-out basis that automatically subscribes users into the data collection system.

The change won't just affect Free users, but also Pro and Pro+ – Copilot Business, Enterprise, student accounts and teacher accounts will be exempt from the new user data training change.

The company blog post adds AI-generated content as well as user feedback and interactions will all go into training the AI models.

GitHub will use your data to train its AI models, it confirms

Some of the elements that will go into training GitHub's AI include: inputs, like prompts and snippets of code; outputs, including accepted content and edited suggestions; code context; comments and documentation; file names and repo structures; Copilot interactions and even feedback like thumbs up/down.

As well as the account types mentioned above and those who opt out, there is one third and final category of user who will be exempt from the training change. "Content from your issues, discussions, or private repositories at rest," Rodriguez writes, carefully pointing out that even private repos can be used if a user is actively using Copilot.

The company is keen to point out that real-world interaction data vastly improves model training, thanking users who choose to share their data.

"We believe the future of AI-assisted development depends on real-world interaction data from developers like you," the CPO added.

GitHub publicly stating its position on user data training is an important step, but while users are given the option to opt out, many are still unhappy about the on-by-default setting.


Sign up to read this article
Read news from 100's of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.