In a recent GitHub study, GitHub Copilot was tested to determine whether the AI coding assistant helps developers write code that is objectively better or worse than code written without AI. To do this, GitHub worked with 202 Python developers, 104 coding with Copilot and 98 without. The developers were tasked with creating a restaurant review web server and evaluating its functionality using 10 unit tests.
For each of the completed submissions, at least ten developers went through the code line by line, without knowing whether it was written with or without the help of AI. The resulting 1,293 reviews assessed the readability, reliability, maintainability, and conciseness of the code samples. Additionally, reviewers assessed whether the code should ultimately be approved.
The study results bode well for the use of AI in coding, with GitHub highlighting four key findings.
- Copilot helped developers pass all ten unit tests 56% more often than developers without AI, leading to increased functionality.
- Copilot-assisted code is more readable and allows developers to write an average of 13.6% more lines without encountering readability issues.
- Readability, reliability, maintainability, and conciseness improved by an average of 3.29%, with conciseness seeing the largest improvement (4.16%).
- The co-pilot assisted code was approved 5% more often than the code without AI. This means it takes less time to get the code ready for production.
For more information on the study’s methodology and results, see the official GitHub blog post linked below.