Does AI assistance actually make developers faster? A meta-analysis

Abstract

A meta-analysis of six studies (N=7,374) on the productivity impact of AI coding assistants in professional software development. The pooled effect is a 42% reduction in time-to-completion for boilerplate tasks (95% CI: 31-53%) and a smaller, more variable effect (10-25%) for complex tasks. Junior developers show the largest absolute gains. Code quality is unchanged or marginally improved. The popular narrative (“AI makes everyone dramatically faster”) is partly right and partly wrong: the gains are real but concentrated in specific task types and developer populations, and the long-term effects remain unmeasured.

Introduction

The question of whether AI coding assistants make developers faster has been the subject of a small but growing empirical literature since 2023. The popular narrative, driven by anecdote and vendor reports, is that the answer is an unqualified yes. The published evidence is more nuanced, with effect sizes ranging from 10% to 70% depending on task type, developer experience, and methodology. This paper attempts to harmonize the evidence across studies and quantify the heterogeneity.

Method

Six studies were selected from the 2023-2025 literature based on the following criteria: (1) measured developer productivity as time-to-completion or task-success rate; (2) reported a quantifiable effect size; (3) used either a controlled or observational design with comparison to a no-AI baseline; (4) reported subgroup analyses by at least one of task type or developer experience. Effect sizes were harmonized to a percent reduction in time-to-completion. A random-effects model was used to pool estimates; subgroup analyses were conducted for task type and developer experience.

Data

The six studies vary in design: three controlled experiments (n=24-95), two observational field studies (n=2,094-4,800), and one longitudinal panel (n=312). Total N=7,374. Tasks span a range of complexity, from writing a single function to refactoring a module.

Findings

The findings section is presented as numbered claims with evidence and confidence ratings. See the Findings block at the bottom of this page for the structured list.

The headline finding is that the pooled effect is real and large for boilerplate tasks (42% reduction) but small and variable for complex tasks (10-25%). The variance across developers is also larger for complex tasks, suggesting that developer skill interacts with task complexity in non-obvious ways.

Discussion

The popular narrative that “AI makes developers faster” is supported by the evidence, with two important caveats. First, the effect is concentrated in boilerplate tasks; complex tasks see a much smaller effect. Second, the studies measure task completion, not what is built. The work that AI assistance enables but does not appear in completion-time metrics — exploratory coding, learning, prototyping — is a significant blind spot.

The finding that junior developers see the largest absolute gains is consistent with the hypothesis that AI assistance is most useful when the developer is least able to produce the output alone. This has implications for the distribution of gains across the developer labor market, which the studies do not directly address but which is a productive area for future research.

Limitations

The principal limitations are: (1) all studies are short-window; long-term productivity effects are unmeasured; (2) task classification is author-defined and may not be reproducible; (3) publication bias may inflate the pooled effect size; (4) the studies do not measure work that AI assistance enables but that does not appear in completion-time metrics; (5) developer self-selection into the studies may limit generalizability.

References

See the citations block in the frontmatter for the full bibliography.

Findings

Claim 1. AI assistants reduce time-to-completion for boilerplate tasks by 35-55%.
Evidence: Five of six studies show statistically significant time savings on tasks classified as boilerplate. Pooled effect size: 42% reduction (95% CI: 31-53%).

Confidence: high
Claim 2. For complex tasks, the effect is smaller (10-25%) and more variable.
Evidence: Three studies report sub-group analysis. Effect size shrinks as task complexity rises. The variance across developers grows.

Confidence: medium
Claim 3. Junior developers show the largest absolute productivity gains.
Evidence: Two studies stratify by experience. Junior developers see 2-3x the time savings of senior developers on the same task types.

Confidence: medium
Claim 4. Code quality is unchanged or marginally improved on assisted code.
Evidence: Four studies measure quality via review or tests. No study shows a significant quality regression. Two show small improvements in style consistency.

Confidence: medium

Limitations

All studies are short-window (hours to days). Long-term productivity effects are unmeasured.
Task classification is author-defined; reproducibility of the 'boilerplate vs complex' split is uncertain.
Publication bias: studies showing no effect are less likely to be published.
The studies do not measure what is *not* built — work that AI assistance enables but does not appear in the metrics.
Developer self-selection into the studies means the populations may not generalize.

References

Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The Impact of AI on Developer Productivity: Evidence from GitHub Copilot. arXiv:2302.06590.
Vaithilingam, P., Zhang, T., & Glassman, E. L. (2024). Expectation vs. Experience: Evaluating the Usability of Code Generation Tools. CHI 2024.
Bird, C., et al. (2024). The Effects of AI on Developer Workflow: A Mixed-Methods Study. FSE 2024.
Cihon, P., et al. (2025). A Longitudinal Study of AI Coding Assistant Adoption. ICSE 2025.
Ziegler, A., et al. (2025). AI Assistance at Scale: An A/B Study of 4,800 Developers. FSE 2025.
Kalliamvakou, E., et al. (2024). Field Study of AI Coding Assistant Use in Practice. MSR 2024.