[This is a long post, but it is not LLM generated or even LLM reviewed. My thoughts, my words - possibly wrong]
I am under the impression that some of the concerns raised in LLM-related conversations stem from a superficial understanding of how proper LLM-assisted development looks like.
I would like to clarify a bit how this works in real life, and why, I argue, code produced with the help of an LLM can actually be better than code produced by a developer programming solo.
The AI Slop problem
Thanks to (or rather, because of) LLMs, it is now possible for someone (who may or may not be able to code) to write a prompt, and then create a PR (or even let the LLM create the PR) with whatever the model spits out. This can also go further, and one can ask the LLM to decide what to implement. The LLM can be unleashed on a code base, and create any number of PRs that fix (what they think are) bugs, implement new features, and so on.
The result of this process is what we generally refer to as AI slop.
This is what FOSS projects are concerned about. They are afraid of being flooded with thousands of automatically generated PR requests which will be a burden for devs. Even if these sloppy changes can be identified quickly (which is not always the case) it still takes time to separate the wheat from the chaff, and for large enough numbers this can completely paralyze a project.
The concern is not that LLMs generate bad code (even though they can certainly also do that). The concern is that LLMs make it possible to spam FOSS repositories in a way that was not possible before and create an unbearable amount of overhead for maintainers.
[Edit: @Phemisters pointed out that there are also other concerns, such as the ethicality of how models are trained. Sure, it’s a complex topic and there are many angles. In this post, I am focusing on aspects pertaining the quality of code contributions]
LLM-assisted development is not AI Slop
AI slop is NOT what a developer who wants to do something meaningful uses an LLM for. Instead, you co-code (or pair-program, or whatever you want to call it) with an LLM.
Proper co-coding with an LLM is a very intense, hands-on task, which gives a developer the power to implement stuff much faster, and, if done correctly, better than they would do without the help of the LLM.
This is a summary of how you implement something with an LLM:
-
Someone (the developer) has an idea, writes a design sketch, more or less detailed, and asks an LLM to prepare a plan. If any part of the initial request is very under-specified, the LLM will ask clarification questions.
-
The model reads all the relevant parts of the codebase, if needed does its external research (e.g., color science documents), and it produces a plan, i.e., a full-fledged design doc which outlines what the model is planning to do, why and how.
-
The developer reads the plan and comments on it, asking questions (e.g., “why this is done like this?”) or correcting the model proposal (e.g., “This should be done like this, not like that”)
-
Steps 2-3 repeat several times, until the developer is happy with the plan.
-
The developer asks the model to implement the plan.
-
While the LLM is working, the developer can read the thinking process of the model (I don’t like the word, but it’s what it is called). The model explains in great detail what it is doing and how. The sequence of the actions that it will do, what is the starting point, the intermediate steps and the final goal. If it has any doubt, it will start thinking very hard about something.
-
At any point, the developer can interrupt the model while it is thinking, and offer guidance or clarification. At any moment, if they see something wrong in the thinking trace, they can interject and steer the model in the right direction.
-
When the code is ready, the LLM builds the project.
-
From here on, it is an iterative process of the developer (1) testing out the feature, and (2) doing edits and fixes in the code, either directly or with the help of the model.
As I mentioned before, it is a very intense experience. The model thinks and edits fast, iterations are relatively quick and there is a huge amount of information that the developer needs to assimilate.
The whole process is very transparent, and the model is only in control to the extent that the developer lets it be. While the model will certainly make its own decisions and propose plans and implementation details, in the end it is the developer who has the last word concerning how things are done, what goes in and what does not.
Potential advantages of co-coding
And why LLM-assisted code changes can actually be of very high quality.
Codebase awareness
An LLM can keep all the relevant bits of a codebase in memory (the context). Thanks to this bird’s eye view of the code, it has a good understanding of where the best place to implement a certain functionality is. It may not know this better than the maintainers, who - thanks to years of experience - know the codebase almost by heart. But it certainly understands the overall code structure much better than a new developer who is not familiar with the code, and can make better, more informed decisions.
LLMs can easily find snippets of similar/related code to the one that the developer wants to implement and reuse that instead of reinventing the wheel. This leads to less and more maintainable code. A human that does not know the whole codebase cannot do this as efficiently or as accurately.
Code style and guidelines
An LLM analyzes the codebase and adapts its style to conform to the rest of the codebase. A developer strives to do that, but if you come from a different code base it is not so easy to start doing things differently.
More robust code
Humans are notoriously sloppy when it comes to identifying corner cases. It’s just that any code of non trivial complexity has a huge space of possible states, and keeping track of all those states is not something that our brain is wired for. Coding LLMs are trained on huge amounts of existing code, are specialized in identifying the patterns that can cause issues and know how to build effective defenses around them. They are a great help in finding subtle bugs that for a human would be very difficult to spot.
Cut through the boilerplate, focus on the substance
Writing code involves a lot of boilerplate and repetitive tasks. This is especially true for languages like C that do not offer a lot of high level abstractions. When you co-code, the developer can let the model take care of the “boring” stuff while they focus on the “meat” of a change (i.e., UI design, which transformations to apply, how to effectively decompose the problem, etc.).
Faster iterations => More iterations => Better results
Being able to iterate quickly, not having to spend too much time typing boilerplate, enables a more explorative, less dogmatic approach to coding.
The developer can try out different ways of doing the same thing, because trying an option is relatively cheap. A developer is not as likely to do that if any refactoring is going to require hours if not days of work to implement and then debug the result.
The developer is no longer affected by the sunk cost fallacy, i.e., they started doing something in a suboptimal way, but the amount of effort that already went into it makes it undesirable to revisit early decisions. Using an LLM frees the developer and allows them to recover from wrong early decisions (or to accommodate requirement changes) quickly and with comparatively little effort.
Documentation and learning
Coding LLMs are excellent at documenting the code that they produce. This is something that developers really do not like to do, and lack of documentation is one of the main hurdles for maintenance and a common cause of regressions (i.e., when someone modifies a bit of code that is not fully understood and things that used to work break).
When you co-code you learn a lot about the codebase. Reading the LLM plans and its thinking process gives the developer (especially if not very familiar with the codebase) a much better understanding of how the different parts fit together, and it’s an extremely valuable learning tool.
Conclusions
To summarize, these are my claims, based on my quite extensive experience with co-coding:
- AI Slop and co-coding are two very different things, and they should not be confused
- AI Slop is a threat for FOSS development, co-coding is not
- Co-coding done well can actually produce better code, and contribute positively to the health of a codebase. This is especially true when the contribution comes from developers who are not too familiar with the codebase
- Co-coding frees developers from having to worry about the less interesting stuff (code styles, boilerplate, mechanical tasks), which is what in real life takes most of the time, and allows them to focus on the core business logic
And here are some of the things that I do not claim (before someone implies that I do). The space of things that I did not claim is very large, and it extends to everything that I did not explicitly - well - claim, but some of these non-claims probably need to be spelled out clearly:
- LLM-generated code is always good
- Coding LLMs are perfect and make no mistakes
- Anybody can effectively co-code with an LLM, regardless of their skillset
- Everybody should contribute code, as co-coding makes it technically possible
- LLMs write better code and understand codebases better than any human possibly could
- Software projects should accept any LLM-generated submission because they are oh so good
- Coding LLMs solve every possible problem
- …