“No Duh,” say senior developers everywhere.
The article explains that vibe code often is close, but not quite, functional, requiring developers to go in and find where the problems are - resulting in a net slowdown of development rather than productivity gains.
I have never seen an AI generated code which is correct. Not once. I’ve certainly seen it broadly correct and used it for the gist of something. But normally it fucks something up - imports, dependencies, logic, API calls, or a combination of all them.
I sure as hell wouldn’t trust to use it without reviewing it thoroughly. And anyone stupid enough to use it blindly through “vibe” programming deserves everything they get. And most likely that will be a massive bill and code which is horribly broken in some serious and subtle way.
I’ve used Claude code to fix some bugs and add some new features to some of my old, small programs and websites. Not things I can’t do myself, but things I can’t be arsed to sit down and actually do.
It’s actually gone really well, with clean and solid code. easily readable, correct, with error handling and even comments explaining things. It even took a gui stream processing program I had and wrote a server / webapp with the same functionality, and was able to extend it with a few new features I’ve been thinking to add.
These are not complex things, but a few of them were 20+ files big, and it manage to not only navigate the code, but understand it well enough to add features with the changes touching multiple files (model, logic, view layer for example, or refactor a too big class and update all references to use the new classes).
So it’s absolutely useful and capable of writing good code.
would you deploy this server?
This is the truth. It has tremendous value but it isn’t a solution – it’s a tool. And if you don’t know how to code or what good code looks like, then it is a tool you can’t use!
I’ve seen and used AI for snippets of code and it’s pretty decent at that.
With my colleagues I always compare it to a battery powered drill. It’s very powerful and can make shit a lot easier. But you’d not try to build furniture from scratch with only a battery powered drill.
You need the knowledge to use it - and also saws, screws, the proper bits for those screws and so on and so forth.
What bothers me the most is the amount of tech debt it adds by using outdated approaches.
For example, recently I used AI to create some python scripts that use polars and altair to parse some data and draw charts. It kept insisting to bring in pandas so it could convert the polars dataframes to pandas dataframes just for passing them to altair. When I told if that altair can use polars dataframes directly, that helped, but two or three prompts later it would try to solve problems by adding the conversion again.
This makes sense too, because the training material, on average, is probably older than the change that enabled altair to use polars dataframes directly. And a lot of code out there just only uses pandas in the first place.
The result is that in all these cases, someone who doesn’t know this would probably be impressed that the scripts worked, and just not notice the extra tech debt from that unnecessary dependency on pandas.
It sounds like it’s not a big deal, but these things add up and eventually, our AI enhanced code bases will be full of additional dependencies, deprecated APIs, unnecessarily verbose or complicated code, etc.
I feel like this is one aspect that gets overlooked a bit when we talk about productivity gains. We don’t necessarily immediately realize how much of that extra LoC/time goes into outdated code and old fashioned verbosity. But it will eventually come back to bite us.
Eh I had it write a program that finds my PCs ip and sends it to the Unifi gateway to change a rule. Worked fine but I guess technically it is mostly using the go libraries written by someone else.
How is it not correct if the code successfully does the very thing that was prompted?
F.ex. in my company we don’t have any real programmers but have built handful of useful tools (approx. 400-1600 LOC, mainly Python) to do some data analysis, regex stuff to cleanup some output files, index some files and analyze/check their contents for certain mistakes, dashboards to display certain data, etc.
Of course the apps may not have been perfect after the very first prompt, or even compiled, but after iterating an error or two, and explaining an edge case or two, they’ve started to perform flawlessly, saving tons of work hours per week. So how is this not useful? If the code creates results that are correct, doesn’t that make the app itself technically ”correct” too, albeit likely not nearly as optimized as equivalent human code would be.
To add on to what others have said, vibe coding is ushering in a new golden age for black hat hackers. If someone is rely entirely on AI to generate code they likely don’t understand what the code they have is actually doing. This tends to lead to an app that works correctly for what the prompted specified but behaves badly the instant it has to handle anything outside of the prompt, like a malformed request or data outside the prompted parameters. As a result these apps tend to be easy to exploit by malicious actors, often in ways the original prompter never thought of.
If the code doesn’t compile, or is badly mangled, or uses the wrong APIs / imports or forgets something really important then it’s broken. I can use AI to inform my opinion and sometimes makes use of what it outputs but critically I know how to program and I know how to spot good and bad code.
I can’t speak for how you use it, but if you don’t have any real programmers and you’re iterating until something works then you could be producing junk and not know it. Maybe it doesn’t matter in your case if its a bunch for throwaway scripts and helpers but if you have actual code in production where money, lives, reputation, safety or security are at risk then it absolutely does.
It’s not bad for your use case but going beyond that without issues and actual developpers to fix the vibe code is not yet possible for llms