> they have encyclopedic knowledge at a superficial level, the approximate judgement and maturity of a teenager, and the short-term memory of a parakeet. If I ask for something, I get the statistical average opinion of a bunch of goons, unconstrained by context or common sense or taste.
Love this paragraph; it's exactly how I feel about the LLMs. Unless you really know what you are doing, they will produce very sub-optimal code, architecturally speaking. I feel like a strong acumen for proper software architecture is one of the main things that defines the most competent engineers, along with naming things properly. LLMs are a long, long way from having architectural taste
I’ve tried that. I’ve experimented with a whole council of 13 personas including many famous developers. It’s definitely different. But it’s hasn’t performed significantly better in my tests.
If you do spot checks, that is woefully inadequate. I have lost count of the number of times when, poring over code a SOTA LLM has produced, I notice a lot of subtle but major issues (and many glaring ones as well), issues a cursory look is unlikely to pick up on. And if you are spending more time going over the code, how is that a massive speed improvement like you make it seem?
And, what do you even mean by 10x the amount of work? I keep saying anybody that starts to spout these sort of anecdotes absolutely does NOT understand real world production level serious software engineering.
Is the model doing 10x the amount of simplification, refactoring, and code pruning an effective senior level software engineer and architect would do? Is it doing 10x the detailed and agonizing architectural (re)work that a strong developer with honed architectural instincts would do?
And if you tell me it's all about accepting the LLM being in the driver's seat and embracing vibe coding, it absolutely does NOT work for anything exceeding a moderate level of complexity. I used to try that several times. Up to now no model is able to write a simple markdown viewer with certain specific features I have wanted for a long time. I really doubt the stories people tell about creating whole compilers with vide coding.
If all you see is and appreciate that it is pumping out 10x features, 10x more code, you are missing the whole point. In my experience you are actually producing a ton of sh*t, sorry.
Honestly, this more of a question about scope of the application and the potential threat vectors.
If the GP is creating software that will never leave their machine(s) and is for personal usage only, I'd argue the code quality likely doesn't matter. If it's some enterprise production software that hundreds to millions of users depend on, software that manages sensitive data, etc., then I would argue code quality should asymptotically approach perfection.
However, I have many moons of programming under my belt. I would honestly say that I am not sure what good code even is. Good to who? Good for what? Good how?
I truly believe that most competent developers (however one defines competent) would be utterly appalled at the quality of the human-written code on some of the services they frequently use.
I apply the Herbie Hancock philosophy when defining good code. When once asked what is Jazz music, Herbie responded with, "I can't describe it in words, but I know it when I hear it."
> I apply the Herbie Hancock philosophy when defining good code. When once asked what is Jazz music, Herbie responded with, "I can't describe it in words, but I know it when I hear it."
That’s the problem. If we had an objective measure of good code, we could just use that instead of code reviews, style guides, and all the other things we do to maintain code quality.
> I truly believe that most competent developers (however one defines competent) would be utterly appalled at the quality of the human-written code on some of the services they frequently use.
Not if you have more than a few years of experience.
But what your point is missing is the reason that software keeps working in the fist, or stays in a good enough state that development doesn’t grind to a halt.
There are people working on those code bases who are constantly at war with the crappy code. At every place I’ve worked over my career, there have been people quietly and not so quietly chipping away at the horrors. My concern is that with AI those people will be overwhelmed.
They can use AI too, but in my experience, the tactical tornadoes get more of a speed boost than the people who care about maintainability.
I had a long reply to your comment, then decide it was not truly worth reading. However, I do have one question remaining:
> the tactical tornadoes get more of a speed boost than the people who care about maintainability.
Why are these not the same people? In my job, I am handed a shovel. Whatever grave I dig, I must lay in. Is that not common? Seriously, I am not being factious. I've had the same job for almost a decade.
That’s because you’ve been there a decade. It’s very common for people to skip jobs every 2 years so that they never end up seeing the long term consequences of their actions.
The other common pattern I’ve seen goes something like this.
Product asks Tactical Tornado if they can building something TT says sure it will take 6 weeks. TT doesn’t push back or asks questions, he builds exactly what product asks for in an enormous feature branch.
At the end of 6 weeks he tries to merge it and he gets pushback from one or more of the maintainability people.
Then he tells management that he’s being blocked. The feature is already done and it works. Also the concerns other engineers have can’t be addressed because “those are product requirements”. He’ll revisit it later to improve on it. He never does because he’s onto the next feature.
Here’s the thing. A good engineer would have worked with product to tweak the feature up front so that it’s maintainable, performant etc…
This guy uses product requirements (many that aren’t actually requirements) and deadlines to shove his slop through.
At some companies management will catch on and he’ll get pushed out. At other companies he’ll be praised as a high performer for years.
Way better than the random India dev output. I seriously don't know what everyone around here is doing. All I see are complaints while I produce the output of ten devs. Clean code, solid design.
Spend a few hours writing context files. Spend the rest of the week sipping bourbon.
A better example might be why we build stairs with a standard riser height and tread run. If you've ever accidentally tripped on an unusual or non-standard stair, you already know this.
Users don't need to think about how to use them; they are ubiquitous and familiar, and therefore intuitive and automatic.
If every set of stairs (or, worse, if every stair in a set) was radically different, every time you approached some stairs you would have to think carefully about how to use them so you don't fall.
Your point is true, but the one I was replying to was focusing on the aesthetic aspect. For them, the sameness of UIs, while functional, make for a drab experience.
My point is that I don't find this to be case. Rather, consistent UIs, while functional, are also beautiful to me. The constituents of the UI can be designed with aesthetic taste, but the way it is all put together consistently and functionally has a beauty all its own.
It seems just fine to me. This is what Anthropic needs to do if they want to survive. I'm always looking out for someone to integrate an actually good harness to a good model. Once that happens, I'm jumping ship if Anthropic keeps playing these tricks.
It's almost unusable for me now. A simple prompt to merge 3 sub-100-line files with simple node code, on Sonnet 4.6, uses up 20% of my 5 hour quota, on a new/fresh session.
To be fair, my comment was a bit harsher before the update. The way they handle the development, communication and how they treat customers isn't fine. I've seen some angry people post and comment in manners which truly deserved the label hostile.
The whole product with the infrastructure and Claude Code's code appear to be vibe coded.
I'm not sure about other devs, or even their number, but AI can most definitely NOT produce better code than I can.
I use it after I have done the hard architectural work: defining complex types and interfaces, figuring out code organization, solving thorny issues. When these are done, it's now time to hand over to the agent to apply stuff everywhere following my patterns. And even there SOTA model like Opus make silly mistakes, you need to watch them carefully. Sometimes it loses track of the big picture.
I also use them to check my code and to write bash scripts. They are useful for all these.
What you're describing is using it to do something you already can do at an expert level, and you already know exactly what you want the result to look like amd won't accept anything that deviates from what's already in your head. So like a code autocomplete. You don't really want the "intelligence" part, you want a mule.
That's fine, and useful, but you're really putting a ceiling on it's potential. Try using it for something that you aren't already an expert in. That's where most devs live.
Even expert coder antirez says "writing the code yourself is no longer sensible".
AFAIU antirez is mostly writing in C, a verbose language where "create a hashtable of x->y" turns into a wall of boilerplate. In high level languages the length diffrence between a precise specification and the actual code is much smaller.
He also mentions using it for Python which is minimal boilerplate.
And he didn't limit his take to just C code. He said: state of the art LLMs are able to complete large subtasks or medium size projects alone, almost unassisted, given a good set of hints about what the end result should be.
But if the using them as mules is still producing silly mistakes, how will I have the confidence to defer to their intelligence for much more complex stuff?
These things bullshit their way about all the time. I've lost track of how many times they seem to produce something great, only for me, upon deeper inspect, to see what a subtle mess they have made. And when the work is a bit complex, I cannot verify on sight; I'd have to take time to do it.
Also, they absolutely cannot even produce some levels of code. Do you think I can just give them a prompt to produce a haskell-like language, allow them to crank for some hours, and have a language ready made?
Want an example? here is something Sonnet gave me just today:
I get this as the type of xx: Promise<Result<Pick<Cabinet, "name">[]>>
Which is obviously wrong. I should be getting the full type, i.e., all columns picked. The problem is that the Column generic parameter is not being properly inferred, which is (probably) due to the sorting by name, since the sort column is defined to have to be part of the query field name, so when field is not provided, TypeScript infers the fields as the sort column name.
Neither ChatGPT nor Claude Opus have been able to solve this after one hour, suggesting all kinds of things that don't work. But I have solved it myself, with:
export type QueryArgs<Rec extends StdRecord = StdRecord, Fld extends StrKeyOf<Rec> = StrKeyOf<Rec>, FltrOp extends FilterOpsAll = FilterOpsAll, Srt extends Fld = Fld> = {
/** Fields to include in results (defaults to all) */
fields?: Fld[],
/** Filters to apply */
filter?: RecordFilter<Rec, FltrOp>,
/** Sorting to apply */
sort?: {
field: Srt// StrKeyOf<Rec>
order: SortOrder
},
/** Pagination to apply */
page?: {
maxCount?: number | undefined
startFrom?: { sortFieldKey: any, idKey: ID } | undefined
}
}
And:
queryX: <Ent extends EntityNamePlural, Col extends StrKeyOf<Dto<Ent>>, Srt extends Col = Col>(args
: {
entity: Ent,
query: QueryArgs<Dto<Ent>, Col, fOperators, Srt>,
auditInfo?: AuditSpec
}
) => Promise<Result<Pick<Dto<Ent>, Col | Srt>[]>>
- Good code is what enables you to be able to build very complex software without an unreasonable number of bugs.
- Good code is what enables you to be responsive to changing customer needs and times. Whether you view that as valuable is another matter though. I guess it is a business decision. There have been plenty of business that have gone bust though by neglecting that.
Good code is for your own sanity, the machine does not care.
I'm going to ask the qustion I ask everyone who makes the claim that they wrote like that for years: Can you show us a link from prior 2022 that you wrote like that?
Sure, but, look, we have seen these claims so many times, that if it were true by now someone would have linked at least one archived blog post to show that it is, indeed, how humans used to write.
And apparently they killed more during the mission to retrieve this guy
> striking Iranian military-aged males believed to be a threat who got within three kilometer” according to a correspondent with the US Air & Space Forces Magazine, who said he had been briefed on the operation.
I'm starting to believe that China isn't going to make the move. It's winning the hearts and minds of the rest of the world and will be able to leverage its growing soft power well beyond what Taiwan would provide. I just don't see them giving up the position the US has abandoned.
I'm starting to think so as well. The Chinese are typically cautious geopolitically, and very strategic. They may well have made the calculus that for the foreseeable future, they have more to gain from keeping the status quo re Taiwan while their rivals score own goals, waiting for a possible rapprochement with Taiwan on favorable terms.
That's something the factions in the Middle East miss: sometimes great change comes from patiently applying pressure and infiltrating from within, rather than a frontal attack.
China doesn't think in that way. It doesn't make permanent alliances. It is always open to reach limited, scoped deals in fields where it benefits them.
Yeah that sounds like a pretty good deal. Drop the bankrupt Russians and do a deal with us Europeans, a much richer market, to brace against US economic warfare.
I suspect that China might be Russia's Ukraine offramp. If Russia decides to pull out, China can come in and work as a negotiator and win brownie points with the EU. I could see them being able to continue working with both Russia and the EU in that future.
I suspect that they are willing to wait a few more years until they have built up their own chip making capacity so that disrupting Formosa won’t strongly affect their own economy, while it will hinder other developed countries.
I'm not so sure about that. Taiwan pro-reunification party still grows, and its economy is hyper-specialized (not surprising, neocolonialism etc). If china's chip production capacity reach acceptable level (which it will), enough to put downward pressure on lesser chip, Taiwan economy might suffer enough that they vote for a reunification, probably as an autonomous regions (like Guangxi or Ningxia). That would be China's ultimate win.
Not just yet, they should wait for a little bit. The US isn't done depleting its inventory yet, the US might get itself in a lot deeper yet, and the US population will only detest the war even more given time. All of those things will help China take Taiwan. If Iran gets ugly enough the US population will just have that much less willingness to get involved in another major conflict. 3-18 months for Taiwan (9-18 more likely; China still needs some prep). There's no scenario where China isn't going to successfully take the island after this. They now know the US isn't at all prepared to stand off with them in coastal Asia. It would take years of surge production to get ready, the US doesn't have years re Taiwan.
If China is going in, we'll start to see large signs of that. They'll begin a number of prominent campaigns, including sabotage, propaganda, extremely large supply movements, and so on.
Yes, reading articles like this one, I suspect it's going to be the lack of firepower that causes this administration to finally back out of the conflict. And with these number it sounds like it might be sooner than later.
If China were to learn anything important from Russia and the USAs "swift" wars it's: don't do it. They'll have the upper hand but a determined government and population will bog down their efforts for years and potentially destabilize politics at home.
Love this paragraph; it's exactly how I feel about the LLMs. Unless you really know what you are doing, they will produce very sub-optimal code, architecturally speaking. I feel like a strong acumen for proper software architecture is one of the main things that defines the most competent engineers, along with naming things properly. LLMs are a long, long way from having architectural taste
reply