He does have a point about fees. It's not really surprising that the fee structure designed for chatbots would not make sense when applied to long running tasks and agents. But an increase in prices can solve this problem.
Doubtless some people will reduce usage as a result. But Ed seems to find the idea that a 10 man developer team might spend 80K a year on tokens ridiculous. I don't understand this. Has he seen how much developers are paid? If you get a 20% productivity boost from coding agents, then that's two developers for 80K - effectively very good value.
Where things could go wrong is in comparison to cheaper models. If it's 5K a year for Qwen, and it's 2/3 as good will you pay 75K extra for Opus? Perhaps not.
I think that team is better off with a junior developer. This alleged “20% productivity boost” even if it exists, is individual. On the team level, it will be largely offset by people having to review 20% more code.
Obviously in some cases a junior developer is a better investment if it's a straight up choice.
Actually I think it'll be rare for a manager to be choosing between either a junior developer or a coding assistant, since each are going to benefit the team in very different ways and it'll often be obvious which you need.
What I mean is that at the price levels in the article the coding agent still had a realistic chance of positive ROI. People will pay for things with positive ROI.
The problem is that LLM cost is more or less the same for generating some fixed amount of code or it will converge to that soon. But developer costs vary wildly based on the seniority*geographical location. Sure some Silicon Valley architect will be always more expensive than any LLM bills he incurred. But a middle tier dev at an outsource or local cheap shop overseas using the same LLM for the same tasks and same token costs? Eeh, it can go either way really.
Well we have cool projects like CollapseOS the problem is that there is so much undocumented silicon out there that cant be used without massive efforts. I know several "gold scrappers" and its such a shame that they trash great classic chips just go get back a bit of metal. So much effort went into making those chips and its just a shame that many can't be reused. While lack of cheap electricity prevents open design from being reused, there is an even bigger world of undocumented chips that are trashed as well.
He's producing semiconductors with a 1000nm (one micron) feature size. This kind of tech was cutting edge in the mid 80s. You might be able to produce a 32KB memory chip with it.
It would be difficult to break into the RAM business with that sort of product as most of the demand these days is for higher capacities.
I don't see the OP implying that anyone should trust the government. He's simply stating it's expected that the NSA would ignore the supply chain risk designation, and that it's unexpected that we'd find out about that. If anything the comment seems to imply a lack of trust in government.
Try setting up one laundry which charges by the hour and washes clothes really really slowly, and another which washes clothes at normal speed at cost plus some margin similar to your competitors.
The one which maximizes ROI will not be the one you rigged to cost more and take longer.
Directionally, tokens are not equivalent to "time spent processing your query", but rather a measure of effort/resource expended to process your query.
So a more germane analogy would be:
What if you set up a laundry which charges you based on the amount of laundry detergent used to clean your clothes?
Sounds fair.
But then, what if the top engineers at the laundry offered an "auto-dispenser" that uses extremely advanced algorithms to apply just the right optimal amount of detergent for each wash?
Sounds like value-added for the customer.
... but now you end up with a system where the laundry management team has strong incentives to influence how liberally the auto-dispenser will "spend" to give you "best results"
Wow that is terrible. In my memory GPT 2 was more interesting than that. I remember thinking it could pass a Turing test but that output is barely better than a Markov chain.
The article is about two models which have either 2B or 4B parameters. Both are dense models. The 2B version will certainly use less power than qwen3-coder-next.
The models are quite good. They aren't just a tech demo.
Doubtless some people will reduce usage as a result. But Ed seems to find the idea that a 10 man developer team might spend 80K a year on tokens ridiculous. I don't understand this. Has he seen how much developers are paid? If you get a 20% productivity boost from coding agents, then that's two developers for 80K - effectively very good value.
Where things could go wrong is in comparison to cheaper models. If it's 5K a year for Qwen, and it's 2/3 as good will you pay 75K extra for Opus? Perhaps not.
reply