This doesn’t explain the cover (seemingly not used in the final collection) with a hallucinated map on it. Maybe they only used generative art for mockups, but they did use it on a cover design.
…while not changing anything about our behavior, you mean. Because we were never ignorant of how to do better; we just couldn’t accept even any inconvenience, any obstacle to our “growth”.
I'm well aware! Not only are we unable to change our behaviour we in fact have the hubris to imagine that if we could only use our technology to communicate with the whales that it would be enough to say
> "Don't go to these places—even though you want to, even though your family has been breeding there for generations—because that's our special whale hunting area"
And that their behaviour would change for us, that their response would simply be:
> "No worries, thanks for the heads up! Sorry for getting in the way of your harpoons"
“In fairness, this pot of water was already uncomfortably hot before [latest development] raised the temperature another few degrees closer to boiling.”
…says a happy frog who will be as cooked as everyone else.
Hi Boris, random observer here. Would you consider apologizing to the community for mistakenly closing tickets related to this and then wrongly keeping them closed when, internally, you realized they were legitimate?
I think an apology for that incident would go a long way.
> My hypothesis is that some of this a perceived quality drop due to "luck of the draw" where it comes to the non-deterministic nature of [LLM] output.
I think you must have learned that they’re more nondeterministic than you had thought, but then wrongly connected your new understanding to the recent model degradation. Note: they’ve been nondeterministic the whole time, while the widely-reported degradation is recent.
Your argument seems to be that a statistically-improbable number of people all experienced ultimately- randomly-poor outputs, leading to only a misperception of model degradation… but this is not supported by reality, in which a different cause was found, so I was trying to connect your dots.
Not everyone is reporting and the number of users is not consistent. On the former the noisiest will always be those that experience an issue while on the latter there are more people than ever using Claude Code regularly.
Combining these things in the strongest interpretation instead of an easy to attack one and it's very reasonable to posit a critical mass has been reached where enough people will report about issues causing others to try their own investigations while the negative outliers get the most online attention.
I'm not convinced this is the story (or, at least the biggest part of it) myself but I'm not ready to declare it illogical either.
No, that is not my argument, in fact I don't have any argument whatsoever. It was just a plausible observation that I felt like sharing. There's nothing further to read into it, I don't have a horse in this race.
Not really, they said "some of this a perceived quality drop". That's almost certainly correct, that _some_ of it is that.
When everyone's talking about the real degradation, you'll also get everyone who experiences "random"[1] degradation thinking they're experiencing the same thing, and chiming in as well.
[1] I also don't think we're talking the more technical type of nondeterminism here, temperature etc, but the nondeterminism where I can't really determine when I have a good context and when I don't, and in some cases can't tell why an LLM is capable of one thing but not another. And so when I switch tasks that I think are equally easy and it fails on the new one, or when my context has some meaningless-to-me (random-to-me) variation that causes it to fail instead of succeed, I can't determine the cause. And so I bucket myself with the crowd that's experiencing real degradation and chime in.
> You are using it to mean "maintaining full version history", I believe?
No, they are using it to mean “backed up”. Like, “if this data gets deleted or is in any way lost locally, it’s still backed remotely (even years later, when finally needed)”.
I’m astonished so many people here don’t know what a backup is! No wonder it’s easy for Backblaze to play them for fools.
definition of the term backup by most sources is one the line of:
> a copy of information held on a computer that is stored separately from the computer
there is nothing about _any_ versioning, or duration requirements or similar
To use your own words, I fear its you who doesn't know what a backup is and assume a lot other additional (often preferable(1)) things are part of that term.
Which is a common problem, not just for the term backup.
There is a reason lawyers define technical terms in a for this contract specific precise way when making contracts.
Or just requirements engineering. Failing there and you might end up having a backup of all your companies important data in a way susceptible to encrypting your files ransomware or similar.
---
(1): What often is preferable is also sometimes the think you really don't want. Like sometimes keeping data around too long is outright illegal. Sometimes that also applies to older versions only. And sometimes just some short term backups are more then enough for you use case. The point here is the term backup can't mean what you are imply it does because a lot of existing use cases are incompatible with it.
> To use your own words, I fear its you who doesn't know what a backup is
Feel free to use my reputation, instead: when I say a system is backed up, data cannot be lost by that system being destroyed, because an independent copy always exists. This satisfies those whom it concerns, who put their money where their mouth is, whereas your more generous but insufficient definition would absolutely not be good enough.
When you assure a client that a system is backed up, which definition do they expect from you?
> When you assure a client that a system is backed up, which definition do they expect from you?
the one in the contract (and the various EU laws)
that is not a satisfying answer, I know
e.g. in some past projects the customers explicitly did _not_ want year long backups and outright forbid them, redundant storage systems + daily backups kept for ~1-2 weeks (I don't remember) had been pretty close to the legal limit of what we are allowed to have for that project (1)
the point I'm making was never that a good general purpose backup solutions shouldn't have versioning and years of backups
it's that
1. the word backup just doesn't mean much, so you have to be very explicit about what is needed, and sometimes that is the opposite of the "generic best solution"
2. If data is explicitly handled by another backup solution, even if it's a very bad one, it's understandable that the default is not to handle it yourself. (Through only the default, you should always have an overwrite option, be warned if defaults change, etc.).
Insisting a word means something it doesn't in a way where most non-tech people tend to use it in the definition you say isn't right just isn't helpful at all. Telling them that this is a very bad form of backup which they probably shouldn't use is much more likely to be taken serious.
---
(1): Side note: It's because all data we had is backed up else where, by a different solution, and sometimes can be a bit sensitive. So the customers preferred data loss (on our side, not on theirs) over any data being kept longer then needed (and as such there being more data at any point of time if there is some hacker succeeding or similar). And from what I have heard that project is still around working the same way.
But ironically that is similar to the case here, the data is owned/handled by a different system and as such we should not handle the backup.
But isn't that exactly what Dropbox does? If I delete a file on my PC, I can go to Dropbox.com and restore it, to some period in the past (I think it depends on what you pay for). In fact, I can see every version that's changed during the retention period and choose which version to restore.
Maintaining version history out to a set retention period is a backup...no?
reply