Hacker Newsnew | past | comments | ask | show | jobs | submit | stri8ted's commentslogin

I doubt this is representative of real world usage. There is a difference between a few turns on a web chatbot, vs many-turn cli usage on a real project.

Entire segments of the podcast sphere are making their money talking about these so-called unspeakable subjects. Why don't you share what you really think.

I mean, your right, unless you actually work for a government or state and then you cannot.

https://gov.texas.gov/news/post/anti-israel-policies-are-ant...


Maybe because there are laws against it?

Those are the same thing

They are not if there aren't customers who are willing to pay more. For instance imagine a widget that lasts 1 year and is just under 1/2 the price of one that lasts 2 years. There may be high demand because it's the more economical option. If you raise the price so that it's 1/2 the price of the 2 year widget then demand collapses without effecting supply.

If customers were willing to pay more then a higher price wouldn't solve anything. The price is said to be too low exactly because people are trying to buy more than there is available to sell. The whole point of higher prices is to try and scare people away. Not enough supply and a price too low are the same thing.

48 GB is not consumer hardware. But fundamentally, there are economies of scale due to batching, power distribution, better utilization etc.., that means data center tokens will be cheaper. Also, as the cost of training (frontier) models increases, it's not clear the Chinese companies will continue open sourcing them. Notice for example, that Qwen-Max is not open source.


Nothing obviously prevents using this approach, e.g. for 3B-active or 10B-active models, which do run on consumer hardware. I'd love to see how the 3B performs with this on the MacBook Neo, for example. More relevantly, data-center scale tokens are only cheaper for the specific type of tokens data centers sell. If you're willing to wait long enough for your inferences (and your overall volume is low enough that you can afford this) you can use approaches like OP's (offloading read-only data to storage) to handle inference on low-performing, slow "edge" devices.


It is consumer hardware in the sense that Macbook Pros come with this RAM size as base and that you can buy them as a consumer, without having to sign a special B2B contract, show that your company is big and reputable enough, and order a minimum of 10 or 100.


> 48 GB is not consumer hardware.

It’s a MacBook.


Technically that's correct (which as we all know is the best kind of correct), but really, how many consumers are buying a high-end MacBook Pro with 48GB or more of RAM? That's a very small percentage of the population. In these kinds of discussions, "consumer" is being used as a proxy for "something your average home laptop buyer might have". And a 48GB MBP is not that.

I know it's annoying, because a 48GB MBP is indeed technically "consumer hardware", but please understand the context and don't be pedantic. You know what the GP meant. (And if not, that's... kinda on you.)


> but please understand the context and don't be pedantic.

The context is this is something I can pick up at an Apple Store and not some rig I have to build with NVIDIA cards.

I led with:

> get closer and closer to consumer hardware

I think this demonstrates getting closer, whether you think a MacBook is consumer hardware or not. But I'm the one being pedantic.


Do you have any evidence to support this view?


Read who and how it was founded. It's not a secret at all.


It’s funny how I got immediately downvoted and flagged


Who else would MITM 30% of the internet?


Price Input: $2.50 / 1M tokens Cached input: $0.25 / 1M tokens Output: $15.00 / 1M tokens

https://openai.com/api/pricing/


It's clearly true there have been abuses as a result of this technology. And its also clearly true criminals have been caught as a result of the cams, that otherwise would not have been.

If you believe the costs of the the abuses, and potential abuses, exceed the benefit, then at least be honest about the trade-off, because there are real benefits.

Personally, I believe the costs, on net, are worth the benefits. And in so far as the costs can be further reduced, without loosing most benefits, then great. This is not right or wrong. It's just a question of values, and how you weight the costs vs benefits.

Don't down-vote this all at once.


Please don't comment about the voting on comments. It never does any good, and it makes boring reading.[1]

[1] https://news.ycombinator.com/newsguidelines.html


My question to you is: how are you assessing the costs? Do you know how many crimes have been stopped as a result of these cams? Do you know the extent to which our privacy is being lost and our data is being used against us or others?


I take into account publicly available information (news articles), factor in personal anecdotes, and reason about human nature and incentives. I know the extent of reported abuses, and I do my best to extrapolate. It's not perfect, but such is life.

To be clear, even if we all agreed on the data, I still would not expect everyone to take the same position. There are subjective differences in values.


I get that but at the very least one should demand evidence to their efficacy


Flock has put out a report claiming 10% crime in the US is solved using their technology. There are of course counter argument, that claim this is not valid.

https://www.flocksafety.com/customers/how-many-crimes-do-aut...


I strongly agree with this take.


Can you show some comparisons for WER and other ASR models? Especially for non english.


I've been experimenting with Gemini 3.1 Flash Lite and the quality is very good.

I haven't found official benchmarks yet, but you can find Gemini 3 Flash word error rate benchmarks here: https://artificialanalysis.ai/speech-to-text/models/gemini — they are close to SOTA.

I speak daily in both English and Russian and have been using Gemini 3 Flash as my main transcription model for a few months. I haven't seen any model that provides better overall quality in terms of understanding, custom dictionary support, instruction following, and formatting. It's the best STT model in my experience. Gemini 3 Flash has somewhat uncomfortable latency though, and Flash Lite is much better in this regard.


How is this content related to HN? Are there any submission criteria?


Exactly. As far as I'm concerned, the benchmark is useless. It's way too easy and rewarding to train on it.


It's just an in-joke, he doesn't intend it as a serious benchmark anymore. I think it's funny.


Y'all are way too skeptical, no matter what cool thing AI does you'll make up an excuse for how they must somehow be cheating.


Jeff Dean literally featured it in a tweet announcing the model. Personally it feels absurd to believe they've put absolutely no thought into optimizing this type of SVG output given the disproportionate amount of attention devoted to a specific test for 1 yr+.

I wouldn't really even call it "cheating" since it has improved models' ability to generate artistic SVG imagery more broadly but the days of this being an effective way to evaluate a model's "interdisciplinary" visual reasoning abilities have long since passed, IMO.

It's become yet another example in the ever growing list of benchmaxxed targets whose original purpose was defeated by teaching to the test.

https://x.com/jeffdean/status/2024525132266688757?s=46&t=ZjF...


Or maybe you’re too trusting of companies who have already proven to not be trustworthy?


I mean if you want to make your own benchmark, simply don't make it public and don't do it often. If your salamander on skis or whatever gets better with time it likely has nothing to do with being benchmaxxed.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: