I don't care how rough the estimate is, LLMs are using insane amounts of power, and the message I'm getting here is that the newest incarnation uses even more.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
This is a most excellent place for technology news and articles.
I don't care how rough the estimate is, LLMs are using insane amounts of power, and the message I'm getting here is that the newest incarnation uses even more.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
Kind of? Inefficient coding is definitely a part of it. But a large part is also just the iterative nature of how these algorithms operate. We might be able to improve that via code optimization a little bit. But without radically changing how these engines operates it won't make a big difference.
The scope of the data being used and trained on is probably a bigger issue. Which is why there's been a push by some to move from LLMs to SLMs. We don't need the model to be cluttered with information on geology, ancient history, cooking, software development, sports trivia, etc if it's only going to be used for looking up stuff on music and musicians.
But either way, there's a big 'diminishing returns' factor to this right now that isn't being appreciated. Typical human nature: give me that tiny boost in performance regardless of the cost, because I don't have to deal with. It's the same short-sighted shit that got us into this looming environmental crisis.
Coordinated SLM governors that can redirect queries to the appropriate SLM seems like a good solution.
And water usage which will also increase as fires increase and people have trouble getting access to clean water
https://techhq.com/news/ai-water-footprint-suggests-that-large-language-models-are-thirsty/
It would only take one regulation to fix that:
Datacenters that use liquid cooling must use closed loop systems.
The reason they dont, and why they setup in the desert, is because water is incredibly cheap and energy to cool a closed loop system is expensive. So they use evaporative open loop systems.
Unfortunately I wonder if it’s more expensive to set up a closed loop system that’s really expensive or to buy lawmakers that will vote against bills saying you should do so and it’s a tale old as time
Politicians are cheap
Yeah sorry forgot my /s there
Also don't forget how people like wasting resources by asking questions like "what's the weather today".
I have an extreme dislike for OpenAI, Altman, and people like him, but the reasoning behind this article is just stuff some guy has pulled from his backside. There's no facts here, it's just "I believe XYX" with nothing to back it up.
We don't need to make up nonsense about the LLM bubble. There's plenty of valid enough criticisms as is.
By circulating a dumb figure like this, all you're doing is granting OpenAI the power to come out and say "actually, it only uses X amount of power. We're so great!", where X is a figure that on its own would seem bad, but compared to this inflated figure sounds great. Don't hand these shitty companies a marketing win.
I think AI power usage has an upside. No amount of hype can pay the light bill.
AI is either going to be the most valuable tech in history, or it's going to be a giant pile of ash that used to be VC capital.
Bit of a clickbait. We can't really say it without more info.
But it's important to point out that the lab's test methodology is far from ideal.
The team measured GPT-5’s power consumption by combining two key factors: how long the model took to respond to a given request, and the estimated average power draw of the hardware running it.
What we do know is that the price went down. So this could be a strong indication the model is, in fact, more energy efficient. At least a stronger indicator than response time.
Fucking Doc Brown could power a goddamn time machine with this many jiggawatts, fuck I hate being stuck in this timeline.
There's such a huge gap between what I read about GPT-5 online, versus the overwhelmingly disappointing results I get from it for both coding and general questions.
I'm beginning to think we're in the end stages of Dead Internet, where basically nothing you see online has any connection to reality.
People who fawn over generative AI haven't tried to use it for more than 5 seconds. I wish it could run a ttrpg game for me or even just remember the details of its original prompt but its not even close.
And an LLM that you could run local on a flash drive will do most of what it can do.
I mean no not at all, but local LLMs are a less energy reckless way to use AI
Probably not a flash drive but you can get decent mileage out of 7b models that run on any old laptop for tasks like text generation, shortening or summarizing.
I don’t buy the research paper at all. Of course we have no idea what OpenAI does because they aren’t open at all, but Deepseek's publish papers suggest it’s much more complex than 1 model per node… I think they recommended like a 576 GPU cluster, with a scheme to split experts.
That, and going by the really small active parameter count of gpt-oss, I bet the model is sparse as heck.
There’s no way the effective batch size is 8, it has to be waaay higher than that.
And perhaps even more importantly, the per-token cost of GPT-5's API is less than GPT-4's. That's why OpenAI was so eager to move everyone onto it, it means more profit for them.
I don’t believe api costs are tied all that closely to the actual cost to openAI. They seem to be selling at a loss, and they may be selling at an even greater loss to make it look like they are progressing. The second openAI seems like they have plateaued, their stock evaluation will crash and it will be game over for them.
How the hell are they going to sustain the expense to power that? Setting aside the environmental catastrophe that this kind of "AI" entails, they're just not very profitable.
Look at all the layoffs they've been able to implement with the mere threat that AI has taken their jobs. It's very profitable, just not in a sustainable way. But sustainability isn't the goal. Feudal state mindset in the populace is.
Isn't this the back plot of the game, Rain World? With the slug cats and the depressed robots stuck on a decaying world when the sapient, organic species all left?
Spoilers dude.
Of course there are comments doubting the accuracy, which by itself is valid, but they are merely doing it to defend AI. IMHO, even at a fifth of the estimates, we’re talking humongous amounts of power, all for a so-so search engine, half arsed chatbots and dubious nsfw images mostly. And let’s not forget: it may be inaccurate and estimates are TOO LOW. Now wouldn’t that be fun?
but they are merely doing it to defend AI.
No they're not, you can agree the research is garbage without defending AI. It literally assumes everything. GPT5 could be using eight times the power. It could be using half the power. It could be using a quadrillion times the power. Nobody knows, because they keep it secret.
This bubble needs to pop, the sooner the better.
That's alright. When they've got a generation of people who can't even hold a conversation without it, let alone do a job, that price increase will drop that energy use pretty rapidly.
40Wh or 18Wh which is it?
That's my old gaming PC running a game for 2min42sec-6minutes ... Roughly.
that's a lot. remember to add "-noai" to your google searches.
I'm just going to ignore the AI recommendations, let them burn money.
i don't judge you for that. honestly it matters fuck all at this point
The last 6 to 12 months of open models has pretty clearly shown you can substantially better results with the same model size or the same results with smaller model size. Eg Llama 3. 1 405B being basically equal to Llama 3.3 70B or R1-0528 being substantially better than R1. The little information available about GPT 5 suggests it uses mixture of experts and dynamic routing to different models, both of which can reduce computation cost dramatically. Additionally, simplifying the model catalogue from 9ish(?) to 3, when combined with their enormous traffic, will mean higher utilization of batch runs. Fuller batches run more efficiently on a per query basis.
Basically they can't know for sure.
Help me out here. What designates the “response” type? Someone asking it to make a picture? Write a 20 page paper? Code a small app?