vivendi

joined 1 month ago
[–] vivendi@programming.dev 4 points 1 week ago* (last edited 1 week ago)

The model ISN'T outputing the letters individually, binary models (as I mentioned) do; not transformers.

The model output is more like Strawberry

Tokens can be a letter, part of a word, any single lexeme, any word, or even multiple words ("let be")

Okay I did a shit job demonstrating the time axis. The model doesn't know the underlying letters of the previous tokens and this processes is going forward in time

[–] vivendi@programming.dev 0 points 1 week ago (2 children)

No, this literally is the explanation. The model understands the concept of "Strawberry", It can output from the model (and that itself is very complicated) in English as Strawberry, jn Persian as توت فرنگی and so on.

But the model does not understand how many Rs exist in Strawberry or how many ت exist in توت فرنگی

[–] vivendi@programming.dev 3 points 1 week ago

Broadcom management deserve gulag

[–] vivendi@programming.dev 4 points 1 week ago (5 children)

For usage like that you'd wire an LLM into a tool use workflow with whatever accounting software you have. The LLM would make queries to the rigid, non-hallucinating accounting system.

I still don't think it would be anywhere close to a good idea because you'd need a lot of safeguards and also fuck your accounting and you'll have some unpleasant meetings with the local equivalent of the IRS.

[–] vivendi@programming.dev 0 points 1 week ago (4 children)

This is because auto regressive LLMs work on high level "Tokens". There are LLM experiments which can access byte information, to correctly answer such questions.

Also, they don't want to support you omegalul do you really think call centers are hired to give a fuck about you? this is intentional

[–] vivendi@programming.dev 0 points 1 week ago* (last edited 1 week ago)

According to https://arxiv.org/abs/2405.21015

The absolute most monstrous, energy guzzling model tested needed 10 MW of power to train.

Most models need less than that, and non-frontier models can even be trained on gaming hardware with comparatively little energy consumption.

That paper by the way says there is a 2.4x increase YoY for model training compute, BUT that paper doesn't mention DeepSeek, which rocked the western AI world with comparatively little training cost (2.7 M GPU Hours in total)

Some companies offset their model training environmental damage with renewable and whatever bullshit, so the actual daily usage cost is more important than the huge cost at the start (Drop by drop is an ocean formed - Persian proverb)

[–] vivendi@programming.dev 0 points 1 week ago* (last edited 1 week ago) (1 children)

This is actually misleading in the other direction: ChatGPT is a particularly intensive model. You can run a GPT-4o class model on a consumer mid to high end GPU which would then use something in the ballpark of gaming in terms of environmental impact.

You can also run a cluster of 3090s or 4090s to train the model, which is what people do actually, in which case it's still in the same range as gaming. (And more productive than 8 hours of WoW grind while chugging a warmed up Nutella glass as a drink).

Models like Google's Gemma (NOT Gemini these are two completely different things) are insanely power efficient.

[–] vivendi@programming.dev 0 points 1 week ago (1 children)

China has that massive rate because it manufactures for the US, the US itself is a huge polluter for military and luxury NOT manufacturing

[–] vivendi@programming.dev 0 points 1 week ago* (last edited 1 week ago) (7 children)

I will cite the scientific article later when I find it, but essentially you're wrong.

collapsed inline media

[–] vivendi@programming.dev 11 points 1 week ago* (last edited 1 week ago)

Not directly, but the democrats are a right wing bourgeoisie liberal party, so they essentially serve the interests of capital

The interests of capital eventually leads to imperialism and when that starts to crack (<- you're here), fascism. Unless the actual socialists win.

And you know what? I don't think socialism had a chance in yankee land. They have spent decades upon decades brainwashing the population in a highly compatible fashion. The highest point of American radical action is just some adventurism terrorist shit that doesn't do anything. Ask Americans to actually organize and do vanguard action and they'll screech at you, because the bourgeoisie media and education has made you an slave inside your own brain.

So yeah, democrats or whatever. Why does it matter if fascism comes in 9 years or 4 years, you guys are straight up fucked.

If you live in yankee land, emigrate your ass before you lose your ass

[–] vivendi@programming.dev 5 points 1 week ago

It's an S for Summation

It should be a D though, because it fucks students

view more: ‹ prev next ›