this post was submitted on 26 Mar 2025
853 points (94.8% liked)
Technology
68187 readers
4218 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Forgive my ignorance but using just the frequency of words how does it come up with an answer to a question like "are sweet potatoes good for you and how do you microwave them in a way that persves their nutrients?"
Does it just look for words that people online said regarding the question or topic?
Basically, yes.
If I were an alien and you walked up to me and said, "Good Morning", and I looked around and everyone else said "Good Morning", I would respond with "Good Morning ". I don't know what is "Good" or "Morning", but I can pretend I do with the correct response.
In this example "Grok" has no context on what is going on in the background. Musk may have done nothing. Musk may have altered the data sets heavily. However the most popular response, based on what everyone else is saying, is that he did modify the data. So now it looks like he did, because that's what everyone else said.
This is why these tools have issues with facts. If 1 + 1 = 3, and everyone says that 1 + 1 = 3, then it assumes 1 + 1 = 3.
@Viskio_Neta_Kafo I assume it's big data corpus linguistics; each word/phrase is assigned an identifier and then compared to the corpora the LLM holds to see what words are commonly grouped. Linguists have used corpora for decades to quantitatively analyse language; here are some open ones https://www.english-corpora.org/ - the LLM I assume identifies the likely lang "type" to choose a good corpus, identifies question tags & words in key positions, finds common response structures and starts building.