this post was submitted on 11 Mar 2025
335 points (98.8% liked)

Technology

65819 readers
4952 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

A new study from Columbia Journalism Review showed that AI search engines and chatbots, such as OpenAI's ChatGPT Search, Perplexity, Deepseek Search, Microsoft Copilot, Grok and Google's Gemini, are just wrong, way too often.

you are viewing a single comment's thread
view the rest of the comments
[–] EncryptKeeper@lemmy.world 3 points 17 hours ago* (last edited 17 hours ago) (2 children)

The problem is that the sales pitch of these AI answering services in search engines is to save you time from having to open search results and read them yourself. The problem with 80-90% accuracy is that if the summaries are hallucinated even once, you can no longer trust them implicitly, so in all cases you now have to verify what it says by opening search results and reading them yourself. It’s a convenience feature that doesn’t offer you any actual convenience.

Sure it’s impressive that they are accurate 80-90% of their time, but AI used in this context is of no actual value.

[–] Patch@feddit.uk 4 points 13 hours ago (1 children)

It's a real issue. A strong use case for LLM search engines is providing summaries which combine lots of facts that would take some time to compile through searching the old fashioned way. But if it's only 90% accurate and 10% hallucinated bullshit, it becomes very difficult to pick out the bullshit from the truth.

The other day I asked Copilot to provide an overview of a particular industrial sector in my area. It produced something that was 90% concise, accurate, readable and informative briefing, and 10% complete nonsense. It hallucinated an industrial estate that didn't exist, a whole government programme that doesn't exist, it talked about a scheme that went defunct 20 years ago as if it were still current, etc. If it weren't for the fact that I was already very familiar with the subject, I might not have caught it. Anyone actually relying on that for useful work is in serious danger of making a complete tit of themselves.

[–] OhVenus_Baby@lemmy.ml 1 points 12 hours ago

Copilot sucks and I totally understand the POV. I stick with GPT, Mixtral. I don't think their going anywhere anytime soon but they need significant actual refinement.

[–] Flisty@mstdn.social 3 points 13 hours ago

@EncryptKeeper @OhVenus_Baby I have very much embraced the swearing method to get rid of 50% of my Google result screen being taken up by an untrustworthy statement. Just a waste of space and scrolling time.