this post was submitted on 06 Sep 2024
1 points (100.0% liked)
Technology
69421 readers
2460 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Here's an experiment for you to try at home. Ask an AI model a question, copy a sentence or two of what they give back, and paste it into a search engine. The results may surprise you.
And stop comparing AI to humans but then giving AI models more freedom. If I wrote a paper I'd need to cite my sources. Where the fuck are your sources ChatGPT? Oh right, we're not allowed to see that but you can take whatever you want from us. Sounds fair.
Did the experiment.
Zero shock factor. It showed an empty google search result. I have screenshots for the deniers. I don't know what you think will happen, but unless you're asking it some super vague question, where the answer would be unanimous across the board, it's not going to spit out some shock factor quote that you can google. What a waste of an 'experiment'.
Bro this was 6 months ago lol. Models have gotten way better since then. I made this comment when Google was still telling people to put glue on pizza. Which, if you did re-input the answer, would take you to a reddit post. Almost all of them would take you to a reddit post back then.
Thats insane it used to do that. Never seen it myself.
Can you just give us the TLDE?
AI Chat bots copy/paste much of their "training data" verbatim.
Microsoft's Copilot funnily enough actually provides sources that it pulls from the internet if you ask it to.
Not to fully argue against your point, but I do want to push back on the citations bit. Given the way an LLM is trained, it's not really close to equivalent to me citing papers researched for a paper. That would be more akin to asking me to cite every piece of written or verbal media I've ever encountered as they all contributed in some small way to way that the words were formulated here.
Now, if specific data were injected into the prompt, or maybe if it was fine-tuned on a small subset of highly specific data, I would agree those should be cited as they are being accessed more verbatim. The whole "magic" of LLMs was that it needed to cross a threshold of data, combined with the attentional mechanism, and then the network was pretty suddenly able to maintain coherent sentences structure. It was only with loads of varied data from many different sources that this really emerged.