this post was submitted on 24 May 2025
        
      
      1 points (100.0% liked)
      Science Memes
    17255 readers
  
      
      923 users here now
      Welcome to c/science_memes @ Mander.xyz!
A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.

Rules
- Don't throw mud. Behave like an intellectual and remember the human.
 - Keep it rooted (on topic).
 - No spam.
 - Infographics welcome, get schooled.
 
This is a science community. We use the Dawkins definition of meme.
Research Committee
Other Mander Communities
Science and Research
Biology and Life Sciences
- !abiogenesis@mander.xyz
 - !animal-behavior@mander.xyz
 - !anthropology@mander.xyz
 - !arachnology@mander.xyz
 - !balconygardening@slrpnk.net
 - !biodiversity@mander.xyz
 - !biology@mander.xyz
 - !biophysics@mander.xyz
 - !botany@mander.xyz
 - !ecology@mander.xyz
 - !entomology@mander.xyz
 - !fermentation@mander.xyz
 - !herpetology@mander.xyz
 - !houseplants@mander.xyz
 - !medicine@mander.xyz
 - !microscopy@mander.xyz
 - !mycology@mander.xyz
 - !nudibranchs@mander.xyz
 - !nutrition@mander.xyz
 - !palaeoecology@mander.xyz
 - !palaeontology@mander.xyz
 - !photosynthesis@mander.xyz
 - !plantid@mander.xyz
 - !plants@mander.xyz
 - !reptiles and amphibians@mander.xyz
 
Physical Sciences
- !astronomy@mander.xyz
 - !chemistry@mander.xyz
 - !earthscience@mander.xyz
 - !geography@mander.xyz
 - !geospatial@mander.xyz
 - !nuclear@mander.xyz
 - !physics@mander.xyz
 - !quantum-computing@mander.xyz
 - !spectroscopy@mander.xyz
 
Humanities and Social Sciences
Practical and Applied Sciences
- !exercise-and sports-science@mander.xyz
 - !gardening@mander.xyz
 - !self sufficiency@mander.xyz
 - !soilscience@slrpnk.net
 - !terrariums@mander.xyz
 - !timelapse@mander.xyz
 
Memes
Miscellaneous
        founded 2 years ago
      
      MODERATORS
      
    you are viewing a single comment's thread
view the rest of the comments
    view the rest of the comments
          
          
Some details. One of the major players doing the tar pit strategy is Cloudflare. They're a giant in networking and infrastructure, and they use AI (more traditional, nit LLMs) ubiquitously to detect bots. So it is an arms race, but one where both sides have massive incentives.
Making nonsense is indeed detectable, but that misunderstands the purpose: economics. Scraping bots are used because they're a cheap way to get training data. If you make a non zero portion of training data poisonous you'd have to spend increasingly many resources to filter it out. The better the nonsense, the harder to detect. Cloudflare is known it use small LLMs to generate the nonsense, hence requiring systems at least that complex to differentiate it.
So in short the tar pit with garbage data actually decreases the average value of scraped data for bots that ignore do not scrape instructions.
The fact the internet runs on lava lamps makes me so happy.