this post was submitted on 04 Jul 2025
104 points (96.4% liked)

Selfhosted

46671 readers
660 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

everytime i check nginx logs its more scrapers then i can count and i could not find any good open source solutions

you are viewing a single comment's thread
view the rest of the comments
[–] daddycool@lemmy.world 10 points 2 days ago (2 children)

some of those AI crawlers are actually not crawlers, but actual users who just use AI instead of coming directly to the site. Soo.. blocking AI completely could also potentially reduce exposure.

Normally, websites want users to come to their site, instead of an AI search engine "stealing" the content and presenting it as it's own. Yes, AI search engines are more convenient for the user, but in the end it will discourage website creators and thereby cut of it's own "food supply".

[–] Fedditor385@lemmy.world 1 points 2 days ago (2 children)

I understand, but the shift in user behaviour is significant and I think websites are not taking it into account. If the users move more and more to AI, and since Google introduced AI mode it's only a question of time until it becomes the default, we will see more and more of what we thing are AI crawlers and less and less organic users.

AI seems to be the new middleman between you and the user, and if you block the middleman, you block the user. For people with hobby websites or established sites it may make sense because people either know of them, or getting more exposure is not a wish or requirement, but for everyone else, it will be painful.

[–] lambalicious@lemmy.sdf.org 2 points 1 day ago

So, what I'm reading is, if your "users" are bad (or bots), just get better users.

Sounds like a net win.

[–] Noja@sopuli.xyz 2 points 2 days ago

I honestly don't think most people replace search with AI, it will also slowly solve itself when google injects ads into the output.

[–] Zexks@lemmy.world 1 points 22 hours ago

We all understand that. But if those users keep insisting on giving everyone their life story and current option in world politics before giving us the bread recipe we came for, they can fade away.