Is there any meta analysis on these major outages?
They seem to be occuring more and more regularly.
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
Is there any meta analysis on these major outages?
They seem to be occuring more and more regularly.
It would be funny if it was because of some AI coding
Please don’t remind me. Had a colleague with a senior title who just vibecoded on our CI pipeline and it ended up blocking deployments for half a day.
Maybe Kevin Fang will make a video about all combined?
My prejudice tells me that it's vibe slopped code.
I saw people speculating about this, but it'll be interesting to see if that's really what it was (and if so, if they'd admit it).
Preliminary articles say this was a DDoS.
A DDoS that brings down the biggest(?) DDoS protection provider?
def possible, cloudflare DDoS their own dashboard a few months ago with some react code
https://blog.cloudflare.com/deep-dive-into-cloudflares-sept-12-dashboard-and-api-outage/
As far as major outages, I can only think of around 10 going back 5 years. Sure there have been minor hiccups and glitches, but gosh, that's life on the internet.
Yeah I have issues with my home setup at least that often. I can only think of two occasions where the solution took more effort than a reboot. Maybe we're all just old enough for "every 6 months" to feel like every 5 mins
To clarify, by "issues" I mean "system stoppages not precipitated by me fucking with something." I screw up my own system way more often than that lol
I don't know how old you are brother, but it feels like I'm in one of those Star Trek warps where all the stars are just whizzing by.
It does feel like the major outages are growing closer together.
I want to blame plugging AI where it doesn't belong, but I suppose that putting everything into the same couple of server closets could be the primary root cause.
So far this year its been:
Thats the ones that affect stuff at work. Fun times.
Azure was down? Did anyone notice? 😆
If you're stuck with Windows at work, you might have. I couldn't log on to our domain remotely and was unable to open some documents and emails that were signed as "internal use only".
It’s a vibe.
Dang. I had to look it up, but CrowdStike taking down a huge percentage of Windows PCs was so last year, I guess.
My services have a higher uptime than AWS, Azure and Cloudflare despite the fact that they are running on 10 year old hardware
I Guess many lemmy instances are not as much decentralized as they thought.
Cache-ing is a hard problem to solve. For MANY years cloudflare was the cheapest and easiest way to solve the issue.
But the same people that started cloudflare have moved on or passed away. Now its just another corp. So yeah we are going to get downtime. In the corporate world, its the standard so no one is going to move off it.
...unless it keeps going down consistently.
For those who are in the know, does memcache work with lemmy/piefed? What is a good Cache for modern day systems?
I dont use cloudflare at all, so I didnt notice. I do see a lot of recommendation's to host behind Cloudflare, so if they have a problem you do too
I do see a lot of recommendation’s to host behind Cloudflare, so if they have a problem you do too
Fallback to Tailscale, or any number of ways to skin the cat. At the very least, everything is accessible locally. I'm not running anything mission critical, so it was more of a minor inconvenience to me.
aussie.zone sh.itjust.works lemmy.world
all were down for me
I didn't notice and I've been on Lemmyverse the whole day, on and off. Different time zone though (Eastern Europe) so I'm not sure how that translates to "this morning".
Not all instances were burned. Lemmy.world that i use was though. I will make a user for backup in an instance that is not affected by cloudflare.
"Further details will be made available when we've worked out how the AI screwed us this time"
This really should read: further details will be made available when our AI api is back up and can self-report on how it fucked up
I am still using cloudflare tunnels for home assistant, so I am a little annoyed. On the other hand this was a much needed reminder that I need to move home assistant to pangolin/newt as well.
NAT may be insecure but it keeps chugging on
Eggs, baskets, and lessons being taught about how it's not a good idea to heavily skew their ratio.
My own DNS is still up and strong. Take that, Cloudflare.
took much of lemmy.world down. I wonder how come I'm seeing some posts from lemmy.worlders though
Cloudflare have a large international network. Perhaps some users could access sites when coming from another region
I don’t use cloudflare but the only reason I noticed is a lot of websites I go on use cloudflare captcha due to me using a vpn. Now I can’t even get on those website due to the captcha being down 😭
I was having DNS issues yesterday using cloudflare's DoH, so I'm curious if it is going to be related.
And it's taken down most of Lemmy's "selfhosted" images.
Took down Framework's website, which I was using.
I wish they go down more, at least once a month.
TIL that Xitter depends on Cloudflare (I don't have account, I just wanted to check a specific tweet). They don't afford their own infrastructure...
sh.itjust.work is down, but lemmy.radio is up. I need to tip them.
Well, unless it came back in the last 25 minutes, it's working fine in Western Australia.
Still having problems, their status page confirms the issue is still ongoing. https://www.cloudflarestatus.com/
Downdetector isn’t working in the app.
Because downdetector is cloudflare