Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The IsUpMap lets you check the status of over 100 major sites at once (isupmap.com)
97 points by mikelgan 9 hours ago | hide | past | favorite | 36 comments
 help



Pretty cool visualization.

I've been building something like this for 12 years now.

One major difference is mine does not only rely on the "official" status page but also receive millions of reports from users about outages.

So your single pane of glass can show not just known outages but emerging ones that haven't been acknowledged yet by providers.

Also supports more than 8,000 services.



Something must be wrong, it's showing github as up!

GitHub does not report their outages. If you see GitHub.com, does not mean GH actions are working.

beautiful visualization of "complex systems run in degraded mode"

https://how.complexsystems.fail/#5


What a great capsule of wisdom!

There is still a tendency within some parts of aviation (safety auditing) to look for root causes and use tools like "fish bone diagrams" despite the more holistic approach used after an actual crash or incident.


A bunch of different services on a single status page doesn’t make it a complex system. Most of these have no relation to each other other than the high level services on the cloud providers.

They're all part of the internet, which is one of the most complex systems ever built.

> A bunch of different services on a single status page doesn’t make it a complex system.

you're it does not.

> Most of these have no relation to each other other than the high level services on the cloud providers.

so, some of them are related to each other? some of them even share underlying infrastructure? perhaps multiple of these are considered infrastructure for some teams?

what is the point you're trying to make?


Probably unfair to class Cloudflare as "degraded" they have over 300 PoPs theres always going to be some in maintenance mode and re-routed

Auth0 and Slack appear degraded here, but not on their status pages

This app looks to be incorrectly parsing Slack and Auth0 official status page and showing incidents as ongoing that are not

And those are just the 2 that I checked.

To be fair, accurately scraping and normalizing data from status pages is really hard to to do consistently (my company has a team of 5 engineers to do it and it's a lot of work).


Yea I was wondering where that data/info was coming from?

And what does it mean exactly?


Cloudflare as well

Services like Cloudflare and Twilio have so many POPs globally that one or more always have an outage going on. Then there's the question of whether it's a major outage or a minor outage. Even though major status page providers like Atlassian and Incident.io have public status APIs (Cloudflare uses Atlassian), it takes more than just parsing them to determine what is "down" and at what granularity.

I run an outage detection service - and some of these issues, like parsing hundreds of - sometimes undocumented - status APIs, make for an interesting engineering problem.


Ouch, Azure isn't even present

They said major sites

Would be interesting if sites could be grouped based on what services they rely on, or just grouped based on which have correlated downtime.

Correlated downtime and this is a place I wouldn't actually mind a guess from AI on whether their is a common underlying cause between some of the things. I say AI because I don't really think anyone is going to keep all of the possible common dependencies of different privately hosted systems up to date, but AI could at least take an initial guess + try to find if anyone else is posting root cause theories elsewhere at the time and link to those (and a guess is fine enough).

Maybe try using <wbr> for example Cloud<wbr>flare or mongo<wbr>db for more natural break on small screens.

Facebook, Twitter (X), Instagram is no longer a thing?

They don't have straightforward status pages or APIs to detect outages - I think that's the reason they are not listed.

Suggestion: The area of each rectangle should be proportional to the UPTIME capitalization

Maybe this is the idea, but how come github uptime is 100%!?

No love for mindgeek assets?

Are those ever down?

Playstation is in the list but not Xbox? Weird

No Apple services listed? Where's iCloud?

Interesting.. Ms Teams blocks the entire url..

Yeah, highly inaccurate data. Shows Auth0 with an uptime of 0.6% over 24h. Smells like a slop project.

Well if you count every minor service outage which maybe 0.1% of the users are non-critically affected by, you quickly get to 0.6%. So, this doesn't really tell you anything.

But 55 of them is unknown (edit: fixed now)

And github has 100% uptime while cloudflare has 20%. Yeah, right.

What a godsend this is! Thanks a lot! I hope the data is accurate! Keep improving it.

I'm assuming there's an optimisation in the source of this:

``` if(github) return false ```


over half are unknown



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: