Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don’t think that’s true, there was an initial Dynamo outage that was resolved in the wee hours that ultimately cascaded into the ec2 problem that lasted most of the day


Was the Dynamo outage separate? My take was the NLB issue was the root cause and Dynamo was a symptom which they flipped some internal switches to mitigate the impact to that dependency


If their internal NLB monitoring can delete the A record for dynamodb that seems like a weird dependency (like, i can imagine the nlb going missing entirely can cause it to clean up via some weird orchestration, but this didn't sound like that).


I was thinking more along the lines of the NLB being in front of DNS servers and dropping resolvers

Or an NLB could also be load balancing by managing DNS records--it's not really clear what a NLB means in this context

Or there was an overload condition because of the NLB malfunctioning that caused UDP traffic to get dropped

Obviously a lot of reading between the lines is required without a detailed RCA--hopefully they release more info




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: