At PagerDuty, we are building out our highly available service for managing incidents properly, but how does PagerDuty monitor PagerDuty? In this talk we will cover the strategies that PagerDuty uses to always make sure our service is always up and running.
Arup has been working in the space of software operations since 2007. He started out at as an Operations Engineer at Amazon, helping to reduce customer defects with multiple teams for the Amazon Marketplace. Since then, he has managed and built operations teams at Amazon and Netflix to help improve availability and reliability. He currently works at PagerDuty, where he is part of the Infrastructure Engineering group.