DevOps Operations Performance Platform

PagerDuty Blog

Subscribe to PagerDuty Blog: eMailAlertsEmail Alerts
Get PagerDuty Blog: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by PagerDuty Blog

Scaling Incident Management By Patrick O'Fallon Incident management is paramount to the success of any modern ITOps team. However, much like growing a business, scaling incident management can also trigger growing pains. As the landscape of devices, applications, and systems grows - each requiring monitoring - so too, does the alert noise and complexity around management for on-call staff. With an increasing number of engineers on your team, it can be difficult to on-board and implement new notification policies and after-hours operations to ensure your team is efficient and load is fairly distributed. And the push towards hybrid models of IT and bimodal IT environments can also complicate incident management. Nevertheless, with a few tried and true techniques, you can scale incident management in a planned, deliberate, organized, and effective way. Don't fall victi... (more)

Causes of Downtime | @CloudExpo @PagerDuty #DevOps #APM #Monitoring

The Top Causes of Downtime By Zachary Flower According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute. While the data collected was from incredibly large companies, the cost of downtime for even small startups is no laughing matter. Let's assume, for the sake of simplicity, that your core product is a web app that relies solely on organic sales, totaling $1 million in revenue a year. This amounts to about $2 in lost revenue per minute. This doesn't sound like too much in the grand scheme of things, but revenue is only a small part of ... (more)

From #BigData to #FastData | @CloudExpo #DevOps #AI #APM #Monitoring

Fast Data, Fast Monitoring By Christopher Tozzi Big data is old news. Today, the key to leveraging data effectively is to do fast data. In a similar fashion, traditional incident management-which entails collecting and analyzing large volumes of monitoring information-is no longer enough. Organizations must also now do "fast monitoring," which means not only collecting monitoring data; but making it actionable in real-time. This post examines what fast monitoring means, and explains how incident management teams can implement this approach to realize great benefits. Defining Fast... (more)

Monitoring in the #Microservices Age | @DevOpsSummit #APM #Monitoring

Monitoring in the Microservices Age By Chris Riley Thanks to Docker and the DevOps revolution, microservices have emerged as the new way to build and deploy applications - and there are plenty of great reasons to embrace the microservices trend. If you are going to adopt microservices, you also have to understand that microservice architectures have many moving parts. When it comes to incident management, this presents an important difference between microservices and monolithic architectures. More moving parts mean more complexity to monitor and manage in order to keep applicati... (more)

My Guide to Surviving the High-Growth Stage Companies

Last week, I shared my best practices for fast-tracking a career. This week I’m sharing my top pieces of advice for companies on the high-growth fast-track. As the term suggests, companies at this stage are characterized by a rapid increase in regional and international sales, global employee headcount, and the overall maturity of their product or service. While this is an exciting phase for up-and-coming startups, it also presents challenges that can either kill your company or greatly strengthen it. We’re all too familiar with the Cinderella stories of Silicon Valley – and the... (more)