PagerDuty Blog

Scaling Incident Management By Patrick O'Fallon Incident management is paramount to the success of any modern ITOps team. However, much like growing a business, scaling incident management can also trigger growing pains. As the landscape of devices, applications, and systems gro... (more)
Last week, I shared my best practices for fast-tracking a career. This week I’m sharing my top pieces of advice for companies on the high-growth fast-track. As the term suggests, companies at this stage are characterized by a rapid increase in regional and international sales, g... (more)
Monitoring in the Microservices Age By Chris Riley Thanks to Docker and the DevOps revolution, microservices have emerged as the new way to build and deploy applications - and there are plenty of great reasons to embrace the microservices trend. If you are going to adopt microser... (more)
Fast Data, Fast Monitoring By Christopher Tozzi Big data is old news. Today, the key to leveraging data effectively is to do fast data. In a similar fashion, traditional incident management-which entails collecting and analyzing large volumes of monitoring information-is no longe... (more)
The Top Causes of Downtime By Zachary Flower According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute. While the data collected was from incredibly large companies, the cost of downtime for even small startups is no laughing matter. ... (more)
Measuring Technical Debt with Incident Management Data By Christopher Tozzi If technical debt were like monetary debt, it would be hard to keep track of it unless you checked in manually. The only way many people find out their checking account is running out of funds is by loggin... (more)
Incident response bottlenecks – you know they’re real and you know that your incident response system probably has a few, but they must be minimized as they hurt your on-call teams and your customers. Let’s take a look at some of the most critical bottlenecks and how to avoid the... (more)
Joe Sexton recently joined PagerDuty’s Executive Advisory Board. As an experienced leader in scaling high-growth SaaS companies, we asked him to share his thoughts on how others can scale their careers in today’s workplace. These skills can apply to any field – technical or not.... (more)
It’s critical to have the right tools in place before a firefight happens. A lack of proper tooling makes it significantly more difficult to recognize, organize, fight, and resolve a major outage. This is especially true when teams are busy fighting rather than communicating to i... (more)
The Mainstreaming of DevOps By Eric Sigler Here at PagerDuty, we spend a lot of time thinking about how we can help the DevOps community and IT professionals succeed. We're particularly interested in the "hows and whys" of evolving DevOps practices, how to deliver value to our p... (more)
The on-call engineer has a critical role to play in incident management. They can mean the difference between an incident turning critical or being managed and resolved quickly. Startups may not have many choices around who should be on call, but as the organization grows and in... (more)
Avoiding Noise in Incident Management Suppression. According to the thesaurus, this word is synonymous with terms like deletion, elimination, and annihilation. Yet within the context of incident management, suppression means something quite different. It’s not about getting rid of... (more)
Smart Devices, Smarter Monitoring By Michael Churchman Smart devices require smart monitoring. That's not a platitude. It's an imperative. In fact, the smarter the device, the smarter you need to be about monitoring it. Why You Should Monitor As headlines have shown, unmonitored,... (more)
Break Down the Silos: Correlate Data Between Vendors By Chris Riley Thanks to the DevOps movement, we now understand why software delivery chains that consist of a series of silos are bad. They complicate communication between different teams, leading to delivery delays, backtrack... (more)
According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute. While the data collected was from incredibly large companies, the cost of downtime for even small startups is no laughing matter. Let’s assume, for the sake of simplicity, tha... (more)
International Women’s Day is a global day celebrating the social, economic, cultural and political achievements of women. It’s about unity, celebration, reflection, advocacy and action. In my career, I have had the opportunity to work with some inspiring women who have helped sha... (more)
A memo from our CEO, Jennifer Tejada. More than a hallmark card holiday, we celebrate today as International Women’s Day. It’s a celebration across the US and the tech industry with a grass roots movement, “A Day Without Women,” which calls for women to go on strike to draw atte... (more)
Have you ever returned to the office to find out that a server was down the whole night, and there was no way you could have been informed? If so, you probably need mobile incident management. In a world where almost everyone’s pockets are filled with smart devices, it would be a... (more)
One year ago today, I embarked on my most exciting adventure yet — PagerDuty. Here in the valley, it’s truly amazing what can be achieved in a year for a software startup like PagerDuty. When I started last February, I was an Enterprise Business Representative and the company fe... (more)
Cloudflare and Google’s Project Zero published details of security data leak. A vulnerability in Cloudflare’s code has led to a potential unknown quantity of data leaking – including people’s private information such as passwords, personal information, messages, and cookies over ... (more)
© 2008 SYS-CON Media