DevOps Operations Performance Platform

PagerDuty Blog

Subscribe to PagerDuty Blog: eMailAlertsEmail Alerts
Get PagerDuty Blog: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by PagerDuty Blog

How We Compute Today: What Modern Infrastructure Looks Like By Michael Churchman Today's infrastructure is not your grandparents' IT infrastructure, nor is it the infrastructure from a generation ago. The days of punch cards, vacuum tubes, ferrite core memory, floppies, and dial-up Internet are over. Today's infrastructure is also not the IT infrastructure that it five years ago, or even a year ago for that matter. Modern infrastructure is changing constantly, and all that we can do is provide a snapshot of infrastructure at the moment, along with a general picture of where it's going. If you are going to monitor infrastructure effectively, you need to understand what infrastructure looks like today, how it is changing, and what it will include tomorrow. Hardware: Less of Moore's Let's start by making a basic distinction: Hardware infrastructure is relatively stable ... (more)

Incident Response: Protecting Your Brand and Reputation From the Get Go

Customers are loyal to companies with whom they feel a shared set of values. So when an unexpected event strikes a company, the resulting upheaval places the brand at risk. Whether it’s an airlines blunder or a performance incident involving technology systems, these moments affect the customer experience by taking down your ability to serve customers well.  With more digital channels giving customers access to brands, the reality today is that companies are operating in a world that is watching their policies, actions and how they handle themselves when things go wrong. The eff... (more)

Scaling Incident Management | @DevOpsSummit #DevOps #APM #Monitoring

Scaling Incident Management By Patrick O'Fallon Incident management is paramount to the success of any modern ITOps team. However, much like growing a business, scaling incident management can also trigger growing pains. As the landscape of devices, applications, and systems grows - each requiring monitoring - so too, does the alert noise and complexity around management for on-call staff. With an increasing number of engineers on your team, it can be difficult to on-board and implement new notification policies and after-hours operations to ensure your team is efficient and loa... (more)

Owning Incident Response: It’s All About The Iterative Improvements

Recently, I was putting together training material for our upcoming track on “Owning Incident Response” at PagerDuty University, and I listened to the recordings of incident calls across many years of PagerDuty history. Several hours of hearing my coworkers at 2x speed prompted two observations: first, I should go find my copy of Christmas with the Chipmunks; and second, the evolution of our incident processes took time, effort, and focus. Any company, regardless of the size of their teams and infrastructure, can have a great incident response process, but it doesn’t happen by ac... (more)

Incident Management for IoT | @ThingsExpo @PagerDuty #AI #IoT #M2M #API

Incident Management for IoT Today By Twain Taylor The Internet of Things (IoT) is starting to become very popular in the lives of people, and in enterprises globally. While it began as a novelty, more innovative and mission-critical use cases have been popping up lately. With the sheer variety of IoT devices available, the large amount of data that’s generated, and the various security vulnerabilities, companies that make IoT devices face a host of challenges that an incident resolution platform can help with. If you’re building an IoT system today, or have plans to build one in... (more)