DevOps Operations Performance Platform

PagerDuty Blog

Subscribe to PagerDuty Blog: eMailAlertsEmail Alerts
Get PagerDuty Blog: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by PagerDuty Blog

In our always-on, IoT-enabled, cloud-connected, big data age, we face a major paradox: it’s now easier than ever to collect large amounts of data — yet the more data we collect, the harder it becomes to monitor situations effectively. This problem is similar to what psychologists call “information overload” — the phenomenon that causes someone to fail to make decisions effectively because he has too much information to contend with. In some contexts information overload is unavoidable. If you get hundreds of emails each day, there may not be much you can do about feeling overwhelmed by them, as you don’t necessarily have a lot of control over who sends you an email. Yet, when it comes to data center infrastructure, information overload is not inevitable. It’s entirely up to you to decide how much and what types of data to collect. If you find that you have too much ... (more)

A Developer’s Perspective | @DevOpsSummit #DevOps #APM #Monitoring

A Developer's Perspective By Eric Sigler "Walking over to the Ops room - I don't feel like I ever need to do that anymore." In the run up to our latest release of capabilities for developers, I sat down with David Yang, a senior engineer here at PagerDuty who's seen our internal architecture evolve from a single monolithic codebase to dozens of microservices. He's the technical lead for our Incident Management - People team, which owns the services that deliver alert notifications to all 8,000+ PagerDuty customers. We sat down and talked about life after switching to teams owni... (more)

Causes of Downtime | @CloudExpo @PagerDuty #DevOps #APM #Monitoring

The Top Causes of Downtime By Zachary Flower According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute. While the data collected was from incredibly large companies, the cost of downtime for even small startups is no laughing matter. Let's assume, for the sake of simplicity, that your core product is a web app that relies solely on organic sales, totaling $1 million in revenue a year. This amounts to about $2 in lost revenue per minute. This doesn't sound like too much in the grand scheme of things, but revenue is only a small part of ... (more)

From #BigData to #FastData | @CloudExpo #DevOps #AI #APM #Monitoring

Fast Data, Fast Monitoring By Christopher Tozzi Big data is old news. Today, the key to leveraging data effectively is to do fast data. In a similar fashion, traditional incident management-which entails collecting and analyzing large volumes of monitoring information-is no longer enough. Organizations must also now do "fast monitoring," which means not only collecting monitoring data; but making it actionable in real-time. This post examines what fast monitoring means, and explains how incident management teams can implement this approach to realize great benefits. Defining Fast... (more)

Smart Devices, Smarter Monitoring | @ThingsExpo #IoT #M2M #APM #Monitoring

Smart Devices, Smarter Monitoring By Michael Churchman Smart devices require smart monitoring. That's not a platitude. It's an imperative. In fact, the smarter the device, the smarter you need to be about monitoring it. Why You Should Monitor As headlines have shown, unmonitored, unprotected smart devices may be a disaster (or a DDoS attack) just waiting to happen. Consider the following: Smart devices can be hacked Last year's wave of DDoS attacks was a wake-up call. Many smart devices have little or no built-in security, and that combined with wireless communication and the soph... (more)