DevOps Operations Performance Platform

PagerDuty Blog

Subscribe to PagerDuty Blog: eMailAlertsEmail Alerts
Get PagerDuty Blog: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Latest Blogs from PagerDuty Blog
Have you ever come into a company and looked around and wanted to change everything so it’s new, modern, and follows best practices? Or have you gone to a conference and taken a ton of notes on how you could fix processes in your organization... The post Be an Agent For DevOps Change a...
On June 28th, 2017, we marked four years of performing “Failure Fridays” at PagerDuty.  As a quick recap, Failure Fridays are a practice we conduct weekly at PagerDuty to inject faults into our production environment in a controlled way, and without customer impact. They’ve been... The...
A release is a set of customer visible and operational features that together provide a completely new or improved product capability. It’s something that’s meaningful from the user’s perspective, often comprised of user stories and service delivery related activities. Digg...
The threat landscape is expanding at a crazy pace. There are new vulnerabilities released every day, and the amount of servers, applications, and endpoints for ITOps to manage is continually growing. These threats are also growing more potent and frequent, as a recent spate of... The p...
Today’s infrastructure is not your grandparents’ IT infrastructure, nor is it the infrastructure from a generation ago. The days of punch cards, vacuum tubes, ferrite core memory, floppies, and dial-up Internet are over. Today’s infrastructure is also not the IT infrastructure th...
It sometimes feels like engineering and business teams speak different languages and work in completely incompatible ways. Agile development teams work with sprints, user stories, scrums, and relative estimation, whereas project managers gravitate more towards gantt charts, calls for p...
Becoming a new parent has been one of the most difficult challenges I’ve ever faced. That wasn’t a huge surprise; I’d been fully expecting it. But what I did find surprising was that my experiences with PagerDuty helped prepare me in ways I never anticipated.... The post BabyDuty or: H...
I’m proud and appreciative of the commitment PagerDuty makes to diversity and inclusion. I appreciate how we hire women and people of color into visible leadership roles, have safe spaces and channels for employees to discuss experiences and concerns, and have frank, open discussions a...
I recently joined the summer internship program at PagerDuty, and I have already had one of the most inspiring and thought-provoking experiences of my life. Last week, I attended the fifth annual Girls In Tech Catalyst Conference hosted by the Girls In Tech global non-profit,... The po...
In our always-on, IoT-enabled, cloud-connected, big data age, we face a major paradox: it’s now easier than ever to collect large amounts of data — yet the more data we collect, the harder it becomes to monitor situations effectively. This problem is similar to what... The post H...
“Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production.” — Principles of Chaos Engineering Netflix, Dropbox, and Twilio are all examples of companies that perf...
Alerts. It’s so easy for them to pile up. One moment, you’re looking at a handful of alerts. A few hours — or maybe even minutes — later, you’re looking at a mountain. How do you manage them and keep your responders from being completely... The post Determining Incident Priority appear...
I belong to a marginalized group. I won’t bore you with the details. It’s not that I think you’re indifferent, I just know that my story isn’t unique. There are millions with stories like mine. The size of our diverse family is unknown because we... The post Pride in Our Progress...
Today is a big day for us here at PagerDuty: We’re publicly announcing the Public Beta launch of our Community. We’ve actually been working on this project for quite some time, and now it’s finally ready for everyone to join. For us, the word “community”... The post Announcing PagerDut...
Question: What does a barbershop quartet have to do with digital transformation? In what has become a yearly ritual for the PagerDuty Product team, we attended the consistently wonderful Mind The Product conference in San Francisco this week. There were a few key and relevant... The po...
A few weeks ago, the PagerDuty security team sat down with Guy Podjarny from The Secure Developer for a discussion on our security philosophy and a look at some of the tools we use. One of our guiding philosophies on the PagerDuty security team is... The post The Secure Developer: Keep...
In two days, we’ll be sharing what we’ve been up to for the last few months in our What’s New With PagerDuty webinar. This quarter, we’ve released a ton of exciting new capabilities that empower developer and ITOps teams to deliver amazing customer experiences and... The post What’s Ne...
Organizations need many incident commanders to provide a high level of service to their customers while avoiding on-call load. Many shy away from becoming an incident commander because they assume only senior technical leads can be one. However, soft skills are actually more important,...
Prevention is the best medicine The best way to build a distributed system is to avoid doing it. The reason is simple — you can bypass the fallacies of distributed computing (most of which, contrary to some optimists, still hold) and work with the fast... The post Building Scalable Dis...
About a year ago, some technical difficulties at Citi temporarily shut off a few hundred thousand cards and a swath of ATMs at the same time. The result: Citi’s newly launched Costco Anywhere cards received a “flood of complaints.” The Internet phrase for something on... Th...
Teams that serve the business, such as Business Operations and Business Intelligence, are faced with a barrage of urgent requests and never-ending list of business-critical projects on a daily basis. How can teams control this chaos and ensure they work on the right things? Traditional...
Monitoring applications and systems is one thing — knowing what to do with all the data being gathered is quite another. Most IT organizations today have deployed multiple types of monitoring systems. Much of the time, the alerts these systems generate represent minor deviations from.....
Zayna Shahzad is a Software Engineer at PagerDuty on the Mobile Team. She works on the Android and iOS PagerDuty apps offered through the App Store and Play Store. In this post, she shares her experience shadowing our Customer Support team. Finder her on Github and... The post Shadowin...
Here at PagerDuty, our engineering teams are committed to Agile development principles that favor rapid iteration over lengthy periods of design, and favor direct communication between team members over reams of written specifications. There are countless articles that dictate how Agil...
In a recent blog post, Managing a Tier Zero Service Doesn’t Have to Be Scary, PagerDuty’s SVP of Product Development Tim Armandpour discussed several important best practices that minimize chaos during incident resolution. According to Tim, in today’s always-on world, guaranteeing reli...
Here at PagerDuty, we’re pretty focused on being involved in the DevOps community by providing perspectives on where we’ve been, where we are and where we’re headed as a community — and of course hearing from the community as well! And, if you follow this blog... The post Trends in Dev...
Do you have what it takes to win the PagerDuty Innovation and Transformation awards? PagerDuty is excited to recognize the achievements of the most successful organizations with these awards at PagerDuty Summit 2017. Presented each year at PagerDuty Summit, these awards honor both the ...
I recently had the privilege of spending a full day with a small group of our customers. The attendees were leaders in their development and IT operations organizations and spanned a wide variety of industries, including technology, media, finance, retail, healthcare, and more. Every s...
The PagerDuty and HipChat extension empowers responders to collaborate to resolve issues directly from their chat window Several weeks ago, we released our updated HipChat extension. Our team is excited to have built the v2 HipChat extension from scratch to support great new functional...
Here at PagerDuty, we’re committed to helping our customers get the most out of the platform as possible. We’ve long shared best practices and knowledge via resources such as our Support Knowledge Base. But over in Customer Support and Success, we’ve been hearing your frequently... The...
“Incident lifecycle management? If we manage to stay alive from one incident to the next, it’s a good day. On a bad day, it’s all panic mode.” Unfortunately, that’s the reality of incident lifecycle management for far too many software and IT companies — b...
While a major incident is ongoing, all of your focus is on restoring service: watch the smoke, figure out where the fire is, and put it out. But after service has been restored — the incident is resolved, the adrenaline has drained, and it’s peace-time... The post Better Incident Postm...
Today, we’re excited to announce a suite of new functionality to power even faster resolution and accelerate learning from major business-impacting incidents with the definitive Incident Resolution Lifecycle. With this release, we help you to differentiate major incidents from other da...
Your high school history teacher no doubt delivered to you some variation on George Santayana’s famous remark that, “those who cannot remember the past are condemned to repeat it.“ I’m pretty sure Santayana wasn’t thinking about incident management when he...
We are very pleased to announce that PagerDuty and Atlassian are continuing to collaborate and improve best practice around the incident resolution lifecycle and make the world of unexpected chaos a little less frantic. In April we announced the best-in-class  PagerDuty HipChat Extensi...
Get face to face One of the great things about being a platform is that your users have the ability to take your product in a different direction than you might. We’ve had the ability to integrate your preferred conference call tool into the incident... The post Video Conferencin...
In today’s integrated digital economy, the IT infrastructures at most corporations can no longer exist in silos. The overwhelming benefit of integration is the rapid development of new ideas and solutions. The unfortunate downside is that increased integration and connectivity also pla...
Now that you’re well-equipped with how to fast track your career and survive the high growth startup stage, in this post I’ll share my advice on how to make time for professional development and lay the groundwork for reaching your career goals — something many... The post This Is Not ...
In the run up to our latest release of capabilities for developers, I sat down with David Yang, a senior engineer here at PagerDuty who’s seen our internal architecture evolve from a single monolithic codebase to dozens of microservices. He’s the technical lead for our Incident Managem...
Code reviews are an important part of the modern software lifecycle. Unfortunately, a lot of cycles are burned and morale is damaged because there are few guidelines given to reviewers (and reviewees) on constructive feedback and effective written communication. Below are some tips for...