Redox logo

How to plan for outages and build team resilience during a pandemic

May 14, 2020

This is part three of our "Contingency Planning During a Pandemic" series. Check out part one / part two / part four.


This is the third post in our four-part series about Redox's contingency planning efforts during the COVID-19 pandemic.  As a brief recap, the series is aimed at sharing our experience from a remote-first perspective as we formed a rapid collaboration to plan for the unexpected effects and interruptions of the pandemic.  

In the first blog of this series, Morgen Donovan described how we achieved our goal to “Implement a method for maintaining a current understanding of our capacity” with the first deliverable: Launch weekly capacity pulse checks that can be reported out by team and role. 

Next, Becky Northrup walked us through the process for “addressing potential capacity imbalances by identifying who does or does not have capacity, and where we should direct help efforts” with the second deliverable: Create a virtual Help Desk that allows Redoxers to request and offer help.

In this post, I’ll discuss how we built resilience into teams by ensuring that backups are in place for all work functions, and that we document, store, and share our knowledge and processes in ways that really allow others to step in and carry work forward. 

Deliverable No. 3: Implement individual contingency plans that identify current work, backups, and how-to documentation.

The Purpose

As Morgen mentioned, we had a lot of questions weighing on us at the onset of the pandemic, and the most pressing was: “Who would take over for me if I was out unexpectedly for two weeks” and “Would they know what to do?” 

We had to make sure every Redoxer could answer these questions explicitly, in a format easily accessible to others.  Even without a pandemic, as a distributed company we couldn’t walk into a coworker’s office to grab a set of files to take over where they left off if they are gone unexpectedly. So we needed to make sure we had cloud-based information sharing in place that would let our teams be resilient in the face of mass outages.  We needed to find and address any gaps we might have had quickly and early before interruptions happened.

The Work

People

I was fortunate to have a great team working with me on this effort.  

  • Talent Acquisition (TA Team).  This incredible group pivoted quickly and with enthusiasm into a role that required a lot of following up with teams and individuals to ask, “Hey, did you do your outage plan?”  They split up according to which team they interact with most in their hiring pipelines, making sure every individual in those teams completed their plans and providing support as needed.

  • Knowledge Champions.  Our fearless Knowledge Champions team is made up of representatives from each of our internal departments.  This group stepped in to support the TA Team to make sure everyone on their respective teams was working on filling in their outage plans. 

  • Team Leads.  It’s no surprise that the people who head up the departments here at Redox were willing to redirect their teams with a moments’ notice to focus on a company priority.  This helped immensely with buy-in and completion.  

  • Morgen Donovan and Dietke Fowler.  These two contingency planning leads stepped in for me while I was out for a few days during this project.  They made sure things stayed on track, identifying the need for and implementing “acceptance criteria” for plans (see more in the Process section below).

Process

Once we secured our dream team, our next task was to divide up the work.  To keep track of individual tasks, I established a timeline that outlined each day’s work.  I used the Confluence “/task” macro to include due dates for each person on the team to check off once a task was complete, thereby updating our entire working group on progress.   

Template Set Up

  • We created a template in Confluence (pictured below highlighting the complexity of the work involved with colors).  Each column pertains to a vital piece of information needed to complete the work in the absence of the task owner:

    • Description of the work

    • Links to necessary documentation

    • Plans for work continuity

    • Who’s responsible for the work when you’re gone 

    • Update/progress so you can jump back in when you return

  • The TA team first piloted the template themselves to make sure everything worked properly.  They then created “shells” on everyone’s existing personal pages in Confluence to make sure it was as easy as possible for everyone to find and update the templates. 

1 - How to plan for - Screen-Shot-2020-05-14-at-12.08.46-PM-1024x280.png

Midway Pivot

As mentioned above, Morgen and Dietke stepped in for me while I was out for a few days, and they discovered that a few teams already had outage plans ready to go.  This was great news, but we had to make sure those plans covered all the points necessary for a company-wide pandemic response.  To avoid having to re-do the plans, we instead created an Acceptance Criteria checklist to make sure all required bases were covered. 

Acceptance Criteria

  • Does the plan identifyimportant processes/ work steps/ projects?

    • Are they known to the team at large?

  • Does the plan identifyat-risk processes/work steps/ projects (those with only 1 owner)?

    • Are they documented?

    • Are backups identified?

  • Does the plan identify processes/ work steps/ projects that CAN be deferred?

  • Does the plan identify processes/ work steps/ projects that WILL be deferred?

Communications

Communicating a time-sensitive, company-wide project has to be approached from more than one angle.  Luke Bonney, our CEO, played a crucial role as he asked everyone to complete this work during his series of COVID-19 response company addresses.  Morgen used our #important-only Slack channel to provide regular updates.  The TA Team consistently communicated to coaches and individuals via Slack as they followed up on progress.  The Knowledge Champions stepped in to serve a similar role and support the TA Team in getting the word out to their teams.

Tools

We used Confluence for task tracking, working group meeting minutes, and the outage plans themselves.  If you do not use Confluence, there are other tools you could use to set up a successful outage planning initiative.  For task tracking, try setting up calendar invitations with due dates or using tools like Asana or Trello.  For meeting minutes, a simple Google or Word document can work.  Finally, for the outage plans, each team could set up a Google or Excel spreadsheet. 

No matter how you choose to document the crucial elements of your plan, be sure to share them with all stakeholders who need them in the moment and store them in an easily accessible place after project completion to use for lessons learned.

The Result

We began this portion of the overall project on March 30th.  By April 7th approximately 85% of Redoxers had complete outage plans, and by April 14th the entire company had finished their outage plans.  

Now that each team has its major projects documented, we are able to see where there are gaps in documentation of implicit knowledge.  We are continuing our efforts to ensure knowledge and work processes are documented and shared.  We monitor the biweekly survey results, and if necessary, reach out to the teams most in need to offer support.

In the next and final post in this series, Dietke Fowler will share with you the lessons we learned throughout this project.  Dietke’s insights into the complexities of business operations are spot-on, and their application to this initiative is something you don’t want to miss!