Regularly releasing updates to the App Store and Play Store is more complex than might be expected, especially for teams at scale and even more so when there are multiple apps to ship. There are so many ways to thread through release complexities that no two teams will do everything the same way.

It’s intriguing to see how other teams work. Discerning similarities and differences between teams can help reveal potentially valuable new approaches. In that spirit - and with a six-year background in the release processes for both the DoorDash and Caviar consumer apps - here is a high-level overview of DoorDash’s mobile release management.

Note: At DoorDash, we ship multiple Android and iOS apps and each team handles releases differently. This article focuses on DoorDash’s consumer iOS team and its release processes.

A general overview of the release cycle

At a high level, our release cycles are similar to those found at many other companies. Here’s a quick look at how we manage things: 

  1. We ship a new app version every week, which means we have a release cut weekly.
  2. Testing begins when a release is cut. At that point, we have up to a week to fix any critical regressions, which we do by cherry-picking fixes into the release branch.
  3. We submit for App Store review after testing and any fixes are complete, ideally mid-week to create a buffer for any potential rejections or unexpectedly long review times.
  4. Once approved, the build waits for release until the end of the week.
  5. On release day, we begin a phased rollout to 1% of users to ensure no other major issues surface that might have been missed during testing.
  6. After closely monitoring release health, we accelerate the rollout to all users on Day 3.
  7. At this point, the subsequent release is already in the works, so it’s a matter of rinse and repeat.

The following sections drill down a bit deeper into our processes, particularly in regards to collaboration and the human aspects of our processes.

Release management responsibilities

A small team of volunteers rotates responsibilities for managing releases and taking on release manager responsibilities. Each on-duty release manager is in charge of making sure that the current release rolls out smoothly and that the subsequent release makes it through App Store approval before its scheduled release date.

We deliberately keep the number of release managers relatively low. We have enough managers to spread the load, but not so many that anyone gets out of practice. Keeping the numbers in check helps to keep decisions consistent, especially with inevitable judgment calls that may occur. Managers not only are kept up-to-date on recent process changes and improvements; they’re part of deciding how those processes should change in the first place.

Managing communication

To maintain clear communication and keep organized throughout releases and rollouts, we create a Slack channel specific to each release — for instance #ios-release-5-0-0. This centralizes release-specific status updates and conversations in one place, reducing noise in other shared channels and making it easy to look up details from a past release if needed.

The dedicated Slack channel is especially helpful when hotfixes enter the mix. Because we have a weekly release schedule, shipping a hotfix for an issue in production means that there are two separate releases in progress simultaneously. Isolating each release’s communication in its own channel prevents confusion. For instance, in the 5.0.0 example, anything related to the 5.0.1 hotfix would be discussed in #ios-release-5-0-0 while matters related to the upcoming 5.1.0 release would be addressed in #ios-release-5-1-0.

Testing release candidates

Our apps have grown so large that it’s impossible for any one person or even a few people to fully own release candidate testing. The team dedicated exclusively to higher-level QA can’t easily manage intensive weekly regression testing, too. Considering the volume and pace of continuous changes and new feature development throughout the app, it’s difficult to ensure that the right things are being tested — and tested correctly. The people actually building features and making changes are in the best position to know what’s new, what’s different, and how to test it all properly.

That’s why our release candidate testing relies on a group of engineers we call “component owners.” Each of them is responsible for testing their own component — a feature or area of the product — in the release candidate and fixing or delegating fixes for any regressions detected during testing. Components usually map one-to-one with our product teams — for example, the login team is responsible for running testing related to the app’s login component. Each component owner has specific tests that they must run before approving the component. And every single component must be approved before the release can be submitted for review.

Of course, making sure all component owners have signed off before submission and figuring out who is still testing can get complicated. We use a mobile release management platform called Runway to make collaboration easier throughout the week. It captures the current status of component testing and can also automatically remind component owners through Slack to complete their tests.

Subscribe for weekly updates

Handling regressions and hotfixes

Although we strive for seamless releases — test, approve, and release — that’s obviously not possible 100% of the time. Sometimes we catch regressions during testing that need to be fixed before we can release the app. Because it takes time to figure out what’s causing a bug and then fixing it requires modifying code — which introduces risk late in the cycle — it’s important to carry out release candidate testing as early in the week as possible. That way we have enough time to find the correct solution to the regression and thoroughly test the fix to make sure our changes don’t break something else. Given the scale and nature of what we build at DoorDash, even small issues can have a big impact, so we approach regressions and hotfixes with rigor and care.

If a component owner identifies a regression during release candidate testing, they work with their team to triage the problem and then develop a fix on our main branch via a regular pull request. After the fix is merged, it may — or may not — be eligible for cherry-picking over onto the release branch. Because late-arriving changes are so risky, we can’t simply allow everything into the release. We have strict requirements that must be met before any change can be added to a release. For example, we allow fixes for regressions or new bugs that have a measurable effect on the user experience, but we don’t allow integration of code to fix minor bugs that don’t affect the user — and we certainly don’t use this process to squeeze in new features that might have missed the release cut. We have developed a flow to facilitate escalating fixes for possible inclusion in a release; teams submit fix requests, including explanations and evidence, that are then reviewed by the release manager.

Post-release, the process for creating a hotfix is similar to that used for requesting a cherry-pick in-cycle, except the criteria for allowing a post-release hotfix are much more strict. Spinning up a hotfix requires much more work and could impact the normal release hot on its heels. If we find a bug late within a release cycle — say, between when the app was submitted for review and its release — the decision on whether it gets a fix depends on the same strict criteria used for post-release hotfixes. Although the update is not yet public, it may be waiting for review or even approved already; to implement a fix, we would have to reject the build and resubmit the app. Because this could delay the release, we evaluate whether a fix is merited and how to approach it on a case-by-case basis. We could, for example, either developer-reject the build, apply a fix, then resubmit, or we could let the release happen on schedule and immediately spin up a hotfix afterward. Alternatively, we may determine the bug isn’t impactful enough to warrant a hotfix, putting it in for a fix during the next release cycle.

Monitoring rollouts post-release

Post-release, we rely on component owners and their teams to keep an eye out for any issues that may arise in their areas of responsibility. Each team has a unique set of key metrics they own and monitor, which makes them best-equipped to understand what may be going wrong with their components in the new release.

But release managers aren’t completely off the hook in this. They must watch higher-level measures of health, like the new release’s crash rate and any trending issues. We use Sentry to keep track of the apps’ crashes. Because we can integrate it with Runway, we can create a single source of truth for observing app health closely. If a release manager sees something unusual, they can delegate component owners to take a deeper look and make fixes as needed. But if no problems arise, we can automatically go to a full rollout.

Conclusion

As described here, releasing mobile apps at scale takes quite a bit of work and coordination. Keeping releases moving forward on schedule requires effort that involves stakeholders across many teams to test and safeguard quality; centralizing control with release managers ensures the process runs consistently and efficiently. The setup described here allows us to maintain a weekly release cadence across multiple apps while keeping quality high and team members happy.