Skip to content

Canary Deployments are a progressive deployment strategy that allows us to roll out new service versions incrementally. Instead of deploying a new release to all clients simultaneously, a small subset of traffic can be routed (e.g., 10%) to the latest version while most continue using the stable version. If the new version performs well, we will gradually increase traffic until full deployment.

This approach reduces risk by allowing us to catch issues early and roll back quickly if needed.

Why Canary Deployments

Canary deployments are beneficial in the following ways:

  • Feature rollouts: Deploy new features to a subset of clients before full release.
  • Performance testing: Monitor how a new version handles live production traffic.
  • Bug fixes: Validate fixes with real traffic without exposing all clients to potential regressions.
  • Lower risk: Only a few clients are affected if something goes wrong.
  • Easy rollback: If the canary version has problems, we can quickly switch back to the stable version.
  • Better user experience: We avoid major outages by testing updates in production before the full rollout.
  • More confidence in releases: Real user feedback helps us make data-driven decisions.

How Canary Deployments Work

There are two ways of deploying the update: rolling and side-by-side deployments.

Rolling Deployments

A rolling deployment gradually replaces the instances of the old version with the new version in place. It updates one or a few instances at a time until the whole fleet runs the new version.
This strategy installs the changes in stages, with a few machines using the new version and the others continuing to run the stable version.

  • Pros:
    • No need for extra resources.
    • Simple in design and cost-effective
  • Cons:
    • As it replaces the old version, it will be harder to roll back if issues arise.
    • No side-by-side comparison between old and new versions.

Side-by-side deployments

A side-by-side deployment means running the old and new versions simultaneously in separate environments. When the new version is ready, traffic is switched from the old to the new version.

  • Pros:

    • Fast to rollback.
    • No impact on users until the switch.
  • Cons:

    • Doubles the resources.
    • More complex routing and deployment logic.

When to Use Canary Deployments

Use Canary Deployments When:

  • The changes are backward-compatible.
  • We want a safe way to roll out updates without downtime.
  • We have monitoring and observability in place to catch issues.
  • We must validate the impact on a client before rolling it out to everyone.

Avoid Canary Deployments When:

  • It’s a critical security fix that must be applied immediately.
  • The update contains significant breaking changes that can’t run alongside the old version.
  • We don’t have proper monitoring to detect failures.
  • The system is not designed for controlled traffic splitting.

Traffic Routing Strategies

All traffic routing strategies are applicable regardless of the rolling deployment approach you choose. Whether you’re doing a rolling deployment or a side-by-side deployment, the flexible routing rules let you control how traffic flows to each version.

Round Robin Routing (Load Balancing)

Distributes traffic evenly across all available instances.

  • Pros: Simple, fair distribution.
  • Cons: No guarantee that a user hits the same instance across requests.

Sticky Traffic

Routes traffic based on a specific parameter (user ID, session ID, client ID) and ensures a client always hits the same instance.

  • Pros: Better for session affinity and caching.
  • Cons: Can cause uneven load distribution if hash keys are poorly distributed.

Gradual Traffic Shifting

Slowly shift some traffic to a new version.

  • Pros: Reduces risk and allows rollback if issues arise.
  • Cons: Requires monitoring to detect failures early.

Blue-Green Deployment

Runs two environments in parallel. Traffic is fully switched from stable to the new version once the new version is verified

  • Pros: Instant rollback, no partial deployments.
  • Cons: Requires double resources (both versions running).

Header-Based Routing

Routes traffic to different versions based on headers or cookies.

  • Pros: Controlled experiments without affecting all users.
  • Cons: Needs application-level support to identify user segments.

Feature Flags vs. Canary Deployments

There is a clear strategy when choosing between feature flags and Canary deployments:

Feature Deployment & Cron Jobs → Feature Flags Only

  • Gives complete control over enabling/disabling features without redeploying.
  • Safer rollback without affecting existing jobs.

API Migrations → Canary Deployments + Feature Flags

  • Canary Release controls traffic split between old and new APIs.
  • Feature Flags allow toggling specific behavior in the new API without rolling back deployments.

This combo ensures smooth rollouts while maintaining control over new API behavior.

Where Canary Deployments Come in Handy

  • Google Chrome: Google uses the canary release strategy for its Chrome browser. Chrome Canary is a nightly-updated version that allows developers and early adopters to test the latest features and changes before they reach beta and stable channels.

  • Facebook: Facebook often rolls out new features to a small percentage of users—especially on mobile—to monitor performance and engagement and uncover technical issues early.

  • Amazon Web Services (AWS): AWS relies on canary deployments to validate updates, configurations, and new features in a controlled environment, minimizing risk across its vast cloud infrastructure.

  • Microsoft Azure: Like AWS, Microsoft uses canary releases for Azure services to test updates safely, ensuring any issues are caught before a full-scale rollout.

Canary deployments are designed for traffic shifting. They work well when a client continuously interacts with a service, allowing issues to be detected early. In a cron job, a canary release means some instances run the old version, and others run the new version. But if the cron job runs only once per hour, we’re not shifting traffic—we’re just running different versions at different times.

Canary deployments are preferred for stateless applications but are not strictly mandatory. They can also be used for stateful applications, which adds complexity and risks.

What We Have Learned

Canary deployments give us a powerful way to release software with confidence. Instead of pushing changes to everyone immediately and wishing for good, we take a measured approach: start small, watch closely, and expand only when it’s safe. We are using rolling deployments with a sticky routing strategy to give us full control over how traffic flows. Combined with good monitoring and observability, this technique helps us catch issues early, roll back fast, and deliver better user experiences. In short, it’s a safer, smarter way to ship.

Published by...

Image of the author

Amjad Khader

Visit author page