Progressive Delivery with Argo Rollouts: Canary with Analysis

Progressive Delivery with Argo Rollouts: Canary with Analysis

Hope you have gone through and enjoyed the first two parts of our progressive delivery with Argo Rollout series, where we have seen how one can implement the blue-green and canary deployment strategy respectively, by deploying a sample application using the Argo Rollout controller in a Kubernetes cluster.

In Part 3 of this series, we will be taking a step further and explore the canary deployment strategy with automated analysis by deploying a sample app using the Argo Rollouts. This would help us to learn how we can either fully promote or rollback our next upgrade of microservices with ease, without impacting end users, and more importantly, without any human intervention.

What is Analysis? Why is it needed?

When we are performing upgrades or deployments of new versions of our microservices, we also want to be sure that these new changes are not breaking out any functionality. To be sure of this, one needs to perform certain functional and sanity testing after every upgrade. Thanks to the Argo Rollout Analysis, this kind of testing can be performed before/during/after the upgrade in an automated way and based on the results of the analysis, either we can roll forward or rollback the new changes completely.

progressive-delivery-rollouts-analysis.png

Argo Rollouts Analysis

Now, you must be wondering how to implement this so-called “Analysis” right?

For that, one needs to create and apply an “AnalysisTemplate” object, which gets triggered by Argo Rollout objects, that creates another k8s object called “AnalysisRun”. This “AnalysisRun '' object will eventually run an analysis of your choice to decide if your upgrade is successful to rollforward or unsuccessful to rollback.

For analysis, you can use metrics scraped from your canary services with the help of different monitoring providers like Prometheus/Datadog/NewRelic, etc., or you can create your own Kubernetes jobs as well to trigger your own custom set of tests or, if needed, you can perform some HTTP request against some external service and decide further.

Sample AnalysisTemplate: In the below example, we can see how we can calculate the success rate of new canary version using Istio based Prometheus metrics which checks how many total HTTP requests are getting http 5xxx error. Here, based on the successCondition defined by us below, if more than 95% of requests are successful, then this analysis would be called Successful here and would promote our rollouts.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
 name: success-rate
spec:
 args:
 - name: service-name
 - name: prometheus-port
   value: 9090
 metrics:
 - name: success-rate
   successCondition: result[0] >= 0.95
   provider:
     prometheus:
       address: "http://prometheus.example.com:{{args.prometheus-port}}"
       query: |
         sum(irate(  istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}",response_code!~"5.*"}[5m]
         )) /
         sum(irate(   istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}"}[5m]
         ))
Source: ArgoRollouts docs

There are different ways to perform this analysis as part of Rollouts as listed below.

Background Analysis

We can run our analysis in the background while our canary rollout is progressing through its rollout steps.

Sample code of how to do analysis in the background using Rollouts. Look specifically for where the analysis section been mentioned below:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: guestbook
spec:
...
  strategy:
    canary:
      analysis:
        templates:
        - templateName: success-rate
        startingStep: 2 # delay starting analysis run until setWeight: 40%
        args:
        - name: service-name
          value: guestbook-svc.default.svc.cluster.local
      steps:
      - setWeight: 20
      - pause: {duration: 10m}
      - setWeight: 40
      - pause: {duration: 10m}
      - setWeight: 60
Source: ArgoRollouts docs

Inline Analysis

We can perform our analysis inline as part of the rollout steps. When we declare that our analysis should be performed inline, then the analysis will be triggered only when that step is reached. It holds the further rollout until the analysis is complete. The success or failure of the analysis run decides if the rollout will proceed to the next step or abort the rollout completely.

Sample code of how to do inlined analysis using Rollouts. Look specifically for where the analysis section been mentioned below as part of i.e inlined into steps:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: guestbook
spec:
...
  strategy:
    canary:
      steps:
      - setWeight: 20
      - pause: {duration: 5m}
      - analysis:
          templates:
          - templateName: success-rate
          args:
          - name: service-name
            value: guestbook-svc.default.svc.cluster.local
Source: ArgoRollouts docs

BlueGreen Pre Promotion Analysis

A Rollout using the BlueGreen strategy can launch an AnalysisRun before it switches traffic to the new version using pre-promotion. This can be used to block the Service selector switch until the AnalysisRun finishes successfully. The success or failure of the AnalysisRun decides if the Rollouts switches traffic or aborts the Rollout completely.

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: guestbook
spec:
...
  strategy:
    blueGreen:
      activeService: active-svc
      previewService: preview-svc
      prePromotionAnalysis:
        templates:
        - templateName: smoke-tests
        args:
        - name: service-name
          value: preview-svc.default.svc.cluster.local
Source: Argo Rollouts docs

BlueGreen Post Promotion Analysis

A Rollout using a BlueGreen strategy can launch an analysis run after the traffic switch to the new version using post-promotion analysis. If the post-promotion analysis fails, the Rollout enters an aborted state and switches traffic back to the previous stable Replicaset. When post-analysis is Successful, the Rollout is considered fully promoted, and the new ReplicaSet will be marked as stable.

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
 name: guestbook
spec:
...
 strategy:
   blueGreen:
     activeService: active-svc
     previewService: preview-svc
     scaleDownDelaySeconds: 600 # 10 minutes
     postPromotionAnalysis:
       templates:
       - templateName: smoke-tests
       args:
       - name: service-name
         value: preview-svc.default.svc.cluster.local
Source: Argo Rollouts docs

Now, you should be familiar with the crux of how analysis plays its role in the overall rollout. So let's get our hands dirty now with some hands-on as doing is actually learning.

Lab of Argo Rollouts with Canary Deployment And Analysis

If you do not have the K8s cluster readily available to do further labs, then we recommend going for the CloudYuga platform-based version of this blog post. Else, you can set up your own kind local cluster with the Nginx controller also deployed and follow along to execute the below commands against your kind cluster.

Clone the Argo Rollouts example GitHub repo or preferably, please fork this

git clone https://github.com/NiniiGit/argo-rollouts-example.git

Installation of Argo Rollouts controller

Create the namespace for installation of the Argo Rollouts controller and Install the Argo Rollouts through the below command, more about the installation can be found here.

kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

You will see that the controller and other components have been deployed. Wait for the pods to be in the Running state.

kubectl get all -n argo-rollouts

Install Argo Rollouts Kubectl plugin with curl for easy interaction with Rollouts controller and resources.

curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x ./kubectl-argo-rollouts-linux-amd64
sudo mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts
kubectl argo rollouts version

Argo Rollouts comes with its own GUI as well that you can access with the below command

kubectl argo rollouts dashboard

Now you can access the Argo Rollouts console, by visiting http://localhost:3100 on your browser. You would be presented with UI as shown below (currently it won’t show you anything since we are yet to deploy any Argo Rollouts based).

Figure 1 Argo Rollouts Dashboard.jfif

Figure 1:Argo Rollouts Dashboard

Now, let's go ahead and deploy the sample app using the Canary Deployment strategy and analysis.

Canary Deployment And Analysis with Argo Rollouts

To experience how the Canary deployment via analysis works with Argo Rollouts, we will deploy the sample app which contains Rollouts with canary strategy, Service, and Ingress as Kubernetes objects.

analysis.yaml content:

kind: AnalysisTemplate
apiVersion: argoproj.io/v1alpha1
metadata:
  name: canary-check
spec:
  metrics:
  - name: test
    provider:
      job:
        spec:
          backoffLimit: 1
          template:
            spec:
              containers:
              - name: busybox
                image: busybox
                #args: [test]  #--> for making analysis fail, uncomment
              restartPolicy: Never

rollout.yaml content:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  replicas: 5
  strategy:
    canary:
      steps:
      - setWeight: 20
      - pause: {duration: 10}
      - analysis:
          templates:
          - templateName: canary-check
      - setWeight: 40
      - pause: {duration: 10}
      - setWeight: 60
      - pause: {duration: 10}
      - setWeight: 80
      - pause: {duration: 10}
  revisionHistoryLimit: 2
  selector:
    matchLabels:
      app: rollouts-demo
  template:
    metadata:
      labels:
        app: rollouts-demo
    spec:
      containers:
      - name: rollouts-demo
        image: argoproj/rollouts-demo:blue
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        resources:
          requests:
            memory: 32Mi
            cpu: 5m

service.yaml content:

apiVersion: v1
kind: Service
metadata:
  name: rollouts-demo
spec:
  ports:
  - port: 80
    targetPort: http
    protocol: TCP
    name: http
  selector:
    app: rollouts-demo

ingress.yaml content:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: rollout-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: rollouts-demo
            port:
              number: 80

Now, let's create all these objects in the default namespace by executing the following command

kubectl apply -f argo-rollouts-example/canary-deployment-withanalysis-example/

You would be able to see all the objects been created in the default namespace by running the below commands

kubectl get all

Now, you can access your sample app, by accessing this http://localhost:80 on your browser. You would be able to see the app as shown below.

You would be able to see the app as shown below:

Figure 2 Sample app with blue-version.jfif

Figure 2: Sample app with blue-version

If you visit the Argo Rollouts console by again accessing localhost:3100 on your browser, then this time, you could see the sample deployed on the Argo Rollouts console as below.

Figure 3 Canary Deployment on Argo Rollouts Dashboard.jfif

Figure 3: Canary Deployment on Argo Rollouts Dashboard

You can click on this rollout-demo in the console and it will present you with its current status of it as below.

Figure 4 Details of Canary Deployment on Argo Rollouts Dashboard.jfif

Figure 4: Details of Canary Deployment on Argo Rollouts Dashboard

Again, either you can use this GUI or else (preferably) use the command shown below to continue with this demo.

You can see the current status of this rollout by running the below command as well:

kubectl argo rollouts get rollout rollouts-demo

When Analysis is successful

Now, let's deploy the Yellow version of the app using the canary strategy via the command line.

kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow

You would be able to see a new yellow version-based pod of our sample app, coming up.

kubectl get pods

Currently, only 20% i.e 1 out of 5 pods with a yellow version will come online, and then it will pause for 10 sec before it initiates Analysis as we have mentioned in the steps above. See line 12 in the rollout.yaml spec file

For Rollouts, if Analysis, AnalysisRun triggered by AnalysisTemplate is successful, then it understands that it can promote the rollout else, it will safely roll back to the previous revision. In order to show how Rollout promotes in case your Analysis is successful, we are running a busybox container that will eventually exit with status code 0. This makes the rollout believe that the analysis has been successfully completed and it's good to proceed ahead.

kubectl get Analysis template

Let's confirm if it has created AnalysisRun or not

kubectl get AnalysisRun

This AnalysisRun will eventually create a Kubernetes job to execute and based on its exit status, it would either roll forward or rollback.

kubectl get job -o wide

Execute the below command, and you would be able to see Analysis being executed as part of rollouts.

kubectl argo rollouts get rollout rollouts-demo

Also on the Argo console, you would be able to see below the kind of new revision of the app with the changed image version running.

canary-analysis.jpg

If you visit the app URL on http://localhost:80 on your browser, you would still see only the majority blue version, and a very less number of yellow is visible initially because Rollout is waiting for the results of the Analysis job. Based on the results, it will decide whether to promote and proceed with the rest of the canary deployment steps OR it will Rollback.

cananry-yellow-blue.jpg

Once the Analysis is successful, you would see more of the yellow version app running as below:

canary-yellow-full.jpg

Now let's delete this setup before we test how Argo Rollouts behave in case of Analysis gets failed.

kubectl delete -f argo-rollouts-example/canary-deployment-withanalysis-example/

When Analysis is unsuccessful

To verify how Argo Rollouts automatically rolls back the new revision in case Analysis is not successful, we would deliberately fail the analysis. To do that let's open the cloned repo in the any of editors like VS code and access the analysis.yaml file from canary-deployment-withanlysis-example folder

Now, let's uncomment args: [test] from the analysis.yaml and save it.

kind: AnalysisTemplate
apiVersion: argoproj.io/v1alpha1
metadata:
  name: canary-check
spec:
  metrics:
  - name: test
    provider:
      job:
        spec:
          backoffLimit: 1
          template:
            spec:
              containers:
              - name: busybox
                image: busybox
                #args: [test]  #--> for making analysis fail, uncomment
              restartPolicy: Never

Now, this change will basically make your Analysis fail and will show you how Rollouts will roll back to the old revision itself. Repeat all the steps that we did earlier in case the analysis becomes successful.

Conclusion

In this post, we experienced how we can use the analysis feature provided by Argo Rollouts to achieve an automated canary deployment style of progressive delivery. Achieving canary deployment in this way with Argo Rollouts is simple and importantly provides much better-automated control on rolling out a new version of your application than using the default rolling update strategy of Kubernetes.

I hope you found this post informative and engaging. I'd love to hear your thoughts on this post, so start a conversation on Twitter or LinkedIn :)

What Next?

Now we have developed some more understanding of progressive delivery and created a canary deployment with an analysis out of it. Next would be diving deeper to try the last part of this series, i.e, canary deployment with traffic management using Argo Rollouts, stay tuned for this post.

You can find all the parts of this Argo Rollouts Series below:

Looking for help with building your DevOps strategy or want to outsource DevOps to the experts? learn why so many startups & enterprises consider us as one of the best DevOps consulting & services companies.

References and further reading: