Hope you have gone through and enjoyed the first two parts of our progressive delivery with Argo Rollout series, where we have seen how one can implement the blue-green and canary deployment strategy respectively, by deploying a sample application using the Argo Rollout controller in a Kubernetes cluster.
In Part 3 of this series, we will be taking a step further and explore the canary deployment strategy with automated analysis by deploying a sample app using the Argo Rollouts. This would help us to learn how we can either fully promote or rollback our next upgrade of microservices with ease, without impacting end users, and more importantly, without any human intervention.
What is Analysis? Why is it needed?
When we are performing upgrades or deployments of new versions of our microservices, we also want to be sure that these new changes are not breaking out any functionality. To be sure of this, one needs to perform certain functional and sanity testing after every upgrade. Thanks to the Argo Rollout Analysis, this kind of testing can be performed before/during/after the upgrade in an automated way and based on the results of the analysis, either we can roll forward or rollback the new changes completely.
Now, you must be wondering how to implement this so-called “Analysis” right?
For that, one needs to create and apply an “AnalysisTemplate” object, which gets triggered by Argo Rollout objects, that creates another k8s object called “AnalysisRun”. This “AnalysisRun '' object will eventually run an analysis of your choice to decide if your upgrade is successful to rollforward or unsuccessful to rollback.
For analysis, you can use metrics scraped from your canary services with the help of different monitoring providers like Prometheus/Datadog/NewRelic, etc., or you can create your own Kubernetes jobs as well to trigger your own custom set of tests or, if needed, you can perform some HTTP request against some external service and decide further.
Sample AnalysisTemplate: In the below example, we can see how we can calculate the success rate of new canary version using Istio based Prometheus metrics which checks how many total HTTP requests are getting http 5xxx error. Here, based on the successCondition defined by us below, if more than 95% of requests are successful, then this analysis would be called Successful here and would promote our rollouts.
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
args:
- name: service-name
- name: prometheus-port
value: 9090
metrics:
- name: success-rate
successCondition: result[0] >= 0.95
provider:
prometheus:
address: "http://prometheus.example.com:{{args.prometheus-port}}"
query: |
sum(irate( istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}",response_code!~"5.*"}[5m]
)) /
sum(irate( istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}"}[5m]
))
There are different ways to perform this analysis as part of Rollouts as listed below.
Background Analysis
We can run our analysis in the background while our canary rollout is progressing through its rollout steps.
Sample code of how to do analysis in the background using Rollouts. Look specifically for where the analysis section been mentioned below:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: guestbook
spec:
...
strategy:
canary:
analysis:
templates:
- templateName: success-rate
startingStep: 2 # delay starting analysis run until setWeight: 40%
args:
- name: service-name
value: guestbook-svc.default.svc.cluster.local
steps:
- setWeight: 20
- pause: {duration: 10m}
- setWeight: 40
- pause: {duration: 10m}
- setWeight: 60
Inline Analysis
We can perform our analysis inline as part of the rollout steps. When we declare that our analysis should be performed inline, then the analysis will be triggered only when that step is reached. It holds the further rollout until the analysis is complete. The success or failure of the analysis run decides if the rollout will proceed to the next step or abort the rollout completely.
Sample code of how to do inlined analysis using Rollouts. Look specifically for where the analysis section been mentioned below as part of i.e inlined into steps:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: guestbook
spec:
...
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 5m}
- analysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: guestbook-svc.default.svc.cluster.local
BlueGreen Pre Promotion Analysis
A Rollout using the BlueGreen strategy can launch an AnalysisRun before it switches traffic to the new version using pre-promotion. This can be used to block the Service selector switch until the AnalysisRun finishes successfully. The success or failure of the AnalysisRun decides if the Rollouts switches traffic or aborts the Rollout completely.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: guestbook
spec:
...
strategy:
blueGreen:
activeService: active-svc
previewService: preview-svc
prePromotionAnalysis:
templates:
- templateName: smoke-tests
args:
- name: service-name
value: preview-svc.default.svc.cluster.local
BlueGreen Post Promotion Analysis
A Rollout using a BlueGreen strategy can launch an analysis run after the traffic switch to the new version using post-promotion analysis. If the post-promotion analysis fails, the Rollout enters an aborted state and switches traffic back to the previous stable Replicaset. When post-analysis is Successful, the Rollout is considered fully promoted, and the new ReplicaSet will be marked as stable.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: guestbook
spec:
...
strategy:
blueGreen:
activeService: active-svc
previewService: preview-svc
scaleDownDelaySeconds: 600 # 10 minutes
postPromotionAnalysis:
templates:
- templateName: smoke-tests
args:
- name: service-name
value: preview-svc.default.svc.cluster.local
Now, you should be familiar with the crux of how analysis plays its role in the overall rollout. So let's get our hands dirty now with some hands-on as doing is actually learning.
Lab of Argo Rollouts with Canary Deployment And Analysis
If you do not have the K8s cluster readily available to do further labs, then we recommend going for the CloudYuga platform-based version of this blog post. Else, you can set up your own kind local cluster with the Nginx controller also deployed and follow along to execute the below commands against your kind cluster.
Clone the Argo Rollouts example GitHub repo or preferably, please fork this
git clone https://github.com/NiniiGit/argo-rollouts-example.git
Installation of Argo Rollouts controller
Create the namespace for installation of the Argo Rollouts controller and Install the Argo Rollouts through the below command, more about the installation can be found here.
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
You will see that the controller and other components have been deployed. Wait for the pods to be in the Running state.
kubectl get all -n argo-rollouts
Install Argo Rollouts Kubectl plugin with curl for easy interaction with Rollouts controller and resources.
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x ./kubectl-argo-rollouts-linux-amd64
sudo mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts
kubectl argo rollouts version
Argo Rollouts comes with its own GUI as well that you can access with the below command
kubectl argo rollouts dashboard
Now you can access the Argo Rollouts console, by visiting http://localhost:3100
on your browser. You would be presented with UI as shown below (currently it won’t show you anything since we are yet to deploy any Argo Rollouts based).
Now, let's go ahead and deploy the sample app using the Canary Deployment strategy and analysis.
Canary Deployment And Analysis with Argo Rollouts
To experience how the Canary deployment via analysis works with Argo Rollouts, we will deploy the sample app which contains Rollouts with canary strategy, Service, and Ingress as Kubernetes objects.
analysis.yaml
content:
kind: AnalysisTemplate
apiVersion: argoproj.io/v1alpha1
metadata:
name: canary-check
spec:
metrics:
- name: test
provider:
job:
spec:
backoffLimit: 1
template:
spec:
containers:
- name: busybox
image: busybox
#args: [test] #--> for making analysis fail, uncomment
restartPolicy: Never
rollout.yaml
content:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollouts-demo
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 10}
- analysis:
templates:
- templateName: canary-check
- setWeight: 40
- pause: {duration: 10}
- setWeight: 60
- pause: {duration: 10}
- setWeight: 80
- pause: {duration: 10}
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollouts-demo
template:
metadata:
labels:
app: rollouts-demo
spec:
containers:
- name: rollouts-demo
image: argoproj/rollouts-demo:blue
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
requests:
memory: 32Mi
cpu: 5m
service.yaml
content:
apiVersion: v1
kind: Service
metadata:
name: rollouts-demo
spec:
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollouts-demo
ingress.yaml
content:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rollout-ingress
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: rollouts-demo
port:
number: 80
Now, let's create all these objects in the default
namespace by executing the following command
kubectl apply -f argo-rollouts-example/canary-deployment-withanalysis-example/
You would be able to see all the objects been created in the default namespace by running the below commands
kubectl get all
Now, you can access your sample app, by accessing this http://localhost:80
on your browser. You would be able to see the app as shown below.
You would be able to see the app as shown below:
If you visit the Argo Rollouts console by again accessing localhost:3100 on your browser, then this time, you could see the sample deployed on the Argo Rollouts console as below.
You can click on this rollout-demo in the console and it will present you with its current status of it as below.
Again, either you can use this GUI or else (preferably) use the command shown below to continue with this demo.
You can see the current status of this rollout by running the below command as well:
kubectl argo rollouts get rollout rollouts-demo
When Analysis is successful
Now, let's deploy the Yellow version of the app using the canary strategy via the command line.
kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow
You would be able to see a new yellow version-based pod of our sample app, coming up.
kubectl get pods
Currently, only 20% i.e 1 out of 5 pods with a yellow version will come online, and then it will pause for 10 sec before it initiates Analysis as we have mentioned in the steps above. See line 12 in the rollout.yaml
spec file
For Rollouts, if Analysis, AnalysisRun triggered by AnalysisTemplate is successful, then it understands that it can promote the rollout else, it will safely roll back to the previous revision. In order to show how Rollout promotes in case your Analysis is successful, we are running a busybox container that will eventually exit with status code 0. This makes the rollout believe that the analysis has been successfully completed and it's good to proceed ahead.
kubectl get Analysis template
Let's confirm if it has created AnalysisRun or not
kubectl get AnalysisRun
This AnalysisRun will eventually create a Kubernetes job to execute and based on its exit status, it would either roll forward or rollback.
kubectl get job -o wide
Execute the below command, and you would be able to see Analysis being executed as part of rollouts.
kubectl argo rollouts get rollout rollouts-demo
Also on the Argo console, you would be able to see below the kind of new revision of the app with the changed image version running.
If you visit the app URL on http://localhost:80
on your browser, you would still see only the majority blue version, and a very less number of yellow is visible initially because Rollout is waiting for the results of the Analysis job. Based on the results, it will decide whether to promote and proceed with the rest of the canary deployment steps OR it will Rollback.
Once the Analysis is successful, you would see more of the yellow version app running as below:
Now let's delete this setup before we test how Argo Rollouts behave in case of Analysis gets failed.
kubectl delete -f argo-rollouts-example/canary-deployment-withanalysis-example/
When Analysis is unsuccessful
To verify how Argo Rollouts automatically rolls back the new revision in case Analysis is not successful, we would deliberately fail the analysis. To do that let's open the cloned repo in the any of editors like VS code and access the analysis.yaml file from canary-deployment-withanlysis-example folder
Now, let's uncomment args: [test] from the analysis.yaml
and save it.
kind: AnalysisTemplate
apiVersion: argoproj.io/v1alpha1
metadata:
name: canary-check
spec:
metrics:
- name: test
provider:
job:
spec:
backoffLimit: 1
template:
spec:
containers:
- name: busybox
image: busybox
#args: [test] #--> for making analysis fail, uncomment
restartPolicy: Never
Now, this change will basically make your Analysis fail and will show you how Rollouts will roll back to the old revision itself. Repeat all the steps that we did earlier in case the analysis becomes successful.
Conclusion
In this post, we experienced how we can use the analysis feature provided by Argo Rollouts to achieve an automated canary deployment style of progressive delivery. Achieving canary deployment in this way with Argo Rollouts is simple and importantly provides much better-automated control on rolling out a new version of your application than using the default rolling update strategy of Kubernetes.
I hope you found this post informative and engaging. I'd love to hear your thoughts on this post, so start a conversation on Twitter or LinkedIn :)
What Next?
Now we have developed some more understanding of progressive delivery and created a canary deployment with an analysis out of it. Next would be diving deeper to try the last part of this series, i.e, canary deployment with traffic management using Argo Rollouts, stay tuned for this post.
You can find all the parts of this Argo Rollouts Series below:
- Part 1: Progressive Delivery with Argo Rollouts: Blue Green Deployment
- Part 2: Progressive Delivery with Argo Rollouts: Canary Deployment
- Part 3: Progressive Delivery with Argo Rollouts: Canary with Analysis
Looking for help with building your DevOps strategy or want to outsource DevOps to the experts? learn why so many startups & enterprises consider us as one of the best DevOps consulting & services companies.