Canary Model Deployment with Seldon Core
Seldon Core supports Canary deployment strategy out-of-the-box. Recently I was looking for some samples/references describing that feature, so I can point a customer to, but everything I found was overcomplicated IMHO. The whole point of using this feature in Seldon Core is that it’s actually extremely simple to deploy an ML model in the Canary fashion. So in this post I will give it a shot and try to describe the feature in my way.
Seldon Deployment K8s resource contains one or more predictors as it is thoroughly described in this post. By default, there is a single predictor that handles all the traffic coming to the Seldon Deployment. This predictor is considered as a “green” or “current” model deployment.
When we have a new (“blue”) scoring image that we want to deploy in Canary way, so we route the traffic to the “green” and “blue” models according to configured weights, we just add a new predictor to the Seldon Deployment and specify the traffic weights:
Actually, that’s it. Having applied the manifest above, we can test it with these commands:
The “green” model will response 70% times with version 1.0.0:
{“prediction”:”tacos”,”scores”:”0.79620767",”time”:0.000128,”version”:”1.0.0"}
The “blue” model will response 30% times with version 2.0.0:
{“prediction”:”tacos”,”scores”:”0.79620767",”time”:0.000205,”version”:”2.0.0"}
Behind the scene, Seldon Core configures all necessary K8s resources such as services and Istio virtual services for you:
Whenever you update the Seldon Deployment with new weights it will update the corresponding virtual service routing rules accordingly.
That’s it!