Google and Netflix today announced the launch of Kayenta, a new open source project that aims bring the canary analysis tools Netflix developed internally to a wider audience. Kayenta is integrated into the Netflix-incubated Spinnaker continuous delivery platform, which works across virtually every public and private cloud. While Spinnaker is the focus of this release, though, Kayenta can also be adapted to other environments.
The general idea behind canary analysis is pretty straightforward. Like the name implies, this is an early warning system that is all about prevent major issues when you roll out an update to a service or your infrastructure. As you roll out an update to a subset of new users (or servers, or parts of your network), the canary analysis service checks whether the new system behaves as it should — or at least as well as the old one. At every step, the system performs its checks and ensures that you don’t roll out an upgrade that may pass all of your regular tests but creates issues when thrown into a more complex production system.
As Google product manager Andrew Phillips told me, a lot of developers already do this, but it’s often a rather informal process. Teams often build their apps, deploy it to a few servers, wait for a few minutes and then check their dashboards to look for obvious issues. That introduces the chance of human error and brings in the potential for bias. A canary analysis system, on the other hand, can evaluate the metrics and then (ideally) make an objective decision on whether the code is ready to ship or not. While most companies run automated tests to check their code for obvious errors, that kind of testing is often not enough when you want to put your code into production, especially if that production environment consists of a set of microservices that may end up interacting with each other in unexpected ways.
As is so often the case these days, with Kayenta, the Netflix team wants to open up its own system to bring the service to the wider community (and in return benefit from the community’s advances, too). To do this, Netflix and Google also worked to rewrite the parts of Kayenta that were specific to Netflix, where the system grew rather organically. That doesn’t necessarily make for good code, though, so with Kayenta, Google and Netflix also spent some time cleaning up the code and making it more modular. Indeed, as Netflix director of delivery engineering Andy Glover told me, the Google and Netflix teams spent about a year to get the code ready for today’s release and one of the major areas of focus for both teams was making sure that the code was as modular as possible.
The fact that Google and Netflix already did some joint work on Spinnaker surely helped their efforts to get Kayenta off the ground, too. It also helps that canary analysis isn’t exactly a competitive advantage for either company. As Phillips stressed, there is really no need for every enterprise to reinvent the wheel and this kind of project is all about “giving space-age tech to the masses.”
Looking forward, the plan is to grow both the Kayenta and Spinnaker community. “The goodwill of Netflix and Google together has attracted a certain crowd of developers that have embraced Spinnaker,” Glover noted. The Kayenta project will surely benefit from that. The Spinnaker Slack room already has over 4,000 participants, after all.
As Phillips also stressed, part of that interest is due to the simple fact that people need software delivery solutions and while there are plenty of options, a project that has the backing of Google and Netflix attracts a lot of attention by default. And given the backing of these two companies, I wouldn’t be surprised if we saw some kind of commercial distribution of Spinnaker and Kayenta in the near future.