Internet Performance Delivered right to your inbox

MySQL on Kubernetes

Last week, I was proud to present my talk on behalf of Oracle Dyn, “MySQL on Kubernetes” at Kubecon 2017 in Austin, TX.

I started working at Oracle Dyn in early September, and have had a wonderful time coming up to speed, getting to know my colleagues, and being amazed with their experience and knowledge since starting. I was notified by the Cloud Native Computing Foundation (CNCF) about a few weeks after starting and was thrilled I was offered the opportunity to present, and present on behalf of Oracle Dyn!

The conference was four times larger this year (around 4,000 attendees) than last (in Seattle, around 1,000 attendees) and larger than all previous conferences combined. What does this mean? It means the Kubernetes project and the wonderful ecosystem and community that has evolved around it– and with it enumerable projects that leverage it- it is an incredible success.

My talk was primarily centered around running MySQL, or in a general sense, stateful applications such as databases on Kubernetes. Kubernetes is renowned for being excellent for applications that can be scaled up and out that have a mortal and ephemeral quality to them but with some apprehension about running databases on Kubernetes.

Kubernetes is a moving target and new features appear every other day it seems. Even when I submitted my talk in July, prior to my role in Platform Services as Oracle Dyn, my understanding of Kubernetes lagged in matching the latest features. In putting the talk together, I had some functionality to come up to speed with, which I think made for a better talk because, in doing so, I have a great appreciation for so many features that Kubernetes now offers.

I have worked with Linux for more than 20 years now, and been part of the MySQL community for almost as long, having been part of a number of open source projects, even still maintaining the Perl MySQL client driver, so having this perspective of having worked as a systems administrator, database administrator, and developer throughout the years and the manual work that was once required that Kubernetes makes so simple and automated.

The hardest part of preparing this presentation was limiting the content it covered. I initially had almost 60 slides and several demos because it’s such excellent technology that I really wanted to cover but only had 35 minutes to do so. In the future, it might make sense to submit a talk for a day-long workshop instead!

The features I focused on in my talk were:

StatefulSets

This is a feature that allows one to run stateful applications because it offers the ability for Kubernetes pods to be run in order and to have a consistent and unique naming of pods and pod resources such as network and storage across restarts. The talk explained how StatefulSets differ from ReplicaSets and make it ideal to run databases on Kubernetes.

The showcase application that the talk covered that leverages StatefulSets is the project Vitess (http://vitess.io/overview/), which is a Google project developed to be the database store of YouTube that allows them to massively scale out because it offers built-in sharding, efficient connection-pooling, query-deduping, query-cleansing, automatic backups, and several other features. A demo was given of Vitess being run along with configuring sharding, showing failover, and performing backups, as well as the intuitive UI that is innate to Vitess.

Operators

This is functionality that CoreOS developed for etcd that allows one to create Custom Resource Definitions and encode domain knowledge about an application into the resource such that the formerly complex aspects of running something like a database (and databases are complex and hard applications to run) are abstracted away from the user. This makes it possible to run these applications without having to concern oneself with as much of the complexity that would be required to run something like a database cluster or monitoring system such as Prometheus.

The showcase application for Operators used in the talk is our very own Oracle project being developed in Bristol, UK, set for release early 2018, the MySQL operator. The MySQL operator makes it possible to run a MySQL replicated cluster and perform backups from a single command line. A demo was given showing the creation of a 3-node MySQL cluster and then running a backup, with the greatest of ease!

Other features for Kubernetes that I covered that can lend themselves to databases on Kubernetes were node and pod affinity, as well as being able to use any number of storage classes to back the database and provide persistence.

The presentation went really well, was well-attended, even requiring me to take conversations outside the hallway after presenting to allow the next speaker to present!

The conference overall was one of the best conferences I’ve attended. Highlights from the conference, amidst my being busy rehearsing for my talk, were:

  • Kubernetes being so stable now that it’s “Boring” and that running a Kubernetes cluster is far easier than it used to be
  • Heptio working with Microsoft to bring backup of Kubernetes clusters to Azure
  • HBO serving Game of Thrones on Kubernetes
  • Service mesh on Kubernetes (LinkerD) which provides service discovery, visibility, and handling of communication failures
  • Monitoring and metrics on Kubernetes (Prometheus, OpenTracing, Fluentd)
  • CoreDNS becoming (https://coredns.io/)
  • gRPC becoming part of the CNCF
  • Other excellent use-case stories from the likes of NetFlix and Github.

I have to say, from leaving this conference, I’m very excited for the future of what Kubernetes offers for Oracle Dyn and how we can leverage (and potentially contribute to projects like the MySQL Operator) so much of this wonderful platform and community!

For those who weren’t at the conference, my slides (and links to video demos) can be found here.


Share Now

Patrick Galbraith
Whois: Patrick Galbraith

Patrick Galbraith is a Principal Platform Engineer at Oracle Dyn Global Business Unit, a pioneer in managed DNS and a leader in cloud-based infrastructure that connects users with digital content and experiences across a global internet.