MongoDB on Kubernetes with GitOps by Sergey Pronin
GitOps quickly found its way into DevOps and developer’s hearts as it greatly simplifies the application delivery pipeline. GitOps can be also used to deploy and more importantly - manage - MongoDB on Kubernetes. In this video, Sergey Pronin explains how MongoDB users can benefit from the collaboration of Kubernetes and GitOps and shows a quick demo of the example.
Hello, everyone. My name is Sergey, and I’m from Percona. And today I’m going to take you to the journey of running MongoDB on Kubernetes with GitOps.
My talk will consist of these parts. So the first one is a brief intro of what we’re going to discuss why it is important, what is interesting and fun. Then I’m going to go a bit deeper into the details of running MongoDB on Kubernetes. What kind of moving parts are there? Why is it not so simple? And why do you need the operator to do that, I will describe what is Percona distribution for MongoDB operator what it can do, why it is good to use to deploy the MongoDB on Kubernetes. Then we’ll look into GitOps, what is GitOps, why someone wants to use GitOps, or it is great to use. And then there will be a very brief demo about what you can do with good ops and Percona MongoDB Operator. That will be fun.
Okay, so super quick intro. This is usually how running MongoDB on Kubernetes looks like. So at the bottom, you have an infrastructure. This infrastructure can be Amazon, it can be Google Cloud, it can be Pager, it can be some private cloud or OpenStack, can be bare metal, anything you want, right? A bit higher, in the middle, you have the Kubernetes layer. Kubernetes comes with hundreds of primitives. It’s pods, it’s containers, stateful sets, replica sets, role-based access control, everything you name it, there are lots of them. Keeping it all is quite hard. And then the top layer here would be MongoDB, which you run on your Kubernetes cluster and MongoDB comes with its own complexity. Like there are replica sets with multiple nodes for resiliency. Mongos is a MongoDB router or proxy to route the requests to the needed replica set nodes in case you use sharding more, you can be one shard node. You have Config Server, where you keep all the metadata for sharded clusters. And here config means that obviously, you need to configure your MongoDB cluster so that all replica set nodes connect to each other, mongos knows where to route the queries, and so on so forth.
Looking at that, it’s quite complicated, it’s not easy to manage, it’s not easy to deploy, it’s not easy to use, right. But on the right here, you might simplify your lives a bit with some tools like GitHub, like Ansible Terraform, and you automate the work that you do with these tools, right? And then you kind of turn into juggler, right, who juggles all these Kubernetes primitives. So this monitor concepts all these tools, and if one piece fails, the whole infrastructure goes down or use the backups or what is worse you can lose the data. Right? So again, MongoDB and Kubernetes is not that easy. And this is how it usually looks like on the highest level, right? If we look a bit deeper into that, this is the MongoDB cluster on Kubernetes. So on the left, you have service right it is Cuban a subject a service to which you connect, then your traffic is routed to mongos. One goes is again one way to be routed proxy which should round the quarters. Mongos talks to Config Server replica set and Config Server replica set it’s not just as stateless application with a set of faults. It’s also stateful because it keeps state it gives them metadata about all the real data that you keep in your MongoDB database and fold the collections right. So one this talks to Config Server replica set and then what happens mongos route to needed replica set node or just the replica set and replica sets at the same time, they also consist of multiple node nodes or points in the case of Kubernetes. And this part they also have storage so usually, it is a stateful set for one of those you can use deployment because you don’t have any storage there. So, in the Kubernetes world, MongoDB cluster already has lots of convenience, already has lots of pods, its stateful sets, its deployments, its a service. But don’t forget that. On top of that, you also have to configure your MongoDB, right and manage your one because like, you want to perform someday operations like backups, scaling, recovery, like if one node fails, you want it back. And doing it all manually is quite a task. Right? Even in the outside of Kubernetes world, it’s quite difficult. But in Kubernetes, where everything is super ephemeral, there are lots of moving parts, the pods are coming and going. And the nodes can fail at any time. And it’s normal in the Kubernetes world because Kubernetes manages the containers and containers are used to be stateless. But in case we have a state that kind of complicates things, right? Again, why I’m saying that is just to describe the whole complexity of managing MongoDB in Kubernetes, right? And here the operator comes in quite handy. The operator concept or it was an SDK, was introduced by Red Hat, I think, in 2018. And it was the initial prototype release, the idea of the operator is simple. What if the user does not need to manage the Kubernetes primitives and the application in Kubernetes at all? What if the user just talks to Kubernetes API, they’ll something to throw in a specific format for a simple yml file. And then Kubernetes takes care of everything. And the operator takes care of everything. So the operator is just a piece of code, and the real value of the operators then the user gets the service instead of servers, instead of pods, instead of config files. And you see here on the right, there is everything greyed out here, right? There is a database right in there, there are multiple containers, some human is primitives, the user doesn’t know what is happening there, the user talks to Kubernetes API and says, Hey, Kubernetes, or, Hey, Mr. Operator, I want to have a database with three nodes with three proxies with this number of gigabytes on my storage with this bucket as configured for my backups. And that’s it. And then some magic happens in the background the user does know what’s going on. And what the user gets is just this service the service object, which the user can use as an endpoint to connect, right? In the case of the database, it’s the Mongo DB endpoint to connect to MongoDB. And to connect to the MongoDB database, right? And then what, what is the real value that or more value coming from operators day to operations, because taking the backups of a MongoDB cluster, running Kubernetes is not that simple. It’s, especially as you know, the Kubernetes, and they’re all knows can come and go and operator solve this as well. The user tells Hey, Mr. Operator, just take me the backup and upload it to this bucket. That’s it or I want to point in time recovery logs, just do it for me, please, or I want monitoring out of the box working and the operator can provide that. And again, everything is automated, the user does not need to talk to MongoDB itself does not need to think about Kubernetes primitives. The user just talks to Kubernetes API to a specific format, so that it can interact with the operator. An operator is just basically a piece of code that does some magic.
In Percona, we provide 100% open source solutions for open source databases, including MongoDB. And we have our own operator for MongoDB, which is called Percona Distribution for MongoDB Operator. The reason is we have our own Percona Distribution for MongoDB, which comes with a server, which comes with backup tools. And we just deploy this distribution on Kubernetes through our Operator and our operator support scaling like you can horizontally scale your MongoDB cluster with adding more and more shards. You can vertically scale by adding more resources through regular Kubernetes requests and limits, you can horizontally scale your replica set your replica sets by adding more nodes in Kubernetes it is just adding more pods, you can take backups with our operators I mentioned before, we also support point in time recovery, which is upload operations logs of logs to s3 buckets. And then you can easily restore your data minimizing your recovery time objectives and recovery point objectives. We have monitoring built-in it is automated integration with the Percona Monitoring and Management, an award-winning tool for monitoring databases. But in case you want, for some reason, customize your MongoDB cluster or you’re not happy with our monitoring solution, you can add the sidecar container, which is a concept in Kubernetes, a sidecar container pattern where you can add a container into the pod and this container will have access to the storage networking and computational power of the pod. Right? So they share all the resources. And you can, for example, that one could be exporter, if you don’t like PMM, for some reason, right? This is what Percona Distribution for MongoDB is, and we deploy it with our operator in Kubernetes.
So let’s talk a bit about GitOps, right? I actually stole this picture from another talk by guys from GitOps, they develop this and created this GitOps approach. And what I really like about this picture is this immutability firewall concept, right. And this is really correct. So on the left here in blue, you have your continuous integration pipelines or flows, whatever you have there, you have development, you test it, you build it, whatever you do, and then you go to GitHub, and create a pull request. And this pull request, for sure is carefully reviewed by the team lead, by security team by some automation tool, code analysis. You name it, right. And once it is merged, this should somehow be shipped to staging, to production, to other environments. And usually there are a bunch of tools to do that. Like, again, the Jenkins can do that in Jenkins can do it with like a continuous deployment or you can do it with Ansible, with Terraform. And then there are multiple tools to do that, but the beauty of GitOps, that you keep all the operational changes in GitHub, and they are automatically synced to your environments, the environment that you need, like, for example, staging, this is the real beauty, because in this case, really your GitHub repository, acts like immutability firewall, which means the state, which you have in GitHub, is actually the same as you have in your real environment. And GitOps approach is all about that. Every operation is a pull request, all the differences between the real environment and get ops are all automatically synced, whether you change it in GitHub, or whether you change it on your real environment, these changes must be synced. And the source of truth is always GitHub repository. And obviously, as long as you have a good review process in your organization, all the changes that are made to your application or to your infrastructure and q&a is they can be easily reviewed, and they are fully auditable. This is the beauty of GitHub. It’s sort of comes with that, right. And if we recall this first picture, and we had, there are lots of moving parts when you want to deploy MongoDB cluster on Kubernetes. Right, hundreds of Kubernetes primitives, MongoDB complexity loads of tools to maintain it all. And in case if you use the operator and GitOps approach, you don’t have any of this complexity anymore. What you get is all your changes are done through GitHub. Once they are merged, they are automatically synced to Kubernetes cluster, and MongoDB operator takes care of MongoDB configuration and Kubernetes primitives. So this is where simplicity comes from. And instead of being a juggler who thinks of all these moving parts, you just need to think now about GitHub, about how to enhance your development and review process, your testing, your build pipelines, instead of thinking about the operational burden, because this is all taken care of. Right, this is the beauty of GitOps using with operator framework and concept. And for the end user, if we go a bit deeper into the details, for the end user it looks like the user goes to GitHub changes the code there, it’s carefully reviewed by some other teams, the reviewers, and once it is merged, the GitOps operator, I have flux here, as an example, detects these changes. And as step number three, these changes are automatically rolled out somewhere, right, in this case, we have MongoDB, and a separate namespace., Kubernetes, and these changes are all down. In case of MongoDB, we roll out the MongoDB operator, and we’re all out the custom resource, which is basically just the declaration which uses apply, it’s just Hey, I want to have this number of nodes in my replica set, I want sharding enabled, I want this storage, I want to expose my MongoDB cluster through a load balancer, and CR customers was just the ammo file with all the details about MongoDB cluster. And then as step number four Percona Operator for MongoDB detects some changes maybe or just the text, there is a new cluster coming in and does all the magic provisions, the load balancer provisions, the service provisions, the storage, everything, right. And again, with the user gets, in the end, is just the service, which just works, right. So the user does not need to think about Kubernetes primitives, about how MongoDB is configured, and everything else. And then the user can also have the two operations on top of that. Monitoring is already up and running. So all the magic already happened.
So let’s go into the week demo. I’ll give you an example of how this can be done. Let me go fullscreen. And let’s start. I will pause. This is a prerecorded demo so that it goes smoothly, I will pause sometimes the video to explain some of the steps that I’m doing there. And what’s happening in details, right. So here already have something in GitHub, I cloned the repository locally. And there you see two yml files, one is called bundle.yml and other one is cr.yml. Bundle.yml is Percona operator. It is just the deployment with Percona operator and with container and also creates some role-based access control for the operator itself. And basically, this is some service account. And cr.yml is custom resource yml. I will show it to you right now. What’s in it. Right.
So this is just the declaration of what I want to deploy, and what kind of MongoDB cluster I want. And you see there is a replica set name, error zero, it has free nodes, right? It has this number of resources like 300 million CPU requests and half over gig for RAM, right? And here I say okay, I want sharding enabled, I want Config Server replica set have also free nodes. And I want mongos, MongoDB router, to have three nodes as well. So everything is listed here, I have fully list in the configuration of a MongoDB cluster in this place. Right? Everything is is there. It’s just a simple cr.yml I declaratively I just tell to Kubernetes what is it I really want, okay, so let’s move forward. As the next step, what I’m going to do, I’m going to create Oh yeah. So here I’m showing that I have a regular cluster. So no magic, I just provision the cluster just now to, to play with it, I’m going to create a separate namespace for the GitOps. And in this namespace, I’m going to deploy flux, flux is an operator for the GitOps which monitor GitHub changes and then automatically synchronizes them with kubermnetes. I have installed flux ctl tool with snap, and it is simpler. And now I’m going to instal flux operator into Kubernetes. Right. So here you see I have specified my email, the GitHub repository, and the specific GitHub. And this specific folder in this was a which I’m going to monitor through this operating. And I have also specified that I’m going to monitor the master branch. So now the operator was created. The points are not yet up, I guess. Let’s see. Yeah, so they are just studying and watching you the whole process. Yeah, now they’re up and running. And the only thing I need to do right now is to ensure that flux can connect to the GitHub repository. To do that, I’m going to ask for a key, which flux is going to use to connect to my GitHub repository. This is the key, and I’m going to add this key, I’m going to copy it. And then this key to GitHub. In the background, there’s a deploy key, you go into the settings of your repository, and there is on the left, you can see deploy keys and just create a new key, by the way, you need the right access, right? Because GitOps operator needs to keep this state. Okay, so let’s see, I have already added the key in GitHub. And now I’m showing you that there are no parts in the default namespace.
And let’s see the logs of our flux operator to see if it’s doing anything because there is a reconcile loop in flux operator, which checks every five minutes, the changes in the GitHub repo. And I can trigger it out, I can trigger it manually, or I can just wait for five minutes. But looks like something already happened here. Yeah, so flux already detected it, we can set a loop just ran. And as you see, now, there are some orders already in GitHub, in my default namespace. And if I’m going to check the process, yeah, I’m going to watch them. So you remember my cr.yml. I created it, I showed it to you, there were three nodes for replica set, three nodes for Config Server replica set, three nodes of monitors. And now what I expect to happen is that all these pods, all these MongoDB and Kubernetes primitives are going to be automatically created without me doing anything. Here, I’m just watching the pods being created. So again, in the end, they should have three pods for Mongo, three pods for replica set and three pods for Config Server replica set, and also one pod for Percona Distribution for MongoDB operator being created, so that operator can manage my MongoDB cluster. So let’s see the processes on going the Mongo’s pod running, config file is almost up and running. So okay, config one crash back off. It’s normal because there are lameness probes, and sometimes there are race conditions if the point is not ready yet. The MongoDB cluster is not up. Okay, let’s see get pods. It’s still not ready. Yeah, this is what I was talking about. And it should go up pretty soon. Okay, crash little. Yes, right. Okay, I get all I got all the pods up. And also here. If you see I did cube cuddle get PSM. PSM DB is a custom resource object which I created. And you see the status is ready, which means my cluster is up and running. And here it is. The endpoint, because in my custom resource, I mentioned that I want to expose my mongos pods through the load balancer. And in Google Cloud, they just created the load balancer alternative, the load balancer object, and now have the endpoint. To demonstrate that this is a work in service, I will just connect to this endpoint with the Mongo client. And we’ll see if it works. But first, I need to get the password. I’m getting the cluster admin password from a secret, which was also created by broker owner operator. So let’s see, I will copy the password, I will connect to this endpoint here. It’s a load balancer IP. And the Okay, it’s up and running. Let’s see if I can list the shards. Let’s see that. Yes, you see, I have one shard, arrays, zero, multiple nodes, so it’s all working. It’s all up and running. Right? Okay, let’s do some changes. Now, let’s say that I don’t want three mongos pods I want 2 now. So I’m going to change my customer source yml. I’m going to now commit the changes. Were the nice commit, nice and easy. And I’m going to push the changes to my GitHub repo to the master range. Okay, and now what’s going, what should happen is flux should detect the changes in the GitHub repo. And but as I mentioned, flux has a five-minute reconcile loop. So I’m going to sync the changes manually so that it’s going to reflect that much quicker. Okay, the changes are synchronizing now. And there are flux logs, you can see that okay, I did that did the change. I’m going to apply it. Okay, nothing to apply anymore. So everything is applied. And you see I have three mongos pods. And now I have only two. Right? And again, I didn’t do any changes, right? Manually. What I did, I did the change in the cr.yml. Well, I created a commit, I pushed it as a pull request from obviously, in the production environment, in a big organization, there should be some review process. And once this review passes, the changes merged, and then GitOps operator just catches this change and applies it. Right. So So I guess this is it for the demo. Right? And, and again, I just want to get back to this picture, right? This is the end state of what the end-user gets from GitOps and MongoDB operator. To the change in GitHub flux operator, which is GitOps operator catches this change. And then applies it to as a regular Kubernetes primitive, as a regular Kubernetes custom resource. And then Percona MongoDB operator detects the change and applies it and the database is provisioned. And the user does not know anything. What’s happening here, when the user cares about is that the change in GitHub books reviewed and merged and that the user gets the service. So the user just connects to the database, which just works with no complexity with no visibility of what’s going on inside. And with all the day-to-day operations when with all the magic complex, right. Okay, I guess it was useful. I hope it was useful for you. And if you have any questions, please contact me on Twitter or through the email. We’ll be glad to help you. And feel free to contribute your ideas and code to our operator. Thank you very much. ∎