Open Source Database From Apache Cassandra, HBase to Yugabyte - Percona Podcast 17

by Karthik Ranganathan, Matt Yonkovit

Link to listen and subscribe: PodBean

Karthik Ranganathan CTO at Yugabyte knows databases inside and out having been on the team that first built Apache Cassandra, helped optimize and scale HBase, and most recently built Yugabyte. What insights does he have from participating in these efforts? He sat down with Percona’s HOSS Matt Yonkovit to talk through what he learned, what he regretted, and how Yugabyte takes those lessons and implements them.



Karthik Ranganathan

Founder and CTO, Yugabyte

Karthik was one of the original database engineers at Facebook responsible for building distributed databases including Cassandra and HBase. He is an Apache HBase committer, and also an early contributor to Cassandra, before it was open-sourced by Facebook. He is currently the co-founder and CTO of the company behind YugabyteDB, a fully open-source distributed SQL database for building cloud-native and geo-distributed applications.

See all talks by Karthik Ranganathan »

Matt Yonkovit

The HOSS, Percona

Matt is currently working as the Head of Open Source Strategy (HOSS) for Percona, a leader in open source database software and services. He has over 15 years of experience in the open source industry including over 10 years of executive-level experience leading open source teams. Matt’s experience merges the technical and business aspects of the open source database experience with both a passion for hands on development and management and the leadership of building strong teams. During his time he has created or managed business units responsible for service delivery ( consulting, support, and managed services ), customer success, product management, marketing, and operations. He currently leads efforts around Percona’s OSPO, community, and developer relations efforts. He hosts the HOSS talks FOSS podcast, writes regularly, and shares his MySQL and PostgreSQL knowledge as often as possible.

See all talks by Matt Yonkovit »


Matt Yonkovit: Hi, everybody, I’m here with Karthik Ranganathan, from Yugabite, thank you, Karthik, for being here. You’re the founder CTO of Yugabite. So that’s an exciting role for you, isn’t it?

Karthik Ranganathan: Yes, yes, it is, Matt. Thank you for having me. Yeah, it’s been fun at least. Yeah, by way of quick background. I’ve been in FB most of the time and yeah, having the blast of my life.

Matt Yonkovit: Yeah, I looked at your background, and you go way back. So you’ve been at Microsoft, you’ve been at Facebook, you’ve been at Nutanix, you’ve been a couple other places, here and there. I know, I mentioned to you earlier, and I saw at Microsoft, you were part of the Windows Vista team, which that’s a whole exciting thing. So, and you you did the wireless features, so any wireless Wi-Fi problems, we can call you right on if someone’s still using Vista somewhere?

Karthik Ranganathan: Well, you would hope so. But then tell them to go to somebody else.

Matt Yonkovit:
That’s okay. So going from Microsoft to Facebook, that’s got to be a pretty big jump. That’s, that’s a that’s a big change. Drastic, wasn’t it?

Karthik Ranganathan: It was actually it was, I think, primarily because I’ve been an infrastructure throughout. So and I was in infrastructure and Facebook. So I mean, at Facebook, I ended up building the first my first project ended up getting open sourced and becoming what we now know as Apache Cassandra. So you know, started building a distributed NoSQL database, and they were neither distributed nor knows, I mean, I have not no real distributed open source databases back then this is 2007. But the primary difference was, I guess, this cloud native world and the world of exploding data, like the the world of networking is distributed, but in a slightly different sense, right? Like in the in a sense, you’re used to thinking about high performance and how you get stuff across the wire and delivered with guarantees, and so on. But the world of distributed data, there’s a notion of persistence to it, it needs to always work, it needs to heal itself, it needs to just be easy to operate and so on. So it’s just a slightly different book. But it has been a very exciting journey, like, right from the start to my entire stay at Facebook.

Matt Yonkovit: Well, so when you were at Facebook, so Cassandra has exploded, right? So did you even think that it could get as big as it did?

Karthik Ranganathan: Actually, at Facebook, of the like, the three of us working on this database, we just called it pure storage, because you know, we didn’t want to be very inventive about naming it. And it was just an inside project. And the origins of the project was, there was Facebook, messaging, like this is pre-messenger, the messenger that we know now, it’s just like, it is almost email relegates kind of sending messages back and forth between people on Facebook, the site, and there was no way to search. And a lot of people were saying, hey, could we search our messages, if only we could. And we didn’t want to put the search index into our RDBMS, we were using a sharded MySQL back then. And it was it would get operationally complex, and the dataset sizes would be huge because every message that everyone sends to everyone else would have to be reverse index. So we said, what, let’s build this database that works on commodity hardware, and we had a few decisions to make. And we said availability was more important than consistency when the CAP theorem was taken into account. And that’s because you know, messages, it’s fine. Not too many people search messages, right, like, and not too many people. And if you’re missing some messages, you could always complain, we’d go back to the message store and reindex it right. So, so we ended up building this database called peer storage, and the job of naming the database fell on me because no one else could be bothered enough. Right. So let me name this thing, right, like, and I thought, like any database anybody’s building is gotta be named after an Oracle Team because Oracle’s like the king of databases. So. So we said, like, Okay, what’s, what’s the stuff around Oracle? We can name it after and Delphi was already taken. So that was out. And so in doing not so extensive research, the next famous oracle that anybody had heard of was Cassandra. Like, I mean, it turns out, no one listened to what he said, but hey, what the hell it makes still a good story. So we said, Let’s name it, Cassandra. So we’ve made Cassandra and, and then we said, let’s open-source it, and it just took off. And I’m like, oh, man, who should have named it a little better, but not too late.

Matt Yonkovit: One of those regrets if you could ever go back, name it something different, right. Well, so there was a time though when Cassandra was out there, and it was not as polished. It took a little while there was a there were a few years that I remember, I remember when it was open source. So I was at my MySQL or maybe just at Percona. And I remember what it was open-sourced. It was like, Ooh, what’s this? This is interesting. And then it really took a few years. And then you started to see like, some of the places like data stacks start to form around can we do a commercial offering? You also started to see from what I’ve heard, and I’ve talked to some Facebook engineers that there was some divergence because what you were running internally was very different than what ended up externally, as we know Sandra now. And so I know that that there’s been some of that. So it kind of took a life of its own and went off and kind of moved into different directions. And so other than the name any big regrets that you wish you would have done now looking back knowing what you know, like, Oh, if I only did this one thing, that would have been so much better?

Karthik Ranganathan: I think it’s, well, there are many things we could have done differently. But honestly, even the naming, I think the need of the hour is all you know, right? Like back then. So no regrets as far as what we ended up doing with respect to the database, obviously, nobody could have predicted the rise of the database and the level of popularity it would have it would get and the number of projects that would power just take no telling that in the beginning, right. But it was built for a specific set of purposes like easy to use, easy to deploy easy to run, and highly available. And to this day, it does its job really well when it comes to those things, right. But if you think and the regrets, or I guess there’s no regret as such. But if you think about it, many years later, at that time, we couldn’t deal with data more than a single node, right? Not really in any way other than sharding, your application, which is like a huge undertaking. So it was extremely successful in getting people to go from, I can only deal with data in one node and I really have a world of complexity to there are ways to deal with data and reasonable performance across multiple nodes, even if you have to give up a lot of access patterns, right. So it was very successful in doing that one thing, I think that was the aim of NoSQL, because back then there was a distinct Big Data aspect to data where it wasn’t really transactions, it wasn’t really user facing always. And so I’d say the entire crop of databases that grew at that time, they NoSQL databases, as we know them now. Like Cassandra, and I subsequently worked on HBase. I’m an HBase. Committee as well. And Mongo came up around the same time and in Dynamo, shortly after, they were all focused on handling large amounts of data. But they were the building blocks for what would happen in the future, which is people wanting to build transactional apps and dealing with transactional big data, right, that comes later. So no, no regrets? I think it really did really well, I mean, some of the things I would change if I could or not write it in Java, I didn’t know back then that SSDs would come up and really make GC a problem. Also didn’t know back then that the cloud would become a thing and large memory heaps would come up. And we’d have to deal with that. I mean, just no way of knowing that. And again, definitely didn’t see the art, at that time, didn’t think about the transactional big data from where consistency would be more important. Anyways, these are all good lessons learned. But I think it did his job really well.

Matt Yonkovit: Well, so just stay in on this thing for just one more second. And then I want to jump over to your ah, Bates, maybe a little RocksDB, then we’ll get into what you’re working on now. You know, I’m curious. So you know, you mentioned that hey, it’s doing one thing really well. And as I’ve seen the growth of MongoDB, as I’ve seen the growth of Cassandra, as I’ve seen the growth of Redis, all of these projects start doing some very focused awesome things. And they plug into environments really well. And they do the purpose thing that they’re built for. But we see this trend, that now we’re evolving and adding more and more to these these databases, where it is no longer the purpose driven, it’s becoming more general use, what do you think about that? Because there’s this this philosophy, there’s competing philosophy that you want more features? Versus, look, we want to keep the Bloat as small as possible and very focused on performance and what we’re good at. And so I’m curious where you fall on that?

Karthik Ranganathan: Yeah, I think having seen the spectrum of both sides, I think I now feel like the direction the world will go is to get a somewhat Integrated Database for a majority of the workloads and for very specialised workloads, to start getting specialized in terms of databases, having to build a general-purpose application with very specialized workloads is extremely hard on the application developer side. So I think the bottom line comes to who are you asking the question? Are you asking the builder of a database? Or are you asking the builder of an application or service, right, until you get two very different answers? Back then right back when we built our SQL? The question was really a database builder’s point of view, right? Somebody needs to store a lot of data. It doesn’t fit in a single node, what do you do, right, and the database delegates and the folks in the infrastructure team used to work really closely with the application folks at Facebook and trying to help them figure out how to shard data, how to store it, and it was really a lot of pain because the app team and the infrastructure team would have to work as one unit to make sure the data was distributed, replicated, highly available and all of those things, right. So Facebook, being Facebook and having x very large workloads, right? This footprint is just huge billions of operations, petabytes, or terabytes of data. Everything is huge, right? So the specialization made sense because if you could do a narrow thing really well, you would save a lot of pain and you’d get a lot of return on your investment. And so databases like Cassandra, and HBase. And all of these grew from that need, right? Like in order to make a simple workload that’s very specific, but happening all over the place at huge scale become very efficient. That was the first stage of growth. Now, once this made its way into the enterprises and developers at large, and everybody trying to build applications on it, it quickly brought about, okay, I know how to handle the scale aspect. I know how to handle queries and the transactional aspect. But why am I having two things to deal with right then. And that kind of builds up over time. And pretty soon, you see, even the transactional site is growing, and you need scale on the other side. And so you’re like, wait a minute if I need scale on my transactional side, and I need scale on my scale side, then I should just have one database that does transactions and scale, while not giving up the scale portion. And so now you see databases getting more complex and getting and handling more things, right. But the same architecture doesn’t hold the architecture that we built for scale only, but watered down transactions was very different is fundamentally has to be very different. And the complexity increases when building the database, but the value increases when using the database. And that’s the two points of view.

Matt Yonkovit: Okay, yeah, and I mean, right now, it’s interesting, because we’ve talked with a lot of open-source companies that use open source. And it’s very unusual to find one that has just one database, right, they’ve got like, 1012 13, different like, they’ll have Cassandra, MySQL, Postgres all these different RDBMS is no SQL systems all sitting side by side, all kind of there. And it does create a bit of a logistical nightmare. Yep. For some of their operations, folks. So it there is always this, do you add, the more features? Do you not do you use the databases that you’re comfortable with, and, and then you have the whole people factor as well, because as you bring new engineers, as you bring in new infrastructure operations, folks, sometimes they’re like, oh, I hate that database, I’m gonna move everything to this other one. And you see that, especially on the executive level, as well, where a new exec team comes in, and all of a sudden you go from, we’re gonna do everything on Kubernetes to everything in the cloud, or we were in the cloud. Now, we’re gonna do everything on Kubernetes. And you get this waffle effect quite a bit. It’s an interesting market space that we’re in right now. And so, so speaking of other databases, yeah. So at Facebook, you also work on the HBass project. Now, I know, you don’t have RocksDB in the core, where you’re also working on Rocks.

Karthik Ranganathan: We working closely with the team building rocks, so okay, the origin of rocks is actually quite interesting. We tried to put HBase we try to experiment with running. So Facebook has user data, which was on this sharded MySQL team, right. And, and one of the things we noticed was, I mean, and MySQL used InnoDB as the storage engine. And if your DB is based on Btree-based storage architecture. So like, as you want to minimize the number of seats on disk, because traditional disks couldn’t offer too many random seats. So a Btree based architecture really makes that a lot better. And most of the, of the, of the existing RDBMS is the traditional RDBMS are built around between so like Postgres and MySQL, and so on right now, with SSD is coming up now, what really happened was SSD internally is can give you a lot of random iOS, because the, how you translate your random iOS to sequential iOS is done by the SSD layer, right, the information below the line of what you see in the software. So that means you no longer needed to adhere to restricting the number of I ops or minimizing for it, you needed to now go with how many parallel AI ops can you do? And how sequentially Can you layout data so that you can utilize the device better? So that meant that a log-structured storage model is far more conducive to an SSD than an mp3 based storage model? I mean, I’m going all the way down to the guts of storage.

Matt Yonkovit: But that’s okay because we love that too. Right.

Karthik Ranganathan:
But popping up a level right? What does this mean? This means this meant that the memory to disk ratio of what you needed in order to serve your workload, you needed a lot more memory for every you know, gigabyte or terabyte of storage, right? So, like you could do maybe one is two for one is to five, but or maybe even one is to 10 if you push it but you couldn’t do much better than that, which means for every 10 units of storage, you need one or more units of memory right like and RAM and that made in bills very expensive, right. Whereas with HBase, what we noticed was going with the log-structured storage format we could do one is to 501 is to 101 is to 1000 Depending on the workload, right? So that clearly made a case for log-structured storage LSM storage being the way to store now, obviously, we just tried running HBase on the SSD tier And HBase was also written in Java. So you can add that to the list of regrets. But anyway, but we didn’t know back then again, I think Java can be tuned reasonably well to run on hard disks. But it’s very difficult to tune this to run well on flash devices, SSD devices, which are much faster. So you start hitting your garbage collection problem a lot quicker, and you have larger memory heaps, which gets harder to deal with. So the upshot is, we found that we couldn’t run HBase as it is inside that year. So we needed a storage engine underneath my sequel that would behave like LSM. Right, so what we learned from HBase, it wasn’t Java, you couldn’t just pluck out the code, but and MySQL wasn’t C, so you needed something that works with C++ as a storage engine to replace InnoDB. And work the way HBase did with the rest of the replication and architecture already in place. Because there was a huge operations team already an expert at running MySQL at scale. So that kicked off the HBase. Sorry, the RocksDB project where it was like level DB was taken as a starting point. And then we added a whole bunch of concepts from HBase and built that storage engine, so so we will connect it to the work, we didn’t work on it directly. But anyway, that RocksDB was then converted into a storage engine for MySQL called my rocks, which powers it, which really made it very efficient to run my sequel on SSD devices, right? So

Matt Yonkovit: Right, and we actually use MyRocks in the Percona Server, so Oh, there you go. There you go.

Karthik Ranganathan: So that’s the origin of RocksDB. Now, now coming full circle, we knew as I mean, we hadn’t, it wasn’t one. But we had two databases in the open-source Cassandra and HBase, both written in Java, both of which we knew intimately. And both of which we could have used a lot of building blocks and code from in order to build up right when we started. Like, when we started to build our database. And we knew exactly we want to make it transactional, there were significant differences. But there were also large areas that would be similar. However, we said what, for the better of the words, we’re going to stick to C++, we don’t want to go to Java path large, high performance is important. So we’re absolutely on the other side is let’s do it in C++. So that meant like, hey, the rocks DB building block would be, like, critical for us. And so we started with that.

Matt Yonkovit: Okay, great. Yeah, and I remember those times, because I grew up as a DBA, in the pre-SSD, and went through the SSD changes, and it was almost overnight, that you started to see these pop-ups. Actually, in my closet, I have one of the first SSDs that was produced, I think it was like 16 gigs. And it cost like two grand, I got MySQL to pay for it at the time, because they were like, oh, I’m gonna test this. So you know, so benchmark. But it’s so interesting because sometimes you go back and you regret design decisions. You’ve mentioned Java a couple of times, my favorite thing, and InnoDB, when you go into the core, was back in the early days of InnoDB in the 4150 series. You know, obviously, nothing was optimized for SSD back then. Because it didn’t exist. And Hecke, who created in ODB actually wrote this comment. It’s my all-time favorite. I like to bring it up as much as I can because it’s just funny. He actually put a cap on storage. And he put I’m capping iOS at 100 iOS per second because that’s all modern disk can do. And he wrote it in like, 95. Right? He’s right.

Karthik Ranganathan: Or wasn’t?

Matt Yonkovit: Yeah, I mean, yeah. And that’s the thing, when you start you start with single-core machines and then you get multi-core, and then you get hyperthreading, and then you, like, the advances continually make you jump in allow you to do new things that you didn’t plan for before. And I think that’s one of those really interesting movements in the market is, I mean, the fact that storage is so cheap now. And even fast storage is so cheap. Yeah, it has really brought down the requirements that you need for those beefy boxes that you mentioned because it used to get as much RAM as possible. Yep, load it up. And you want all your hot data in memory?

Karthik Ranganathan: Like it doesn’t matter what and look at that, just do it.

Matt Yonkovit: Yeah. Well, but you look at the architects, architectures of some of the no SQL systems, some of the classic ones, they still are not 100% optimized for SSD. And the more that, you

Karthik Ranganathan: They’re not, in fact, we built Cassandra and HBase, pre SSD or so and MongoDB none of them are optimized for SSD as and they will adapt to use SSD, but they’re not built for SSD. Right. So I can talk about the exact things in these databases that will yeah, anyways, yeah, just a detail. But yeah,

Matt Yonkovit: well, no, but it’s super interesting because you think with all this modern hardware, you think with all these modern advances, that you know that the adjustments and the changes, but a lot of it ends up being altered. architectural decisions that were made back in the very, very early days. Absolutely. Yeah and, and not all of them are fixable, easily, like like, like khakis 100 is hardcoded where you could just make it a variable or something right. You know, most of them aren’t that easy.

Karthik Ranganathan: No, then I think they’re very fundamental. Yeah.

Matt Yonkovit: Yeah, it and that’s where I think that it’s really great to harness the power of the community for a lot of this work because I can tell you, for instance, MySQL during that 4151 timeframes, in that range, when they were still an independent company, they really had a hands-off approach to DB, because inner DB was owned by Oracle. And so it was the community that brought things up. Now you mentioned rocks dB, for instance, Mark Callahan was at Google at the time, and he was, let me go ahead, and let’s fix those things that are broken in InnoDB. And Peter and Vadim over at Percona where I am they were like, oh, let’s fix those things. And there was a lot of community-led effort to fix and overcome the bottlenecks. Yeah. And that’s why it’s, it’s super important now, now for you was, was the decision to open source Cassandra or easy was, it was just like, This is what we’re gonna do, like, was it? Or was that because coming from Microsoft, at the time, Microsoft was like oh, my god, open source right. Back in the early to mid-2000s When Steve Ballmer was there he famously said, Linux is cancer. So, open sores really weren’t in the Microsoft lexicon by going to Facebook, it was a lot different. So, I mean, like, did that strike you as odd?

Karthik Ranganathan: No, it was, I think, maybe it had to do a little more with my own personality. Or maybe it had to do with the fact that I hadn’t been fully indoctrinated in the Microsoft ways. Like, I was still an engineer at heart more than anything. And there was a number of us back then that it was like, Okay, fine. That’s what the company wants to do. But that’s not necessarily how we think about things. So it wasn’t too hard, actually, as an end. At Facebook. I think, even at the time, I’m not sure about now, it’s been a few years, but back then at least, there was a distinct, like thought process towards open sourcing. The reason we couldn’t we didn’t open source more is because it’s it was more work open-sourcing it sometimes because it’s so integrated with the Facebook pieces of technology. And with Cassandra having been built without that level of I mean, there was still some integration, we use Facebook’s version of thrift and a whole bunch of stuff, but at least it was small enough to where we could open source the dependencies with it. And you know, the community over time, obviously, see, no piece of infrastructure can be static, right? It has to evolve with changing libraries, changing hardware, changing everything, right. So, so it’s great that the community formed around it and kept pushing the limit. And so you know, it is what it is today because a lot of people but it was, it was easy for us to do so. And there was a good amount of encouragement to do so. And I didn’t know what to expect. I mean, to be honest, I really didn’t expect what happened afterward, which is like which is awesome. I think I mean that in a very positive way. But it was still exciting to see some interest from outside and some people talk about it, and have a bigger group to work with on that. So that’s really been a good journey. You’ve been very fruitful.

Matt Yonkovit: So you left Facebook, you went to Nutanix for a couple of years. And then like you got an itch to like, on creating a new database, a new company. I mean, like, what, what, what inspired you than to like, say, Okay, I’m going to jump in, and I’m going to create a new open-source database, something that ended up as you go by like, like, what was that kind of problem? What did you see? Like, what was like, the lightbulb moment?

Karthik Ranganathan: Yeah. So. Okay, so maybe I’ll give you a little bit of the inside story that we externally No, but But anyways, so myself and my co-founder, I live inside stories. It’ll just be between us. Okay, great. Okay, great. It’s good to be nice, but we’ll just like to do it on the podcast anyway. So like Nutanix, like dirige. He was one of the founders and the CEO for a long time at Nutanix. Like he and a couple of others from Lightspeed, they reached out to Canon, my co-founder at UW. And Karen and I were working together on h bass and the next generation of H basic, we were going to build a database that we call it we codenamed hydro bass internally, but we were going to build this for what looks closer to a modern cloud architecture, right. And in C++, we’re just starting this work. And, and the longer-term ambition that the two of us had was to start a company around making databases easier because we felt that the database world had diverged even back then this is back in 2013. When we both joined Nutanix. Together we had the like our thought process back then was the world actually needed a new database because there were enough things that were different like we talked about SSDs we talked about the cloud with There’s a whole nother thing where you have these. Back then on Facebook, inside Facebook, it used to be nearby and faraway data centers, which are what are called zones and regions today in cloud-like, terms weren’t like as well known back then. But anyway, how do you architect a database for this type of architecture where replication is built-in, and you can actually offer high availability and go after consistency. So these are some of the things we were starting to see as patterns, there were also a number of operational things that you could make easy. There’s also a whole journey that you have to keep building along on the database side in order to handle more data and more I ops at scale. So a number of these things had come into our work that we were doing with firstly, Cassandra, and then HBase. And this next database we were building, and we wanted to bring that to the enterprise at large, because at Facebook, it will always be specialized access patterns, and rightfully so because that was the need of the hour. But those types of access patterns are going to be the same friction that you see between using an Apache Cassandra and what the enterprises truly want, there is some area where it’s very powerful, but there are some areas where it’s deficient, right? And you constantly see people struggle with this on what kind of databases can I use to make the whole make data layer a hole? Right? Like, that’s really what they’re trying to do. So we wanted to go do that. But there’s made a very compelling point, he was like, Hey, you guys, it’s still in the distributed storage area, even if it’s not databases, but and you guys know, technology, but learn building a company Learn, learn about enterprise companies, and you know, from a product company do and, yeah, it seemed like the right thing to do. We said, Hey, we probably stay for a year or two, we ended up staying three years, it was a blast, we actually had a lot of fun. But the calling was always to go back and build a database. That was the long-term plan anyway, so So yeah. So then, like in 2016, we said, like what, we’re going to go talk to people and you know, go after chasing this dream of ours, a longtime dream of ours, so Right, right. So so he said, Okay, let’s go talk. And when we talk, it was actually an interesting landscape back then the tech companies and the forward thinkers were all thinking, how do they move to the cloud, and there’s a small fraction of them. And they were thinking only a small fraction of their workloads because anything else was too much work. And no one wants to think about inventing work for themselves. Right? And the enterprises were like, analytics like snowflake was just starting to get some traction. So analysts 2016 went, Yeah, yeah. And Analytics was clear, it was headed to the cloud. And enterprises were okay with that. But a system of record, and critical data, and OLTP systems were not right. And most of the people told us like, when we talk to enterprises, they’re thinking about it. It’s not in their roadmap to go to the cloud. Right. And, and the journey was, it’s been interesting to see because based on our experience inside Facebook, it felt like the breadcrumb was leading to a place where the data tier has to become cloud-native. I mean, we’ve seen Kubernetes come up, we’ve seen microservices come up, we’ve seen zones and regions and public cloud architectures come up. And every one of those things had happened inside Facebook a few years before. So it felt like that was the time when in a few years, the data tier would pick up. So we said, this is the time let’s build it, let’s build it the way we would have built it at Facebook without like all of this hindsight, 2020 kind of learnings, but also, in a way where the enterprise and people at large can use because like having both Cassandra and HBase, a number of people came out and reached out and said, Hey, can you add that secondary index? Can you give me a simple transaction? Okay. It’s like, Yeah, but how about it doesn’t work all the time? Does that work for you? And like?

Matt Yonkovit: Yeah, yeah. Yeah. So that’s not good to have eventual like, go, or inconsistency? Yeah, the design.

Karthik Ranganathan: The point was just different. Like, and adding something like this in would just be too much of a lift, it’s just like, fundamentally changes the system at its core.

Matt Yonkovit: Yeah. Well, in you know, you, you guys have built something that’s, that’s a little unique, because you have stayed true to the Apache two license, we’ve seen that a lot of other companies have already started to like shift they’ve started to, we’re gonna do whether it’s SPL, a lot of people want a GPL. But you have stuck with Apache two now. You know, what was that? You know, you chose that, what, four or five years ago now? Is that something that you felt pressure to change?

Karthik Ranganathan: No, actually, it’s been. I mean, like, there’s the theory, and there’s practice, there’s always some level of worry about if you got the theory wrong, but when you see it work out, in practice, it just makes it a lot easier. Right. So so the decision, by the way, was not light. It was actually there was a lot of thought that went in. So in 2016, when we started the company, we said like, we didn’t know if we wanted to do open source or not, because there was a lot of confusion. And a lot of I mean, we are provenance of open source, having built multiple open-source databases and work in the community and you know, committers and so on, but there was a lot of cautions from the business side like you know, VCs and a whole bunch of people We’ll about really evaluate if open source is a viable business model. And 2016 was a bad year for that, like as an, it was unclear like none of the open-source companies had really broken out yet. And so we said, like, you know what, this is a one-way street, you open source, you can’t undo it, right? Like, it’s just like, there’s no going back on it is almost like a breach of trust. So, and we wanted to start the company. So that put us in the only possible spot, which is, let’s keep working on the product and be ready to open source it like let’s behave as if we are open source, but not actually open sourcing, let’s evaluate, let’s understand the market, right. So I think if you if you take out the first six months, and just like setting things up and getting things ready, it took about a year for us to get our bearings on what we wanted to do and how we wanted to go about it. And we said like we’re gonna go the one thing was clear in talking to our customers early, which is that people wanted the transparency of an open-source database, specifically in the OLTP segment. And you know, databases, like Oracle are amazing, but people don’t really know how it works or what it does, or what they pay for, or what the alternative is, or a lot of those types of issues. Right. So so we said, we are going to go open source, but the fear that you know, Amazon would take it and run it that that existed right back then. So so we said, we’re going to keep about 80% of our core database features and be open 20% are going to be enterprise features, they are things that only advanced enterprises would need. And the other big piece of the business we had to think about was also that hey, what if Amazon builds the remaining 20%, and runs it and puts us out of outputs us out of business as we get traction on Amazon, anybody else? I’m just using it as just picking on Amazon. But yeah, anybody could do that. Our realization was at least the theory was that people pay not for the database features, they pay for the ability to consume memory, this is what Red Hat had shown us. It’s not the features in the operating system. It’s about making it enterprise-grade. And they built a really successful business out of this. So so our theory in 2016 was, we need to not just think about the database, we need to think about the consumption of the database, right. And it wasn’t a big secret that the world would eventually have to DBs, but it wasn’t clear when. So we said, You know what, we’re going to strike the balance between the two sides and build something completely radical and unique, we’re going to build a software-defined the best, so one that we can run when the time comes, but one of the enterprises can run themselves, and that sits outside the database. So we make the database, mostly free, or open-source, a small percentage of the open database features will be held closed. This is primarily to prevent the Amazon thread. And you know, build a complete product on the side, which is a software-defined device. So you could just get turnkey, deploy and compete with Amazon’s managed services like a ruler if you will if you wanted to. Right. So that was really the vision that we started executing on.

Matt Yonkovit: And we launched the company at the end of 2017. And we said for our license, we’re not picking AGPL. Because, hey, Mongo was AGPL. And it still didn’t stop Amazon and Azure from having a competing Mongo service. And elastic was something that changed their life. So changing licenses doesn’t really stop people from competing, right? So one way or the other, it’s going to happen, right? So, right, you better off offering what customers want the problem why customers adopt Amazon is because I want a service that’s easy to consume. But the maker of the software doesn’t have it, where do I go, I go get it wherever I can. And Amazon’s offering it, I’ll take it from them, right. So it’s more the dramatic growth of the cloud, where everybody missed the boat on building a managed service on it. And Amazon was simply trying to make their own cloud more compelling, because what’s the use of a cloud without any services? So they said, Okay, let’s put the useful services. And that became a really big business and everybody else missed out. And now it feels like it’s strip mining, open-source, and whatever. So so we said, hey, let’s do the simple thing. Let’s have stuff in the cloud that can compete with the cloud vendors. And additionally, you can take this on-prem, or go to any cloud or run it yourself. Right. So with that kind of a value proposition, why would somebody go to Amazon specifically, they’re able to get everything from the makers of the database. So we felt like, that’s a good enough, well thought of a reason. And let’s go with that theory. So and a year later, and we saw that most of the people were stickier to our platform product, which is the software-defined the best more than the database. And we said, You know what, this is the last stop, which is going to remove it, we’re going to make your guide as open-source as Postgres because we were also hearing from the Postgres community, hey, here’s supporting Postgres, how come you’re not as open-source as Postgres? It’s just like, You got to be as extensible and open. And we said, yeah, that was never the intention, it was done, just to make sure that if it was business viability, this is not even affecting it. So why not just make it completely open. And that has really led to a dramatic inflection in the adoption, and even the commercial results have been great.

Karthik Ranganathan:
And you mentioned the Postgres you know, and that’s compatibility that you’re not building anything on Postgres itself. It’s just compatible because it’s using the same SQL language and the same no wonder.

Matt Yonkovit:
It’s has grown over more than that. We Okay, all so I think you got by is a combination of people. The upper half of Aurora and the lower half of Google Spanner, so to speak. Explain why, okay, so we reuse the Postgres code so feature for feature bug for the upper portion of the database. So the query engine is actually Postgres code, vanilla. So when the application hits the database, it’s actually talking to the Postgres code, which receives it, checks for correctness, and then does a query plan and the execution plan. The query and execution plan is where we inserted our code and started building stuff to make it work well on a distributed substrate. And the storage engine is completely changed, right?

Karthik Ranganathan: So if you that’s built on MyRocks, or not MyRocks uses RocksDB, but it also uses a lot of other things like even the rocks DB is heavily modified, because rocks DB is a single node database. And if you want to do consistency across nodes, you need raft and other things. And so we pulled out the write ahead log and the MVCC. And so we use RocksDB as the storage component on a node. So the per each node storage component is rocks dB, which is also modified to make it very optimal for the access patterns from above the replication and transactions manager and distribute that the replication is built using a raft and a transaction manager that’s like built on top of this raft system that enforces consistency. And the upper half is actually like, just Postgres Cobra is actually an extensible API layer. So we have SQL and no SQL to not fight the war, but embrace it and let people come from either direction. But nevertheless, the SQL part of it is actually Vanilla, Postgres, not Vanilla, Vanilla, Postgres on top. So that means we support you know, stored procedures, triggers, extensions, partial indexes what have you have Postgres.

Matt Yonkovit:
Okay, and, and that makes it very easy to port over them.

Karthik Ranganathan:
That’s right. So the porting over we kind of give the mental map of no application or most applications built for a single node will not work well in a distributed substrate that’s just like laws of physics, right? So you have to think about a few things. But that there’s, you have two options here, you can rebuild your application inside out. Or, in megabytes case, you can get your application to run sub-optimally on a distributed system, that’s step one, and then go look at what are the slow queries and one of the adjustments and tweaks you have to make, which is not a whole lot like and get it to work? Well, on a distributed system. Like an example, you may be using auto-incrementing, Id like it’s called the serial data type in Postgres, right, like where you insert a row, it’ll give you the ID is one, you insert another row to give you two, right? It just works as-is on gigabyte, except for every ID that’s issued by default, Postgres would go somewhere and fetch another ID. So that means every insert is now incurring an extra RPC, which means your insert performance is really going to suck, right. So the only change you need to do is Postgres actually allows you to modify the session or the table to assign these numbers in batches, like these auto-increment IDs in batches to all of the nodes. So just say, hey, an alternate set the cache size to like 25, or 100. And now you’re suddenly making only one call for every 100 calls for every node, and suddenly, you’re dramatically your application performance goes up. So that transformation is a lot easier. Whereas if you go to any of the other distributed SQL databases, the way is, hey, you should use your UID. And that’s like, now you’ve just left reasoning about a lot more complexity.

Matt Yonkovit: Oh, yeah, I mean, and I, that’s great that you’re, you’re looking at that level, especially as things move over. Because one of the issues for a lot of people now is, unfortunately, we have that people want the easy button, they want it to be very easy to get started. And that’s pushed a lot of what had classically been more database engineering skills, DBA skills down to the developers who are just giving me a table, just give me the information.

Karthik Ranganathan:
I just need to write an app.

Matt Yonkovit: Yeah, yeah. Yeah. Yeah. And, and I think that’s where it’s super interesting to see that kind of movement in the market because we are seeing more common issues come up over and over again, that are like, Oh, well, everybody knew this five years ago. But now it’s a new set of developers who are just used to going to Amazon or Microsoft, or Google clicking a button, and there’s a database there. And then when it’s slow, they just go swipe up. Now we’re gonna go to the next instance size, right? And it’s a super interesting thing because as we talk about, like trying to get down to the nitty-gritty, there are so many inefficiencies that we built into different database layers over the years, but also our applications. It’s a good reminder that as you do porting as you move things, not everything’s always going to just move over without changes, and I think that’s when we’re used to easy we want easy. That’s right. And it’s not always that case.

Karthik Ranganathan: That’s right. Yeah, I think the trade-off here, at least that we’re trying to strike is that it’s not you don’t Like, we don’t try to be 100% on either side. So we try to give easy until it gets expensive or complex or inefficient at which point you can start thinking that’s really the approach, right? So it’s easy enough to move. But if something doesn’t work well that we tried to make it easy enough for you to find out why it doesn’t work.

Matt Yonkovit: Yeah. So Karthik, let me finish with this question. What have you guys got going on? What’s on your roadmap? What are you really excited about here? What are you seeing that’s coming down the pipeline that you’re like, Ooh, this is exciting. I can’t wait for this to be out there.

Karthik Ranganathan: Yeah, I think curious. Yeah, no, totally, I think the great question. I think a lot of stuff coming. That’s interesting, right? Like the first thing is actually beefing up the set of core database features like as you know, Postgres is incredibly feature-rich. And we’re still catching up on some of these features. And some of the things coming out from our side is the ability, for example, to support gene indexes, or generalized inverted indexes on documents and text search, and just indexes and posters geospatial. So there’s a whole bunch of stuff on that side, that’s going to make it super interesting. We’re all also support Postgres extensions, which are like, super, super interesting. It’s really bad. Yeah. And that’s like, amazing, at least the query layer extensions. So there’s support for more extensions that were planned, including the ability to read from foreign tables, which are super interesting. So that’s just on the core database side. The other interesting things were keeping a close eye on like, it’s super exciting that Amazon’s building this SQL server interface to Postgres, it’s called Babelfish. And they’re building it on vanilla, Postgres. And guess what gigabyte uses vanilla, Postgres as the upper half. So this could be a really interesting thing for us to offer SQL Server as a distributed database API to people. So that would be another really cool thing that of course, that’s not just the core database, it’s more about expanding how things become really easy for people. So that’s an interesting thing. And there’s always the cloud-native world where we keep trying to add, oh, one of the other things about unifying that’s unique is we’re the only ones to do both synchronous and asynchronous replication. So we’re a distributed database that can asynchronously replicate as well. So we can easily work in both the traditional world where, you know, people want async and the new world where it can be stretched across regions. So we are beefing up a lot more on the async replication capability. So that’s, that’s pretty exciting on that front. And finally, the cloud-native side, including our own cloud offering, there’s a number of enhancements going on to make things you know, better and faster. And so finally, I’d say that the path to adding more APIs or adding more functionality in a distributed database, when you have a lot of data, a lot of simple OLTP queries start looking like OLAP, like starting from a constant. Yeah, a lot of the parallel capabilities and all of that is also going into

Matt Yonkovit: Okay, excellent. Well, Karthik, I thank you for sitting down with me today. This was a great chat. I really enjoyed hearing about like, kind of your journey and some of the things that happened at Facebook and what you’ve worked on here at gigabyte. This is, this is great. I’d love to have you back sometime. And you know, I see you submitted for Percona live, so we’re gonna be looking forward to hearing what you have to say there as well. But it was great chatting with you today. And I hope you have a great rest of the day.

Karthik Ranganathan: Awesome. Thanks. Thank you. Thanks, man.

Matt Yonkovit: Wow, what a great episode that was we really appreciate you coming and checking it out. We hope that you love open source as much as we do. If you liked this video, go ahead and subscribe to us on the YouTube channel. Follow us on Facebook, Twitter, Instagram, and LinkedIn. And of course, tune into next week’s episode. We really appreciate you coming and talking open-source with us.

Did you like this post? Why not read more?

✎ Edit this page on GitHub