OnGres, PostgreSQL, Open Source, Kubernetes, Babelfish w/ Alvaro Hernández - Percona Podcast 48

December 7, 2021 7 Dec 2021 by Alvaro Hernandez, Matt Yonkovit

Link to listen and subscribe: PodBean

Matt Yonkovit, The HOSS at Percona, sits down with Alvaro Hernández, Founder & CEO at OnGres. Find out the story behind the history behind the elephant wall decoration in Alvaro’s office. Then listen as we go deep into the Postgres contribution ecosystem, OnGres, running databases on Kubernetes, and Babelfish.

YouTube

Link: https://youtu.be/Eq_O6q9OsY4

Alvaro Hernandez

Founder & CEO at OnGres

Alvaro Hernandez is a passionate database and software developer. After several unsuccessful ventures, Alvaro founded and work as the CEO of OnGres, a PostGres startup set to disrupt the database market. He has been dedicated to PostgreSQL and R&D in databases since two decades ago. Alvaro is an open source advocate and developer and a well-known member of the PostgreSQL Community.

See all talks by Alvaro Hernandez »

Matt Yonkovit

The HOSS, Percona

Matt is currently working as the Head of Open Source Strategy (HOSS) for Percona, a leader in open source database software and services. He has over 15 years of experience in the open source industry including over 10 years of executive-level experience leading open source teams. Matt’s experience merges the technical and business aspects of the open source database experience with both a passion for hands on development and management and the leadership of building strong teams. During his time he has created or managed business units responsible for service delivery ( consulting, support, and managed services ), customer success, product management, marketing, and operations. He currently leads efforts around Percona’s OSPO, community, and developer relations efforts. He hosts the HOSS talks FOSS podcast, writes regularly, and shares his MySQL and PostgreSQL knowledge as often as possible.

See all talks by Matt Yonkovit »

Transcript

Matt Yonkovit:
Hi everybody. Welcome to another HOSS Talks FOSS. I’m the HOSS, Matt Yonkovit. I’m here with Alvaro Hernández. Alvaro, how are you today?

Alvaro Hernández:
I’m very good. Thank you for having me today.

Matt Yonkovit:
Yes. And, and quite honestly, based on your wall, I think people will understand what we’re going to be talking about. So, before we go anywhere you have to tell me about the elephant in the room.

Alvaro Hernández:
Right! Right. That’s a good one. Yes. There’s the elephant called Slonyk. I guess most of the audience may be familiar with this. That means elephants, small elephants in Russia. And so yeah, it’s a mascot of Postgres. So probably some people say that one of every three words I say is probably Postgres. I don’t know if that’s true. But for sure. At least in the data on Kubernetes. community people know that they say Postgres, Postgres, possibly three times I will pop up.

Matt Yonkovit:
Yeah, it’s better than saying Jesus three times. Yes, yes. But no. So what tell me about the mask? Where did you get the wall decoration?

Alvaro Hernández:
This is actually very interesting. I helped I created let’s say. It’s not about me. But I have founded a nonprofit organization in Spain. One of the goals of this organization is to organize a Postgres conference with a different goal. The goal of interacting more with people promoting more to hallway track and the being inspiring for for what is the future of Postgres, don’t worry, I’m getting to the elephants and this conference called Postgres Ibiza. And it happens on the fantastic island of Ibiza in Spain in the Mediterranean Sea in summer. before the pandemic you will come come back after next year 2022. And there in Ibiza there is a shop that is absolutely unbelievable. I love it. It’s called the streets. And it’s so weird, so imaginative. It is like the weirdest things you’ll ever see in a shop. Like even the parking lots, they have signals. And I like this for crazy people. This is for drug abusers. This is like any kind of thing. And you need to pick one. And everything’s great in the shot. And they had this elephant. It was just in white. The design from our team is painted blue cross blue color. And that’s where we get it from. We ship it from Ibiza to our offices.

Matt Yonkovit:
Excellent. I mean, it’s a very unique piece. And it does set the stage for the rest of the conversation. Right? And knowing that you’ve been in the Postgres space for many years now, maybe enlighten us, how did you get started?

Alvaro Hernández:
Well, that’s a good one. So I’m here by chance but haven’t been elsewhere also. So I was at the University around 1997. In my second year, the university and I had to work on a project, I joined the Students Union, and I was helped out with the students. And so I was tasked to build some software. And it was just starting with the software. And I at some point said, oh, I need to store some data. And how can I do this? And I asked a colleague and say, Hey, let’s just store data. Where do I do this? Like like, something, like told me like a text file, was not a good option. So I’ve heard of something called a database. I was really, really not knowledgeable at the time. And my friend said, Oh, yes, then you need Postgres. What is Postgres? Search brain? Okay, so I started researching Postgres, oh, this open source, I can install it, I started using it and never looked back. Postgres since 1997 will feel all my personal professional and my customer’s needs. And so that’s why the only times I’ve looked to other databases are either because of my personal interest in learning about databases, and I’ve tried to learn about almost all of them, or because of migrations to Postgres for our customers. But other than that I really don’t have significant experience with all the databases because Postgres has always fulfilled all my needs.

Matt Yonkovit:
Yeah, no, and that’s great. starting off early, I remember that’s how my database journey started off as well. I’ve worked on many different databases over the years. I mean, I started off more towards the evil empire because I started as an Oracle DBA and then moved towards open source a few years after college. But I think it’s interesting because open source has been so powerful in a university and educational setting. It’s enabled students, it’s enabled people to do things that they could never do before. I remember I’ve done several projects over the course of my life, even when I was one of those evil Oracle DBAs. and built websites, games, different things. And I always used either MySQL or Postgres or some open source components, PHP, Perl, Python over the years. Because I don’t have to, like, have this huge barrier to entry back when I started back when you were in college, right? If you wanted to use an Oracle database, okay, that’s $50,000. And I’m not the type of student who had $50,000 laying around on your I’m guessing you probably fall into the same boat. Yeah. But yeah. So I started with the LAMP stack I started with different open source components. And it’s a powerful thing, to enable so many different types of people and so many different steps on their journey to be able to do really cool and interesting things.

Alvaro Hernández:
I also got involved early with Postgres Community itself, learning, every user is already part of the Postgres community. But I mean, like more active roles. I remember I was attending the, by then the main Postgres conference, which was held in Ottawa, still there. It’s called PG conf. And I went there in 2010. And I was attending a talk and someone said we’re looking for speakers for the next year, no, sorry, that was probably 2009. So we’re looking for speakers for next year. And you don’t need to be an expert. I consider everybody an expert. I was like, I’m just a bare-bones user. They said You don’t need to be an expert. As long as there’s some topic that you love, you want to speak about it, you’re really an expert in that topic. And you should apply. And I was thinking, I’m also really proficient in English. And this is an excellent opportunity to try to do both at the same time to speak at a conference, which I’ve never done before. And also to do it in English, which it will, maybe it will help me improve a little bit. And so next year, I did, and I did my first talk. So that was 2010. It was a disaster. I, it was really bad talk, it’s still there. But apparently, some people liked it. That and a couple of years later, I did another one that also was at least more successful. Actually, a lot of people came to talk to me after that talk. It created some curiosity around it. And since then, I’ve done more than 100 textbook tech talks since that year 2010. Probably around 120 today, and most of them are around Postgres.

Matt Yonkovit:
Well, and this is an interesting thing, because there are so many people who want to contribute to the community, but they don’t feel they can because they don’t have the code level skills. I don’t know. See, I don’t know how to program. They might know the database side, they might run the database, they might know scripting, but they don’t feel that they can contribute. And I think that sharing your knowledge and being able to share your experiences is probably one of the best ways people can contribute.

Alvaro Hernández:
Absolutely agree. There are actually two sides of this, I would say. One is the type of contribution and the other one is that you may feel that something is not interesting, even though it is for others. This second talk I was just referring to before he was about it was a project that we did just for fun, called the billion tables project. The idea was to see if there’s a limit in terms of how many tables you can create on a database, empty tables, by the way, like no data, this the tables distractor. Right. This was a joke that started one day with some beers that some people would read about into it, and we decided to go for it. So we aim to create 1 billion tables. And this is pretty lame, right? Like what you ever wanted to this without data and what for and but at the end of the day, we submitted this talk. A colleague of mine and I presented a talk at the conference. And at the end, I started seeing major contributors joining the room before I started the talk. And I was like, Oh my God like my second talk ever on a conference. And like the main Postgres people are coming to listen to me, they are not so what this is like a joke. And then finish up on the talk. I remember one of them. Bruce Momjian came to me and said, Wow, this is pretty impressive. like, all the tuning that you did and all the width of the lens that you had to go and all this parallelization that you had to go through just to be able to grease the wheels. And by the way, you found the limit that we needed, didn’t even know about it. Like this is amazing. And I was struck I was not expecting I was thinking it was a pretty lame talk that nobody will ever listen to. But it was very interesting for many and it’s still today some people refer to it as a good example of some extreme cases. So whatever you feel is one topic is What you may not consider interesting might definitely interesting for others, so don’t hesitate to try to share it. The other one is that this is probably a problem in the Postgres community, there’s a strong bias towards pricing code contributions, which are obviously awesome. And not putting into the same particular, or at least a similar category, non-code contributions. And those are critical for our community to be organizing conference reading documentation, promoting pulsars, in your timeline in whatever social media, like there are so many things that you can do to contribute to a community that is non-code contributions. That that is a little bit something you’re missing.

Matt Yonkovit:
Yeah, and I think that this is where it’s interesting, I see this in the Postgres spaces, a great opportunity for all those in the community to accelerate. Because from an entry-level perspective, Postgres is growing, it’s becoming more and more popular every day. But it is so difficult for a lot of developers to adapt as easily as some of the other databases that are on the market. And it’s not that there’s a lack of information. But a lot of times, people who have been in the community for a long time, I’ve done this myself, as I do talk on things I’m familiar with, I tend to overlook or neglect the easy things because I assume people know them. Even if I’m not making a conscious decision. Subconsciously, I assume. You might everybody should know how to do an installation, or everyone should know how to configure this out of the box, or how to fill in the blank. Yeah. And now we’ve got more developers in the world than we’ve ever had before. So many developers, right. And a lot of them have learned for the first time. The easy ways to deploy code or to build infrastructure. Amazon big button, right? spin up an instance. It’s magically there, throw some data in an object store? You’re done. Yeah. So their idea on easy is vastly different than mine.

Alvaro Hernández:
Totally, totally. We see that or I consider it a set, notable problem. Postgres is going along. This is undeniable. And this is fantastic. I’m very proud of it. But is it growing as much as it could, is the developer base, we’ve also had sustainable growth. You already mentioned it, there are many more developers than ever in the world. And there are also this many projects, more than any other time in terms of open source projects available. And we are, we’re always bound by our human nature, right? So when you when you’re trying to contribute in your free time, to an open source project, there’s so many to choose from, right. And creating friction is potentially a barrier for adopting new contributors. If you look at the Postgres contributing process, it really, really creates a lot of friction. So the first thing you see is that Postgres is not on GitHub or GitLab. or equivalent. I mean, it is, but it says the mirror, and it says, we don’t take pull requests, go to the contributing Guide, which is a wiki, it already feels like a bit odd for some, like, what is this wiki thing, right? It’s like Wikipedia, and when you go there, it says, Read it to the list. And what is it to do list and it’s a, it’s a set of items collected over the last 30 years without any order there that you need to review before submitting an issue because there’s no issue tracker. Then when you finally decide to create a patch, you need to subscribe to a mailing list, which contains 200 emails per day. And well, I mean, you can delete them. But you really need to be there in order to share and contrast with other developers whether your patch or your ideas are good. Then if you are even able to finally submit the patch, someone is going to take it and commit on your behalf. There are no pull requests. So someone literally takes your patch, which by the way, you had to attach as an attachment to an email and download, apply locally and commit as, as his or hers. You’ll get a mention though, but it will not show up on your GitHub profile. So there’s a lot of friction in this process, that we know that maybe it’s not allowing Postgres to grow even more, or to have a more sustainable developer space.

Matt Yonkovit:
Yeah, I mean, it’s so interesting because even over the last few years, the number of tools and the number of emphasis people put on different products like GitHub and your GitHub profile that makes or breaks a lot of people’s ability to get jobs and you profile themselves as competent. And as they look for projects to work with and contribute with, that can be a barrier, especially to this current generation of developers. My generation is now I’m older, right? It was a bit different, right. And there wasn’t the reliance on GitHub you still used version control whether you use Git or subversion or Bitbucket or whatever your favorite tool was, most of that was controlled by a company and you did your code check-ins to a company repository, and no one ever saw them. Correct. And there wasn’t any sort of tracking or anything else. So it is very important to that ecosystem. And to the new developers. I read an interesting study a couple of years ago, where they asked developers who had graduated in the last four or five years, so relatively new developers, what the primary reasons you choose a job. And the number one reason that this study had was based on the tooling that they could use at the job, which is interesting. You wouldn’t think of that. But people have preferences, and they like certain technologies, and they like certain workflows. And it’s interesting how those fit in. I think that from a contributor process, making it as easy as possible is a really good thing to strive for. I think it’s also important to strive for ease of use for end-users to install and developers using the product as opposed to contributing. Right. And I think that both of those need to be satisfied in order to accelerate growth even more.

Alvaro Hernández:
Yeah, absolutely. This last topic is quite interesting to me. Personally, as part of my day job, the easy use of Postgres, if if you would ask anyone, is it easy or difficult to use Postgres? Actually, you may find the answer to be trivial. Do you know you’re going to Linux box, Ubuntu, or Debian, and you’ll say apt install Postgres and boom, in a few seconds, we’ve got clusters running on your laptop? Now, the question is, is this Postgres you want to run in production? The answer then is probably no. And what is the distance between this very quick installation and what you run in production, and it turns out to be huge. You need to install a lot of all the components that don’t come with Postgres, you need to do a lot of tuning, you need to understand all these tools to pick the right ones from the ecosystem, which ones work with which ones are good, which ones are not totally that good to your use case. And to put them all together. And this is becoming an increasingly difficult problem, which ends up requiring a lot of effort and a lot of Postgres expertise, which is not available to anyone. So in reality, using Postgres may be considered super simple, but running Postgres with production, quality, production, warranties, and an enterprise-ready, let’s call it that environment is quite hard. So that’s, again, another barrier to adoption from that perspective, that’s something that we really need to work on. And I’m personally trying to get hold of this problem.

Matt Yonkovit:
Yeah, I mean, when you look specifically at the Postgres ecosystem, Postgres has a great core database, and as you said, easy to get that out of the box, but how many extensions exist? How many projects exist? I once did a quick GitHub count. There’s over 100,000 Different Postgres-related projects. And there are over 1000 different ways to do high availability. There’s 1000s of ways to do backups. So it’s hard to understand that ecosystem sometimes. And there’s a lot of debate over which tools you should use for which use cases. And when you don’t use the right tools, often you can harm yourself more than healthier indeed. Yep. And I think that’s where your company, OnGres, you’ve been doing quite a bit of work to try and help users in the community understand where those pieces fit, right?

Alvaro Hernández:
Yes, absolutely. We actually call this the Jake problem of the Postgres ecosystem. The Postgres ecosystem is vast, you just put some numbers into it. Like there are so many options. And this is very good because you have a lot of options to choose from. You’re not bound to the tool mandated or kind of bundled by the vendor. But it’s also a problem because unless you’re an expert, you don’t know which one is the right one. And recently swift rains and data corruption by some tools, which were not may be correct for that use cases list. And so yes, we are trying to say this is we’re trying to make Postgres easy to run in production, which sounds like trivial, but in reality, as we’re discussing it is a big problem. And it’s, it’s actually even more complicated. When you think about all the potential environments in which Postgres may run. We actually call Postgres, I don’t know if you would agree with this, that we call Postgres, a Linux, like the Linux kernel. It’s a little bit bare bones. I mean, it’s fantastic. It has all the core functionality, all the SQL compatibility, and all these features historic have warranties durability, that, then you need the distribution around it, you need a set of components around Postgres that will make the distribution of possible distribution, we call this the stack problem, you need a stack of components next to Postgres to be able to run production kind of workloads.

Matt Yonkovit:
Yeah, and I think that’s one of the critical things that I’ve seen a lot of companies start off with, we want to replace Oracle, or we want to replace SQL Server. And we’re going to move to Postgres. And coming from those environments, there’s an expectation that certain tools, certain functionality, are there out of the box, exactly. I mean, you would never think that from a SQL Server or an Oracle perspective, hot backups, or HA capabilities would require additional software or third-party components in an Oracle or SQL server environment. So there is that kind of translation? Miss that sometimes does occur.

Alvaro Hernández:
Yeah, yeah, exactly. That’s exactly the case. That’s what we have found. And some people can say, and what is the Oracle Enterprise Manager of Postgres? There’s actually not, there are some GUI tools, but they’re mostly for database optics, management, not for the infrastructure, the DevOps part of the database.

Matt Yonkovit:
Right. And I think this is where when you look at this problem, it’s an interesting space, because with the growth of Postgres, over the last five years, six years, 10 years, a lot of this has also been piggybacked with the growth of the cloud. A lot of people choose cloud-based databases as a service because it has a lot of those components already kind of figured out. And it’s just out of the box, where you don’t have to think about it right. And I think this is where the shift that you’re seeing quite a few people start to explore is how do I get that same level of compliance, that same level of structure, but not necessarily commit to one cloud provider or another or one database service or another. And that’s where you start to see the Kubernetes play. And both your company and ours are part of the data Kubernetes community? So we both have a fervent passion for running databases on Kubernetes. So I think that’s where the great equalizer starts to come in, where you can provide end-users with those capabilities to easily start to pull those components together.

Alvaro Hernández:
Yeah, exactly. And actually, I can speak of how we came to Kubernetes. It’s not because we found Kubernetes. The next big thing, the next fancy hipster term that everybody is jumping there into the bandwagon we also need to be there. We actually come from this stacking problem that I mentioned before, right? Like we were thinking one day we need Postgres, we need as you said, an HA Solution. We like Patrone. Well, we need a backup tool, we need the monitoring tool, we need a connection pooler, we need tuning the connection, we need even a graphical user interface for us, it’s important to add that to the mix. And this is like it’s a lot of components. And now how to deploy this on this cloud vendor and this other cloud vendor, and these are the cloud vendors. And oh, by the way, what are on Prime Oh, and Primus is like wild west, right? Some are used as an entry point to a cluster, DNS load a load balancer and others use DNS for the ones virtual IPs. And then the storage and some are certs, file system, the Sanders, Nas, local drives. And when you start counting all the possibilities, you end up with an error basically coming onto an explosion of options. And we’re thinking, power is going to pack this. We would like to pack this into a single distributable unit that will say, Hey, this is Postgres with this pack, and you can go and use it in any environment you are in. And it was basically impossible. Every environment is different. And we’re tired of writing Ansible and Ansible and copy and pasting them because you cannot 100% reuse them. I mean, you copy them the least change blah, blah, blah. And, but that’s not sustainable. And that’s not maintainable. So, we realized that Kubernetes was the answer. Kubernetes is the API to distributed systems that is hiding from you the details of how computers are, how networking most importantly, is and how the storage is. And this program with the end gains the API, and it will be the responsibility of the cluster administration, the Kubernetes administrator to map that to physical or virtual infrastructure, right. And so that’s how we came to Kubernetes is the answer to creating a single package that you can distribute content containing Postgres, that stack of components that we believe in this case, are the better ones, for most use cases, and run on any environment, whether it’s a cloud environment, on-prem or any make sort of hybrid positive, or however you want to call it.

Matt Yonkovit:
Yeah, and I think that that’s a powerful way to help users use the right tools and have the right setup. You look at most applications developed nowadays, they are cloud-native by default. So there are already people who are comfortable running Kubernetes. So adding another Kubernetes component, or another operator to the mix, is a very natural step. And it fits into how they’ve come to think about their application deployments anyways. So it coincides with that. But it also gives us the opportunity to bake in some of those best practices. And some of the things that people often miss when they start trying to manage the infrastructure that they don’t really understand. Encryption, backups, high availability, sometimes performance tuning, or scaling things that you can bake into a lot of the operators as you scale your resources up in that. And I think that helps people to overcome and run better applications out of the box, because they don’t have to worry about a misconfigured database. Right off the bat.

Alvaro Hernández:
Yeah, that that’s You have touched an appointment that is key for us. This open source software that we have written for running Postgres on Kubernetes, all stack, like a stack on top of Postgres, right. And one of the goals, as I mentioned before, is to make Postgres more accessible for production and usages. And Kubernetes users are already used to a certain pattern on how to do some things. They know how to create journals, they know how to create CRTs. They know how to list components to deploy services to attach labels to add various elements, right. So one of our goals was to make that experience also seamless with Postgres. And Postgres, again, requires a lot of expertise and requires a lot of tuning to run production. So it’s not enough to just say, Oh, this is a way of running pulsars on Kubernetes. But you still need to understand this configuration parameter, this configuration parameter is going to be this file created under this path. That means these have this other different format, which is not usual to you, or the interplay, this tuning, or that you need to pick between these two tools, which use don’t know anything, right. So one of our goals was to hide all these details from the user, you still have the option to tune them if you’re an expert, but by default, they’re hidden from you. And everything’s exposed in a very high-level interface. So we’re leveraging here a lot. The CRD is that the custom resource definition that Kubernetes provides, I consider them kind of an API call. They’re part of the API surface at the end of the day, right? Like it’s like a specification, and you as the application, in this case, the operator creators, we can say we believe this field is important, which has this meaning and this other meaning. And if you make them sufficiently high level and again, abstract them away from Postgres inner details, then any Kubernetes user that is used to working with channels and crd will be able to deploy clusters with production quality as if one of our company’s experts will be doing this. So for example, instead of asking the user to create a 50 lines of gamle cluster definition file, including references to files like Postgres.com phba.com, Attorney embedded configuration, about tuning roles, like size and how replication works, and all these things, were just, you can just create a cluster with probably less than 10 lines of Yaml where the more advanced concept you’re going to be talking about is the number of instances you want the clusters version that you want or the size of the disk that is going to be attached to every part. Those are concepts that are understandable by anyone that if you want you can create a custom Postgres configuration and the parameters and even then we’re not even talking about the complex lab is yet another crd. Yet another Yaml because that’s How the ecosystem works on Kubernetes. But it’s not only that, we’ve also tried to use and reuse existing cncf components and build them into the system so that users are also more used to them. For example, you talked about encryption and for example SSL communication, right, which is important for Postgres. In StackGres, we don’t use Postgres SSL support. This is simply not even compiled in we actually helped and work together with the envoy community the envoy proxy to develop. envoy is also extensible like Postgres and accepts blagging sir filters and we help and with their help we develop a plugin for envoy that understands Postgres by protocol and is even able to terminate SSL. So instead of terminating SSL Postgres, we terminate a solid envoy which runs a site particle service. This is also transparent to the user so users don’t even need to understand it. But then when you’re gonna manage the SSL certificates, we are now connecting envoy with cert-manager so that Postgres SSL certificates are directly managed by cert Manager, which is what people understand in the ecosystem.

Matt Yonkovit:
It makes it easy because you’re meeting the users where they are. And the administrators were there.

Alvaro Hernández:
Yeah, we call this I don’t know. So we call this way that we’re trying to reduce the cognitive burden of using new tools. Because this new tool is leveraging tools on configuration, side style and patterns, that you already know.

Matt Yonkovit:
That makes a lot of sense. Because the more tools you have, the more single points of failure you have, the more places that oops, I forgot to do that. And then that leaves you exposed to potentially expired certifications or things that didn’t get encrypted that you thought were encrypted. There’s, there’s a lot there. So that does make quite a bit of sense. And I think that as the ecosystem expands and as more people come on, the expectations get higher and higher for what the stack can do. It’s not just what we can do today, or what we did a year ago, because these, the user group and the user community is growing, based on other people moving into the space so we just recently Babelfish, the SQL Server to Postgres translation from AWS came out. And that’s going to drive more people who are used to the SQL Server ecosystem, to explore Postgres. They have a set of expectations, and tooling and processes that might be completely foreign to many, in the Postgres space, or in the Linux space, right, like, and so how do those match up? And how do you make it easy for them?

Alvaro Hernández:
Well, Babelfish is a very interesting topic in itself, right? It’s something that we’ve always been, personally, and as a company very interested, because one of our goals is to also make Postgres, reach to all the areas and tap all the markets or other use cases, and all the users, as we were discussing at the beginning of this chat, right Postgres is growing a lot, I still think that Postgres can grow much more than what it is. And one way to do this is making the user base bigger, right, like, addressing more use. So Babelfish sounds like a blessing, like a fantastic open source, or open source by Amazon that I hope will get a lot of contributors and will grow as a large project will merge upstream. And then we’ll have Postgres with both Postgres capabilities and SQL Server capabilities. Sure, at the beginning, compatibility is not I mean, I know for sure it’s not perfect, but it will improve over time. And it will, it will handle most of the use cases. And then we’re writing the user base like we’re making Postgres. achieve more market share and dress more users. Now, how we can make this experience less boring, or SQL users SQL Server users. I don’t have an answer for them. All I know is that one path we’re trying to follow. And this is not exclusive for SQL Server users, but also Postgres. And it’s something that also distinguishes this approach on Kubernetes. From the Manage slug offerings, database offerings, like Postgres managed services, is what we call the day to operations and all the automation you can achieve with Kubernetes. So if you pick a managed solution, you as you will well, very well put it before you have backups, manage backups, you have HA You have monitoring and some alerting built-in right there’s nothing you need to do. However, that doesn’t mean that the database doesn’t require any care. You still need depending also on your scale and size. But you also need to take care of it about databases that are routine operations that you need to perform and databases, Postgres, you need to do vacuums, potential repacks reading, excess tuning, and many others, right. And, of course, also data migrations, which are non-trivial. And these operations, they’re not automated, they’re not provided to manage the services. And it goes, they’re a little bit hard to implement that way. However, in Kubernetes, thanks to the API that Kubernetes provides, you may provide automation or for some of these operations, actually, I would say majority of those operations and automating this what we call day to operations into the software like stagger is something we’re focusing on right now. For example, just to give you an example, we’ve been able to automate repack, vacuum, Mayor and minor version upgrades, restart of the clusters in an orderly fashion upgrade of the container for secure for security, and even benchmarks. So with just a few lines of yaml from our web console like point and click, you can run a benchmark or do a major version upgrade. You don’t need to understand how these things work internally. And this is key because if we’re going to make Postgres more active Similar to SQL Server users, they’re definitely not going to be familiar with how to run a PG upgrade. And then after running a PG upgrade, you need to run an analysis, and all these things, all these small details that are very specific to Postgres. They’re not, they’re not going to be familiar with this. So the capability disability of automating as much as possible. And again, hiding all these internal details, I think, is one way to approach this problem. How to make it easier for them.

Matt Yonkovit:
Yeah, no, I mean, I think the more automation, the more that you can take away some of the specific nuances and do them for the users, the better off you are, and you can always give them access to break that automation and do it themselves. Exactly. And give them the best of both worlds. So where do you see the Postgres community going? Where do you see the next year to where do you see this kind of escalating? Where do you see the growth over the next couple of years?

Alvaro Hernández:
It’s an interesting topic. For me, I’m, I’m very interested in. I’ve been a little bit vocal about some of the things that I don’t see are perfect. In the Postgres community. There’s governance issues. In my opinion, it’s interesting that Postgres is not run by any single company. It’s an effort of coordinate, more or less coordinated developers around the world, and has some small bodies for governance, and smartisan, open source community with this liberal license, very open this governance bodies, they’re not as transparent. And as democratic as some people believe, including me, obviously, that they should be. So I am it’s okay, how they have evolved, history is history, and possibly the way they were created and operated until today, were the best choices at the moment. And I have no regrets of that, I believe that’s, that’s actually taking us as a possible community very, very far. So hats off to that. But once we’re in 2021, and we see these, I believe there are some improvements that are to be implemented in terms of, for example, governance. As I mentioned before, also from the contributing space, you know how to remove this friction or reduce this friction to contributing to Postgres. Postgres in general, if you look at developers, I don’t have an exact number. But it’s kind of an aged community. With all my due respect to everybody’s age. I’m not young myself either, like consider that but numbers say that my IT guy says I’m not. And, but we need more fresh blood fresh and young developers, young commuters coming to Postgres. We’re struggling with this. So there’s this problem. And, and actually, things like Babelfish represent in my opinion. And I could be totally wrong here. both an opportunity and a challenge. At the same time, okay, they’re an opportunity, because we have the opportunity to do whatever I’ll do what I was explaining before, not waiting for the user base to appeal to the SQL Server community, and to significantly grow the outreach of Postgres. But for this to happen, this Babelfish project is considered and released by Amazon as a development branch for future features that would like to be included upstream. What would happen if it doesn’t merge upstream? I mean, and there’s already when this was announced last year, the end of last year, there were some voices leaning on not including these, or at least bringing some serious concerns for its inclusion, even some core components of what Babelfish needs, which I believe would be great for Postgres to have this is a template protocol. Oops, which is a way of adding an extension point to Postgres out of the many extension points that Postgres has to be able to block new protocols is needed by Babelfish to implement the TDS particles. But what if tomorrow someone wants to implement another database particle or are totally different? Postgres right now has a protocol that is quite old. And it’s very hard to replace today, there will be particle hoops, maybe all protocols with a new protocol could co-exist. And so what happens if all these improvements that Babelfish has, which I’m sure need a lot of polishing and breaking up into smaller chunks and addressing them individually, but what happens if two years from today, they don’t get merged upstream. They may get rejected. So then Babelfish, instead of being a future branch, may end up being a fork. And then we would have two Postgres, phosphorus, and another, another Postgres version. Yeah. And one is going to accept contributors, and GitHub is gonna accept pull requests, it’s gonna work the modern way. And so is it gonna become more successful than Postgres? That would be really worrying to me. I hope that’s not going to be the case. And I’m sure the community will figure out a way to polish and to address Babelfish as a feature that will become merged upstream as soon as possible. But we need to contemplate the alternatives that may also happen.

Matt Yonkovit:
No, it’s one of the benefits and drawbacks of open sources. If you don’t modernize a bit, you don’t collaborate. People can always fork. Yeah. And sometimes they do. And what’s funny is there’s a whole list of reasons. Sometimes there are valid reasons not to, but certain projects may find it more beneficial to go their own way. And so that’s one of those things that is always a difficult thing. It’s a two-sided coin. Some of its good some of its

Alvaro Hernández:
Yeah, yeah, absolutely.

Matt Yonkovit:
But Alvaro, I wanted to thank you for coming on today. This has been a great chat. I appreciate you stopping by sharing with us a little bit about your background and Postgres, where you see Postgres growing, some of the issues in the stack space around Postgres, right. It’s great to catch up.

Alvaro Hernández:
Thank you. No, really, I really appreciate this conversation. Very insightful. And I hope it will be interesting for the audience, too.

Matt Yonkovit:
All right. Thank you. And for those who are watching, please subscribe to the channel. And if you like this, let us know. See you next time. ∎

OnGres, PostgreSQL, Open Source, Kubernetes, Babelfish w/ Alvaro Hernández - Percona Podcast 48

YouTube

Alvaro Hernandez

Matt Yonkovit

Transcript

Did you like this post? Why not read more?

All Things Pulsar, Cassandra and DataStax - Percona Database Podcast 78 /w Patrick McFadin

Data Collection, Download Metrics, and Scarf - Percona Database Podcast 77 /w Avi Press

All things Open Source Database Advocacy and AlloyDB - Percona Database Podcast 76 /w Gabe Weiss