How to Secure PostgreSQL - Percona Community PostgreSQL Live Stream & Chat - Oct, 6th

Percona Community PostgreSQL Live Stream & Chat - Oct 6th

Percona Community Live Stream for PostgreSQL will focus on How to Secure PostgreSQL. This is a part of our bi-weekly meetup to develop open-source database skills. Come up with your questions and get an answer straight forward.

VIDEO

TRANSCRIPT

Dave Stokes: Welcome back. This is Percona, this little biweekly Postgres talk. We’re part of this Percona Community Postgres live stream. Now the official title, my name is Dave Stokes, I’m a Technology Evangelist, and with me is Charly Batista. And today, we’re talking about how to secure Postgres. In the old days, it was very, very simple, because you just shut off access to the card reader, and no one can get to your database, which shows how old I am. So, Charly, what do we need to know to lock down our Postgres server?

Charly Batista: Nice to be here again. Yeah, that used to be pretty simple, right? I just kind of got up the cable. Yeah, your security problems, though. Yeah, but thinks things have evolved since then. People have found more and more ways to try to steal your data, not only steal data, but compromiser, your service. Security. It’s becoming a complicated complex beast, I would say nowadays.

Dave Stokes: Actually someone asked me last week or the week before about is SQL injection still a problem? Is that still a problem?

Charly Batista: Yeah, funny enough, it still happens. And that’s the thing when we talk about security.It’s a huge topic. It’s something really huge. But the things that always comes to our mind, the first is access to the database, right? So when it Okay, let’s talk about security, there’s the database secure policies. We want to prevent unintentional access to the database, some somewhere that doesn’t have privileged access to data. But it goes a lot beyond controls things. And we see nowadays, they’re very sophisticated attacks. Well, the SQL injections is not new. And it’s not so sophisticated anymore. But it used to be a sophisticated one when it came to write, especially when languages like PHP, those things were really trendy, like SQL injections, and those things are horrible. They’re not so popular, because most of the languages have frameworks that help you to build applications, and those frameworks, they work on something,they have some measures to prevent those, but they still happen. They still happen.

Dave Stokes: Well, as long as humans are involved, we’re going to do something creative or destructive, done intentionally or unintentionally, that will mean some poor person has to do a backup restore and explain.

Charly Batista: Well, you know, one thing that I’ve done I’ve seen recently, in the last few years that it’s been increasing, are people hijacking the whole data store, right, so I now I forgot the name the although surgeons, they do encrypt your whole data store, you DS cure or ads or something. And then they charge you some money to give you the encryption key.

Dave Stokes: It’s kind of funny four years ago when MySQL eight oh came out, by default SSL was turned on. And a lot of folks were saying, well, that’s overhead, that’s going to slow down my server. We don’t want, you know, that sort of overhead. And then we started pointing out that, you know, you really don’t want a man in the middle sniffing your credit card numbers, everything else and ruining your PCI compliance. And after a while, they kind of, well, the overhead doesn’t really bother us much anymore. How do we make this more secure?

Charly Batista: And that’s the thing. So when it comes to secure the database, you’ll have different levels that you need to go right. So for example, the one you’re talking about the natural security level, right? So it’s very important. Sometimes you have really is strong authentication system. Right. So you, your authentication system, informs your user to have a password with 100 characters with numbers and characters and letters, and everything is so sophisticated check against the dictionary those a lot of things. But then everything goes on text. We’ll include text on the network, right? So you failed miserably on the security on that aspect, right? So well that those, these things gotta fail miserably because it doesn’t matter how complex is your password, if it’s easy to get that complex password, people are still going to get. That’s the thing when it comes to security, we have many different layers. So we have the natural claim, we have the DVS, global security itself, like, as we explained, so you don’t want to expose your database, right in all crimes, as you mentioned, so you could just cut the wire, and then your database is secure. Nowadays, that we are going to the internet, we don’t want to expose the database, internet, I’ve seen a lot of security breaches for all the databases that the team went installed the database, the port was open for everybody. And the databases didn’t have a password. It seems like but it still happens. Like if we go there, most of the security breaches on the database world because of those things. Because of, you know, and the first thing that you need to do when you’re talking about security is to have a checklist, right? Okay, this is my checklist, and go first for the easy, stupid mistakes. Because, well, if you fail on the stupid stuff, it doesn’t matter how hard your password is, it doesn’t matter how complicated the algorithm for your SSL or your TSL is to expose a database with open ports with no user password. So it will be pretty easy. Right? So and those are the things that I’ve seen happening. But beyond those things, okay. What are what should we be concerned about? When you’re securing the database? Well, first thing is, is axis, right. So the first thing we need to be concerned is about the access, you should make your database as hidden as possible, right? Your database should not be exposed to whole internet, for example, your application should be exposed to the internet market leaders, right? So if your install your database, it’s a very good practice to have metric isolation. So you start your database in a private network, and then your application has access to the internet. So you can access your application to the master database. That’s first of all, that makes things a bit harder to get an attacker to access your database. So another thing that you should do is secure the communication to your database. One may say it’s, it’s sort of my private network, right? So what is the problem? Why do I need to use a TLS, or SSL connection to my data is, as you mentioned, so yeah, it makes my database slower, right? So I gotta have latency on my database, I’m going to use more CPU. And so while we never know who is working on our side, maybe the guy just next to you wants to get some private information that is not supposed to.

Dave Stokes: Well, I know. In MySQL, we always said, make sure you put your application to database traffic on its own network, your replication traffic on its own network, because there’s nothing more having an LDAP update and a DNS zone change. Other one, one pipeline fits all party line.

Charly Batista: Exactly. Exactly. And well, yeah. And even both problems. Right. And when you start building that segmentation, so you if you have, if you start building, segmenting your networks or connecting to the database, it’s a lot easier for you to improve security is a lot easier for you to troubleshoot performance problems, right? That’s, for example, if you have a replica that’s lagging from the primary, well, what is the problem? The problem might be that your network has been flooded by the application. And maybe your replicates working pretty well. Your product is working pretty well, but then you have a bottleneck on the network side. If you do not have any segments. Hm, that’s going to be complicated to troubleshoot and complicated to fix. How are we going to fix this such a problem? Alright, so having this segmentation on the network is very important. And it’s important for security important for performance is important for troubleshooting from many different angles and perspectives, right? Those are the things that we need to be concerned about. So if you enforce the axis, it makes more complicated an attacker gonna read have a problem. So another thing that you want the access point of you need to enforce are the authentication. And here there is one point you people usually confuse authentication and authorization. On the authentication, you grant someone access to something called don’t grant, you should check if that person is who he or she is saying they are. Right. Okay, look, I’m Charly, here is my ID, you can see. So this is my ID, my photo. So I tried authenticating that you are a child. Then comes the authorization part of the process. Well, now is your child labor, they don’t care, you have no answers here. So those are different things. And they were important. Because when you make sure that your authentication process is strong enough. So that’s going to be harder for someone to fake information or to impress in personate you. I’m Dave. Nice. I even have this nice bird here.

Dave Stokes: Also, the computer actually recognizes by a facial pose. Yeah.

Charly Batista: Yeah. So it was like, Okay, let’s say I can fake that if that user has really restricted access, the only access that it’s needed, that we are, it’s will be not a smaller problem. But the problem might be less problematic, right? Because another thing that I’ve seen over the years is, okay, we have this huge database, we have a few terabytes of data here. And we have 10 applications. All the 10 applications use reverse sign user. Even when they don’t need even if application a has a schema, application D has another schema, they do not share any information about them both, they are completely isolated. And the only thing they share is their results, the same database. So yeah, the same instance we have. That’s the results chart, that’s okay, that’s fine. And they have the same user, and the user has super privileges.

Dave Stokes: I used to recommend that people have a separate account for reading and writing, and different reading and writing accounts for each project. And someone said, Why do you do that? I said, well, the situation for your application is you have a lot of programmers who open up a port, do five or six different queries sleep for a while, do some computations come back. And your problem is you don’t have any sort of port concentration going on. So you know, the quicker they’re in and out, the better you are for MySQL. For Postgres, you have the opposite problem where every time you come in, you have to fork off a process, get a user running, and then attach it. Is that big an issue for security? You people try to circumvent that problem.

Charly Batista: Well, that that depends on. In the end of the day, all the process, run with the same user, right? So if you if you create one or a main process, they’re gonna run on the OS. They’re gonna have the same user that is running the Postgres so the user on the database side, they’re not really related to the user on the site. So on these are decent. It’s fine. So I don’t see really a problem. But then you shoot. A user that runs our database should have been pretty specific, the only reflash that’s really needed to run the database. Their folders need to be with proper access only to that user. So add on this sense. I think most of the database, they do a quick check on the folder if the folder has access like 777, so it won’t run for it’s really weak check, we need to put more constraints when we are creating the user and grant the OS user that runs the database. I guess we don’t want to have a super user, user that has sudo privileges, it’s not needed. Right? So we need to have the proper ACLs to go there to use the proper EC, Linux, everything. So we don’t disable the SC Linux. Right? Because the database runs entirely on the user space, there is nothing much that Postgres needs outside user space. So it really doesn’t need to have root privileges to do anything on Postgres. Right. That’s, that’s one thing, right? And then when it creates a new process, all those processes, they’re gonna, they’re gonna be espoused, we’re using the same user, let’s say we’re using the user Postgres, they’re gonna have the same user as the main process, right? It’s just a copy of a new processes a child process. In that sense, if we secure enough, the user that is, is running Postgres, that’s okay. But it’s, the kernel does a really good job to preventing those sprinklers to be leaked out of the user space. That’s, that’s, I think, from security point of view all using either thread or process. In the sense, I think most of the problem comes when if it’s inside of the database itself. When you create the user on the database, let’s say we have two different applications. We have one application for sales and another application. I don’t have a finance, right. So the finance one is the application that you organize all the shots, you organize the salaries and everything. So what prevents you and user that has the password, the access from this the sales application to go on the virus and increase his own salary?

Dave Stokes: No, ever do that? Wow.

Charly Batista: You know, yeah. You know, and this is it’s stupid example. Yeah. But if you have one user for all the applications above, or if you have an attacker, someone that gets access to your application, your point of sale application, going to have access to all the other applications, right, and even your finance application. So it’s really critical to also have this the separation of application of user, when you have multiple users even have multiple applications on the same database. So it’s a lot better to create a specific user for each specific application and grant only what those application needs, only those tables. And one thing when that is really nice when it comes to both screens. US because of the granularity, what we can do on both screens is impressive. We can go for row access, right? We can control at row level, like to tell which vol that user might be able to access or not. It’s huge granularity, right? And if we go to the documentation, for example, let me share my screen where is creature?

Dave Stokes: Well, while you’re sharing that, do you recommend having different schemas for each application? One for finance one for dev, one for Admin?

Charly Batista: Yeah, I do. Actually, it depends on how you organize your application. I would even recommend to have different databases on the same instance. If you do not share any information across applications, you can have different databases. So you’re able to see my screen here. There we are. Okay, let me try to increase here make it better. So this is the road security on passwords. So as I was mentioning, we can, for example, create a policy that we, in this example here, the manager gonna run, he’ll do a select on the table account, he want to see only the roles that the manager column inside of the rules is equal his username. So he won’t be able to select any roles. That’s not managed by hands on. Right. So he won’t be even in this case here, even able to insert or delete roles that doesn’t belong to him. If he wants to insert anything on this table account, the manager needs to be equal to his user login. So when it comes to secure the database, on the access to the database, the granularity that’s given by Postgres is really amazing. If you if we think about that, right? Because we can really go to the row level security on Postgres, it’s not okay, the user Charlie doesn’t have access to the table accounts. That’s fine. Every database that can do that grant a lot, because you kept Okay, yeah, the user Charly can access this table accounts, but he can only see or only manipulate or only access, the roles here that belongs to him. If the manager is not the user, Charly, he cannot see, select, in this case, he cannot update, he cannot delete, or he cannot even insert any role here and record here that does not belong to the user trial. Alright, so this is the level of granularity that we can we can get on our posters. And we should start taking advantage of those things when you go to the database, back to your question. So one thing that I used to decide if I gonna create a database or a schema is if the information is shared, on the example that we gave, about the point of sale system, the finance system? Well, we might have some information that be shared among those two systems, right? So like, for example, they stock the amount of products that we have in the stock, we might need to know them for the point of sale, because you don’t want to sell something that you don’t have any stock. And the finals might need an audit department might need that information, because, well, they want to buy more of those things that you sell the most. Right? So because we’re sharing information, in this case, the best approach is to use schemas inside the same database. Right? We have one database, and then we have one schema for the finance for Mr. Mrs. schema for the point of sales. And we might have a common schema with that shared information. It’s a good design that you can have different, we can have this segmentation by schemas, right? If the systems are independent. I usually recommend having independent databases, if you do not share anything if you never share anything. So then we have one database, on the same instance, for the finance another database for the same instance, for the point of sale. Right? So in this way, it’s even easier to make the segmentation of the user for the finance, you won’t be even granted if access to the audit. It’s our easy to do this. For the other hand, Postgres doesn’t support cross-database select. So if you need information, let’s say you connect to the database, point of sale, and you need to get information for the database finance, you need to disconnect for the database point of sales connect to the database files, and oh, that’s a lot. It’s not like MySQL that you can do a cross-database. You put the database name, the table name, and then do a salon. It doesn’t work. And that’s the main reason, why we would recommend using schemas when you need to share information, because this cross a lot doesn’t work on. So, yeah, as I was mentioning here, so when it comes to the database access, there are a lot of granularity that you can, you can specify, you can go deeper and deeper on Postgres. So not only, okay, I’m not going to grant you the access for the stable, lower, they’ve got to grant you access for this table, but only for the specific roles, but you are the owner, let’s go to write one down any related whatever, we can use the policy, we can, we can make the policy very complex, right. And those are really nice features that we have imposed, that we start enforcing the access. In this example, if this user, the user child here links, the password, someone gets my username and password and access the database will be limited to the number of actions that he can do on the database, because my user on the database has a lot more limitations than a superuser, for example. Right? So those are part of enforcing right on the sharp leaves. Okay, what users we need, yeah, we need venues, okay, what access do those users need? Do they need to access the whole database? Do they need to access only one schema or two schemas on that schemas? Do they need to access all the tables in that schema? So if we can, if we start doing that segmentation, even in the case, when we have a security breach, and the thing is not if we have a security breach, but when you have a security breach, right, because it’s proven that we are eventually going to have a security breach most of the like the major companies they all had, and they probably going to have again. So the thing here is that we’re trying to do is minimize the impact when we have that secured bridge. And this is when you can go over and more and more things on talking about securing the database by access control. So that a lot of things that you can go you can do with the data access control, grants and things.

Another important thing here that I forgot is first just like all the databases, when you go to the authentication, it you can grant access to the user without a password, right? So when you go to there is this file PGH. Pa, you can do a configuration to tell the database that user is trusted, you trust that user. And you can say, Okay, I trust this user, if it connects from this IP or this box, after this, this data, right, you can tell the post, well, I trust this user, the database will not require any password. I’ve seen a lot of databases with this configuration for the user Postgres connecting on the local host using. So if that person already has access, if you do not have an user password to the database, then the person already has access to everything. But one can say oh, it’s easy to to just go change the PG, HB five into a database removed. That’s true, but we can have monitoring on the database removes things, and you can have another if that very specific parameter has changed, right? So you can know in advance, so you can know a lot faster if someone is trying to get access to your database. And for the other hand, if it’s already there, the person just connects to your database and whatever that he or she wants to do. Right. So it’s never a good idea to leave users without passwords on the database, it doesn’t matter if it’s the connection is local connections connection from outside old, it’s never a good idea to leave and use it without a password. And it goes to the first shots we have on our checklist. It’s always good to reinforce that always give your users a password, a good one. And, if you will, about the passwords, there is long discussions on okay, what type of password should you use? So should we use another type of encryption to get a database for the passwords encrypted? Should we save the passwords? As plain tax on the database? How should we were? So I would say that leave it is in plain facts is never a good idea. You have your username and all the passwords there as plain text? Well, it’s never a good idea. Right. So most of the recommendations are to use at least one check sample, like not MD five and more of the five is proven to be pretty weak. The problem about having those those those shacks on our it is because you cannot decrypt back. So if the user loses its password, you cannot, there is no way you can get the password back. So you should have a solution to let the user password, some sort of proven authentication proven who you are, change the password. Another solution would be to use a for encryption algorithm, for example, with using a synchronous keys, so but then you need you to make sure that you have the key secure. A lot of people I’ve seen applications that they use this type of, Ρ‰f encryption, so but then they install the private and the public key inside of the same database is the same as you saw in plaintext. Right. So if you have the private and public key in public key inside of the database, to decrypt the, it’s the same as have plaintext. I’ve also seen some applications that they do a bit more complex, I’d say they put on the product, the public key or the database, and the private key, they put on an application file, the configuration application, the configuration file on applications, and it’s still bad. So if you have a security breach, and someone is granted access your application. So in the application configuration file, you have the user ID and password to access your database, you have the key to decrypt it all the passwords inside of your database, what else you needю So it’s, it’s granted there for free, you just need access to an application. And those are the things that you need to be careful. Are we going to use encryption, so let’s use a vault or some sort of solution, but that it makes things a lot more thankful for an attacker to get postings and you see a lot of those things that are outside of the database world itself and not inside of a database. And this is when things get complex and complicated because you need a diverse team to help you contribute those things? I know at Rest Encryption is a big subject for many folks. What are our options in the postgres world for that? Well, of course we doesn’t have it may table what I seen a couple of, of plugins working on this. One thing that I’ve seen, a lot of people doing, it’s not really advice encryption is to have the file system encrypted, right? So it’s not on the database, the file system encrypted. So if you have an attack on file system attack, and someone gets your file system or whatever it may copy. So your file system is encrypted. When it comes to encryption, and database, there are a lot of different challenges. The first challenge is on the data access on search. Right. So if you let’s say, if you have an interest increase on the database, and the database really encrypt the table files, when you do a select, for example, you need to decrypt the index, before you put it in memory, right, so and then the problem about that is you probably need to decrypt the whole file because well, you need to own a naive implementation, you need to decrypt the whole file, and then you go to define the rule we are searching for the search, it becomes a lot is lower, and then now we have a lot more CPU and memory usage, if you’re not there. So it’s a lot more complex to build at drast solutions when you have encryption inside of Amazon by the name of it itself. So it’s a lot faster and easier to have it done by the file system itself, right? Because, okay, on the database layer, you have where you know where the data file, so you get that those blocks decrypted. You just asked for the file system, I see I need the blocks 1235 And the default system block. I don’t know of any really good implementation of food at Rest Encryption for both schools. I know some implementations that they do as part of the solutions. And then you need to mix that, for example haven’t the file system encryption is the most common on one Postgres vote is the most common one that we encrypt the whole file system and everything else that you need to do you need to go for plugins or for extensions. And the ones that support more features are not open source and Postgres for entries encryption. And that would be really nice too, if anybody on the audience has been working on Nola bolts, and anyone on both rhythms and new project. And if they could share, I really would like to take a walk and, you know, see what how the process has been doing on the front. But from what I’ve been working there, no, no no much progress on course, because all of this, I’ve seen a lot of discussions, a lot of discussions from people that people don’t want real progress. I haven’t seen much on the one thing I learned the hard way. Many years ago, there was a company that had a new accounting clerk and they fear well, we’ll just take the head of accounting account, closed that over and give it to this novice. Well, the novice wasn’t good with SQL and didn’t do a where clause on a DELETE statement. And it was like, why did we give a novice delete privilege?

Dave Stokes: Oh, we just had a question from XYZ.So reading from memory doesn’t have encryption. I know a very few systems that actually encrypt things all the way through into memory and those keep it encrypted there. If you run into anything like that, Charly?

Charly Batista: Good point. In the end of the day, when you got up on your application, it’s either the final application or worse with the data in memory, the data got to be decrypted by the in memory data that the final operation with that data gotta be decrypted, it cannot be encrypted, because you need to, to get the data. That’s a huge part of that, for example, for passwords. If you have the best secure encryption system, and the huge password and everything, when it goes to memory, it will be decrypted. So everything that goes to memory at some point, it will be decrypted. Right? So you get the data, the data will be there, you have a memory, if you do a memory dump of the application, you can find that memory dump, you can get access to the key, you can get access to the encrypted and to the decrypted data that is in that, because there is no other way to work the rattling with the encrypted data, or at least now not yet. Maybe we will improve in the future as we resolve those things out. But at now, when we really work with the data, yet it’s decrypted in memory, it will be eventually be decrypted.

Dave Stokes: So thank you for that question. XYZ.

Charly Batista: Yeah, that’s a really, really good, interesting one.

Dave Stokes: So what else can we do to keep the Joe Average snooper from getting into our database?

Charly Batista: Yeah, so we’ve enforced the access right before so delta integration, the data access control, we’ve been enforcing them as well with the other roles and grants that we did encryption.We’ve been enforcing encryption, from the network perspective, right? We are using certificated, TLS, SSL and everything related to the connection. So our data gonna go secure to the network from those two endpoints, they even use encryption read. So we also need to apply the encryption for our data layer, more or less. And of course, as I mentioned that the most common one is to encrypt the file system, and there are many solutions in the market, open source ones.Another thing that we can definitely do is to check and start granting different privileges for for the topology on the natural for the host, right. So not all boxes in your natural need to be granted access to the database. Even when you’re in a private network, you probably only need to get access to the database for the application. Obviously, application needs to have access from a couple of of boxes of your DBAs and everything, and everybody else, not probably none of that need access to the database. So if you prevent them any post, which is pretty easy, we don’t need to use complex firewalls, rules and everything because we are inside of our armature. You can go to the configuration file on Postgres and grant access to those very few boxes. I think machines can be IDs names or whatever you use to access your database, right. So that person, if a person wants to reach in your database, it’s another thing that the person needs needs to figure out.

Dave Stokes:
We have a question. So column-level encryption with only update and insert access is best policy.

Charly Batista: I’m not sure that I fully understood the question.

Dave Stokes:
Well, what I understood the question is asking if a column level o column level encryption with only an update insert access. I understood something else when I read the graphs, and now I’m even more confused. Yeah, can you give us a little more clarification there?

Charly Batista: You can insert all the records you want. But every update is going to be an additional record. And there’s no such thing as a delete until someone with a Master policy grant can come through and get rid of it. Kind of like a primitive version of blockchain, where everything’s append only until someone with very, very high probably just comes through and calls out the old data. I’m not even sure if that will be possible for you to only encrypt the data, because it doesn’t make sense to only encrypt the data, the insert and update. There are two main problems for data security later bridge. One is the access of the data. For example, if you have your credit card, on the database, you don’t want anybody to access that information. Right? So it’s accessing information. It’s a problem. It’s a secured bridge. So when we use those, those policies to enforce or to make things hard and difficult for somebody to access is because well, accessing the information itself is already a problem. Right? That’s, that’s one thing. Another thing is manipulating the data, right? So we don’t want anybody else to manipulating and faking the data. The example that I gave the lame example of changing someone’s salary, right? So I could read access to the database or a banking system, I could get access to a bank database, and then send the $10,000 from the account to my account. So we don’t want those types of things to happen, right? So we don’t want anybody else, anybody that has access to the information to see the data. It’s a problem. And we don’t want people changing the data. So let’s say it’s possible to only give one, encrypt the data when it’s inserted. So if we only encrypt the data when it’s inserted, or updated. What happens if we need to access that data needs to be decrypted? So the point is when we read the data you get in memory, but that makes sense. We really only print the data. When we write, that’s when we encrypt the data because when we reach, you just discard the data. If you don’t need to update it with the data, we just discard the data. You don’t do anything with that data. So yes, when you do encryption, you only encrypt the data back to the file system to the data file when you’re writing back the data. But you still need to decrypt that data when you read that file. So we read that data and put back in memory, when it comes to memory still need to be decrypted. Even if it’s for a delete, we need to find where is the role of the record. And you still change that work. Because even though I deletion on both ways, or any most of the database, there are not physical deletion, you don’t remove, you just mark that logical deletion is to change in that block, you still have one one beat, at least to change it to change that beat from valid to invalid, right, you still need to after you change that block, you need to recalculate the encryption to write it back to the disk. Right? So every time we do manipulate the data, it’s either insertion update, and deletion. We need always to encrypt the data back. And when you’re only reading the data from the disk, we need to decrypt it put it back in memory, right? What do we control encryption cannot be done at the table level or tablespace level or schema level. So this is the address encryption that we are discussing before write on post with natively cannot. Postgres doesn’t by itself that doesn’t have at Rest Encryption. The easiest way that we do encryption on Postgres today is encrypting the file system. So that means it cannot be done at this level of control, you cannot just encrypt a tablespace or a table. Well, you can do a tablespace. Because you can have table spaces in different file systems. In this case, you can because it’s on file system, wow, well, you can have tables, table space a on one file system, and tablespace be in a different mind system. So you can control on the tablespace bases because of that, but it cannot be done on a schema basis on our table base lab. Because it’s file system level encryption, it’s not database level. There are plugins on on the market that can do those. But they’re not the ones that I know they’re not open source. Right. So that’s, that’s something the ones that can do for pose risk on, then it’s at rest encryption, it’s inside of the database itself. It can do a table out we can do at row level, actually, not only they will have some plugins that can do at low level. And but they are not open source. And it’s a really nice open source project. So someone out there that wants to have a good idea working what what project open source and what to do for closers. In a nice one.

Dave Stokes:
Yeah. And you can use an existing bulk manager like something from HashiCorp or something similar. So the major hard part, the encryption, key management and rotation part is already taken care of for you. It’s already Exactly.

Charly Batista: And we can use like examples. Or we can base our solutions in some solutions in the market with order database, for example, the Percona server for MySQL, we have at rest solution, right? So we can borrow some ideas for those audit projects and start one for Postgres. If you’ve never walked in to find out that one of your servers has rusted out a bearing from a leaky toilet to floors up on a Monday morning, when you work at a university, I can guarantee you it is quite a thrill. Someone put an explosive device in a urinal and a men’s bathroom and the water leaked down and took out a mini computer used by a lot of the students as well as things were very luckily, Friday afternoon. Backups have been made. So what do you recommend on backup. Backup is a huge topic. So my recommendation is to use a tool like G backrest for backups. The same thing, you can have a specific role to do the backups specific user to do the backups on the database. To be specific privileges, you don’t need to grant everything to that user to like PG back fast you can. It can love anything. Your full backups, you can do incremental backups, you can do differential backups. You can have a backup. We can paralyze that your backup can use more than one tried to backup and you can compress the backup. So a lot of things can be done using a tool like PG backrest or Postgres 15, the PG base backup tool will be able to compress the backups. When we’re doing we’re doing so like that was being done a long time for tools like PG base backup environment and all that we have in the environment. So my suggestion is use a tool to make your life easier. You don’t have to manually we had a stream here we’ve got

We went through the backwards, we showed the manual process. It’s not complicated this per se, but it’s, it has many steps that well, it’s easy to confuse those steps and make a mistake. And then you have files, but you may or may not be able to recover a database using those files, right? So no one wants to have files and when the water leak other your small computer happens, you’re not able to recover the backup of the database, right? So it’s not what we want. The suggestion, yes, use it to the one we recommend a recorder. And it’s in the micro long time it’s PG-based backup. pgBackRest. That’s what the one we recommended Percona. But there are others that PG bar borrow man and a couple of other tools there that if you use a tool, if you have knowledge, they’re good tools. They get the work done. That will be my recommendation for you.

Dave Stokes:
We have another question about can you use LSM trees?

Charly Batista: Postgres, when it comes to I suppose. The question is about indexes.I ndex enforcers they’re quite extensible. We can build our own implementation of index that we want, including LSM trees, V trees, any different types of trees? If the question is related to the file systems, some file systems they do use LSM trees to store the islands and things. Yes, it’s also possible. So as long as the interface API to the file system returns what’s hot?

Dave Stokes: Yeah, it’s definitely possible to use LSM trees in some ways working on RocksDB for Postgres, which I find impressive.

Charly Batista: Even though it’s not so awful to have other implementations of Datastore for Postgres, like MySQL, or MySQL.We have been dB, you have my rocks. So it’s possible to have those implementations for posters. Alright. So postman has its own API that interface the right we can have it not like it seems some, it’s easy to believe or think that you cannot extend or change the start arranging on Postgres database. But actually, that’s not true. It’s possible. And there are some limitations, as you mentioned, some people implementing my box for both groups. And on the index side, too, it’s extensible, for example, that they post products, they build a new whole ecosystem that works inside of Postgres, and it’s just an extension. Nothing changed from from from the regular clone. You have new types, you have new you have indexes, you have many other things. That’s that’s built use to track from what date that was Postgres ecosystem. It’s impressive how extensible the databases.

Dave Stokes: And another question from XYZ extra statement, backup encryption in the bank industry is very important for audits. In something we use pgBackRest to take encrypted backup directly to Cloud Storage. That’s a good thing to know. Before this, we use an account to squeeze account from person A privilege. Ah, that’s something else for me to go up and look at.

Charly Batista: Yeah, that’s true. I mentioned tools like pgBackRest, and it’s really good to you, you have a lot of options and including install the backup encrypted already in the storage subsystem.

Dave Stokes: Very cool. So we have managed to kill off pretty much a good hour on this, and we’ve kind of just barely scratched the surface. And I probably screwed up your entire can presentation that you probably have. And with that, I want to thank our folks who asked questions, especially a big shout out to Charly. I just found out Charlie will going to Berlin for Postgres Europe. So you folks, you’re gonna have to do without him. We have other folks going who are almost as impressive as Charlie. And if you do have questions or comments, or have something that you want to see extensively covered in one of these things, please let us know. And with that, I want to sincerely thank Charly for his effort and his knowledge, I am still very much a knowledge and novice in the Postgres area. And these things are always very enlightening for me. And when I’m doing these talks, you mentioned something, I’m over here on my other screen going, what the heck is that? So this is always a great show for me. So thank you. Any last comments or, or recommendations?

Charly Batista: Thanks, thanks for the nice words, the compliments, people are going to Berlin, we have a lot of knowledge, folks from their corner, some of them a lot more knowledge than I am. We always invite people the community and folks and, and friends to attend that event, right? It’s always good to see in person to talk to have great discussions and to make the community stronger and stronger. This is one way that we have to improve and enlarge in the Postgres community. And thanks again for this nice talk. It’s always good. It’s always great to be here and talk to you guys. It’s a pleasure.

Dave Stokes: Oh, here have anything I’m moving from extra data to Postgres. I don’t know anything exactly about that. I could see why you want to do that for financial reasons, call our consulting folks. They probably have other customers who are in debt. And can we cover performance tuning? Well, that might be an entire series of talks that we do coming up in the future. And with that, I want to thank everyone for participating and for the folks who watch the stream after it’s done recording. And thanks again. And if you have questions, go to percona.com and, and let us know. Take care, folks. Bye Charly. ∎

Speakers

Charly Batista

Percona, PostgreSQL Tech Lead

Charly Batista is currently PostgreSQL Tech Lead at Percona. Possesses over twelve (12) years of experience in various areas of IT including Database Administration, Data Analysis, Systems Analysis and Software Development. Strong analytical skills combined with experience in object oriented programming techniques. Technical Leader for more than four (4) years for the Developer Team. Born in Brazil and now living in Shanghai-China.

See all talks by Charly Batista »

Dave Stokes

Technology Evangelist at Percona

Dave joined Percona last February 2022. Dave is MySQL Community Manager and the author of MySQL & JSON - A Practical Programming Guide

See all talks by Dave Stokes »

✎ Edit this page on GitHub