Setting Up Backups and Scheduling Regular MySQL Backups - Percona Community MySQL Live Stream & Chat - June 10th

by Marcos Albe, Matt Yonkovit

Percona Community MySQL Live Stream June 10th

Learn more about Backup best practices to ensure the availability of your database. This is a part of the bi-weekly live-streamed meetup dedicated to MySQL.

Video

Transcript

Dave Stokes:
Yes, Miss Foreign minister that was singled out scintillating conversation ever had because I forgot to hit the wrong button for those who are expecting Matt Yonkovit, but he’s on his way to the Southeast Linux Fest today. So you’re stuck with me a very poor second choice, but you have the great Marcos Albe, who’s going to run through everything you need to do need to know about setting up backups and scheduling regular MySQL backups. So all the stumbles out of the way, we’re ready to go.

Marcos Albe:
So yeah, I’m gonna be showing you one of brokenness, flagship products. I think this is one of the things that has attracted many people to Percona. And it’s extra backup, which is being very hot backup solution with a host of theatres, right, parallel backups, encryption, streaming, compression. So you know, it’s a really beautiful tool. And you can build a complete solution. This is not a solution to take backups, yet, we are building a solution and I’m going to show you some of it. Sadly, I failed to properly set up a fake s3 server on my machine. So I won’t be able to showcase that solution fully. But again, extra backup is a tool around which you build your own solution. So I’m going to be showing you how to take incremental backups. And we’re also going to be seeing a few other things. Namely, well, how to bring a backup up to date. So you you produce the backup. And you know, two days later, you want to restore the backup, you don’t want to lose all those two days worth of data writes or updates. So you need to bring your backup up to date. The other one is testing backups, having a backup and not destiny, it is just having a large collection of bytes that, you know might mean nothing, they might be unusable. So actually testing backups is a critical step is something you cannot avoid. Like if you’re testing backups, you don’t have backups, you just have a large collection of ads. So to test your backups, and the other one is monitoring. Monitoring backups can tell us a lot about what’s going on in our system. And if we are about to run out of backups, are my backups working? And can I actually rely on them? That’s a question you actually want to be answered. So I, I want to start also, I will talk just very briefly, there are other types of backups, right? Like we have logical backups. We have snapshot backups, and we have delayed replica backups, and they all serve different purposes. I don’t see these as mutually exclusive. Rather many times they are complimentary, and we’re going to see that logical backup are my favorite way to test a physical backup. So this is, again, not an option like which one I have to do, rather how I want to combine them to, you know, have a successful backup strategy that will always allow me to achieve my recovery point objective, which is, you know, how much data can I afford to lose and always to achieve my recovery time objective, which is how fast can I recover from this? So I said that extra backup is physical backup, then there is a logical backup for which I would normally use MySQL dump or my dumper. But if you get to the time today, I’m going to be showcasing MySQL shell utils dumper, which is new, relatively new utility, it has parallel dump, and restore, which is super nice. And it’s integrated into a single shell. So it’s quite nice. And I believe developers are going to enjoy it fourthly because it does have a JavaScript and Python interface. So easy to script easy to integrate into your workflow.

So yeah, we’re going to be taking a look at that. And again, we’re also going to be Considering what a delayed replica is, delayed replica is just replication and a synchronous replication setup that you keep purposefully delayed. So that if you do a drop table, or if you do an update without the word condition, you will have a very fast way to restore because you just have to stop the replica, and bring it up to the point where you did the bad action, skip the bad action, and then apply the following events, and you’re running right, and you didn’t even have to restore a backup, the backup was already there running. So it’s a very nice method. We might also try to get that done. But let’s start by sharing my screen. And how to share a window here. And here we have it share my sharing. Yeah, I’m sharing my screen. Yay. Okay. Here we have in tables, Marcos dB. It has MySQL D running somewhere. There you go. And this MySQL has the movie JSON test that our dear math, let me. So extra backup. I installed I’m running MySQL eight, zero Percona Server, it’s zero. So with a eight-zero series, it’s very important that you match the minor version of your Percona Server or your MySQL, with a minor version of extra backup. Because Oracle is doing continuous integration in its zero, meaning that major theatres and backwards incompatible theaters are introduced in minor releases. And this means that for example, the redo log format has changed at least three times through the last couple of months, years. on eight, zero. And so because extra backup relies on the redo log for consistency, the change of format will actually make it incompatible in between versions. So very important to have a matching version. So in my rocky Linux, I will do Yum, search, show kids. Crap. And I’m gonna grab for pain. And there you go. Very easy. Turkish workers yum installed, I already have it installed. I’m just gonna repeat the process. But basically, again, the important bit is make sure your backup version matches your MySQL version. Okay, that is the critical part, I’m going to make this font a bit larger is I’m not sure if that’s okay. Dave works to me, it works. Okay, great. So Young style. Of course, it’s very fluid like ruled by cache, it’s already installed, of course. And then this will give you the extra backup theory. So you can actually see there, it’s already running. And we get the red version. Amazing. So with that, you will want to set up that there are a few options. Today, I’m going to just make it a simple demonstration with a local storage. But I’m also going to very quickly showcase how you can send it to a remote server, which is usually desirable because you know, you don’t want your backups to live in the same machine you’re trying to save because the whole machine could go to hell. So it rather keep the backups of off path, right? You can you can do that by our syncing the resulting backup in the local storage or you can directly stream the backup into the remote storage. So with that, I told you I was going to share what I do for organizing my fullness. So what I would do is date plus the year plus the week, which is you, I took notes because I have it in my scripts. I’m sorry, I never remember this stuff, top of my head. So those like this, G, and then the

that’s the year. And then v is the week. So we are now at week 23. And then we have the day of the week. So I would normally start my backups on Sunday, I wouldn’t say okay, Sunday, which is for so much people, a low traffic day, other people is crazy day. I know, for gaming companies, and perhaps some retail companies. It’s gonna be crazy day on Sunday. So this might not apply to everybody. But many, many people on Sundays, quiet day. So Sunday is the setup, it will be the zero and I will start day and I will go I will call that food. So what I will do is mkd let’s be late plus g.And I will do this on a Monday on a Sunday. And there you go. And then I will I believe I’m tested just to make sure everything was working not. Okay, that’s great. So what I will do now is take a full backup. So I will do time just to measure how long this is also a good idea when you’re taking backups. measure how long big take, you can do this with time, or you can sit counter for time. See, though, when you start and then take the time when you finish and calculate the delta, whatever. But it’s important to have timing about your pickups. That helps you estimate recovery times it helps you estimate operations in the future when you’re taking backups for providing replicas to provide developer with a testing environment or you know, whatever backups are very important operative assets, and it’s good to know how long they take. So I would try an extra backup. I’m gonna use defaults file that contains my user password so I don’t type them in clear. These are your home Turkey my eyes? Yes, sir. There. And I’m gonna say back up. I want to do equals four threads, you know, I’m just gonna want to see abuse. So I have eight CPUs. So yeah, four threads is quite good. And a simple press equals four as well. And good is going to be more ups and again, each week. And I’m gonna, you know, pretend we are on Sunday. We’re obviously not, but I’m just going to pretend. And that’s it. This is going to produce backup usually. Of course, you have to be rude. So, extra backup. Let me tell you a bit about how extra backup works because it’s interesting to know. Basically, Excel backup requires direct access to the file system. So you can not run external backup remotely. You can access the note remotely through SSH and trigger the backup remotely. But you cannot run extra backup on one machine to go to backup the files of another machine. So extra backup requires direct file system access that the user that I was trying to use did not have permissions by the operating system. than to access those files. So I got ballot, and the other part, Excel backup will start copying the funds from the file system. And as it copies the files, it’s also gonna keep a copy of all the events in the redo log. So, it’s gonna keep a huge redo log stream for all the duration of the backup. And so when you finish, you have crashes. That said, like, the data files you have, will look as a crash it to MySQL, you know, if you started that instance, MySQL will say, these data files are crushed, like your blog and you pull it out the blog, while this instance was running, that’s how they’re gonna look for MySQL. And then, with the redo log, we can do the crash recovery, and bring that copy of the files in sync and make them usable. And that’s how Excel backup works. Basically, you get a copy of the files that is not workable, and you get a copy of the redo log that will make them workable. So we’re gonna see how this works. So now Yeah, with sudo, I start my backup, it’s gonna take about a minute, I’m just gonna do my Google and stuff and doing some math. So Extra, extra backup is great. Because in newer versions, if you’re not doing any ALTER TABLE, drop tables, truncate table, you know, that kind of data definition operations, then it’s basically nonblocking, you know, it’s not gonna cause any locks. Just very, very briefly for well, in it zero is not going to cause any luck, in fact, seven that are still some my Eisah funds. And those will require you to, to actually lock the server to have a consistent copy of those. But otherwise, you know, here the guy did flush alright to be no locks, no recent look, look for backup somewhere, it must have done look for backup. I cannot see it. But you can see it does unlock tables, it’s because it actually look at for backup, which is just looking the metadata and not the table itself. So you can operate on them. You can just not do ALTER TABLE DROP TABLE, etc. All right. And if I look at my backups here, I have backup, it’s full of these QP files. That is compress. Because somewhere there must be a compress installed, although we have it built in actually now. So with Kubernetes This is the Kubernetes extension, I mean, and what we’re going to do now if we wanted to use that backup is we’ll do sudo slotback. Like a vehicle brands, target the pupils what you don’t like Cooper’s command not found. Okay, so it’s not good thing. This, sir, this is coming from our tools report, I guess or from a bill. But if you don’t find it in your, if you do yum install and it doesn’t work for you. It’s for sure in the EPL repo. So with that installed, we’re going to be able to run the compress. Again, it’s going to take some time, it hopes not a bad idea to also do this.

To do time, you know, so you can actually get a timing estimate for your operations. It’s good to learn about how long it takes on your site, like I can give you a time estimate. It’s 90% of the time is going to be inaccurate because of TAS CPU, RAM. You know, compress how much can the files get compressed. So everything comes in And it’s really hard to give time estimates. So it’s best to learn about your timings through a bit of experience experimentation. And just using time, it’s very simple. So I’m gonna restore this backup. I’m gonna show you a very nice trick. Let me find my snippets. Start instance. There you go. So okay, now it got a lot of QP files that you can remove, I forgot there is another switch to remove the files as you decompress. I completely forgot which one is this, I have all that scripted. So I don’t remember the manual to be honest. But it’s very easy.

The third there so that you have a backup, this backup is not usable. This backup as it is right now. It’s in a crashing state, like we said before, it’s waiting to be prepared. And I can tell that because I see this extra backup log file. Meaning that the redo log extra backup capture ID, which is started here, has not been yet been applied to this task. So I’m going to again, go the XtraBackup, prepare, the gonna do nothing else, this is basic. Again, this is gonna bring the backup into steady state, instead of being in two in a crashing state is going to be in a healthy state, it’s going to be as if you had clean we shut down the database before copying files. And we can see

here, you can see it’s still in this, oh, the database was not shut down normally. And you know, extra backup, it’s, it’s based on the same inner DB libraries we use inside my sequel. So you’re gonna see a lot of the same messages here. And does the crash recovery, it goes through it. And it applies the redo log, then applies the undo log. And with that, it completes the crash recovery. Now, these things thing is usable. And I can tell because while I see the log file here, I also see the IB log files here. And those are the redo logs that were producing after the crash recovery. So what I’m gonna do now is going to connect the, I’m gonna create an auxiliary directory. And I’m just gonna let me see. So I’m going to start MySQL with that data directory. And I’m just going to the rest is going to be all fake 433 And then the error log in the auxiliary the socket. So with this simple command, you can start an instance bring up these data set. You could also of course what can we add we can be B to G because that’s what I see here. And to be just to have some buffer for Okay, and if that weren’t good, I should be able to do this

It works. So you can see I’m connected to different ports. So I could also connect to three, three or six animals still. So why is this so normally you will do this in the same instance, in the same server host, where you’re running your production, you will have to stream this to a remote server. And then you can start the backup like this, to ensure the backup actually works. And more importantly, you can do this, my sequel should be I’m gonna, I’m gonna put my password on the screen, though it is today, reverse B. Please don’t copy that.

Dave Stokes:
Password URIs is on this repeat for this purpose. So there’s no correlation between this and anything like a checking account?

Marcos Albe:
It’s just my secondary bank account. But it’s fine. You can take note, I hope that works, actually. Yeah, I think MySQL shell didn’t know about the false flag. You move this be this is there you go. And what you will do is you do the taxes, very easy. Again, I am connected through to the backup and restore. If I’m able to complete a logical backup, that means every clustered key within InnoDB is in good shape. So I can rely on this backup, this backup is good and holy. And hold sorry, good and hold. And you can rely that this backup will be restorable next time. And, you know, like, you don’t have to run to prepare. Next time you’re going to use it. So that’s a very big plus. Because today, the Prepare was very short. Because I had no traffic on the instance. But on an instance with heavy traffic, you will actually suffer some weight when doing the the deeper step because it has to do crash recovery. And if you had a lot of writes, well taking the backup, crash recovery is gonna take quite a while. So having a prepared backup and testing the backup is gonna speed up recovery and it’s going to guaranteed recovery. So looks sounds like kind of important. So here that being says you just pass the directory you want to do these backups key person, the person, in this case, is for sure. I think that’s of course, there you go. And I want to send it to you instance of go to 43. Think that’s good and beautiful. And it even gives you progress and whatnot. So it’s nicer than MySQL dot php, right. And again, it’s doing multi-thread dumping. This case there’s a huge table so that one table is probably blocking everything and the other threads are just idle

I like how it changes the speed according to which table is taking the backup, right? If it’s if it’s a wide row, there’s not going to be able to do as many rows per second. And if it’s like a small rose, then it can do 600k per second. Wow. So not all tables are made the same, you can tell right? Like it will start being near 600k. Now it’s doing 10k. This probably has JSON columns. And that’s how it gets to be done a slope.

Dave Stokes:
Didn’t really JSON was slow down the backups.

Marcos Albe:
You have Jason, you should know this.

Dave Stokes:
I really have not really considered from the backup side before.

Marcos Albe:
You have something to add to that group.

Dave Stokes:
Well, what do you think about being binary blog? I could see why it slows it down. But I’ve never it never occurred to me before this. Now. Now, what’s interesting to me is the relative speed between the two of these between XtraBackup and util, dump instance.

Marcos Albe:
Yeah, I did not time it this. I gave you that advice. But I forgot to time it because I was sure this guy was going to give us the total time. I am not now not so sure. I recall that he gave us like the progress. So you know, I assume the timing will be there. But doesn’t look like it.

Dave Stokes:
Maybe it could be because I know extra backup is just amazingly fast compared to, you know, MySQL dump or enterprise, or Lascaux backup.

Marcos Albe:
Okay, here’s the deal. That is not necessarily always true? Well, yes, many times, it’s gonna be faster. And look, it’s good gave us a time, that’s great. So it’s about double the time to them. But imagine you have more Index System data, which is not uncommon, you know, many people insist on doing their analytic queries on their transactional database. And they over index things, they create those kinds of covering indexes, or just 1000s of indexes. So you could end up with more indexes than data. And in the logical dump, you don’t get the in the secondary indexes. So it’s gonna be smaller that it’s always for sure. But it might also be faster. Again, also, imagine you have very big IB data, one, because you know, you had some undo log growth for a long transaction or whatever. Or that you have a lot of fragmentation, like you have some tables that grew into the hundreds of gigabytes, you know, and then you do delete of health that table, that day, we still going to be 100 gigabytes. So MySQL dump is only going to dump whatever amount of gigabytes for the data that is left, extra backup is going to be the 100 gigabytes. So in that sense, it’s not always true that producing the backup is going to be faster with extra backup, what it’s always almost always going to be true is that restoring the backup with extra backup is faster, because then you don’t have to run the insert statements again, and you don’t have to create the secondary indexes again. And if you and you are a good guy, and you already prepare your backup, ahead of time, instead of waiting until the day of the disaster to test your backup. So because you’re a good guy, you don’t have to wait for the prepared time either. So restore that backup is going to be much faster. Yes. Very cool. Have you seen it? Not always true, but restoring always? Okay, so

Dave Stokes:
I have a completely correct backup. Well, that takes a little bit longer than to have one that kind of leaves something out if you don’t find out till after your, your restore the data.

Marcos Albe:
Yes. And, you know, Benjamin Franklin said there are three things certain in life, death, taxes and data corruption. So

Dave Stokes:
Ahead of his time in more than one way.

Marcos Albe:
And absolutely, absolutely. So yeah, it’s just unthinkable To not test your backups, really, it’s not a backup procedure, unless the procedure includes testing. This was all the testing that was necessary. This is all you need to do, right? It’s like just do a logical backup, restore your instance, somewhere else, you know, copy the instance to remote destination with a single command, you know, I just created one directory and with a single command, I brought up the backup, after preparing, and I was able to verify that it’s fully usable. Now I know this backup is good. Now I can tell the data and this backup is always going to be accessible. And you know, we can go to sleep. And if disaster strikes tomorrow, then we are okay. And we are okay because I also have binary logs enabled, which is the second piece of let’s bring things up to date. But before we go into binary logs, I’m going to show you incremental backups. So I took a full backup, I fully prepared it, so I’m gonna have to wipe out this one. Because for incremental so you can not have fully prepared backups. This is kind of a pain in the neck. But we’re gonna see, like, I guess I will tell you what I will do. Like, I will go this route, you know, to have certainty. And to me, after enough years, dealing with disasters from hundreds of customers, I can tell you certainty is invaluable. It’s like having absolute trust in your backups. It’s really something that is priceless is so beautiful, that the day things go wrong, you can be adamant on you know, we’re gonna get out of this because I have good backups. So it will take a bit more work, but it’s totally worth it. So what I’m gonna do is I’m gonna make the fake remote so fake remote is just I’m gonna pretend this is somewhere else okay. And I’m gonna make give fake remote backups. Thank you amount of disk space I’m going to propose using might be obscene for some people. Again, it all depends on how many guarantees you want for your backups. And if you have income incremented if you want to do incremental backups, things do get a bit more complicated. So let’s see how this goes. Basically, what I’m going to do is surprise.

I’m going to start the screen out and use the marks right.

Dave Stokes:
Now you’re basically doing a full backup on a Sunday. Do you do incrementals for the rest of the week? Or is it like that?

Marcos Albe:
That will be you know, I see people doing incremental twice a day. So it’s also okay, like, I don’t see anything that is, you know, there is this thing where they can incremental, like if you have a very busy database, like a very busy database and you have a huge dataset the other day, I got some game with that 10 terabytes or 33 terabytes. You know, like, massive. That’s it. He was taking a backup with extra backup. And he was never able to keep up with the while he was preparing. The being locks will run away. You know, he had perhaps too short binlog retention because the DML volume like the amount of insert delete updates going on there was huge, so he couldn’t have eternal retention for the bean logs. And in the end, what he did was took a backup, then prepare the backup. And then started taking incrementals. You know, elementals too can always take because they are just based on a lesson. And the lesson is recorded on the files. And so we can always go back to that innocent and you take incremental and incremental take shorter time to prepare. So, in the end, he was able to, you know, have a backup that was working. And that was close enough to the beam locks. So, it there’s nothing wrong, like with taking multiple incremental backups, of course, reading and scanning to look up change pages and everything, it does have some impact on performance, like there is nothing free when you’re doing operations in the database. They’re never free never. So it’s going to put some external, some load on the database, but it shouldn’t be like horrible. So yeah, I will do it once per day. But again, you can do it more than once per day. And that is the beauty of doing week day number right? That will give you like very easy to to find like oh I want to go to last week, you know, to Thursday and it’s easy to just locate the backups and to do calculations you know you just add one plus one plus minus one and you can find weeks back and days back so I don’t have a screen anyway I’ll just connect again to this let me duplicate this chip rocky let me find the

so I’m gonna do my directory for the day plus and I’m gonna say it’s full because in an imaginary Sunday what you don’t like oh percent no plus so I have the full I’m gonna go here to have NC so I use Netcat you can use socat or whatever else. I just don’t remember so cat top of my head I have it somewhere written down but I do remember in cat so I’m gonna say in cat listen on line 29 So please open a port 999 Only listen for the camera streaming there. And then whatever comes through that port just send it to right here so it’s XP stream let’s see. See I pretty sure just let me take a look at my notes. Just one seconds

yeah. This ex wants to see and it’s gonna be here. So, I set up the listener and and then let me make these folds somewhat larger. So Netcat listen on this port and whatever comes on this board by bit to XP stream, which is going to extract and is going to output on this directory so With that I can repeat my extra backup if you’re someone who cares extra there Yeah. So I’m just gonna say target beard slash TV. And that is target deal in this time is just going to store the temporary files, the files that cannot be streamed live because we’re writing to them while the backup is happening. Namely the file that we want that is going to be sorted here is the extra backup log file which is going to be copied Once the backup is finished. So and then I’m gonna say we’re gonna stream it string equals experience clean pipe I’m gonna say Netcat to 701 with never remember Yeah, it’s just like this. So back up sudo because otherwise it’s not gonna work alright, tables and in this it won’t see anything but if we actually there you go. So you can see it’s growing red, okay, there it. How it grows and goes up and down. But basically you can see that I wonder why it goes up and down so drastically. But again, it’s copying files. It’s just gonna take time apps already finish it there you go. So we got our first backup we are going to copy a file from this backup which is the checkpoints file so this is one file you’re gonna have to bring back from the remote destination and you’re gonna see it’s uncompressed and it’s good that is compressed becausecthat guy is telling us the lsn for the backup, we should do some let’s do something. Okay, they’re gonna have some rows. So, I added some rows because I want the logical sequence number to advance so we’re going to have something to increment the backup in the back. Oh, well, I forgot, you can actually, when you’re doing the backup, you could actually save it in a next like, keep it somewhere else so what do you have, okay? Because I’m gonna put cool, it’s gonna be that, okay? So what you do is extra backup you back up and then you do all the work compress and target the equals and then you do rather always take notes, this is what it wants. So you will have like that we’ll do the same and set up the listener. So we stream here because by the late just going to participants.

Again, this is the key part. When you have one, when you’re starting an incremental view, you want to point incremental basically to a directory that contains the LSM. So you know, actually and when you’re doing the backup, I forgot about this to be honest. You don’t have to copy back like you can just simply use this flag that is going to keep an extra copy of extra backup checkpoints locally. So you don’t have to copy back that it was made precisely for that. Originally that didn’t exist like when incremental backups came out that didn’t exist what you know like what they did wrong the faults and of course, when it went to start streaming my pipe was broken because the previous attempt of the backup failed. So to open the pipe again. You will see there’s nothing here. So you open the pipe again and I think again, this time is working. So again, this guy is going to To start at let’s see where it’s telling us where it’s going to tell us it started at Ellison x that’s weird unseat it will only give it us at the very end we’ll see in it Okay, so we can go here

Okay, there you go. It’s the other way. So, to listen here and see to Ellison is the same as from Ellison here right? The full backup has from Ellison zero. And the incremental backup has from Ellison matching the last tool a set of the full backup. So you know if I have been smart, and I would have kept my extra backup checkpoints on the other side, I wouldn’t have to go around login files back and forth but have corrected next time this is my local machine right around victory two three just gonna copy from I hope I didn’t deleted the wrong files, we’re gonna find that person there you go. Verify something very quickly. Okay, I didn’t delete the file status. So I’m gonna take another incremental backup, I’m just gonna add more rows. It will be this. Okay, then you have many more rows.

All right, so now we have a bunch of rows.

And I’m gonna do the right thing. This time. I’m gonna create locally, backups to teach you. So today is Tuesday. So today, it’s Tuesday. And what we’re gonna do is that silly thing I’ve been forgetting incremental basis is going to be one. Because, again, I want to point the incremental bass team to the directory that contains the extra backup checkpoints file for the last backup from where I want to take the Delta. If I want this again to full, I’m going to take a delta that is also going to contain the things from Monday. And not only the things between Monday and Tuesday, it’s going to paint the whole of Monday and Tuesday. So again, if I point again too full, it will start again from this Ellison. And actually, we want to start from these other innocent, because we already have an incremental that covers this range. So I want to point to this other checkpoint file. So it’s going to start at this point. Hope that makes more sense now. So I’m moving, I copied, again, the lsn, the extra backup checkpoints here. And I’m actually gonna stop doing the silly stuff. I created local directory, not in the remote, but in the local. You can say extra in equals law, and it’s going to keep a copy of the checkpoints for me in there. And this should all be erased. No, of course, because we need to create the listener on the side. Here, I’m on the fake remote. To know There you go. That works. Hopefully Oh yes, extra that’s what I never tried to remember variables. So it’s streaming. Again, it’s using and we’re gonna verify that in a second.

Dave Stokes:
Now for restoration purposes, would it be faster to restore to incremental? No, a separate Monday and a separate Tuesday, then a combined Monday Tuesday incremental?

Marcos Albe:
Well, it’s, that’s a good question. If so, if you do larger deltas, you’re going to have to copy more pages. So copying more pages takes more time, that’s for sure. And then if you have to scan more pages, you’re also going to use buffer pool and everything else. And you can see it’s taking less time. So let’s see how long it took on the previous one, which was smaller amount of rows, right?

Dave Stokes:
I was just going over the old Linux dump command where they’d have you do the level zero dump on Sunday, a level three on Monday, a two on Tuesday and a one on Wednesday to kind of keep the incremental sizes down and then go. So those are our heavyweight process to restore things in number of cakes, but no one uses tapes except me anymore.

Marcos Albe:
You can see look, our full backup was four gigs. The first one which only had a handful of rows, right? It’s less than one megabyte. And the second one where I added a bunch of rows, right, like I added a few 1000 rows, it, it became a few megabytes. So again, if I had combined it both like this is really taking only the deltas, you can see it’s effectively taking the deltas, right. So as you take those deltas, if you if I will produce a backup on Tuesday, that will encompass both Monday and Tuesday. It’s going to be a larger and every incremental is going to become larger. Like as we’re gonna see now, there is a Haiti situation. Where how do we test these backups? Now? We were super happy. You know, we’re taking incremental backups, everything was working. Okay. You know, it’s Tuesday, and we got backups until today. Excellent. Excellent. Excellent. Now We certainly have to test the backups, right? Like how to add to disk backups, we have to prepare. But you cannot prepare a backup, you cannot fully prepare a backup, if you want to actually be able to apply the incrementals. So if I go ahead and I do a prepare on this Saturday on this directory, I will not be able to later apply the deltas, I will be stuck with my full backup and this is going to be garbage. So that’s why I created a backup system directory where I will copy the backup basically, so I can actually prepare the backup and actually do the restore and test it. So it Why Is it painful, because when you do the incrementals you’re going to have like, let me show you the process and it’s gonna be more telling than me babbling. So, testing again, I don’t expect a single human to do this, right, I slept on a script to do this really, don’t don’t go there typing commands, like have someone sit down and write the script, spend some days of your week writing script is gonna save you time. It’s going to think make things more reliable. And it’s going to make these make sense, right? Because this doesn’t have any sense as we are now. Like if I have to do all this manually, of course, nobody’s going to do it. Bad habits. Um, so here comes the pain, like perhaps you don’t compress your backups, who cares? It’s just a step. But the point is, you’re gonna have to repeat the decompress, all almost prepare, you’re not going to fully prepare, you’re going to almost prepare and then apply the incremental and once you apply the incremental, do the full prepare, and then test for that day. The next day, you’re going to have to copy again uncompress all almost prepare, apply the incremental almost prepare again and then apply the new incremental and then fully prepare and finally test so you have to start from scratch actually to test the instrumentals. You have to start the applying from scratch because otherwise you have no way to test that’s what I’m trying to say.

I will find the switch to delete the QPS automatically forget about that. So what I’m gonna do now is said extra prepare apply only is it like that or is just applying not only just take a quick look yes I know recall if it takes to prepare or if it just takes the Apply preferably not only and that looks good computer. So I’m gonna go back and I’m going to do the same apply log only targeted. And now I’m going to say that it’s so if I wanted to test this backup, I will have to remove I would have to run again with without the Apply log only, what we do with this is we just apply the redo log but not the rollbacks. So we can keep going forward and be consistent with the lessons we have recorded. So once we if I remove this is also going to apply the rollbacks and we’re not going to be able to apply more incrementals so you don’t want to do that as long as you want to apply incremental that’s the whole thing like this actually you’ll have to copy nothing this the incremental it’s gonna be you have to decompress first watching I almost remembered the name to watch this There you go. And now back backups testing extra bulk apply only and then don’t get create another point to the full one target date is always full one and then into the equals sorry, like remote backups. So being connected is going to remain on touch, we’re always going to be working, and everything is going to be applied on the phone. What I already did the partial apply. So they already did apply only on this one. What I’m going to do now is apply only. And I’m going to apply this incremental basically, that’s what I mean here. There you go. And so, again, if I want to test and here is the painful part, if I want to test if they on Monday, you know when the incremental backup is done, if I wanted to test this incremental backup, I need to go ahead and do Exarcheia. Equals full, I’m on the backup test in red. So I’m going to be fully preparing the backup. And after I do this, I will not be able to apply more incrementals. That’s why I copied this to a backup testing folder. That’s why I don’t do it in the original copy. I make a copy. So I can over and over apply added incremental I hope that makes sense to you.

Dave Stokes:
Well, we’ve been on for an hour and 15 minutes almost. So we should probably

Marcos Albe:
All right, we can wrap up about here. And just let me show you one last very quick bit that I’m not going to be able to get to showcase utterly today. I’m gonna stop sharing the screen. I’m going to share another screen to the window. So while you could go ahead and build yourself all the scripts you want If you want something easy, and that can get you out of the bad spot of not having backups somewhat automated, you could set up PMM. Then you could go to Settings, go to Advanced Settings, and go to enable backup management. Once you enable backup management, you’re going to have this tool here, which I don’t know what that icon means, but I guess it means go back to your backup. And backup inventory you’re gonna have and here is the piece that failed for me today is the storage locations, it’s going to have allow you to right now it only support s3, right? So you will use your s3 endpoint here, I was trying to use this one. And okay, whatever.

But suddenly they the location disguises, specs to give a location and I’m not able to enter it here. And without the location is not working for whatever reason. I think my s3 server expects to have US West one or USGS two, whatever. So I couldn’t get it to work. But this is what I will recommend. Right? It’s like, learn to use extra backup because it’s what these videos in the backend, and learn to take manual logical backups, but try to automate it and try to rely on some professional grade product for your backups. Because you know, it’s what’s going to get you out of the bad spot if your data ever goes bad. So professional to especially if it’s free and open source like this one. It’s a very good idea to have around. And with that, I will stop sharing and wrap up. I hope that was useful very early. I’m not sure if we had any questions. No, there is no windows version of xtrabackup. I see Silverstein was also stolen. Again, there is no windows alternative. I guess if you were running on Docker, like if you had Docker instances, you might get it to run I never tested to be honest. We don’t truly have more than a handful of customers running Windows. And they all use file system level snapshots or logical backups. And that one’s fun.

Dave Stokes:
All right. Never be too paranoid about your the safety of your data. Well, and thanks for everyone who are logged in and the folks who will watch this in the future. And we will have more of these in the series. And thank you all for coming in. So have a great day.

Marcos Albe:
Have a good day everybody. Thanks for joining us, bye. ∎

Marcos Albe

Principal Support Engineer, Percona

After 12 years working as a developer for local and remote firms, Marcos decided to pursuit true love and become full time DBA, so he has been doing MySQL Support at Percona for the past 8+ years, providing lead web properties with advise on anything-MySQL and in-depth system performance analysis.

See all talks by Marcos Albe »

Matt Yonkovit

The HOSS, Percona

Matt is currently working as the Head of Open Source Strategy (HOSS) for Percona, a leader in open source database software and services. He has over 15 years of experience in the open source industry including over 10 years of executive-level experience leading open source teams. Matt’s experience merges the technical and business aspects of the open source database experience with both a passion for hands on development and management and the leadership of building strong teams. During his time he has created or managed business units responsible for service delivery ( consulting, support, and managed services ), customer success, product management, marketing, and operations. He currently leads efforts around Percona’s OSPO, community, and developer relations efforts. He hosts the HOSS talks FOSS podcast, writes regularly, and shares his MySQL and PostgreSQL knowledge as often as possible.

See all talks by Matt Yonkovit »

Comments

Percona Live 2023 - CFP
✎ Edit this page on GitHub