Many articles on database backups reference running database backups and maintenance processes overnight. In lots of situations this is a great idea, especially if you’re dealing with a transactional system that has users connecting to the database during the day and minimal activity outside of business hours. But for systems with different schedules, night time is not always the right time.
While meeting with a client, they mentioned to me that their database backups take around 7-8 hours to complete. Knowing that their hardware is very similar to mine and that I can back up a database that’s 40 times larger than theirs in a fraction of the time, I suggested they let me investigate the issue.
The first thing I noticed when I took a look at their existing backups was that the backup files were more than twice the size of the entire database. A full backup contains a copy of each page of the database as well as the active portion of the transaction log. (For more details on exactly how much of the log is included, see Paul Randal’s blog post on the subject.) It’s not uncommon for a full backup to be slightly larger than the size of the database due to the portion of the log, however what I saw got me thinking that there was an awful lot of active transaction log present in the backup.
I started looking around to see what time the backup was running and what else it was competing with. It turned out that backups were running overnight, and while end users weren’t connected to the database during that time, a rather large bulk insert process was. The backups and bulk inserts in this case were both significantly hindering each other in what I’ll lovingly call a “paradox of suckitude”:
The backup had to include all of the transaction log being generated by the ongoing load process, which made the backup file very large, and also made the backup process take longer since it had to copy all that extra data.
The bulk insert process was affected not only by all the extra IO generated by the backup, but also by the fact that minimal logging is disabled while a backup is taking place.
There were plenty of other steps that could be taken to increase the speed of their backups, such as striping the backup across multiple files and volumes, utilizing backup compression, and adding additional database files, but before playing with any of that I decided to tackle the timing issue first. I changed the backup schedule so that backups would run during the middle of the workday. Sure the backups would be competing with user activity, but testing showed their impact on database performance to be minimal. Simply changing the schedule caused the backup to finish 4x faster, and the bulk insert process sped up by a factor of 10.
It should come as no surprise that scheduling is very important, and even if a schedule makes sense when it’s initially created, things can change over time. Job schedules are definitely something that should be periodically reviewed to make sure that simple conflicts aren’t robbing you of performance.
July 2012 is the thirty-second installation of T-SQL Tuesday – wow that’s a lot! I can still remember reading posts that were part of the first one…
This month’s topic comes from Erin Stellato (blog | @erinstellato), the newest employee of SQLskills (congrats Erin!!) She’s asking us to track what we do for a day and then write about it – what a nifty idea! Here’s what my workday looked like on Tuesday, July 10:
8:15am Put on pants. I like to wear shorts when biking to the train or walking to the office, so I do work for the first bit of my day in comfort before changing into pants.
8:20am Explain options to improve overall database speed to one of our internal clients. They have lots of room for improvement, but a great first step would be for them to create a filegroup with multiple files – the wait stats on their machine seem to indicate this is an issue.
9:10am Fix an issue with an SSIS package. When external data changes and you don’t know about it, hilarity will ensue :)
9:30am Apply a patch for some vendor software we use.
10:00am Start a test restore of a backup from a randomly selected database. There’s no better way to know that your backups are good than to actually restore them. I try to perform restore tests on a few randomly-selected backups each week, and hope to find time to automate this sometime soon.
10:05am Respond to emails.
10:35am Twitter conversation on Microsoft Management Console.
10:45am Discussion on indexes and how the query optimizer will select the narrowest index that can satisfy a query. Showed devs how even though the optimizer’s choice of index was different than theirs, the performance is way better.
12:30pm Design Review with a developer from our group. This turned into a great discussion on how a lot of the issues they had were fundamental mistakes that could have been prevented had they reached out to myself and the data architects earlier in the process.
2:30pm Meet with developers from another team, explain transactional replication to them and why I think it’s the most appropriate method for us to pull data from their system.
3:15pm Catch up with more emails.
3:45pm Catch bus to train station.
4:20pm Catch train.
5:15pm Retrieve bike.
5:30pm Arrive home.
My favorite part of this job is the amount of time I get to spend helping people. There’s a lot of opportunities for that here, along with plenty of people who are very eager to learn. Today was a great day because I got to give out a bunch of help.
My least favorite part of this job is the killer commute. I love my co-workers, the location, and everything about this place, but it’s a real drag to know that I spend nearly 3.5 hours per day getting to and from work. I do my best to make good use of my travel time by doing things like reading, watching videos, writing blog posts, or anything else that doesn’t require an internet connection, but it’s still a ton of time that I could otherwise be working and fully productive. Hopefully sometime in the future I’ll find a position that will allow me to work remotely!
So that’s my day. It’s actually pretty interesting to look at it like this – thank you Erin for such a great topic!
To say that I was dissatisfied with my previous job is like saying that the surface of the sun is slightly warm. When I resigned, I resisted the urge to blog about it right away, figuring it would be much better to take some time and reflect on it first. A year sounded like a reasonable amount of time, and I left there in May of 2011. Over the past year, some of my feelings mellowed while others didn’t, but I’m still every bit as grateful to have gotten out of there. If you’re in management, lead a team, or are considering that path, here’s a few things I recommend you avoid doing:
Prevent candidates from interviewing with their potential team. I smelled something funny from the very beginning, but didn’t act on it as much as I should have. My interview there wasn’t technical at all. It was held with one person, a director who started off by saying they knew next to nothing about SQL Server, though they were an Access developer many years ago. They said my resume looked very impressive, and basically just asked me to elaborate on a few of the projects I had worked on. Not a single technical question. Even with all the questions I asked them on things like training opportunities, my community involvement, twitter, speaking, and my upcoming wedding, I distinctly remember being out of there in about 45 minutes. I was completely dumbfounded when HR called to offer me the position the next day. I reached out to the community asking for guidance as to if this was normal or not, and Jen McCown (blog | @JenniferMcCown) was kind enough to do an informal poll on her blog about it which yielded a wide range of results. After that I went back and asked if there was any way I could sit down with potential co-workers, but they said it wasn’t necessary. “<Director> is a great judge of who will fit in here, so you’ll be fine”. I still was torn, but accepted the job anyway. I don’t regret it because it led me to where I am now, but I don’t consider it to be one of the best decisions I’ve made either.
Don’t talk to your team.
My team was highly distributed, with my manager and a majority of the members in Eastern Europe. There was one fellow DBA who worked in the Chicago office with me, though. One might think this would mean I’d get some good one-on-one time and instruction from the only co-worker on my team in the office, but not the case. The best way to describe this guy’s personality is to compare him to a jar of mayonnaise. He would say “hi” when he walked in, I’d say “bye” when I left, and there was maybe an hour’s worth of conversation between us the entire year I worked there, no matter how hard I tried. Whenever I came to him with questions about a process or how something was set up, I’d usually get one-word answers, or the occasional four-word “go read the document”, and then be emailed a Word file full of screenshots with little to no descriptions. There are a few friends from there from different teams that I still speak with, but in general I never thought it possible to feel so alone in an office full of people.
Don’t listen to your team.
Part of the company’s “database standard” was that every query was required to use the NOLOCK hint. Every query. No matter what. The irony here is that some queries were returning bad data and nobody could figure out why. When the official error report came out with instructions for reproducing the issue, it was evident that dirty reads were the culprit. When I pointed out that NOLOCK was to blame here and that removing it would fix the problem, I was told that NOLOCK was part of the coding standard and must be used. Despite my attempting to explain how NOLOCK works and citing BOL and various blog posts showing that dirty reads can and will occur when using NOLOCK, they refused to believe it was to blame. To me, this was like ordering a cheeseburger and then complaining there’s cheese on it. I actually got an email from the lead DBA telling me that both Microsoft and I have no clue what we’re talking about when it comes to transaction isolation levels. When I asked if they could then please explain it for me, I was told it was “too complicated” and would be way over my head.
Give your team nothing to do.
I’m not sure they ever had a plan for what I would be doing. The whole time I was there I probably had 2-3 weeks of actual work they let me do. The rest of the time I would ask what I could do and I was told I could read, play games, mess around in the test environment, surf the internet, or do whatever I wanted to fill time. This may sound awesome, but it was really quite boring. I would have felt much better with a heavier workload, but without one, I decided to dedicate my time to professional development and the community. I got my twitter on and started this blog, which I consider to be a pretty wise investment of said time. Judging by the number of science fiction novels on people’s desks and the countless online poker games (for real money) I’d see on their screens, I’m guessing there were a few others would could have used some more work too.
Make your team reinvent the wheel. One of the few projects I was given was to develop a view or procedure that would show all the active processes on a server, their current wait types, and statistics such as the number of reads and writes they’ve performed. When I heard this I realized that Adam Machanic’s sp_whoisactive script would be a shoo-in, as it did everything they were asking and much much more. I forwarded information about it to my superiors, noting that it had been in development for years, is used by thousands of DBAs worldwide, and best of all is free. Their response was that all development must be done in-house, and we could never have code on our servers that was written by “some guy from the internet”.
Deny opportunities for training and professional development.
I blogged about this last year, but to summarize, at the interview I was told there was a budget for appropriate training events. When such a training opportunity appeared, they refused to help pay for training or even give me the time off to attend, because they felt I was “good enough” and didn’t need training. I ended up going on my own time and my own dime, and it was worth every penny.
Play games with vacation time for major life events.
When I interviewed for this job in April of 2010, one point I made sure to bring up was my wedding which was taking place in August of 2011. I was assured there would be no problem taking 2 weeks off to get married, provided my manager approved and I had accrued enough vacation time. I filled out the PTO request on my very first day there and was promptly denied. I was told it was too early to request it, and try again 1 year from the date. Ok fine. When August rolled around I submitted another request and was again denied. My manager told me to try again once I had accrued the 2 weeks of vacation time. When I had 2 weeks of vacation built up (this was probably December at this point) I tried again, and was told to wait until February, when it would be 6 months away. In February (which was 6 months out) I submitted another request, and was told “possibly”, because they had just approved another member of the team for vacation during that time and they didn’t want too many people to be gone at once. WTF?!? I was trying to decide if I would just quit my job before the wedding, or simply make sure nothing of value was left at my desk and not show up for two weeks. Fortunately I found a new position and didn’t have to worry.
Conduct exit interviews in full view of others. When I announced my resignation, my exit interview wasn’t held in a conference room or other private place. Instead the director opted for the lobby, with the receptionist right there and other people coming in and out. I later received a follow-up questionnaire from central HR which asked some standard questions as well as some very interesting ones about pay in the workplace. I’m all for written communication, but in lieu of a face-to-face HR meeting, a phone call with them would have worked wonders. At the very least, it would have showed they were trying to make it look like they care why someone’s leaving. But I’m pretty sure they didn’t.