Apr 292011
 

This month marks the 18th installment of T-SQL Tuesday. T-SQL Tuesday was started in December 2009 by Adam Machanic (blog | @AdamMachanic) and is defined as “a recurring, revolving blog party”. Any blogger wishing to participate is invited to write about the given topic, chosen by whoever is hosting that particular month. I’m very pleased to be hosting this time around for the T-SQL Tuesday taking place on May 10, 2011!

The Topic

This month’s topic is CTEs, or Common Table Expressions. Had you asked me 10 years ago what CTE meant, I would have replied “coefficient of thermal expansion” but that was back in my semiconductor & electronic materials phase. I like the database version much better :)

Have you ever solved or created a problem by using CTEs? Got a tip, trick, or something nifty to share? I’d love to see your posts about any of the above. Also don’t forget that T-SQL Tuesday is not limited to only T-SQL:

“Any post that is related to both SQL Server and the theme is fair game. So feel free to post about SSIS, SSRS, Java integration, or whatever other technologies you’re working with in conjunction with SQL Server. Even if your post includes no T-SQL we still want to see it.”

The Rules

– Your post must be published between 00:00 GMT Tuesday May 10, 2011, and 00:00 GMT Wednesday May 11, 2011.
– Your post must contain the T-SQL Tuesday logo (see above) and the image must link back to this blog post.
– Trackbacks should work, but if not please put a link to your post in the comments section so everyone can see your contribution!

Bonus Points

– Include “T-SQL Tuesday #18” in your blog post’s title.
– Mention “coefficient of thermal expansion” in your post.
– Tweet about your post and include the #tsql2sday hashtag.
Contact Adam Machanic and tell him you’d like to host!

I’ll be reading through all the posts and publishing a summary shortly thereafter. I can’t wait to see what everyone has to say!

Edit: I originally posted this a little too early and also with incorrect dates. Everything has now been corrected, and my apologies for any confusion.

Apr 202011
 

This weekend I took and passed Microsoft Exam 70-451 (MCITP: Database Developer 2008). I’m pleased to say my score on this exam was much higher than its prerequisite, 70-433, which I wrote about a few months ago. Much like last time, I am happy to offer my thoughts for those who might be taking this test in the future.

How I Studied

I won’t go as far as saying that I didn’t study this time around. I did do some studying, albeit not much. After reading through the list of topics covered I decided to direct my efforts towards a few areas I felt I could benefit the most from reviewing. For instance, I felt I could benefit much more from studying transaction and concurrency strategies than from studying tables and programming objects.

Honor Student Bumper StickerI did not buy any books, study guides, or practice exams. I did this because I felt I could do fine by reading BOL and blog posts about specific topics. I have nothing against those types of study materials and have benefited from them in the past, I just didn’t feel the need for them this time around. I also had the added security of a second chance as I purchased this exam as part of a Microsoft Certification Pack, which is a great way to go if you plan on taking multiple exams. Buying them in this manner gives you a discount on multiple exams as well as a free “Second Shot” for each one.

Subject Matter

For this exam I was rather pleased with the variety of subject matter covered. I thought the test matched up with the list of topics covered quite nicely and did not feel there was a disproportionate balance of subject matter like I did with 70-433. One thing I did notice was that some topics in the list did not have specific questions dedicated to them, however those that didn’t were usually referenced in other ways, such as being an answer choice for one or more questions.

It seems that the more of these tests I take, the greater my understanding of the intent of the questions. I think the test writers do an excellent job of coming up with scenarios that you’re likely to encounter in real life as opposed to the world of ideal database design. For instance, if a question were to start off with “You are designing a database to store multiple terabytes of video” my first thought would be that I probably wouldn’t design a system that way. True as that may be, it’s quite possible to start a job or consulting gig at a place that has such a system deployed and can’t easily be changed. In that case the best option may be to work with it (at least for the short term).

Test Taking Skills

Once again I can’t over-stress the importance of test taking skills. Unless you are a complete genius, there’s probably going to be at least one question on this (or any) test that you don’t know the answer to. At this point your options are:

A) Leave the question unanswered
B) Pick an answer at random
C) Use your testing skills to eliminate incorrect choices and select an answer from those remaining

Needless to say your best chance of answering correctly lies with option C. When I’ve needed to do this, I’ve always been able to eliminate at least two choices. Here’s an example:

Q: Rachel’s query is the bestest EVAR. What hint did she use when accessing the docHistory table?

A) NOLOCK
B) READPAST
C) RECOMPILE
D) READUNCOMMITTED

We know nothing about Rachel or her query other than it’s on docHistory, so it’s a guessing game at this point. One thing we do know is that NOLOCK and READUNCOMMITTED are equivalent, so A and D can’t be the answer. On top of that RECOMPILE is not a table hint but a query hint, so C is out as well, meaning the answer must be B. Even if you had no clue the READPAST table hint existed, you can arrive at that answer by elimination.

What’s Next

After passing this test I have now completed the prerequisites for the Microsoft Certified Master: SQL Server 2008 knowledge exam. I’ve already had a few people ask me if I’m going to take it. HECK YES! It’s definitely on my list of things to do, but I will be taking some time to prepare for it first. I have no doubt in my mind that I will get there and will work hard in doing so, the only question now is how long it will take me.

Apr 122011
 

T-SQL TuesdayThis month’s T-SQL Tuesday is the brainchild of Matt Velic (blog | twitter) and was inspired by a quote from Adam Machanic (blog | twitter) on the importance of understanding the APPLY operator. It’s a good trick to have up your sleeve as it can do some nifty things and really save you some keystrokes when used properly.

My first experience with APPLY occurred many moons ago when I was working with a system that required a pipe-delimited list of values. Fortunately whoever developed it long ago had the good sense to store the data properly (first normal form) in tables and only build the comma-delimited lists at query time. That being said, the queries used cursors to build the lists iteratively. This all worked fine, however the performance left a lot to be desired.

The solution I devised featured the CROSS APPLY operator as well as some trickery with XML to build the string. Here’s what I came up with. When run against the AdventureWorks sample DB this will return a sorted and pipe-delimited list of each ProductID for every SalesOrderID.

SELECT DISTINCT d.SalesOrderID, LEFT(ProdID.list, LEN(ProdID.list)-1) AS ProductList
FROM Sales.SalesOrderDetail d
CROSS APPLY (
	SELECT CONVERT(VARCHAR(10), ProductID) + '|' AS [text()]
	FROM Sales.SalesOrderDetail e
	WHERE d.SalesOrderID = e.SalesOrderID
	ORDER BY ProductID
	FOR XML PATH('')
) AS ProdID (list);

CROSS APPLY does a great job of running a subquery that can build the desired strings. Of course there’s way more interesting applications of APPLY than this, but this was how I first discovered how helpful it could be. I don’t have the original cursor-laden code to do a comparison against, but it shouldn’t be too much of a stretch to believe that the above code is way faster than iterating and looking for matches.

Happy Coding!

Apr 072011
 

I’m not a fan of the BETWEEN operator. While the syntax is very easy to understand and requires less typing than my favorite alternative, it can lead to situations where unwanted results are returned. Because of this I’ve gotten into the habit of avoiding it all together.

If you’d like a refresher on BETWEEN, its syntax works like this:

a BETWEEN b AND c

where a, b, and c are expressions of the same data type. The result will be all values of a falling in the inclusive range between b and c. This means the following statements are equivalent:

WHERE a BETWEEN b AND c

WHERE a >= b AND a <= c

This is pretty simple to understand when you’re querying an integer column with values between 1 and 5, you would expect to find results corresponding to values in {1,2,3,4,5}. Similarly if you’re querying a datatype in the character family (CHAR, VARCHAR, etc) you’ll get just what you expect there as well. The monkey wrench gets thrown in when you start working with fractional types such as decimals, or in the case that I’d like to illustrate, DATETIME/SMALLDATETIME values.

The Problem
Let’s say you get a request that requires querying all your company’s orders from the month of March, 2011. They are conveniently stored in a table named “Orders”, which has a column of type DATETIME named “OrderDate”. You start writing your query like this:

SELECT *
FROM Orders
WHERE OrderDate BETWEEN '20110301' AND '20110331'

After some testing it’s apparent that many orders from March 31st aren’t appearing in the results as they should. In fact, only orders placed at the exact stroke of midnight (00:00:00.000) on March 31 will be included because the BETWEEN operator is only allowing dates less than or equal to March 31. The way comparisons with the DATETIME type work, any fraction of a second greater than midnight on a date is considered to be greater than that date.

“Back to the drawing board!” you say, and since you need all those orders from March 31 in your results, the next query you develop extends the date boundary by a day to ensure that all of the 31st is included:

SELECT *
FROM Orders
WHERE OrderDate BETWEEN '20110301' AND '20110401'
Parallel Parking Photo

Another BETWEEN I don’t like

April fools! This doesn’t work either, because orders placed at the stroke of midnight on April 1 will be included in the result set even though they aren’t desired. The “equal to” component that exists on both sides of BETWEEN is the culprit here. Even though the probability of an order being placed at the exact stroke of midnight is rather slim, it’s still possible, which should be reason enough to ensure it doesn’t happen rather than just crossing one’s fingers and hoping. The chances of this causing an issue are even greater when using the SMALLDATETIME data type, as its rounding methods would include any orders placed within the 29 second window from 00:00:00 to 00:00:29.

The Solution
So what’s the best easiest way to remedy this? In my mind, it’s dropping the BETWEEN and using the good ol’ comparison operators we learned in elementary school:

SELECT *
FROM Orders
WHERE OrderDate >= '20110301'
   AND OrderDate < '20110401'

This leaves little doubt that the entire month of March is being included, as all dates greater than or equal to March 1 and all dates less than April 1 are included. An equivalent query utilizing BETWEEN could be written like this:

SELECT *
FROM Orders
WHERE OrderDate BETWEEN '20110301' AND '20110331 23:59:59.997'

but I’d rather not key in all those decimals if I don’t have to. This method is also dangerous with regard to leap years. Specifying dates less than the first day of the following month accommodates leap years with no chance for error.

Other Thoughts
It should be noted that the above reasons to avoid using BETWEEN do not apply to the DATE type that was introduced in SQL Server 2008. DATE only stores whole dates, just as INT only stores whole numbers. Any of the partial date types {DATETIME, SMALLDATETIME, DATETIME2} are affected.