Apr 232018
 

I’ve found myself working with PowerShell more frequently as of late, in no small part due to the amazing dbatools module. This has led to me trying to figure out how else I can utilize it for some of my personal internal processes.

I like to use public datasets for experimentation and presentation demos, especially data that people can easily understand and relate to. For some, keeping them up-to-date was a manual process of downloading files, loading tables, and merging. There are of course many better ways to do this, some of which are more automated than others. I could have simply used PowerShell to call bcp, or even just implemented an insert statement and some loops. Then I found dbatools, which has commands which enable me to do an even better job with far less work – just the way I like it!. Here’s how I now keep my datasets current:

Getting The Data

I’ll be using data from the City of Chicago’s Data Portal. They have a tremendous online resource with lots of public datasets available. One that I really like is their listing of towed vehicles. Any time the city tows or impounds a vehicle, a record gets added here and remains for 90 days. It’s very manageable, with only 10 columns and a few thousand rows. (As an added bonus, you can search for license plates you know and then ask your friends about their experience at the impound lot!)

Chicago’s data portal uses Socrata, which is a very well-documented and easy-to-use tool for exposing data. It has a wonderful API for querying and accessing data, but to keep things simple for this post we’re just going to download a CSV file.

If you’re on the page for a dataset, you can download it by clicking on “Export” on the top right and then selecting “CSV”. To avoid all that, the direct link to download a CSV of this dataset is here. Download it and take a look at what we’ve got using your spreadsheet or text editor of choice (mine is Notepad++).

Loading The Data

We’ve got our data, now let’s load it. I like to load the entire downloaded dataset into a stage table, and then copy new rows I haven’t previously seen into my production table that I query from. Here’s the script to create these tables:

-- CREATE STAGE TABLE
CREATE TABLE [dbo].[TowedVehiclesSTG](
	[TowDate] [date] NOT NULL,
	[Make] [nchar](4) NULL,
	[Style] [nchar](2) NULL,
	[Model] [nchar](4) NULL,
	[Color] [nchar](3) NULL,
	[Plate] [nchar](8) NULL,
	[State] [nchar](2) NULL,
	[TowedToFacility] [nvarchar](75) NULL,
	[FacilityPhone] [nchar](14) NULL,
	[ID] [int] NOT NULL
);


-- CREATE FINAL TABLE
CREATE TABLE [dbo].[TowedVehicles](
	[ID] [int] NOT NULL,
	[TowDate] [date] NOT NULL,
	[Make] [nchar](4) NULL,
	[Style] [nchar](2) NULL,
	[Model] [nchar](4) NULL,
	[Color] [nchar](3) NULL,
	[Plate] [nchar](8) NULL,
	[State] [nchar](2) NULL,
	[TowedToFacility] [nvarchar](75) NULL,
	[FacilityPhone] [nchar](14) NULL,
	CONSTRAINT PK_TowedVehicles PRIMARY KEY CLUSTERED (ID)
); 

Now for the magic – let’s load some data! The dbatools command that does all the heavy lifting here is called Import-DbaCsvToSql. It loads CSV files into a SQL Server table quickly and easily. As an added bonus, the entire import is within a transaction, so if an error occurs everything gets rolled back. I like to specify my tables and datatypes ahead of time, but if you want to load into a table that doesn’t exist yet, this script will create a table and do its best to guess the appropriate datatype. To use, simply point it at a CSV file and a SQL Server instance, database, and (optionally) a table. It will take care of the rest.

# Load from CSV into staging table
Import-DbaCsvToSql -Csv $downloadFile -SqlInstance InstanceName -Database TowedVehicles -Table TowedVehiclesSTG `
-Truncate -FirstRowColumns

The two parameters on the second line tell the command to truncate the table before loading, and that the first line of the CSV file contains column names.

Now the data has been staged, but since this dataset contains all cars towed over the past 90 days, chances are very good that I already have some of these tows in my production table from a previous download. A simple query to insert all rows from staging into production that aren’t already there will do the trick. This query is run using another dbatools command, Invoke-Sqlcmd2.

# Move new rows from staging into production table
Invoke-Sqlcmd2 -ServerInstance InstanceName -Database TowedVehicles `
-Query "INSERT INTO [dbo].[TowedVehicles]
SELECT
  [ID],
  [TowDate],
  [Make],
  [Style],
  [Model],
  [Color],
  [Plate],
  [State],
  [TowedToFacility],
  [FacilityPhone]
FROM (
  SELECT
    s.*,
    ROW_NUMBER() OVER (PARTITION BY s.ID ORDER BY s.ID) AS n
  FROM [dbo].[TowedVehiclesSTG] s
  LEFT JOIN [dbo].[TowedVehicles] v ON s.ID = v.ID
  WHERE v.ID IS NULL
) a
WHERE a.n = 1"

The ID column uniquely identifies each tow event, and the production table uses it as a primary key, however I have found that occasionally the dataset will contain duplicated rows. The ROW_NUMBER() window function addresses this issue and ensures each ID is attempted to be inserted only once.

Putting it all together

I’ve showed you how simple dbatools makes it to load a CSV file into a table and then run a query to load from staging into production, but the beauty of PowerShell is that it’s easy to do way more than that. I actually scripted this entire process, including downloading the data! You can download the full PowerShell script, along with a T-SQL Script for creating the tables, from my GitHub here.

Happy Data Loading!

Apr 122018
 

Next week I am happy to be presenting for the Chicago Suburban SQL Server User Group, which meets in Downers Grove, IL.

I will be delivering my session Locks, Blocks, and Snapshots: Maximizing Database Concurrency. This takes a look at SQL Server’s available isolation levels, what they are, how they differ, and why the default setting of “read committed” may not be appropriate for your workload. It should not only be helpful, but lots of fun as well.

I hope you can join us! The meeting is on Tuesday, April 17, and you can register and find further details here!

Mar 262018
 

I’m very pleased to be returning to Wisconsin in April to speak at SQL Saturday Madison 2018! This has always been one of my favorite events. Not only is it close to home, but it’s in a wonderful location and full of great friends. Thank you so much to the organizing committee for all of the hard work you do!

This year I will be presenting Locks, Blocks, and Snapshots: Maximizing Database Concurrency. This light-hearted and informative session will take a look at SQL Server’s available isolation levels, what they are, how they differ, and why the default setting of “read committed” may not be appropriate for your workload. It should not only be helpful, but lots of fun as well. I hope you will join me!

As of the time of posting, SQL Saturday Madison registration is still open. If you’d like to see some awesome presentations on Microsoft data technologies, register today!

Mar 122018
 

I think one of the reasons I like computers is that they’re so much easier to deal with than people. They’re predictable, emotions never get in the way, and there’s always a logical explanation for everything if you dig deep enough. All that being said, I am grateful for my hobbies that don’t involve sitting in front of a computer all day. Thinking through the things I enjoy in my spare time, three central themes that seem to keep popping up are art, architecture, and railroads. Here’s two examples of that:

Turner Family Christmas Cards

The University of Illinois will always be a special place to both me and my family. I graduated from there, was a proud member of the marching band, and got my start in public speaking there. It’s also the place where I met my wife, and to top things off we were married on campus. We have plenty of Illinois memrobilia of varying degrees of rarity, but the collection I’m most proud of is something so obscure that most Illini fans won’t even know they exist.

Fred Turner spent nearly 50 years of his life at the University of Illinois, beginning as a student in 1918, and ending with his retirement in 1966 as the university’s first Dean of Students. Turner loved Illinois and its history, and in 1946 he and his wife, Betty, decided to highlight these topics in Christmas cards they sent to their friends and family. Turner had recently picked up the Japanese art of woodblock printing, and decided to share it by creating a hand-carved woodblock print of a historical site in Illinois each year as the cover of their Christmas card. For an extra tie to campus, the blocks for printmaking were carved from wood salvaged from one of the University’s earliest buildings. They printed 260-275 by hand at their kitchen table each year until 1974.

The cards contain a main image on the front, a brief description and Christmas message from the Turners on the inside, and a fact sheet with more information on the historical building or site chosen. Here’s what they are like:

Turner Christmas Cards
I can only imagine that most of these cards have been lost to a combination of time and people just not knowing what they are. I know of only two complete sets of all 29 cards. As for my own collection, I’ve got a long way to go – I only have three. I found my first one five years ago, and have only come across two more since. Our trips to Central Illinois frequently involve trips to flea markets and antique shops, and now you know exactly what I look for.

If you’re curious to see what more of these cards look like, the university archives has a website showing cards from different years.

Railroad Photography

I’m guessing you’ve never heard of O. Winston Link. He was a civil engineer by degree and a photographer by trade, who in 1955 undertook a personal side project which would end up becoming his life’s work. The Norfolk and Western Railway was the last major railroad in the United States to still use steam locomotives at that point, and Link obtained full permission from their president to photograph the last days of the steam era on their railroad, primarily in Virginia and West Virginia coal country. He was even given a key to the railroad’s switch boxes, allowing him to phone dispatchers to obtain exact arrival and departure times for trains and, occasionally, the ability to request a train be held so a photo could be taken at a particular moment.

Hawksbill Creek Swimming Hole. Luray, VA. August 9, 1956.

Link’s photos were not ad-hoc snapshots in any way. The majority of them were taken at night (making him one of the first to popularize nighttime photography), and illuminated by enormous flash arrays he built himself. Link said that not only did nighttime photos romanticize the trains further, but steam locomotive smoke appears white, and therefore not dirty, when lit by flash. Utilizing his background in advertising and marketing photography, his photos were carefully composed to tell a story, and almost always incorporate people to add a humanizing element.

I wish I could tell you I collect his photos, but that’s not possible. I can only afford to be an enthusiast. Original prints typically sell for thousands of dollars at auction. I do own all of his books, which are absolutely beautiful. The tales he tells in black and white are richer than anything I could ever imagine in color.

Link lived in relative obscurity, his photos were popular with railfans but didn’t start to gain mainstream popularity until decades later. You won’t find many photos of him – he preferred to stay behind the camera, but he did have a cameo appearance in one of my all-time favorite movies, October Sky. If you’ve never seen that movie, add it to your list!

I like to say that if you watch October Sky, and you recognize the train engineer to be O. Winston Link, you’re really smart and observant. But if you’re a true rail nerd, you’ll also know that the locomotive he’s driving is a 2-8-2 “Mikado” #4501, which was actually owned by the Southern Railway and painted to be a Norfolk & Western engine just for the movie. The N&W never used 2-8-2 locomotives.

If you made it this far, thank you for reading about my hobbies. I promise to return to my regularly scheduled technobabble in short order.

This month we celebrate the 100th edition of T-SQL Tuesday, originally created by Adam Machanic back in November 2009. Adam had asked us to write about what topics we might be covering 100 months into the future. I’m not good at predicting the future, but in the spirit of time travel, I am going one month into the past and writing about last month’s topic which was hosted by Aaron Bertrand.

Mar 072018
 

If you’ve done any amount of work in Linux from Windows, chances are pretty good you’ve used PuTTY at some point. PuTTY is a free and open-source terminal emulator that supports a variety of protocols, including SSH. I’ve been using it since college and have always been very happy with it. It’s free, it’s tiny, and it just works.

With the advent of SQL Server 2017 on Linux I find myself using it rather frequently once again, but this time around I’m not just using PuTTY for myself. Now I’m trying to take screenshots of it for slide decks, and doing demos in presentations. In a world of flashy graphics and high-resolution screenshots, PuTTY’s simplicity can become a problem. Remember, it’s a terminal emulator. Its job is to display text. By default it’s 80 columns wide* and 24 rows high.

A PuTTY window doesn’t take up a whole lot of screen real estate at this size, especially for modern HD monitors.This is great for everyday use, and terrible for teaching others. Screenshots end up tiny, and if you scale the image they tend to look horrible and pixelated.

A default size PuTTY window.

It might look great on your screen right now, but being projected on a wall this becomes a problem.

Fortunately PuTTY has an option to fix this, which has evaded me for the better part of the 18 years I’ve been using it. It’s not quite as flashy as “Presenter Mode” that SQL Server Management Studio released a few years back, but it is incredibly helpful. On the configuration screen, click the “Window” category and select the radio option to “Change the size of the font.” That’s all there is to it.

Select the “Window” category

 

Set the font size to change when the window is resized

 

Now connect to your server of choice and resize the window as you see fit. Instead of the window getting larger and the text staying small, the text will now grow with the window. Screenshots can be much clearer, and on-screen demos are delightful!

(click to view full size)

 

*Why do terminal windows tend to be 80 columns wide by default, anyway? Probably because the IBM punched card format, introduced in 1928, used cards that had 80 columns. When it came time to ditch the cards and edit them on a screen instead, 80 columns seemed like a good enough size to stick with.