Jul 292010
 

This is the fourth post in my series on cloud backup solutions.  Previous posts were:

Today’s topic is Google’s Picasa Web Albums, henceforth known as PWA.  Unlike the other solutions I’ve covered, PWA is not explicitly intended to be a backup solution.  It’s designed as a platform for sharing your photos and videos with others, however it also does an excellent job of storing them in the cloud and letting you retrieve them with ease.  I’ve been using it as a form of backup for about a year now, and after browsing the “help” section I see that Google has included notes on how to use PWA as a backup as well.

Client

The best way to upload photos to PWA is to use Picasa, Google’s desktop photo organizer.  I have tens of thousands of photos in my collection and have tried many different photo organization applications and Picasa is the best I’ve found so far.  Its facial recognition features are particularly amazing.  I could probably write an entire post about Picasa and its features, but I’ll save that for another time – I’m here to talk about saving photos to the cloud today.

Pricing & Storage

PWA accounts come with 1GB of free storage.  After that, you can pay for additional storage at Google Paid Storage’s paltry rates.  Any additional space you purchase will be shared with other Google products, such as Gmail and Google Documents.

Within PWA, photos/videos are stored in albums.  At the moment, you can have up to 10,000 albums, each storing up to 1,000 photos.  This is regardless of how much storage you purchase.

Security

CloudsPWA is designed for public sharing, so you won’t find any kind of encryption included like you will for other backup tools.  PWA does have security settings for who is allowed to view your albums though.  You can adjust these settings at the album level, and choose to share with:

  • everybody
  • nobody
  • individuals with authentication (they must have a Google login)
  • the public through a link (anyone who has the URL can view the album)

Backups

With PWA, it’s not really a “backup”, more like an “upload”.   There’s a whole bunch of ways to get your photos into PWA, including the web interface, email, and mobile device apps.  Most of these methods have limitations of how much can be uploaded at a time.  The best way to upload large quantities of photos into PWA is to use the Picasa client.

I sort all of my photos by event, so I already have folders on my computer with names such as “Mom’s Birthday” or “Thanksgiving 2009”.  In Picasa, I create an album for each event and then add the appropriate photos to it.  Then you simply click “Sync to Web” and the album is uploaded to PWA.  Picasa has several options for what quality the photos are uploaded at.  I prefer to use full quality, and don’t mind the fact that it also takes up the most storage space since it’s rather cheap anyway.

Restores

To “restore” (download from PWA) you again have options.  From the PWA website you can download individual photos just as you would download any other file from the web.  If you have Picasa installed on your computer, you also have the option of downloading entire albums at a time through the web interface, or downloading multiple albums (or all your albums) through the Picasa client.  Simply go to File > “Import from Picasa Web Albums” and you can select which album(s) you want to download.  There are also third-party tools available for downloading albums without having Picasa installed.

What I Like

I’m a huge fan of Google’s amazing prices on storage space.  That being said, a low price is a horrible reason to choose a backup solution.  To me, PWA is primarily a great way to share photos with friends and family, and the fact that it can double as a cloud backup is an added bonus.

As I said before, I also really like the Picasa client as it has an amazing set of tools for organizing photos.  In addition to uploading images to PWA, you can do geotagging, use facial recognition to tag people, tag images with keywords, and search and sort by date and a variety of other fields.  It can take a little while to get things setup the way you like them, but once you do, maintaining your photo library is a snap! (pun intended)

What I don’t like

My biggest complaint about PWA is that albums can’t be nested, or created inside other albums.  My previous photo site (which I hosted out of my house) allowed this, and I had a wonderfully organized tree of all my photos.  Now all my albums are at the same level and simply sorted by date.  I know lots of people have requested the ability to nest albums, and hope the folks at Google get around to adding that feature soon.

Jul 272010
 

Do you like it when your schemata are easy to understand and maintain?  If so, keep the following in mind when choosing names for tables and columns.  If you’re feeling evil and want to inflict some frustration on others, this might give you some good ideas too…

1.  Data types make horrible names

There’s no rule that says you can’t name a column after a datatype – it will just be awfully confusing.  The following code works perfectly:

create table INT (
   char datetime not null,
   foo int null,
   bit bit null,
   timestamp datetime null
);

The birth of the DATE datatype in SQL 2008 definitely throws a monkey wrench in the works as well.  How many columns do you know of that are named “Date”? (More on that in a bit.)  Again, it’s not going to break anything, I just find it rather confusing when a column of type “datetime” is named “timestamp”.

2.  So do reserved keywords

Interestingly, SQL Server datatypes are not found on the list of T-SQL Reserved Keywords.  The rules for T-SQL identifiers state that a regular identifier cannot be a T-SQL reserved keyword.  “Regular identifiers” are those which do not require brackets or double quotes around them.

-- This will fail
CREATE TABLE FOO (
  add CHAR(5),
  between INT,
  restore SMALLDATETIME
);

-- Add brackets and it works just fine!
CREATE TABLE FOO2 (
  [add] CHAR(5),
  [between] INT,
  [restore] SMALLDATETIME,
  dump VARCHAR(10) -- DUMP is a reserved keyword but doesn't need brackets
     -- probably because it is discontinued in SQL 2008.
     -- maybe it should be removed from the list.
);

3.  Pick a good name length

Bad NameA good name should be long enough to be descriptive, but short enough that it’s not a pain to type.  I hate columns or tables with names like “Date” or “Name”.  Chances for confusion can easily be lowered by adding another word to make it more descriptive, such as “PurchaseDate” or “FamilyName”.

Table and column information is stored in the sys.tables and sys.columns tables.  The name values are stored in columns of the data type sysname, which since SQL Server 7 has been equivalent to nvarchar(128). This is one of the cases where adding quotes or brackets can’t help you break the rules. Names cannot exceed 128 characters in any case, and temporary tables are a special case as they can’t exceed 116.

4.  Avoid spaces and special characters

They’re allowed, but I consider it a bad practice to use them.  Most special characters are not included in the rules for regular identifiers, meaning that you’ll need to enclose the name in double quotes or brackets.

-- To be really tricky, you can start a name with a leading space!
CREATE TABLE [test^one](
   foo INT NOT NULL,
   [ bar] varchar(10) NULL  -- This is evil
);

5.  They can be case-sensitive depending on collation

Case sensitivity in object names depends on the database’s collation settings.  If it is case-sensitive, then object names will be unique based on case sensitivity as well.

-- This will fail in a case-insensitive database
-- but runs fine in a case-sensitive one
create table testing1 (
   char datetime not null,
   foo int null,
   bit bit null,
   FOO int null,
   timestamp datetime null
);

create table Testing1 (
   char datetime not null,
   foo int null,
   bit bit null,
   FOO int null,
   timestamp datetime null
);

6.  Don’t pick names that will change meaning

The concept of a name that changes meaning might not make a whole lot of sense, so I’ll elaborate with a short story.  In a previous job, we had to maintain a table that stored historical information for the previous 10 years.  Said table had 11 columns:  Key, Year0, Year1, Year2,…,Year9.

Year0 was always the current year, Year1 was last year, etc, so each year the columns changed meaning as far as what calendar year they really referred to.  There was also a special job that had to be run once a year to shift all the data one column to the right.  This is more than just bad naming, it’s a horrible design to begin with!  We knew there was a much better way, but were stuck with this schema due to legacy application support.

In conclusion, a little thought when choosing table and column names can go a long way. I hope this is helpful!

Jul 222010
 

This is the third post in my series on cloud backup solutions.  Previous posts were:

Today I’ll be talking about another cloud backup application called Jungle Disk.  I’ve been experimenting with it for a few months and am generally very happy with it.  Much like Mozy, Jungle Disk allows you to intelligently backup your files into the cloud.  On the contrary, Jungle Disk only provides the client application for running backups, the actual storage of the backups is separate, which I’ll explain in more detail shortly.

Client Versions & Pricing

Jungle Disk has 4 different client versions depending on your needs.  Two of them are considered to be for personal use, the others are for businesses.  The personal editions are called “Simply Backup” and “Desktop Edition” and are respectively priced at $2 per month and $3 per month.  Both include 5GB of free backup space.  Simply Backup lives up to its name – it allows you to run either manual or scheduled backups of whatever files/folders you like on as many machines as you like.  Desktop Edition builds on that and allows you to access your Jungle Disk storage as a network drive.  It also has the ability to sync files between multiple machines.  Since I had no need for the features of Desktop Edition (and I’m also a tightwad) I have been using Simply Backup.

The business editions are “Workgroup Edition” and “Server Edition” and are priced at $4 and $5 per month respectively, both including 10GB of free backup space.  Workgroup Edition includes a multi-way sync feature so an entire group of people can keep files in sync between them.  Server Edition includes remote management features.

Storage

NotJungleDiskAs I mentioned earlier, Jungle Disk only provides a client application for creating backups.  You have a choice as to where those backups are stored, as Jungle Disk supports both Amazon S3 and Rackspace Cloud Files.  Prices differ based on which service you choose.  (I’ve been using S3 for my backups.)

Since you’re paying the storage provider based on the amount of data stored, you have the option of how long to retain your backups before they’re deleted.  By default it’s set to 30 days, which seems way too short to me.  If you delete a file and don’t realize it until 31 days later, you would be out of luck because the last backup containing that file would have been deleted.  I currently have mine set to keep each backup for a year, but am considering disabling this option altogether and just keeping all backups forever.

Security

Jungle Disk uses AES-256 encryption for your data, the key for which is based on a password you choose.  Don’t lose your password, otherwise you’ll be out of luck!

Backups

Backups are pretty simple – just select which folders and/or individual files you want to backup.  There’s a built-in scheduler as well as a bandwidth throttle.  Like most systems, the client will only attempt to backup files that have changed since the last backup.  It also features de-duplication technology, so only the parts of files that have changed are backed up.  This can be particularly helpful if you have a large file with a small part of it that has changed.

Restores

Restores are a snap.  You simply select the file(s)/folder(s) you want to restore, the backup date you wish to restore from, and the location you wish to restore them to.  I found restores to be quick and painless.  In Simply Backup, restores can only be done from the client.  In Desktop Edition and beyond, you have the option of restoring files from the web as well.

What I Like

Since your backup files are stored by third parties, the availability of your backups is subject to their guarantees.  Both Rackspace and Amazon S3 have pretty good SLAs.

A nifty feature Jungle Disk provides is backup reports via RSS.  You are provided with a link to a private RSS feed that is updated each time a backup runs.  It’s especially convenient for me as I have my backups set to run during the day while I’m at work.  I can see that my backup jobs have completed, how long they took and how many files were backed up all from the comfort of my RSS reader.

What I Don’t Like

I believe Jungle Disk could be doing a better job of selling itself, or at least letting prospective buyers figure out what they want.  They offer 4 different products at different prices, but there’s no easy way to compare them to each other.  Each product has its own page with a few paragraphs about some of the features it offers, but what they really need is a chart showing all the differences between the versions.

Another thing I don’t like is that while Jungle Disk does de-duplication, it doesn’t do de-duplication across computers.  If you have the same file on 2 computers, you’ll be backing up and storing 2 identical copies of that file.  Jungle Disk’s de-duplication takes place within what they call a “Backup Vault”, but only a single computer can store its backups within a given backup vault.  If you’re only backing up 1 computer this shouldn’t be an issue, however if you have multiple machines with identical files on them, you’ll be paying for more storage than you really need.

Next Cloud Backup Product Review: Picasa Web Albums

Jul 202010
 

I’ve seen many articles and blog posts concerning what to do when you have to switch between SQL Server Recovery Models.  A lot of these tips are very important, as they can mean the difference between being able to recover your data and the much less desirable opposite.

After a lot of web searches on the subject I came to realize there’s much material for individual cases, but no real “one stop shop” for what to do when switching between any given recovery model.  Thus, my recovery model chart was born.  Since a diagram is worth almost as many words as a picture, I’ll show it to you and explain a little bit afterwards:

I've never actually seen "The Matrix"

As you can see, you pick the recovery model you’re currently in along the left side and the one you wish to switch to along the top.  It will tell you what you should do both before and after the switch.

You’ll notice the asterisk in the entries that involve switching from the simple recovery model.  I made it a backup* because depending on the situation a full backup may not be necessary.  If a full backup already exists, a differential backup should suffice for bridging the LSN gap created by switching to the simple recovery model.  Paul Randal covered this in his “DBA Myths” series a few months ago.

Wanting to keep the chart as simple as possible, I opted not to include rationales in the chart itself.  If any of these don’t make sense, here they are:

Switching to the simple recovery model: Perform a log backup beforehand to allow recovery to that point.  Switching to the simple recovery model will break the transaction log backup chain (which may be desirable if you’re trying to truncate or shrink the log.)  After the switch is made, you’ll need to disable any log backup jobs, as log backups aren’t possible under the simple recovery model.  Continue backing up with full or differential backups as you (hopefully) were before.

Switching from the simple recovery model: After the switch, perform either a full (or differential) backup to start (or restore) the transaction log backup chain as described above.  Until this backup is performed, the transaction log will continue to be automatically truncated just like in the simple recovery model.  You will also need to create (or enable) transaction log backup jobs.

Switching from full to bulk-logged: Perform an extra log backup prior to the switch to ensure recoverability up to that point in time.  Regularly-scheduled log backups may continue while under the bulk-logged recovery model.

Switching from bulk-logged to full: Perform an extra log backup immediately following the switch to allow point in time recoverability from that point forward.

Further Reading

I compiled all the info for this chart from the following sources:

I hope you find this helpful!

Jul 152010
 

Images make just about everything more interesting, which is why I do my best to include at least one with each blog post even if for nothing more than comic relief.

I run a few different websites, and a while ago I decided to host the images for all of them in the cloud using Amazon S3 (Simple Storage Service).  I’ve been a very satisfied customer since then.  Not only is it incredibly easy to use, it’s also rather simple to “mask” it so your images look like they’re coming from somewhere else.  If you look at the properties of any image on this page, you’ll see that it’s coming from “img.bobpusateri.com”.  In reality, it’s coming from S3 thanks to the magic of DNS aliasing.  In this post I’ll show you how to do that.

Why’d I Do This?

My original website was hosted out of my house on a single Linux box residing under my desk.  This machine was a real powerhouse at 300Mhz and 128MB RAM (and I was using it until the end of 2009!), so I wanted to move the load of hosting images off that box and into the cloud.    After considering a few different options I decided to use S3 based on both their 99.9% uptime guarantee and what I consider to be very reasonable pricing.  The fact that I’m not paying for anything I don’t use was also rather attractive.

The downsides?  It’s not completely free, but I’ve always believed that you get what you pay for.  You’re charged based on how much data you’re storing and how many times it’s accessed.  Hopefully nobody decides to keep clicking “refresh” forever with the hopes of bankrupting me (don’t worry, it won’t take you too long!)   At the time I did the switch it also required some coding changes to my sites, but that was a one-time expense.

Accessing S3

Once you create an S3 account, you’ll want to get it set up so you can start uploading your awesome images.  Since S3 lives in the cloud, there are a wide variety of clients available for uploading & managing your data.  My personal favorite is Amazon’s recently-released web console.  Actually the console has been around for a while as it supports several Amazon cloud products but it only recently started supporting S3.  Another client I like is S3Fox, an add-on for Mozilla Firefox.  There are many others out there as well depending on your needs.

The Bucket

BucketsThe basis of all thing in S3 is the “bucket” and very object you store in S3 will be in a bucket.  Buckets store key/object pairs, the keys being a string and the object being whatever type you like (in this case, an image file).  Keys are unique within a bucket.  Buckets cost nothing by themself (you’re only billed by what’s stored in them) and each account can contain up to 100 buckets, though you probably won’t need anywhere near that many.  Of particular importance is the bucket’s name, as it determines how you access the objects contained inside.  Bucket names must be unique across all buckets stored in S3, so if you try to create a bucket named “images” it will probably fail because someone else has likely already thought of that.

The other choice you have when creating a bucket is the region its data is stored in.  Amazon currently has 4 different regions to choose from, two in the U.S., Ireland, and Singapore.  In general you’ll probably want to pick the reason that’s closest to your target audience, but you may have reasons for storing it elsewhere (legal compliance, etc.)  The price you pay depends on the region you store the data in.

If you want to “mask” S3 so that it appears as another domain, you’ll need to give your bucket the name of whatever domain you want to mask it as.  This means that for my “img.bobpusateri.com” domain I have a bucket named “img.bobpusateri.com” in my S3 account.

Objects & Permissions

Once you’ve created your bucket, you’ll want to put stuff in there.  Uploading instructions will vary depending on your client, but most of them utilize a standard FTP-type manager allowing you to create folders and copy local files to S3.

By default a bucket’s contents are shared with nobody except its creator.  To allow the world to see an object, you’ll need to alter its permissions so that “everyone” can read it.  This again varies by client, but generally you can right-click on an object, select “Edit Permissions” or “Edit ACLs” (Access Control List) and grant read/view rights to everyone.  This is possible on a per-object basis through S3 (though some clients will recurse through buckets and/or folders) or a “canned access” policy may be applied to an entire bucket.

Accessing Objects

To access an object, you can formulate a URL from its key and the name of the bucket it’s stored in.  For a bucket named “bucket” containing an object with a key of “key”, the URL would be as follows:

http://bucket.s3.amazonaws.com/key
or
http://s3.amazonaws.com/bucket/key

In the case of the image in this post, the bucket is named “img.bobpusateri.com” and the key is “bc/2010/07/Buckets.jpg”, which means you can access it from URLs:

https://s3.amazonaws.com/img.bobpusateri.com.s3.amazonaws.com/bc/2010/07/Buckets.jpg
or
http://s3.amazonaws.com/img.bobpusateri.com/bc/2010/07/Buckets.jpg

But neither of those are all that good looking, are they?  The purpose of this post is to be able to mask the Amazon part of the URL so that it will also work like this:

https://s3.amazonaws.com/img.bobpusateri.com/bc/2010/07/Buckets.jpg

This is done with a DNS command called CNAME, which more or less creates an alias for a subdomain.  This needs to be done at your hosting provider and will probably be under an advanced options menu somewhere.  You’ll want to set your desired subdomain (“img” in my case) to point to “s3.amazonaws.com”.  Once that’s set up you should be good to go.