Chapter 4. Authenticating S3 Requests

Authenticating S3 Requests

The section on authenticating S3 Requests in the S3 Developer’s Guide (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/) is pretty intimidating. There are a lot of steps you have to go through in just the right order to get your authentication correct. Luckily there are sample implementations in a number of languages in the Getting Started Guide: http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/authenticating-to-s3.html.

The code supplied by Amazon works just fine, but you may want to build your own implementation. You could just read the S3 Developer’s Guide: it’s the canonical resource for this information, and obviously if there’s a conflict between what I’m saying and what the Developer’s Guide says, go with the Developer’s Guide. That being said, I’m going to go through the authentication process in detail to hopefully smooth over some of the spots where I got confused.

The Authentication Process

Every request you make to Amazon S3 must be signed. This is done by adding a Authorization header to the request. An authenticated request to a bucket named mybucket would look like this:

1 GET /mybucket
2 Host: s3.amazonaws.com
3 Date: Wed, 13 Feb  2008 12:00:00 GMT
4 Authorization: AWS your_aws_id:signature

TODO: Create a annotated example request and canonical string

The authorization header consists of “AWS”, followed by a space, your AWS ID, a colon and then a signature:

1 Authorization = AWS <AWSAccessKeyId>:<signature>

So that looks pretty straight-forward, except for the signature. To generate the signature, you generate a canonical string for your request, and then encode that string using your Secret Key. To encode the canonical string, you take the UTF-8 encoding of that string, encode it with your Secret Key using the HMAC-SHA1 algorithm and then Base64 encode the result. In pseudo-code:

1 Signature = Base64( HMAC-SHA1( UTF-8-Encoding-Of( canonical_string ) ) )

Assuming that you have the libraries to do the various encodings, all you need to do now is create the canonical string. The canonical string is a concatenation of the HTTP verb used to make the request, the canonicalized headers and the canonicalized resource.

1 canonical_string = "<http_verb>\n
2 	                   <canonicalized_headers>\n
3             		      <canonicalized_resource>"
The

This is the simplest of the canonical string sub-elements. It is either GET, PUT, DELETE, HEAD or POST. It must be all uppercase.

The

The canonicalized_headers element is constructed from two sub-elements, the canonicalized_positional_headers and the canonicalized_amazon_headers.

1 canonicalized_headers = <canonicalized_positional_headers>\n
2                         <canonicalized_amazon_headers>
The

The canonicalized_positional_headers are the values of the md5 hash, content type and date headers, separated by newlines. In pseudo-code:

1 Content-MD5 + "\n" + 
2 Content-Type + "\n" + 
3 Date + "\n"

Here’s an example

1 \"91ffa40f1a72a58f0d0b688032195088\"\n
2 text/plain\n
3 Wed, 27 Mar 2008 09:14:27 +0000

If one of the positional headers is not provided in the request, replace its value with an empty string and leave the new line (\n) in. For example, if a request had no MD5 hash or content type headers, it would look like this

1 \n
2 \n
3 Wed, 27 Mar 2008 09:14:27 +0000
The

The Amazon headers are all headers that begin with x-amz-, ignoring case. You construct the canonicalized_amazon_headers with the following steps

  • Find all headers that have header names that begin with x-amz-, ignoring case. These are the Amazon headers
  • Convert each of the Amazon header’s header names to lower case. (Just the header names, not their values)
  • For each Amazon header, combine the header name and header value by joining them with a colon. Remove any leading string on the header value as you do this
  • If you have multiple Amazon headers with the same name, then combine the values in to one value by joining them with commas, without any white space between them.
  • If any of the headers span multiple lines, un-fold them by replacing the newlines with a single space
  • Sort the Amazon headers alphabetically by header name
  • Join the headers together with new-lines (\n)

Here’s an example

Request canonicalized_amazon_headers

 1 GET /my_photos/vampire.jpg
 2 Host: s3.amazonaws.com
 3 X-Amz-Meta-Subject: Claire
 4 X-Amz-Meta-Photographer: 
 5         Nadine Inkster
 6 content-type: image/png
 7 content-length: 10817
 8 
 9 x-amz-meta-photographer: 
10 	Nadine Inkster\n
11 x-amz-meta-subject:Claire

Note that the header names have been lower cased, they are in alphabetical order and the spaces between the header names and the header values have been taken out. The Host, content-length and content-type headers are not included in the canonicalized_amazon_headers.

Here’s a more complicated example showing the combination of multiple Amazon headers with the same name and the un-folding of a long header value.

Request canonicalized_amazon_headers

 1 GET /my_photos/birthday.jpg
 2 Host: s3.amazonaws.com
 3 content-type: image/png
 4 content-length: 12413
 5 X-Amz-Meta-Subject:Claire
 6 X-Amz-Meta-Subject:Mika
 7 X-Amz-Meta-Subject:Amber
 8 X-Amz-Meta-Subject:Callum
 9 X-Amz-Meta-Description: 
10    Mika, Claire, Amber and Callum \n
11    at Mika's birthday party\n
12 
13 
14 x-ama-meta-description:
15   Mika, Claire, Amber and Callum 
16   at Mika's birthday party
17 x-amz-meta-subject:
18   Claire,Mika,Amber,Callum

Note that the subjects have all been combined in to a single header, and the new-line in the description has been replaced with a space.

Joining the

The canonicalized_headers are constructed by joining the canonicalized_positional_headers and the canonicalized_amazon_headers with a newline (\n).

1 canonicalized_headers = <canonicalized_positional_headers>\n<canonicalized_amazo\
2 n_headers>
The

The canonicalized_resource is given by

1 /<bucket><uri><sub-resource>

elements in the canonicalized_resource

bucket

The name of the bucket. This must be included even if you are setting the bucket in the Host header using virtual hosting of buckets. If you are requesting the list of buckets you own, just use the empty string.

uri

This is the http request uri, not including the query string. The uri should be URI encoded.

sub-resource

If the request is for a sub-resource such as ?acl, ?torrent or ?logging, then append it to the uri, including the ?.

Some examples

Request canonicalized_resource

1 GET /my_pictures/vampire.jpg
2 Host: s3.amazonaws.com
3 
4 /my_pictures/vampire.jpg

The canonicalized_resource is the same as the URI in the request

Request canonicalized_resource

1 GET /vampire.jpg
2 Host: 
3   spattenpictures.s3.amazonaws.com
4 
5 /spattenpictures/vampire.jpg

Note that the bucket name (spattenpictures) is extracted from the Host header even though it is not in the URI in the request.

Time Stamping Your Requests

All requests to Amazon S3 must be time-stamped. There are two ways of time-stamping your request: using the Date header (which will be in the canonicalized_positional_headers), or using the x-amz-date header (which will be in the canonicalized_amazon_headers). Only one of the two date headers should be present in your request.

The date provided must be within 15 minutes of the current Amazon S3 system time. If not, you will receive a RequestTimeTooSkewed error response to your request.

Writing an S3 Authentication Library

Now that you know how a request is authenticated (in exhaustive detail), let’s implement a library to actually do the authentication.

We need a function that will take a HTTP verb, URL and a hash of headers as inputs and make an authenticated request. It authenticates the request by adding a signature to it. The signature is created by using the inputs to create a canonical string and creating an MD5 hash of that string using your Amazon Web Services Secret key. Okay, so we’re looking for a function that looks like this:

 1 module S3Lib
 2 
 3 	 class AuthenticatedRequest
 4 
 5 	   def make_authenticated_request(verb, request_path, headers = {})
 6 	     ... some code here ...
 7 	   end
 8 
 9 	 end
10 	end

To make a request, you would do something like this

1 $> irb
2 >> require 's3_authenticator'
3 => true
4 >> s3 = S3Lib::AuthenticatedRequest.new
5 => #<S3Lib::AuthenticatedRequest:0x1742920>
6 >> s3.make_authenticated_request(:get, '/', 
7   {'host' => 's3.amazonaws.com'})

You know what, that’s too much code just to make a simple request. Let’s add a class method to the S3Lib module that instantiates an AuthenticatedRequest object and makes the call to AuthenticatedRequest#make_authenticated_request for us.

 1 module S3Lib
 2 	  def self.request(verb, request_path, headers = {})
 3 	    s3requester = AuthenticatedRequest.new()
 4 	    s3requester.make_authenticated_request(verb, request_path, headers)
 5 	  end
 6 
 7 	 class AuthenticatedRequest
 8 
 9 	   def make_authenticated_request(verb, request_path, headers = {})
10 	     ... some code here ...
11 	   end
12 
13 	 end
14 	end

Now we can make a request like this:

1 $> irb
2 >> require 's3_authenticator'
3 => true
4 >> s3 = S3Lib.request(:get, '/', 
5                                    {'host' => 's3.amazonaws.com'})
6 => #<S3Lib::AuthenticatedRequest:0x1742920>

Okay, now that we have that out of the way, let’s get started.

Test Driven Development

When you first look at the requirements for authenticating an S3 request, it’s hard to know where to begin. It’s a complex set of requirements that would be much more simple if you broke it up in to smaller steps. It is also have something that must work correctly if the rest of our library is to function at all. Luckily for you, Amazon gives a set of example requests and the corresponding canonical strings and signatures in the S3 Developer’s Guide (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/). This sounds like a perfect fit for Test Driven Development (TDD), and that’s what we’re going to do for the rest of this section.

In the previous section, I deliberately started from a top-down view of the specification to make a point: the best way to deal with a complex spec like this is to start from the bottom and work your way up. If you just start coding from the top down, you’ll find yourself just sitting there staring at the code not knowing where to start.

So, let’s do some TDD. The key here will be to make small steps, and use our tests to make sure that all of our steps work for all cases, even the edge cases.

The flow of development will look like this:

  • Read part of the authentication specification
  • Create tests for that portion of the specification
  • Make sure your tests fail
  • Write the simplest thing that will make your tests pass

The HTTP Verb

Let’s start with something easy: the HTTP Verb section of the canonical string. Remember from the specification that the HTTP verb “… is either GET, PUT, DELETE, HEAD or POST. It must be all uppercase.” Also, it has to be followed by a carriage return. So, let’s test that. Our test will look something like this:

 1 require 'test/unit'
 2 require File.join(File.dirname(__FILE__), '../s3_authenticator')
 3 
 4 class S3AuthenticatorTest < Test::Unit::TestCase
 5 
 6   def test_http_verb_is_uppercase
 7     @s3_test = S3Lib::AuthenticatedRequest.new
 8     @s3_test.make_authenticated_request(:get, '/', 
 9                                        {'host' => 's3.amazonaws.com'})
10     assert_match /^GET\n/, @s3_test.canonical_string
11   end
12 
13 end

Let’s run that and see what happens. We know this is going to fail, as we haven’t actually written the canonical_string method yet.

 1 $> ruby test/first_test.rb 
 2 Loaded suite test/first_test
 3 Started
 4 E
 5 Finished in 0.000457 seconds.
 6 
 7   1) Error:
 8 test_http_verb_is_uppercase(S3AuthenticatorTest):
 9 NoMethodError: undefined method `canonical_string' 
10 for #<S3Lib::AuthenticatedRequest:0x50d34 @verb=:get>
11 	test/first_test.rb:9:in `test_http_verb_is_uppercase'
12 
13 1 tests, 0 assertions, 0 failures, 1 errors

Good, it fails. Now, let’s write the simplest thing that will make it work. Something like this:

 1 module S3Lib
 2   
 3   def self.request(verb, request_path, headers = {})
 4     s3requester = AuthenticatedRequest.new()
 5     s3requester.make_authenticated_request(verb, request_path, headers)
 6   end  
 7   
 8   class AuthenticatedRequest
 9     
10     def make_authenticated_request(verb, request_path, headers = {})
11       @verb = verb
12     end
13   
14     def canonical_string
15       "#{@verb.to_s.upcase}\n"
16     end
17   
18   end
19 end

Now, run the test again.

1 $> ruby test/first_test.rb 
2 Loaded suite test/first_test
3 Started
4 .
5 Finished in 0.000358 seconds.
6 
7 1 tests, 1 assertions, 0 failures, 0 errors

All right! We’re rolling now! Let’s go on to something a bit more complicated: the canonicalized headers.

The Canonicalized Positional Headers

The canonicalized headers consist of the canonicalized_positional_headers followed by the canonicalized_amazon_headers. To break this down to something simple, we don’t want to test both of them at once. Let’s start testing with the canonicalized_positional_headers. Once again, I’ll refresh your memory so that you don’t have to flip back to the last section.

The canonicalized_positional_headers are the values of the MD5 hash, content type and date headers, separated by newlines. So, let’s write some tests for that spec:

 1 require 'test/unit'
 2 	require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
 3 
 4 	class S3AuthenticatorTest < Test::Unit::TestCase
 5 
 6 	  def test_http_verb_is_uppercase
 7 	    @s3_test = S3Lib::AuthenticatedRequest.new
 8 	    @s3_test.make_authenticated_request(:get, '/', 
 9 	                                        {'host' => 's3.amazonaws.com'})
10 	    assert_match /^GET\n/, @s3_test.canonical_string
11 	  end
12 
13 	  def test_canonical_string_contains_positional_headers
14 	    @s3_test = S3Lib::AuthenticatedRequest.new    
15 	    @s3_test.make_authenticated_request(:get, '', 
16 	                                        {'content-type' => 'some content type', 
17 	                                         'date' => 'December 25th, 2007', 
18 	                                         'content-md5' => 'whee'})
19 	    assert_match /^GET\n#{@s3_test.canonicalized_positional_headers}/, 
20 	                 @s3_test.canonical_string
21 	  end
22 
23 	  def test_positional_headers_with_all_headers
24 	    @s3_test = S3Lib::AuthenticatedRequest.new    
25 	    @s3_test.make_authenticated_request(:get, '', 
26 	                                        {'content-type' => 'some content type', 
27 	                                         'date' => 'December 25th, 2007', 
28 	                                         'content-md5' => 'whee'})
29 	    assert_equal "whee\nsome content type\nDecember 25th, 2007\n", 
30 	                 @s3_test.canonicalized_positional_headers
31 	  end  
32 
33 	end

This will, of course, fail until we’ve written the canonicalized_positional_headers method.

Making sure your tests fail

I’ll spare you the details of the test failures from now on, but don’t take that to mean that you shouldn’t test that they fail. Trust me, knowing that your tests actually fail before you begin coding will save you hours of frustration at some point in your life.

Making sure that your tests fail ensures that your tests are actually running, and it helps to ensure that they’re testing what you think you are testing. Get in to the habit of running them before you begin coding and giving yourself a pat on the back when they fail.

 1 module S3Lib
 2 
 3   def self.request(verb, request_path, headers = {})
 4     s3requester = AuthenticatedRequest.new()
 5     s3requester.make_authenticated_request(verb, request_path, headers)
 6   end  
 7 
 8   class AuthenticatedRequest
 9 
10     POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']  
11 
12     def make_authenticated_request(verb, request_path, headers = {})
13       @verb = verb
14       @headers = headers
15     end
16 
17     def canonical_string
18       "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
19     end
20 
21     def canonicalized_headers
22       "#{canonicalized_positional_headers}"
23     end
24 
25     def canonicalized_positional_headers
26       POSITIONAL_HEADERS.collect do |header|
27         @headers[header] + "\n"
28       end.join
29     end    
30 
31   end
32 end

If you run the tests, you’ll see that the new tests pass, but we’ve broken the test_http_verb_is_uppercase test.

 1 t$> ruby test/first_test.rb 
 2 Loaded suite test/first_test
 3 Started
 4 .E.
 5 Finished in 0.00068 seconds.
 6 
 7   1) Error:
 8 test_http_verb_is_uppercase(S3AuthenticatorTest):
 9 NoMethodError: undefined method `+' for nil:NilClass
10     ./test/../s3_authenticator_dev.rb:26:in `canonicalized_positional_headers'
11     ./test/../s3_authenticator_dev.rb:25:in `collect'
12     ./test/../s3_authenticator_dev.rb:25:in `canonicalized_positional_headers'
13     ./test/../s3_authenticator_dev.rb:21:in `canonicalized_headers'
14     ./test/../s3_authenticator_dev.rb:17:in `canonical_string'
15     test/first_test.rb:9:in `test_http_verb_is_uppercase'
16 
17 3 tests, 2 assertions, 0 failures, 1 errors

That test doesn’t pass in all of the positional headers, so the canonicalized_positional_headers method is failing when it tries to add the non-existent header to a string. This is kind of fortuitous, as the specification for the canonicalized_positional_headers says that a positional header should be replaced by an empty string if it doesn’t exists. Let’s write a test for that spec and hopefully we’ll comply with that specification and fix the currently failing test all in one fell swoop. Here’s the new test

1 def test_positional_headers_with_only_date_header
2   @s3_test.make_authenticated_request(:get, '', 
3                                       {'date' => 'December 25th, 2007'})
4   assert_equal "\n\nDecember 25th, 2007\n", 
5                @s3_test.canonicalized_positional_headers
6 end

To fix the problem, all we have to do is make sure that positional header is replaced with an empty string if it doesn’t exist

1 def test_positional_headers_with_only_date_header
2   @s3_test = S3Lib::AuthenticatedRequest.new
3   @s3_test.make_authenticated_request(:get, '', 
4                                       {'date' => 'December 25th, 2007'})
5   assert_equal "\n\nDecember 25th, 2007\n", 
6                @s3_test.canonicalized_positional_headers
7 end

Our tests all pass now. Phew.

DRYing up our tests

You might have noticed that all of our tests start with the line @s3_test = S3Lib::AuthenticatedRequest.new. Now, I’m a pretty lazy guy, so I want to avoid all of that repetitive typing. Luckily, Ruby’s Unit Testing library is a faithful interpretation of the XUnit test specification. This means that if you define a setup method, it will be run before every test. Similarly, the teardown method, if defined, will be run after every test. Let’s refactor our test library to take advantage of the setup method.

 1 require 'test/unit'
 2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
 3 
 4 class S3AuthenticatorTest < Test::Unit::TestCase
 5 
 6   def setup
 7     @s3_test = S3Lib::AuthenticatedRequest.new
 8   end
 9 
10   def test_http_verb_is_uppercase    
11     @s3_test.make_authenticated_request(:get, '/', 
12                             {'host' => 's3.amazonaws.com'})
13     assert_match /^GET\n/, @s3_test.canonical_string
14   end
15 
16   def test_canonical_string_contains_positional_headers    
17     @s3_test.make_authenticated_request(:get, '', 
18                           {'content-type' => 'some content type', 
19                            'date' => 'December 25th, 2007', 
20                            'content-md5' => 'whee'})
21     assert_match /^GET\n#{@s3_test.canonicalized_positional_headers}/, 
22                  @s3_test.canonical_string
23   end
24 
25   def test_positional_headers_with_all_headers
26     @s3_test.make_authenticated_request(:get, '', 
27 					                {'content-type' => 'some content type', 
28                            'date' => 'December 25th, 2007', 
29                            'content-md5' => 'whee'})
30     assert_equal "whee\nsome content type\nDecember 25th, 2007\n",
31                          @s3_test.canonicalized_positional_headers
32   end  
33 
34   def test_positional_headers_with_only_date_header
35     @s3_test.make_authenticated_request(:get, '', 
36                                  {'date' => 'December 25th, 2007'})
37     assert_equal "\n\nDecember 25th, 2007\n", 
38                  @s3_test.canonicalized_positional_headers
39   end  
40 
41 end

Keeping your methods and variable private yet testable

You might have been gritting your teeth as I just blithely used attributes and methods that should be private in the tests above. In reality, you probably don’t want the canonical_string, @headers or any of the other methods I was testing against available in the public interface. I did this because I didn’t want to obscure what we’re really trying to do here: write an S3 authentication library. However, if you’re interested, here’s a nice method for keeping those methods and attributes private in production yet available for testing.

The key to this technique is that Ruby’s classes are always open and extendible. So, you can open up the class when you’re testing it and make everything publicly accessible while still keeping everything locked down for production use. Here’s an example of how I did it while developing this class.

Here’s the class refactored to keep things private. Notice that only the make_authenticated_request method is publicly available

 1 module S3Lib
 2 require 'time'
 3 
 4 def self.request(verb, request_path, headers = {})
 5   s3requester = AuthenticatedRequest.new()
 6   s3requester.make_authenticated_request(verb, request_path, headers)
 7 end  
 8 
 9 class AuthenticatedRequest
10   
11   POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']      
12   
13   def make_authenticated_request(verb, request_path, headers = {})
14     @verb = verb
15     @headers = headers
16   end
17   
18   private
19 
20   def canonical_string
21     "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
22   end
23   
24   def canonicalized_headers
25     "#{canonicalized_positional_headers}"
26   end
27   
28   def canonicalized_positional_headers
29     POSITIONAL_HEADERS.collect do |header|
30       (@headers[header] || "") + "\n"
31     end.join
32   end    
33 
34 end

Now, in the same file that your tests are in, open up the class and publicize all of the methods we want to test (canonical_string, canonicalized_headers and canonicalized_positional_headers) with public versions (public_canonical_string, public_canonicalized_headers and public_canonicalized_positional_headers). Next, make any instance variables you want access to readable (@headers). Finally, re-name the method calls in your tests by pre-pending public_ to them.

 1 # Make private methods and attributes public so that you can test them
 2 module S3Lib
 3   class AuthenticatedRequest
 4   
 5     attr_reader :headers  
 6   
 7     def public_canonicalized_headers
 8       canonicalized_headers
 9     end  
10   
11     def public_canonicalized_positional_headers
12       canonicalized_positional_headers
13     end
14   
15     def public_canonical_string
16       canonical_string
17     end
18   
19   end  
20 end
21 
22 require 'test/unit'
23 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev_private')
24 
25 class S3AuthenticatorTest < Test::Unit::TestCase
26   
27   def setup
28     @s3_test = S3Lib::AuthenticatedRequest.new
29   end
30   
31   def test_http_verb_is_uppercase    
32     @s3_test.make_authenticated_request(:get, '/', 
33                                         {'host' => 's3.amazonaws.com'})
34     assert_match /^GET\n/, @s3_test.public_canonical_string
35   end
36   
37   def test_canonical_string_contains_positional_headers    
38     @s3_test.make_authenticated_request(:get, '', 
39                                         {'content-type' => 'some content type', 
40                                          'date' => 'December 25th, 2007', 
41                                          'content-md5' => 'whee'})
42     assert_match /^GET\n#{@s3_test.public_canonicalized_positional_headers}/, 
43                  @s3_test.public_canonical_string
44   end
45   
46   def test_positional_headers_with_all_headers
47     @s3_test.make_authenticated_request(:get, '', 
48                                         {'content-type' => 'some content type', 
49                                          'date' => 'December 25th, 2007', 
50                                          'content-md5' => 'whee'})
51     assert_equal "whee\nsome content type\nDecember 25th, 2007\n", @s3_test.publ\
52 ic_canonicalized_positional_headers
53   end  
54   
55   def test_positional_headers_with_only_date_header
56     @s3_test.make_authenticated_request(:get, '', 
57                                         {'date' => 'December 25th, 2007'})
58     assert_equal "\n\nDecember 25th, 2007\n", 
59                  @s3_test.public_canonicalized_positional_headers
60   end  
61   
62 end

This technique of opening the class will also come in handy when we actually write the code to talk to Amazon S3, but want to be able to test without a live internet connection.

Phew. That was a lot of reading and coding, but we have the positional headers working and well tested now. I’ll tone down the verbiage from now on so that we can finish up without killing too many extra forests.

The Canonicalized Amazon Headers

The Amazon headers are all headers that begin with x-amz-, ignoring case. You construct the canonicalized_amazon_headers with the following steps. I’m going to write out each specification along with some tests that express that specification.

  • Find all headers that have header names that begin with x-amz-, ignoring case. These are the Amazon headers
     1   def test_amazon_headers_should_remove_non_amazon_headers
     2     @s3_test.make_authenticated_request(:get, '', 
     3                                         {'content-type' => 'content', 
     4                                          'some-other-header' => 'other',
     5                                          'x-amz-meta-one' => 'one',
     6                                          'x-amz-meta-two' => 'two'})
     7     headers = @s3_test.public_canonicalized_amazon_headers
     8     assert_no_match /other/, headers
     9     assert_no_match /content/, headers
    10   end
    11 
    12   def test_amazon_headers_should_keep_amazon_headers
    13     @s3_test.make_authenticated_request(:get, '', 
    14                                         {'content-type' => 'content', 
    15                                          'some-other-header' => 'other',
    16                                          'x-amz-meta-one' => 'one',
    17                                          'x-amz-meta-two' => 'two'})
    18     headers = @s3_test.public_canonicalized_amazon_headers    
    19     assert_match /x-amz-meta-one/, headers
    20     assert_match /x-amz-meta-two/, headers
    21   end
    
  • Convert each of the Amazon header’s header names to lower case. (Just the header names, not their values)
     1   def test_amazon_headers_should_be_lowercase
     2     @s3_test.make_authenticated_request(:get, '', 
     3                                         {'content-type' => 'content', 
     4                                          'some-other-header' => 'other',
     5                                          'X-amz-meta-one' => 'one',
     6                                          'x-Amz-meta-two' => 'two'})
     7     headers = @s3_test.public_canonicalized_amazon_headers    
     8     assert_match /x-amz-meta-one/, headers
     9     assert_match /x-amz-meta-two/, headers
    10   end
    
  • For each Amazon header, combine the header name and header value by joining them with a colon. Remove any leading string on the header value as you do this
     1   def test_leading_spaces_get_stripped_from_header_values
     2     @s3_test.make_authenticated_request(:get, '', 
     3                            {'x-amz-meta-one' => ' one with a leading space',
     4                             'x-Amz-meta-two' => ' two with a leading and trailin\
     5 g space '})
     6     headers = @s3_test.public_canonicalized_amazon_headers 
     7     assert_match /x-amz-meta-one:one with a leading space/, headers
     8     assert_match /x-amz-meta-two:two with a leading and trailing space /, 
     9                 headers 
    10   end
    
  • If you have multiple Amazon headers with the same name, then combine the values in to one value by joining them with commas, without any white space between them.
    1   def test_values_as_arrays_should_be_joined_as_commas
    2      @s3_test.make_authenticated_request(:get, '', 
    3                                          {'x-amz-mult' => ['a', 'b', 'c']})
    4      headers = @s3_test.canonicalized_amazon_headers
    5      assert_match /a,b,c/, headers
    6   end
    
  • If any of the headers span multiple lines, un-fold them by replacing the newlines with a single space
     1   def test_long_amazon_headers_should_get_unfolded
     2     @s3_test.make_authenticated_request(:get, '', 
     3                                         {'x-amz-meta-one' => "A really long head\
     4 er\n" + 
     5                                                              "with multiple line\
     6 s."})
     7     headers = @s3_test.canonicalized_amazon_headers 
     8     assert_match /x-amz-meta-one:A really long header with multiple lines./, 
     9                  headers    
    10   end
    
  • Sort the Amazon headers alphabetically by header name
     1   def test_amazon_headers_should_be_alphabetized
     2     @s3_test.make_authenticated_request(:get, '', 
     3                                         {'content-type' => 'content', 
     4                                          'some-other-header' => 'other',
     5                                          'X-amz-meta-one' => 'one',
     6                                          'x-Amz-meta-two' => 'two',
     7                                          'x-amz-meta-zed' => 'zed',
     8                                          'x-amz-meta-alpha' => 'alpha'})
     9     headers = @s3_test.canonicalized_amazon_headers    
    10     assert_match /alpha.*one.*two.*zed/m, headers # /m on the reg-exp makes .* i\
    11 nclude newlines
    12   end
    
  • Join the headers together with new-lines (\n)

Here is some code that passes those tests, which hopefully means it achieves the specifications:

 1 class Hash
 2 
 3   def downcase_keys
 4     res = {}
 5     each do |key, value|
 6       key = key.downcase if key.respond_to?(:downcase)
 7       res[key] = value
 8     end
 9     res
10   end
11 
12   def join_values(separator = ',')    
13     res = {}
14     each do |key, value|
15       res[key] = value.respond_to?(:join) ? value.join(separator) : value
16     end
17     res
18   end
19 
20 end
21 
22 module S3Lib
23   require 'time'
24 
25   def self.request(verb, request_path, headers = {})
26     s3requester = AuthenticatedRequest.new()
27     s3requester.make_authenticated_request(verb, request_path, headers)
28   end  
29 
30   class AuthenticatedRequest
31 
32     attr_reader :headers
33     POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']      
34 
35     def make_authenticated_request(verb, request_path, headers = {})
36       @verb = verb
37       @headers = headers.downcase_keys.join_values
38     end
39 
40     def canonical_string
41       "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
42     end
43 
44     def canonicalized_headers
45       "#{canonicalized_positional_headers}#{canonicalized_amazon_headers}"
46     end
47 
48     def canonicalized_positional_headers
49       POSITIONAL_HEADERS.collect do |header|
50         (@headers[header] || "") + "\n"
51       end.join
52     end 
53 
54     def canonicalized_amazon_headers
55 
56       # select all headers that start with x-amz-
57       amazon_headers = @headers.select do |header, value|
58         header =~ /^x-amz-/
59       end
60 
61       # Sort them alpabetically by key
62       amazon_headers = amazon_headers.sort do |a, b|
63         a[0] <=> b[0]
64       end
65 
66       # Collect all of the amazon headers like this:
67       # {key}:{value}\n
68       # The value has to have any whitespace on the left stripped from it 
69       # and any new-lines replaced by a single space.
70       # Finally, return the headers joined together as a single string.
71       amazon_headers.collect do |header, value|
72         "#{header}:#{value.lstrip.gsub("\n"," ")}\n"
73       end.join
74     end       
75 
76   end
77 end

What did I just do to Hash?

If you’re not used to Ruby, you might have gotten a little worried when you saw the additions I made to the Hash class. I opened up a base class and added a couple of methods to it. Yikes! You might think that this is crazy and will lead to all kinds of problems, but it is accepted practice in the Ruby world. Personally, I love it: it makes my code more readable and concise, and it has never caused me a problem.

Date Stamping Requests

If you were feeling especially alert, you might have noticed that I side-stepped a specification in the canonicalized headers sections. That spec is that all requests to S3 must be time-stamped. A request can be time stamped in two ways: through the positional date header, or through an amazon header called x-amz-date. The x-amz-date header over-rides the date header. Also, to make the life of the users of your library easier, let’s make the library provide a date header equal to the current time if none is passed in. Here’s a set of tests that express that spec:

 1   def test_date_should_be_added_if_not_passed_in
 2     @s3_test.make_authenticated_request(:get, '')
 3     assert @s3_test.headers.has_key?('date')
 4   end
 5 	
 6 	def test_positional_headers_with_no_headers_should_have_date_defined
 7     @s3_test.make_authenticated_request(:get, '' )
 8     date = @s3_test.headers['date']
 9     assert_equal "\n\n#{date}\n", @s3_test.canonicalized_positional_headers      
10   end
11 
12   def test_xamzdate_should_override_date_header
13     @s3_test.make_authenticated_request(:get, '', 
14                                         {'date' => 'December 15, 2005', 
15                                          'x-amz-date' => 'Tue, 27 Mar 2007 21:20\
16 :26 +0000'})
17     headers = @s3_test.public_canonicalized_headers    
18     assert_match /2007/, headers
19     assert_no_match /2005/, headers
20   end
21 
22   def test_xamzdate_should_override_capitalized_date_header
23     @s3_test.make_authenticated_request(:get, '', 
24                                         {'Date' => 'December 15, 2005', 
25                                          'X-amz-date' => 'Tue, 27 Mar 2007 21:20\
26 :26 +0000'})
27     headers = @s3_test.public_canonicalized_headers    
28     assert_match /2007/, headers
29     assert_no_match /2005/, headers
30   end

We’ll use the fix_date method to add a date header if it doesn’t exist. Notice that the test accesses the @headers hash. The line that reads attr_reader :headers makes that accessible to our tests. Here’s the code:

 1 module S3Lib
 2   require 'time'
 3   
 4   def self.request(verb, request_path, headers = {})
 5     s3requester = AuthenticatedRequest.new()
 6     s3requester.make_authenticated_request(verb, request_path, headers)
 7   end  
 8   
 9   class AuthenticatedRequest
10     
11     attr_reader :headers
12     POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']      
13     
14     def make_authenticated_request(verb, request_path, headers = {})
15       @verb = verb
16       @headers = headers
17       fix_date
18     end
19     
20     def fix_date
21       @headers['date'] ||= Time.now.httpdate
22       @headers.delete('date') if @headers.has_key?('x-amz-date')    
23     end
24   
25     def canonical_string
26       "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
27     end
28     
29     def canonicalized_headers
30       "#{canonicalized_positional_headers}"
31     end
32     
33     def canonicalized_positional_headers
34       POSITIONAL_HEADERS.collect do |header|
35         (@headers[header] || "") + "\n"
36       end.join
37     end    
38   
39   end
40 end

The Canonicalized Resource

The canonicalized_resource is given by

1 /<bucket><uri><sub-resource>

The canonicalized_resource must start with a forward slash (/), it must include the bucket name (even if the bucket is not in the URI), and then comes the URI and the sub-resource (if any). The bucket name must be lower case. Here are some tests that express this.

 1 require 'test/unit'
 2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
 3 
 4 class S3AuthenticatorCanonicalResourceTest < Test::Unit::TestCase
 5   
 6   def setup
 7     @s3_test = S3Lib::AuthenticatedRequest.new
 8   end
 9   
10   def test_forward_slash_is_always_added
11     @s3_test.make_authenticated_request(:get, '')
12     assert_match /^\//, @s3_test.canonicalized_resource
13   end
14   
15   def test_bucket_name_in_uri_should_get_passed_through
16     @s3_test.make_authenticated_request(:get, 'my_bucket')
17     assert_match /^\/my_bucket/, @s3_test.canonicalized_resource
18   end
19   
20   def test_canonicalized_resource_should_include_uri
21     @s3_test.make_authenticated_request(:get, 'my_bucket/vampire.jpg')
22     assert_match /vampire.jpg$/, @s3_test.canonicalized_resource
23   end  
24   
25   def test_canonicalized_resource_should_include_sub_resource
26     @s3_test.make_authenticated_request(:get, 'my_bucket/vampire.jpg?torrent')
27     assert_match /vampire.jpg\?torrent$/, @s3_test.canonicalized_resource    
28   end
29   
30   def test_bucket_name_with_virtual_hosting
31     @s3_test.make_authenticated_request(:get, '/', 
32                                         {'host' => 'some_bucket.s3.amazonaws.com\
33 '})
34     assert_match /some_bucket\//, @s3_test.canonicalized_resource
35     assert_no_match /s3.amazonaws.com/, @s3_test.canonicalized_resource
36   end
37   
38   def test_bucket_name_with_cname_virtual_hosting
39     @s3_test.make_authenticated_request(:get, '/',  
40                                         {'host' => 'some_bucket.example.com'})
41     assert_match /^\/some_bucket.example.com/, @s3_test.canonicalized_resource
42   end
43   
44 end

Here is the AuthenticatedRequest library that passes these tests. Note the changes that have been made:

  • The HOST constant has been added and set to 's3.amazonaws.com'
  • The get_bucket_name method has been created. It is called from the make_authenticated_request method. This method extracts the bucket from the host header and saves it to the @bucket instance variable.
  • The canonicalized_resource method creates the canonicalized resource string. It is called in the canonical_string method.
 1 class AuthenticatedRequest
 2   
 3   attr_reader :headers
 4   POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']      
 5   HOST = 's3.amazonaws.com'       
 6   
 7   def make_authenticated_request(verb, request_path, headers = {})
 8     @verb = verb
 9     @request_path = request_path.gsub(/^\//,'') # Strip off the leading '/'      
10     
11     @headers = headers.downcase_keys.join_values
12     fix_date
13     get_bucket_name      
14   end    
15   
16   def fix_date
17     @headers['date'] ||= Time.now.httpdate
18     @headers.delete('date') if @headers.has_key?('x-amz-date')          
19   end
20   
21   def canonical_string
22     "#{@verb.to_s.upcase}\n#{canonicalized_headers}#{canonicalized_resource}"
23   end
24   
25   def canonicalized_headers
26     "#{canonicalized_positional_headers}#{canonicalized_amazon_headers}"
27   end
28   
29   def canonicalized_positional_headers
30     POSITIONAL_HEADERS.collect do |header|
31       (@headers[header] || "") + "\n"
32     end.join
33   end 
34   
35   def canonicalized_amazon_headers
36     
37     # select all headers that start with x-amz-
38     amazon_headers = @headers.select do |header, value|
39       header =~ /^x-amz-/
40     end
41     
42     # Sort them alpabetically by key
43     amazon_headers = amazon_headers.sort do |a, b|
44       a[0] <=> b[0]
45     end
46     
47     # Collect all of the amazon headers like this:
48     # {key}:{value}\n
49     # The value has to have any whitespace on the left stripped from it 
50     # and any new-lines replaced by a single space.
51     # Finally, return the headers joined together as a single string.
52     amazon_headers.collect do |header, value|
53       "#{header}:#{value.lstrip.gsub("\n"," ")}\n"
54     end.join
55   end   
56   
57   def canonicalized_resource
58     canonicalized_resource_string = "/"
59     canonicalized_resource_string += @bucket
60     canonicalized_resource_string += @request_path  
61     canonicalized_resource_string  
62   end
63   
64   def get_bucket_name
65     @bucket = ""
66     return unless @headers.has_key?('host')
67     @headers['host'] = @headers['host'].downcase
68     return if @headers['host'] == 's3.amazonaws.com'
69     if @headers['host'] =~ /^([^.]+)(:\d\d\d\d)?\.#{HOST}$/ # Virtual hosting
70       @bucket = $1.gsub(/\/$/,'') + '/'
71     else
72       # CNAME Virtual hosting
73       @bucket = @headers['host'].gsub(/(:\d\d\d\d)$/, '').gsub(/\/$/,'') + '/' 
74     end    
75   end      
76   
77 end

The Full Signature

Now that we have all of the parts of the signature coded up, we can use some samples provided by the S3 Developers Guide to test that it works when we bring it all together. The examples are in the section of the document on REST authentication at http://docs.amazonwebservices.com/AmazonS3/2006-03-01/RESTAuthentication.html

Okay, so let’s take a look at the first example and code up a test to use it. Here’s the example:

Request canonical_string

 1 GET /photos/puppy.jpg HTTP/1.1
 2 Host: johnsmith.s3.amazonaws.com
 3 Date: 
 4   Tue, 27 Mar 2007 19:36:42 +0000
 5 Authorization: AWS 0PN5J17HBGZHT7JJ3X82:
 6 xXjDGYUmKxnwqr5KXNPGldn5LbA=
 7 
 8 GET\n
 9 \n
10 \n
11 Tue, 27 Mar 2007 19:36:42 
12   +0000\n
13 /johnsmith/photos/puppy.jpg

There are two interesting features to note:

  • The bucket is provided in the Host header, not in the URL, but it still shows up in the canonical_resource in the canonical_string.
  • The actual encrypted Authorization header is provided for the sample request. I’ll talk about this more in the next section on signing the request.

Let’s translate that example to a unit test and see if all of our hard work comes together.

 1 require 'test/unit'
 2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
 3 
 4 class S3AuthenticatorTest < Test::Unit::TestCase
 5 
 6   def setup
 7     @s3_test = S3Lib::AuthenticatedRequest.new
 8   end
 9 
10   # http://developer.amazonwebservices.com/connect/entry.jspa?externalID=123&cat\
11 egoryID=48
12   def test_dg_sample_one
13     @s3_test.make_authenticated_request(:get, '/photos/puppy.jpg', 
14                                         {'Host' => 'johnsmith.s3.amazonaws.com',
15                                         'Date' => 'Tue, 27 Mar 2007 19:36:42 +00\
16 00'})
17     expected_canonical_string = "GET\n\n\nTue, 27 Mar 2007 19:36:42 +0000\n" + 
18 																"/johnsmith/photos/puppy.jpg"
19     assert_equal expected_canonical_string, @s3_test.canonical_string
20   end
21 
22 end

Save that in canonical_string_tests.rb and run it.

1 $> ruby test/canonical_string_tests.rb 
2 	Loaded suite test/canonical_string_tests
3 	Started
4 	.
5 	Finished in 0.000447 seconds.
6 
7 	1 tests, 1 assertions, 0 failures, 0 errors

Phew. Everything works as planned.

The next step is to take all of the examples from the Developer’s Guide, translate them to unit tests and make sure they pass.

  1 require 'test/unit'
  2 	require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
  3 
  4 	class S3AuthenticatorTest < Test::Unit::TestCase
  5 
  6 	  def setup
  7 	    @s3_test = S3Lib::AuthenticatedRequest.new
  8 	  end
  9 
 10 	  # http://developer.amazonwebservices.com/connect/entry.jspa?externalID=123&ca\
 11 tegoryID=48
 12 	  def test_dg_sample_one
 13 	    @s3_test.make_authenticated_request(:get, '/photos/puppy.jpg', 
 14 	                                        {'Host' => 'johnsmith.s3.amazonaws.com',
 15 	                                         'Date' => 'Tue, 27 Mar 2007 19:36:42 +\
 16 0000'})
 17 	    expected_canonical_string = "GET\n\n\nTue, 27 Mar 2007 19:36:42 +0000\n" + 
 18 	                                "/johnsmith/photos/puppy.jpg"
 19 	    assert_equal expected_canonical_string, @s3_test.canonical_string
 20 	  end
 21 
 22 	  def test_dg_sample_two
 23 	    @s3_test.make_authenticated_request(:put, '/photos/puppy.jpg', 
 24 	                                       {'Content-Type' => 'image/jpeg',
 25 	                                        'Content-Length' => '94328',
 26 	                                        'Host' => 'johnsmith.s3.amazonaws.com',
 27 	                                        'Date' => 'Tue, 27 Mar 2007 21:15:45 +0\
 28 000'})
 29 	    expected_canonical_string = "PUT\n\nimage/jpeg\nTue, 27 Mar 2007 21:15:45" \
 30 +  					 
 31 	                                "+0000\n/johnsmith/photos/puppy.jpg"
 32 	    assert_equal expected_canonical_string, @s3_test.canonical_string  
 33 	  end
 34 
 35 	  def test_dg_sample_three
 36 	    @s3_test.make_authenticated_request(:get, '', 
 37 	                                        {'prefix' => 'photos',
 38 	                                         'max-keys' => '50',
 39 	                                         'marker' => 'puppy',
 40 	                                         'host' => 'johnsmith.s3.amazonaws.com',
 41 	                                         'date' => 'Tue, 27 Mar 2007 19:42:41 +\
 42 0000'})
 43 	    assert_equal "GET\n\n\nTue, 27 Mar 2007 19:42:41 +0000\n/johnsmith/", @s3_t\
 44 est.canonical_string
 45 	  end 
 46 
 47 	  def test_dg_sample_four
 48 	    @s3_test.make_authenticated_request(:get, '?acl', 
 49 	                                        {'host' => 'johnsmith.s3.amazonaws.com'\
 50 , 
 51 	                                         'date' => 'Tue, 27 Mar 2007 19:44:46 +\
 52 0000'})
 53 
 54 	    assert_equal "GET\n\n\nTue, 27 Mar 2007 19:44:46 +0000\n" + 
 55 	   							 "/johnsmith/?acl", @s3_test.canonical_string    
 56 	  end
 57 
 58 	  def test_dg_sample_five
 59 	    @s3_test.make_authenticated_request(:delete, '/johnsmith/photos/puppy.jpg', 
 60 	                                                  {'User-Agent' => 'dotnet',
 61 	                                                   'host' => 's3.amazonaws.com',
 62 	                                                   'date' => 'Tue, 27 Mar 2007 \
 63 21:20:27 +0000',
 64 	                                                   'x-amz-date' => 'Tue, 27 Mar\
 65  2007 21:20:26 +0000' })                                                   
 66 	    assert_equal "DELETE\n\n\n\nx-amz-date:Tue, 27 Mar 2007 21:20:26 +0000\n/jo\
 67 hnsmith/photos/puppy.jpg", @s3_test.canonical_string
 68 	  end   
 69 
 70 	  def test_dg_sample_six
 71 	    @s3_test.make_authenticated_request(:put,    
 72 	                                        '/db-backup.dat.gz', 
 73 	                                        {'User-Agent' => 'curl/7.15.5',
 74 	                                         'host' => 'static.johnsmith.net:8080',
 75 	                                         'date' => 'Tue, 27 Mar 2007 21:06:08 +\
 76 0000',
 77 	                                         'x-amz-acl' => 'public-read',
 78 	                                         'content-type' => 'application/x-downl\
 79 oad',
 80 	                                         'Content-MD5' => '4gJE4saaMU4BqNR0kLY+\
 81 lw==',
 82 	                                         'X-Amz-Meta-ReviewedBy' => ['joe@johns\
 83 mith.net', 'jane@johnsmith.net'],
 84 	                                         'X-Amz-Meta-FileChecksum' => '0x026617\
 85 79',
 86 	                                         'X-Amz-Meta-ChecksumAlgorithm' => 'crc\
 87 32',
 88 	                                         'Content-Disposition' => 'attachment; \
 89 filename=database.dat',
 90 	                                         'Content-Encoding' => 'gzip',
 91 	                                         'Content-Length' => '5913339' })
 92 	    expected_canonical_string =  "PUT\n4gJE4saaMU4BqNR0kLY+lw==\napplication/x-\
 93 download\n" + 
 94 	                                 "Tue, 27 Mar 2007 21:06:08 +0000\n" + 	       \
 95                      	 
 96 	                                 "x-amz-acl:public-read\nx-amz-meta-checksumalg\
 97 orithm:crc32\n" + 
 98 																	 "x-amz-meta-filechecksum:0x02661779\n" +	                     \
 99            
100 																	 "x-amz-meta-reviewedby:joe@johnsmith.net,jane@johnsmith.net\n"\
101  + 
102 																	 "/static.johnsmith.net/db-backup.dat.gz"
103 	    assert_equal expected_canonical_string, @s3_test.canonical_string
104 	  end  
105 
106 	end

Now, let’s run them

1 $> ruby test/canonical_string_tests.rb 
2 	Loaded suite test/canonical_string_tests
3 	Started
4 	......
5 	Finished in 0.001213 seconds.
6 
7 	6 tests, 6 assertions, 0 failures, 0 errors

SUCCESS! Ahh, that feels good. Only one step remains before we have a fully working library: we have to take that canonical_string and encode it.

Signing the Request

The whole point of this authentication procedure is to digitally sign your request by creating the canonical_string and use it to create an Authorization header. The Authorization header looks like this:

1 Authorization = AWS <AWSAccessKeyId>:<signature>

and the signature like this:

1 Signature = Base64( HMAC-SHA1( UTF-8-Encoding-Of( canonical_string ) ) )

We spent a lot of time figuring out how to make the canonical_string. The next step is much easier: we need to take that canonical_string and feed it through the algorithms to encode it. The method of encoding the canonical_string is highly language dependent, so I’m going to be lazy and point you to the Amazon S3 Getting Started Guide (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/). This page in the guide links to sample implementations in Java, C#, Perl, PHP, Ruby and Python: http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/PreparingTheSamples.html. Here is some Ruby code that does the trick:

 1 **require 'base64'
 2 require 'digest/sha1'
 3 require 'openssl'**
 4 
 5 
 6 module S3Lib
 7 
 8   class AuthenticatedRequest
 9 	
10     def make_authenticated_request(verb, request_path, headers = {})
11       @verb = verb
12       @request_path = request_path.gsub(/^\//,'') # Strip off the leading '/'   \
13    
14 **      @amazon_id = ENV['AMAZON_ACCESS_KEY_ID']
15       @amazon_secret = ENV['AMAZON_SECRET_ACCESS_KEY']          
16 **
17       @headers = headers.downcase_keys.join_values
18       fix_date
19       get_bucket_name      
20     end	
21 
22     .....
23 
24 **    def authorization_string
25       generator = OpenSSL::Digest::Digest.new('sha1')
26       encoded_canonical = \
27         Base64.encode64(OpenSSL::HMAC.digest(generator, @amazon_secret, canonica\
28 l_string)).strip
29 
30       "AWS #{@amazon_id}:#{encoded_canonical}"
31     end**
32 
33 
34   end
35 end

I’ve added the @amazon_id and @amazon_secret instance variables in the make_authenticated_request method, and added the authorization_string method that does all of the heavy lifting. All of the required libraries are included in the base Ruby distribution, so you should be able to just run this.

Let’s write some unit tests to see if that really works. Luckily, we have the examples from the Amazon S3 Developer’s Guide to work with. The examples all use the same set of (fake) authentication credentials.

Table 4.1. S3 Authentication Credentials used in the examples

Parameter Value

AWSAccessKeyId 0PN5J17HBGZHT7JJ3X82

AWSSecretAccessKey uV3F3YluFJax1cknvbcGwgjvx4QpvB+leU8dUj2o

We probably want to set these in the setup section of our tests. Remember that the S3Lib Library is getting the parameters from the AMAZON_ACCESS_KEY_ID and AMAZON_SECRET_ACCESS_KEY environment parameters, so we can set the parameters using Ruby’s ENV command:

1 def setup
2   # The id and secret key are non-working credentials 
3   # from the S3 Developer's Guide 
4   # http://developer.amazonwebservices.com/connect/entry.jspa 
5   #   ?externalID=123&categoryID=48
6   ENV['AMAZON_ACCESS_KEY_ID'] = '0PN6J17HBGXHT7JJ3X82'
7   ENV['AMAZON_SECRET_ACCESS_KEY'] = 'uV3F3YluFJax1cknvbcGwgjvx4QpvB+leU8dUj2o'
8   @s3_test = S3Lib::AuthenticatedRequest.new
9 end

We can then re-write the tests from the previous section to include a test for the Authentication header. Something like this:

  1 require 'test/unit'
  2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
  3 
  4 class S3AuthenticatorTest < Test::Unit::TestCase
  5   
  6   def setup
  7     # The id and secret key are non-working credentials from the S3 Developer's \
  8 Guide 
  9     # See http://developer.amazonwebservices.com/connect/entry.jspa
 10     #           ?externalID=123&categoryID=48
 11     ENV['AMAZON_ACCESS_KEY_ID'] = '0PN6J17HBGXHT7JJ3X82'
 12     ENV['AMAZON_SECRET_ACCESS_KEY'] = 'uV3F3YluFJax1cknvbcGwgjvx4QpvB+leU8dUj2o'
 13     @s3_test = S3Lib::AuthenticatedRequest.new
 14   end
 15   
 16   # See http://developer.amazonwebservices.com/connect/entry.jspa
 17   #          ?externalID=123&categoryID=48
 18   def test_dg_sample_one
 19     @s3_test.make_authenticated_request(:get, '/photos/puppy.jpg', 
 20                                         {'Host' => 'johnsmith.s3.amazonaws.com',
 21                                           'Date' => 'Tue, 27 Mar 2007 19:36:42 +\
 22 0000'})
 23     expected_canonical_string = "GET\n\n\nTue, 27 Mar 2007 19:36:42 +0000\n" + 
 24       "/johnsmith/photos/puppy.jpg"
 25     assert_equal expected_canonical_string, @s3_test.canonical_string
 26     assert_equal "AWS 0PN6J17HBGXHT7JJ3X82:xXjDGYUmKxnwqr5KXNPGldn5LbA=", 
 27     @s3_test.authorization_string    
 28   end
 29   
 30   # See http://developer.amazonwebservices.com/connect/entry.jspa
 31   #          ?externalID=123&categoryID=48
 32   def test_dg_sample_two
 33     @s3_test.make_authenticated_request(:put, '/photos/puppy.jpg', 
 34                                         {'Content-Type' => 'image/jpeg',
 35                                           'Content-Length' => '94328',
 36                                           'Host' => 'johnsmith.s3.amazonaws.com',
 37                                           'Date' => 'Tue, 27 Mar 2007 21:15:45 +\
 38 0000'})
 39     expected_canonical_string = "PUT\n\nimage/jpeg\nTue, 27 Mar 2007 21:15:45 +0\
 40 000\n" +
 41       "/johnsmith/photos/puppy.jpg"
 42     assert_equal expected_canonical_string, @s3_test.canonical_string  
 43     assert_equal "AWS 0PN6J17HBGXHT7JJ3X82:hcicpDDvL9SsO6AkvxqmIWkmOuQ=", 
 44     @s3_test.authorization_string  
 45   end
 46   
 47   def test_dg_sample_three
 48     @s3_test.make_authenticated_request(:get, '', 
 49                                         {'prefix' => 'photos',
 50                                           'max-keys' => '50',
 51                                           'marker' => 'puppy',
 52                                           'host' => 'johnsmith.s3.amazonaws.com',
 53                                           'date' => 'Tue, 27 Mar 2007 19:42:41 +\
 54 0000'})
 55     assert_equal "GET\n\n\nTue, 27 Mar 2007 19:42:41 +0000\n/johnsmith/",
 56     @s3_test.canonical_string
 57     assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:jsRt/rhG+Vtp88HrYL706QhE4w4=', 
 58     @s3_test.authorization_string
 59   end 
 60   
 61   def test_dg_sample_four
 62     @s3_test.make_authenticated_request(:get, '?acl', 
 63                                         {'host' => 'johnsmith.s3.amazonaws.com', 
 64                                           'date' => 'Tue, 27 Mar 2007 19:44:46 +\
 65 0000'})
 66     
 67     assert_equal "GET\n\n\nTue, 27 Mar 2007 19:44:46 +0000\n/johnsmith/?acl", 
 68     @s3_test.canonical_string
 69     assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:thdUi9VAkzhkniLj96JIrOPGi0g=', 
 70     @s3_test.authorization_string
 71     
 72   end
 73   
 74   def test_dg_sample_five
 75     @s3_test.make_authenticated_request(:delete, 
 76                                         '/johnsmith/photos/puppy.jpg', 
 77                                         {'User-Agent' => 'dotnet',
 78                                           'host' => 's3.amazonaws.com',
 79                                           'date' => 'Tue, 27 Mar 2007 21:20:27 +\
 80 0000',
 81                                           'x-amz-date' => 'Tue, 27 Mar 2007 21:2\
 82 0:26 +0000' })                                                   
 83     assert_equal "DELETE\n\n\n\nx-amz-date:Tue, 27 Mar 2007 21:20:26 +0000\n" +
 84       "/johnsmith/photos/puppy.jpg", 
 85     @s3_test.canonical_string
 86     assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:k3nL7gH3+PadhTEVn5Ip83xlYzk=', 
 87     @s3_test.authorization_string
 88   end   
 89   
 90   def test_dg_sample_six
 91     @s3_test.make_authenticated_request(:put,    
 92                                         '/db-backup.dat.gz', 
 93                                         'User-Agent' => 'curl/7.15.5',
 94                                         'host' => 'static.johnsmith.net:8080',
 95                                         'date' => 'Tue, 27 Mar 2007 21:06:08 +00\
 96 00',
 97                                         'x-amz-acl' => 'public-read',
 98                                         'content-type' => 'application/x-downloa\
 99 d',
100                                         'Content-MD5' => '4gJE4saaMU4BqNR0kLY+lw\
101 ==',
102                                         'X-Amz-Meta-ReviewedBy' => 
103                                         ['joe@johnsmith.net', 'jane@johnsmith.ne\
104 t'],
105                                         'X-Amz-Meta-FileChecksum' => '0x02661779\
106 ',
107                                         'X-Amz-Meta-ChecksumAlgorithm' => 'crc32\
108 ',
109                                         'Content-Disposition' => 'attachment; fi\
110 lename=database.dat',
111                                         'Content-Encoding' => 'gzip',
112                                         'Content-Length' => '5913339')
113     expected_canonical_string =  "PUT\n4gJE4saaMU4BqNR0kLY+lw==\napplication/x-d\
114 ownload\n" + 
115       "Tue, 27 Mar 2007 21:06:08 +0000\n" +                                 
116       "x-amz-acl:public-read\nx-amz-meta-checksumalgorithm:crc32\n" + 
117       "x-amz-meta-filechecksum:0x02661779\n" +
118       "x-amz-meta-reviewedby:joe@johnsmith.net,jane@johnsmith.net\n" + 
119       "/static.johnsmith.net/db-backup.dat.gz"
120     assert_equal expected_canonical_string, @s3_test.canonical_string
121     assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:C0FlOtU8Ylb9KDTpZqYkZPX91iI=', 
122     @s3_test.authorization_string
123   end
124   
125 end

Making the Request

We’ve finally got to the point where we have a signature that we can use to sign the request. The last step will be to actually make the request. We’ll be using the rest-open-uri library to do this, so if you haven’t already installed it, do a

1 $> sudo gem install rest-open-uri

from the command line (omit the sudo if you’re on Windows) to get it installed.

Once you have the rest-open-uri gem installed, making the request is simple. open-uri is in the Ruby Standard Library. It extends the Kernel::open method so that any file that starts with “xxx://” is opened as a URL. So, to open a URL, you just use something like this:

1 require 'open-uri'
2 reddit = open('http://reddit.com')
3 puts reddit.readlines

rest-open-uri (http://rubyforge.org/projects/rest-open-uri/) is a library by Leonard Richardson that extends open-uri, adding support for all of the RESTful verbs. You make a PUT request by adding a :method => :put to the headers hash when you make a call.

1 require 'rubygems'
2 require 'rest-open-uri'
3 
4 # PUT to http://example.com/some_resource
5 open('http://example.com/some_resource', :method => :put) 
6 # DELETE http://example.com/deleteable_resource
7 open('http://example.com/deleteable_resource', :method => :delete) 

To make the request, then, we just need to require the rest-open-uri library and then add the following line to the make_authenticated_request method

1 req = open(uri, @headers.merge(:method => @verb, 
2   'Authorization' => authorization_string))

Okay, let’s try this sucker out! First, make sure that you have actually set your environment parameters correctly so that you can authenticate to S3. AMAZON_ACCESS_KEY_ID should be set to your Amazon ID and AMAZON_SECRET_ACCESS_KEY to your Amazon Secret Key. On OS X or Unix, you can see the environment by typing env at the command line

1 $> env | grep AMAZON
2 AMAZON_ACCESS_KEY_ID=your_amazon_access_key_which_is_a_bunch_of_numbers
3 AMAZON_SECRET_ACCESS_KEY=your_secret_amazon_access_key

Okay, now that we’re sure about the authentication, let’s go do some testing

 1 $> irb
 2 >> require 's3_authenticator.rb'
 3 => true
 4 >> S3Lib.request(:get, '/spatten_presentations')
 5 => #<StringIO:0x1741b60>
 6 >> S3Lib.request(:get, '/spatten_presentations').read
 7 => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n
 8 <ListBucketResult xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\">
 9 	<Name>spatten_presentations</Name>
10 	<Prefix></Prefix>
11 	<Marker></Marker>
12 	<MaxKeys>1000</MaxKeys>
13 	<IsTruncated>false</IsTruncated>
14 	<Contents>
15 		<Key>ploticus_dsl.pdf</Key>
16 		<LastModified>2007-09-12T16:09:24.000Z</LastModified>
17 		<ETag>&quot;94ca8590f028f8be0310bd5b2fabafdc&quot;</ETag>
18 		<Size>509594</Size>
19 		<Owner>
20 			<ID>9d92623ba6dd9d7cc06a7b8bcc46381e7c646f72d769214012f7e91b50c0de0f</ID>
21 			<DisplayName>scottpatten</DisplayName>
22 		</Owner>
23 		<StorageClass>STANDARD</StorageClass>
24 	</Contents>
25 	<Contents>
26 		<Key>s3-on-rails.pdf</Key>
27 		<LastModified>2007-12-05T19:38:32.000Z</LastModified>
28 		<ETag>&quot;891b100f53155b8570bc5e25b1e10f97&quot;</ETag>
29 		<Size>184748</Size>
30 		<Owner>
31 			<ID>9d92623ba6dd9d7cc06a7b8bcc46381e7c646f72d769214012f7e91b50c0de0f</ID>
32 			<DisplayName>scottpatten</DisplayName>
33 		</Owner>
34 		<StorageClass>STANDARD</StorageClass>
35 	</Contents>
36 </ListBucketResult>"

Hey, it works! Notice that the first request we did just returned a StringIO object. That’s what the open command returns. To get at the body of the request, we use the read method on the StringIO object.

Note

If you want to read an IO object more than once, you need to rewind it between reads. Like this:

1 request.read
2 request.rewind
3 request.read

Notice that the listing shows the bucket we requested (spatten_presentations), along with some information about that bucket and a listing of all of the objects in that bucket. We’ll be talking more about the XML and how to parse it in the S3 API Recipes coming up shortly.

Over-riding the

We don’t want to have our tests call on S3 all of the time. It really slows things down and means we can’t test when we’re away from an internet connection. To fix it, we can just over-ride the open method in the S3lib::AuthenticatedRequest library. I did it like this:

 1 	module S3Lib
 2 	  class AuthenticatedRequest
 3 
 4 	    # Over-ride RestOpenURI#open
 5 	    def open(uri, headers)
 6 	      {:uri => uri, :headers => headers}
 7 	    end
 8 
 9 	  end  
10 	end

As you can see, it just returns a hash showing the parameters passed in, and never makes a call to the internet.

Error Handling

We now have a working authentication library, and we are almost ready to actually start talking to S3. There’s one more thing we should do, however, that will make our lives much easier and save us tons of time while we’re building the rest of the S3 library. We need to add in some error handling. To illustrate why, let’s try making a new object in a bucket using the current library.

 1 $> irb
 2 >> require 'code/s3_code/library/s3_authenticator'
 3 => true
 4 >> S3Lib.request(:put, "spatten_sample_bucket")
 5 => #<StringIO:0x1741e94>
 6 >> S3Lib.request(:put, "spatten_sample_bucket/sample_object", :body => "this is \
 7 a test")
 8 OpenURI::HTTPError: 403 Forbidden
 9 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
10 320:in `open_http'
11 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
12 659:in `buffer_open'
13 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
14 194:in `open_loop'
15 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
16 192:in `catch'
17 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
18 192:in `open_loop'
19 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
20 162:in `open_uri'
21 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
22 561:in `open'
23 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
24 35:in `open'
25 from ./code/s3_code/library/s3_authenticator.rb:70:in `make_authenticated_reques\
26 t'
27 from ./code/s3_code/library/s3_authenticator.rb:37:in `request'
28 from (irb):3
29 >> puts req.body
30 NoMethodError: undefined method `body' for nil:NilClass
31         from (irb):5

What’s going on here? I can make a PUT request to create a bucket, but I can’t make a PUT to create an object. Hmmm. There’s really no way to figure out what’s going on, either, as the request we’re making just returns nil. Luckily, Amazon returns some information about what the error was. What we need to do is trap the error as it occurs and grab the information from it. Looking at the error more closely, we see that the OpenURI Library is raising a OpenURI::HTTPError. Let’s add some code to the S3Lib::request method to trap that error and see what information we can extract from it.

 1 def self.request(verb, request_path, headers = {})
 2   begin
 3     s3requester = AuthenticatedRequest.new()
 4     req = s3requester.make_authenticated_request(verb, request_path, headers)
 5   rescue OpenURI::HTTPError=> e
 6     puts "Status: #{e.io.status.join(",")}"      
 7     puts "Error From Amazon:\n#{e.io.read}"
 8     puts "canonical string you signed:\n#{s3requester.canonical_string}"
 9   end
10 end

Trying to make the object again gives us a bit more diagnostic feedback (I reformatted the Amazon error response a bit to make it more readable)

 1 t$ irb -r 'code/s3_code/library/s3_authenticator'
 2 >> S3Lib.request(:put, "spatten_sample_bucket/sample_object", 
 3                                :body => "this is a test")
 4 Status: 403,Forbidden
 5 Error From Amazon:
 6 <?xml version="1.0" encoding="UTF-8"?>
 7 <Error>
 8 <literal>SignatureDoesNotMatch</literal>
 9 <Message>The request signature we calculated does not match the signature
10 	       you provided. Check your key and signing method.</Message>
11 <RequestId>7BD4FADF07973DEA</RequestId>
12 <SignatureProvided>redacted</SignatureProvided>
13 <StringToSignBytes>redacted</StringToSignBytes>
14 <AWSAccessKeyId>195MGYF7J3AC7ZPSHVR2</AWSAccessKeyId>
15 <HostId>Baq4uDiuK3jU7Xf3R35sOLYrdFZBASP/e0ncdUdvUX1BJ5HEh58ojC7/WRKXjc/c</HostId>
16 <StringToSign>
17 	PUT
18 
19 	application/x-www-form-urlencoded
20 	Thu, 20 Mar 2008 18:17:40 GMT
21 	/spatten_sample_bucket/sample_object
22 </StringToSign>
23 </Error>
24 
25 canonical string you signed:
26 PUT
27 
28 
29 Thu, 20 Mar 2008 18:17:40 GMT
30 /spatten_sample_bucket/sample_object

Ah-ha! Notice that the StringToSign that Amazon is returning has a content-type header of “application/x-www-form-urlencoded”. We didn’t provide a content-type header at all, so we didn’t include it in our canonical_string. It looks like one of the Ruby libraries we’re using was a little too clever and inserted the content-type for us. Let’s try adding our own content-type header. Hopefully that will work.

Warning

Having extra headers added by a library is a pretty common occurrence. If you are having troubles getting your authentication library working, make sure you check that there aren’t any unexpected headers in the string that Amazon is expecting you to sign.

1 $> irb -r 'code/s3_code/library/s3_authenticator'
2 >> req = S3Lib.request(:put, "spatten_sample_bucket/sample_object", 
3                                      "content-type" => "text/plain", 
4                                      :body => "this is a test")
5 => #<StringIO:0x173ac48>
6 >> puts req.status
7 200
8 OK

That looks promising. No errors raised, and a status of 200 OK. Let’s list the objects in the bucket and make sure everything is okay.

 1 >> req = S3Lib.request(:get, "spatten_sample_bucket")
 2 => #<StringIO:0x1732ac0>
 3 >> puts req.read
 4 <?xml version="1.0" encoding="UTF-8"?>
 5 <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
 6 	<Name>spatten_sample_bucket</Name>
 7 	<Prefix></Prefix>
 8 	<Marker></Marker>
 9 	<MaxKeys>1000</MaxKeys>
10 	<IsTruncated>false</IsTruncated>
11 	<Contents>
12 		<Key>sample_object</Key>
13 		<LastModified>2008-03-20T18:26:01.000Z</LastModified>
14 		<ETag>&quot;54b0c58c7ce9f2a8b551351102ee0938&quot;</ETag>
15 		<Size>14</Size>
16 		<Owner>
17 			<ID>9d92623ba6dd9d7cc06a7b8bcc46381e7c646f72d769214012f7e91b50c0de0f</ID>
18 			<DisplayName>scottpatten</DisplayName>
19 		</Owner>
20 		<StorageClass>STANDARD</StorageClass>
21 	</Contents>
22 </ListBucketResult>
23 => nil
24 >> req = S3Lib.request(:get, "spatten_sample_bucket/sample_object")
25 => #<StringIO:0x1728a0c>
26 >> puts req.read
27 this is a test
28 => nil

That looks perfect: we have a new object in the bucket called “sample_object”, and getting that object gives us back the expected object contents.

We obviously don’t want to leave the error handling as is. Catching all errors and just printing out some information is decidedly sub-optimal. Let’s fix it up by creating a S3ResponseError class and initializing it with some information that will be useful for figuring out what went wrong. We’ll also make sure to add the error type given by Amazon (which was SignatureDoesNotMatch in our example above) so that we can use that to raise a more specific error type in our library.

 1 module S3Lib
 2   
 3   def self.request(verb, request_path, headers = {})
 4     begin
 5       s3requester = AuthenticatedRequest.new()
 6       req = s3requester.make_authenticated_request(verb, request_path, headers)
 7     rescue OpenURI::HTTPError=> e
 8       raise S3Lib::S3ResponseError.new(e.message, e.io, s3requester)
 9     end
10   end
11   
12   class S3ResponseError < StandardError
13     attr_reader :response, :amazon_error_type, :status, :s3requester, :io
14     def initialize(message, io, s3requester)
15       @io = io
16       # Get the response and status from the IO object
17       @io.rewind
18       @response = @io.read
19       @io.rewind 
20       @status = io.status
21       
22       # The Amazon Error type will always look like <literal>AmazonErrorType</li\
23 teral>.  Find it with a RegExp.
24       @response =~ /<literal>(.*)<\/literal>/
25       @amazon_error_type = $1
26       
27       # Make the AuthenticatedRequest instance available as well
28       @s3requester = s3requester
29       
30       # Call the standard Error initializer
31       # if you put '%s' in the message it will be 
32       # replaced by the amazon_error_type
33       super(message % @amazon_error_type)
34     end
35   end
36 end

Note that the S3Lib::request method rescues any OpenURI::HTTPError errors and re-raises them as S3Lib::S3ResponseError errors, passing in the IO object and the AuthenticatedRequest instance to the error. We can use this new error class to do something like this if we just want to output some info:

 1 #!/usr/bin/env ruby
 2 
 3 require File.join(File.dirname(__FILE__),'s3_authenticator')
 4 
 5 begin
 6   req = S3Lib.request(:put, "spatten_sample_bucket/sample_object", 
 7                       :body => "Wheee")
 8 rescue S3Lib::S3ResponseError => e
 9   puts "Amazon Error Type: #{e.amazon_error_type}"
10   puts "HTTP Status: #{e.status.join(',')}"
11   puts "Response from Amazon: #{e.response}"
12 	if e.amazon_error_type == 'SignatureDoesNotMatch'
13 	  puts "canonical string: #{e.s3requester.canonical_string}" 
14 	end
15 end

In the recipes in the rest of this section, we will be creating new error types and raising them based on the amazon_error_type of the raised S3ResponseError.