Chapter 4. Authenticating S3 Requests
Authenticating S3 Requests
The section on authenticating S3 Requests in the S3 Developer’s Guide (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/) is pretty intimidating. There are a lot of steps you have to go through in just the right order to get your authentication correct. Luckily there are sample implementations in a number of languages in the Getting Started Guide: http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/authenticating-to-s3.html.
The code supplied by Amazon works just fine, but you may want to build your own implementation. You could just read the S3 Developer’s Guide: it’s the canonical resource for this information, and obviously if there’s a conflict between what I’m saying and what the Developer’s Guide says, go with the Developer’s Guide. That being said, I’m going to go through the authentication process in detail to hopefully smooth over some of the spots where I got confused.
The Authentication Process
Every request you make to Amazon S3 must be signed. This is done by adding a Authorization header to the request. An authenticated request to a bucket named mybucket would look like this:
1 GET /mybucket
2 Host: s3.amazonaws.com
3 Date: Wed, 13 Feb 2008 12:00:00 GMT
4 Authorization: AWS your_aws_id:signature
TODO: Create a annotated example request and canonical string
The authorization header consists of “AWS”, followed by a space, your AWS ID, a colon and then a signature:
1 Authorization = AWS <AWSAccessKeyId>:<signature>
So that looks pretty straight-forward, except for the signature. To generate the signature, you generate a canonical string for your request, and then encode that string using your Secret Key. To encode the canonical string, you take the UTF-8 encoding of that string, encode it with your Secret Key using the HMAC-SHA1 algorithm and then Base64 encode the result. In pseudo-code:
1 Signature = Base64( HMAC-SHA1( UTF-8-Encoding-Of( canonical_string ) ) )
Assuming that you have the libraries to do the various encodings, all you need to do now is create the canonical string. The canonical string is a concatenation of the HTTP verb used to make the request, the canonicalized headers and the canonicalized resource.
1 canonical_string = "<http_verb>\n
2 <canonicalized_headers>\n
3 <canonicalized_resource>"
The
This is the simplest of the canonical string sub-elements. It is either GET, PUT, DELETE, HEAD or POST. It must be all uppercase.
The
The canonicalized_headers element is constructed from two sub-elements, the canonicalized_positional_headers and the canonicalized_amazon_headers.
1 canonicalized_headers = <canonicalized_positional_headers>\n
2 <canonicalized_amazon_headers>
The
The canonicalized_positional_headers are the values of the md5 hash, content type and date headers, separated by newlines. In pseudo-code:
1 Content-MD5 + "\n" +
2 Content-Type + "\n" +
3 Date + "\n"
Here’s an example
1 \"91ffa40f1a72a58f0d0b688032195088\"\n
2 text/plain\n
3 Wed, 27 Mar 2008 09:14:27 +0000
If one of the positional headers is not provided in the request, replace its value with an empty string and leave the new line (\n) in. For example, if a request had no MD5 hash or content type headers, it would look like this
1 \n
2 \n
3 Wed, 27 Mar 2008 09:14:27 +0000
The
The Amazon headers are all headers that begin with x-amz-, ignoring case. You construct the canonicalized_amazon_headers with the following steps
- Find all headers that have header names that begin with
x-amz-, ignoring case. These are the Amazon headers - Convert each of the Amazon header’s header names to lower case. (Just the header names, not their values)
- For each Amazon header, combine the header name and header value by joining them with a colon. Remove any leading string on the header value as you do this
- If you have multiple Amazon headers with the same name, then combine the values in to one value by joining them with commas, without any white space between them.
- If any of the headers span multiple lines, un-fold them by replacing the newlines with a single space
- Sort the Amazon headers alphabetically by header name
- Join the headers together with new-lines (
\n)
Here’s an example
Request
canonicalized_amazon_headers
1 GET /my_photos/vampire.jpg
2 Host: s3.amazonaws.com
3 X-Amz-Meta-Subject: Claire
4 X-Amz-Meta-Photographer:
5 Nadine Inkster
6 content-type: image/png
7 content-length: 10817
8
9 x-amz-meta-photographer:
10 Nadine Inkster\n
11 x-amz-meta-subject:Claire
Note that the header names have been lower cased, they are in alphabetical order and the spaces between the header names and the header values have been taken out. The Host, content-length and content-type headers are not included in the canonicalized_amazon_headers.
Here’s a more complicated example showing the combination of multiple Amazon headers with the same name and the un-folding of a long header value.
Request
canonicalized_amazon_headers
1 GET /my_photos/birthday.jpg
2 Host: s3.amazonaws.com
3 content-type: image/png
4 content-length: 12413
5 X-Amz-Meta-Subject:Claire
6 X-Amz-Meta-Subject:Mika
7 X-Amz-Meta-Subject:Amber
8 X-Amz-Meta-Subject:Callum
9 X-Amz-Meta-Description:
10 Mika, Claire, Amber and Callum \n
11 at Mika's birthday party\n
12
13
14 x-ama-meta-description:
15 Mika, Claire, Amber and Callum
16 at Mika's birthday party
17 x-amz-meta-subject:
18 Claire,Mika,Amber,Callum
Note that the subjects have all been combined in to a single header, and the new-line in the description has been replaced with a space.
Joining the
The canonicalized_headers are constructed by joining the canonicalized_positional_headers and the canonicalized_amazon_headers with a newline (\n).
1 canonicalized_headers = <canonicalized_positional_headers>\n<canonicalized_amazo\
2 n_headers>
The
The canonicalized_resource is given by
1 /<bucket><uri><sub-resource>
elements in the canonicalized_resource
bucket
The name of the bucket. This must be included even if you are setting the bucket in the Host header using virtual hosting of buckets. If you are requesting the list of buckets you own, just use the empty string.
uri
This is the http request uri, not including the query string. The uri should be URI encoded.
sub-resource
If the request is for a sub-resource such as ?acl, ?torrent or ?logging, then append it to the uri, including the ?.
Some examples
Request
canonicalized_resource
1 GET /my_pictures/vampire.jpg
2 Host: s3.amazonaws.com
3
4 /my_pictures/vampire.jpg
The canonicalized_resource is the same as the URI in the request
Request
canonicalized_resource
1 GET /vampire.jpg
2 Host:
3 spattenpictures.s3.amazonaws.com
4
5 /spattenpictures/vampire.jpg
Note that the bucket name (spattenpictures) is extracted from the Host header even though it is not in the URI in the request.
Time Stamping Your Requests
All requests to Amazon S3 must be time-stamped. There are two ways of time-stamping your request: using the Date header (which will be in the canonicalized_positional_headers), or using the x-amz-date header (which will be in the canonicalized_amazon_headers). Only one of the two date headers should be present in your request.
The date provided must be within 15 minutes of the current Amazon S3 system time. If not, you will receive a RequestTimeTooSkewed error response to your request.
Writing an S3 Authentication Library
Now that you know how a request is authenticated (in exhaustive detail), let’s implement a library to actually do the authentication.
We need a function that will take a HTTP verb, URL and a hash of headers as inputs and make an authenticated request. It authenticates the request by adding a signature to it. The signature is created by using the inputs to create a canonical string and creating an MD5 hash of that string using your Amazon Web Services Secret key. Okay, so we’re looking for a function that looks like this:
1 module S3Lib
2
3 class AuthenticatedRequest
4
5 def make_authenticated_request(verb, request_path, headers = {})
6 ... some code here ...
7 end
8
9 end
10 end
To make a request, you would do something like this
1 $> irb
2 >> require 's3_authenticator'
3 => true
4 >> s3 = S3Lib::AuthenticatedRequest.new
5 => #<S3Lib::AuthenticatedRequest:0x1742920>
6 >> s3.make_authenticated_request(:get, '/',
7 {'host' => 's3.amazonaws.com'})
You know what, that’s too much code just to make a simple request. Let’s add a class method to the S3Lib module that instantiates an AuthenticatedRequest object and makes the call to AuthenticatedRequest#make_authenticated_request for us.
1 module S3Lib
2 def self.request(verb, request_path, headers = {})
3 s3requester = AuthenticatedRequest.new()
4 s3requester.make_authenticated_request(verb, request_path, headers)
5 end
6
7 class AuthenticatedRequest
8
9 def make_authenticated_request(verb, request_path, headers = {})
10 ... some code here ...
11 end
12
13 end
14 end
Now we can make a request like this:
1 $> irb
2 >> require 's3_authenticator'
3 => true
4 >> s3 = S3Lib.request(:get, '/',
5 {'host' => 's3.amazonaws.com'})
6 => #<S3Lib::AuthenticatedRequest:0x1742920>
Okay, now that we have that out of the way, let’s get started.
Test Driven Development
When you first look at the requirements for authenticating an S3 request, it’s hard to know where to begin. It’s a complex set of requirements that would be much more simple if you broke it up in to smaller steps. It is also have something that must work correctly if the rest of our library is to function at all. Luckily for you, Amazon gives a set of example requests and the corresponding canonical strings and signatures in the S3 Developer’s Guide (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/). This sounds like a perfect fit for Test Driven Development (TDD), and that’s what we’re going to do for the rest of this section.
In the previous section, I deliberately started from a top-down view of the specification to make a point: the best way to deal with a complex spec like this is to start from the bottom and work your way up. If you just start coding from the top down, you’ll find yourself just sitting there staring at the code not knowing where to start.
So, let’s do some TDD. The key here will be to make small steps, and use our tests to make sure that all of our steps work for all cases, even the edge cases.
The flow of development will look like this:
- Read part of the authentication specification
- Create tests for that portion of the specification
- Make sure your tests fail
- Write the simplest thing that will make your tests pass
The HTTP Verb
Let’s start with something easy: the HTTP Verb section of the canonical string. Remember from the specification that the HTTP verb “… is either GET, PUT, DELETE, HEAD or POST. It must be all uppercase.” Also, it has to be followed by a carriage return. So, let’s test that. Our test will look something like this:
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator')
3
4 class S3AuthenticatorTest < Test::Unit::TestCase
5
6 def test_http_verb_is_uppercase
7 @s3_test = S3Lib::AuthenticatedRequest.new
8 @s3_test.make_authenticated_request(:get, '/',
9 {'host' => 's3.amazonaws.com'})
10 assert_match /^GET\n/, @s3_test.canonical_string
11 end
12
13 end
Let’s run that and see what happens. We know this is going to fail, as we haven’t actually written the canonical_string method yet.
1 $> ruby test/first_test.rb
2 Loaded suite test/first_test
3 Started
4 E
5 Finished in 0.000457 seconds.
6
7 1) Error:
8 test_http_verb_is_uppercase(S3AuthenticatorTest):
9 NoMethodError: undefined method `canonical_string'
10 for #<S3Lib::AuthenticatedRequest:0x50d34 @verb=:get>
11 test/first_test.rb:9:in `test_http_verb_is_uppercase'
12
13 1 tests, 0 assertions, 0 failures, 1 errors
Good, it fails. Now, let’s write the simplest thing that will make it work. Something like this:
1 module S3Lib
2
3 def self.request(verb, request_path, headers = {})
4 s3requester = AuthenticatedRequest.new()
5 s3requester.make_authenticated_request(verb, request_path, headers)
6 end
7
8 class AuthenticatedRequest
9
10 def make_authenticated_request(verb, request_path, headers = {})
11 @verb = verb
12 end
13
14 def canonical_string
15 "#{@verb.to_s.upcase}\n"
16 end
17
18 end
19 end
Now, run the test again.
1 $> ruby test/first_test.rb
2 Loaded suite test/first_test
3 Started
4 .
5 Finished in 0.000358 seconds.
6
7 1 tests, 1 assertions, 0 failures, 0 errors
All right! We’re rolling now! Let’s go on to something a bit more complicated: the canonicalized headers.
The Canonicalized Positional Headers
The canonicalized headers consist of the canonicalized_positional_headers followed by the canonicalized_amazon_headers. To break this down to something simple, we don’t want to test both of them at once. Let’s start testing with the canonicalized_positional_headers. Once again, I’ll refresh your memory so that you don’t have to flip back to the last section.
The canonicalized_positional_headers are the values of the MD5 hash, content type and date headers, separated by newlines. So, let’s write some tests for that spec:
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
3
4 class S3AuthenticatorTest < Test::Unit::TestCase
5
6 def test_http_verb_is_uppercase
7 @s3_test = S3Lib::AuthenticatedRequest.new
8 @s3_test.make_authenticated_request(:get, '/',
9 {'host' => 's3.amazonaws.com'})
10 assert_match /^GET\n/, @s3_test.canonical_string
11 end
12
13 def test_canonical_string_contains_positional_headers
14 @s3_test = S3Lib::AuthenticatedRequest.new
15 @s3_test.make_authenticated_request(:get, '',
16 {'content-type' => 'some content type',
17 'date' => 'December 25th, 2007',
18 'content-md5' => 'whee'})
19 assert_match /^GET\n#{@s3_test.canonicalized_positional_headers}/,
20 @s3_test.canonical_string
21 end
22
23 def test_positional_headers_with_all_headers
24 @s3_test = S3Lib::AuthenticatedRequest.new
25 @s3_test.make_authenticated_request(:get, '',
26 {'content-type' => 'some content type',
27 'date' => 'December 25th, 2007',
28 'content-md5' => 'whee'})
29 assert_equal "whee\nsome content type\nDecember 25th, 2007\n",
30 @s3_test.canonicalized_positional_headers
31 end
32
33 end
This will, of course, fail until we’ve written the canonicalized_positional_headers method.
Making sure your tests fail
I’ll spare you the details of the test failures from now on, but don’t take that to mean that you shouldn’t test that they fail. Trust me, knowing that your tests actually fail before you begin coding will save you hours of frustration at some point in your life.
Making sure that your tests fail ensures that your tests are actually running, and it helps to ensure that they’re testing what you think you are testing. Get in to the habit of running them before you begin coding and giving yourself a pat on the back when they fail.
1 module S3Lib
2
3 def self.request(verb, request_path, headers = {})
4 s3requester = AuthenticatedRequest.new()
5 s3requester.make_authenticated_request(verb, request_path, headers)
6 end
7
8 class AuthenticatedRequest
9
10 POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']
11
12 def make_authenticated_request(verb, request_path, headers = {})
13 @verb = verb
14 @headers = headers
15 end
16
17 def canonical_string
18 "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
19 end
20
21 def canonicalized_headers
22 "#{canonicalized_positional_headers}"
23 end
24
25 def canonicalized_positional_headers
26 POSITIONAL_HEADERS.collect do |header|
27 @headers[header] + "\n"
28 end.join
29 end
30
31 end
32 end
If you run the tests, you’ll see that the new tests pass, but we’ve broken the test_http_verb_is_uppercase test.
1 t$> ruby test/first_test.rb
2 Loaded suite test/first_test
3 Started
4 .E.
5 Finished in 0.00068 seconds.
6
7 1) Error:
8 test_http_verb_is_uppercase(S3AuthenticatorTest):
9 NoMethodError: undefined method `+' for nil:NilClass
10 ./test/../s3_authenticator_dev.rb:26:in `canonicalized_positional_headers'
11 ./test/../s3_authenticator_dev.rb:25:in `collect'
12 ./test/../s3_authenticator_dev.rb:25:in `canonicalized_positional_headers'
13 ./test/../s3_authenticator_dev.rb:21:in `canonicalized_headers'
14 ./test/../s3_authenticator_dev.rb:17:in `canonical_string'
15 test/first_test.rb:9:in `test_http_verb_is_uppercase'
16
17 3 tests, 2 assertions, 0 failures, 1 errors
That test doesn’t pass in all of the positional headers, so the canonicalized_positional_headers method is failing when it tries to add the non-existent header to a string. This is kind of fortuitous, as the specification for the canonicalized_positional_headers says that a positional header should be replaced by an empty string if it doesn’t exists. Let’s write a test for that spec and hopefully we’ll comply with that specification and fix the currently failing test all in one fell swoop. Here’s the new test
1 def test_positional_headers_with_only_date_header
2 @s3_test.make_authenticated_request(:get, '',
3 {'date' => 'December 25th, 2007'})
4 assert_equal "\n\nDecember 25th, 2007\n",
5 @s3_test.canonicalized_positional_headers
6 end
To fix the problem, all we have to do is make sure that positional header is replaced with an empty string if it doesn’t exist
1 def test_positional_headers_with_only_date_header
2 @s3_test = S3Lib::AuthenticatedRequest.new
3 @s3_test.make_authenticated_request(:get, '',
4 {'date' => 'December 25th, 2007'})
5 assert_equal "\n\nDecember 25th, 2007\n",
6 @s3_test.canonicalized_positional_headers
7 end
Our tests all pass now. Phew.
DRYing up our tests
You might have noticed that all of our tests start with the line @s3_test = S3Lib::AuthenticatedRequest.new. Now, I’m a pretty lazy guy, so I want to avoid all of that repetitive typing. Luckily, Ruby’s Unit Testing library is a faithful interpretation of the XUnit test specification. This means that if you define a setup method, it will be run before every test. Similarly, the teardown method, if defined, will be run after every test. Let’s refactor our test library to take advantage of the setup method.
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
3
4 class S3AuthenticatorTest < Test::Unit::TestCase
5
6 def setup
7 @s3_test = S3Lib::AuthenticatedRequest.new
8 end
9
10 def test_http_verb_is_uppercase
11 @s3_test.make_authenticated_request(:get, '/',
12 {'host' => 's3.amazonaws.com'})
13 assert_match /^GET\n/, @s3_test.canonical_string
14 end
15
16 def test_canonical_string_contains_positional_headers
17 @s3_test.make_authenticated_request(:get, '',
18 {'content-type' => 'some content type',
19 'date' => 'December 25th, 2007',
20 'content-md5' => 'whee'})
21 assert_match /^GET\n#{@s3_test.canonicalized_positional_headers}/,
22 @s3_test.canonical_string
23 end
24
25 def test_positional_headers_with_all_headers
26 @s3_test.make_authenticated_request(:get, '',
27 {'content-type' => 'some content type',
28 'date' => 'December 25th, 2007',
29 'content-md5' => 'whee'})
30 assert_equal "whee\nsome content type\nDecember 25th, 2007\n",
31 @s3_test.canonicalized_positional_headers
32 end
33
34 def test_positional_headers_with_only_date_header
35 @s3_test.make_authenticated_request(:get, '',
36 {'date' => 'December 25th, 2007'})
37 assert_equal "\n\nDecember 25th, 2007\n",
38 @s3_test.canonicalized_positional_headers
39 end
40
41 end
Keeping your methods and variable private yet testable
You might have been gritting your teeth as I just blithely used attributes and methods that should be private in the tests above. In reality, you probably don’t want the canonical_string, @headers or any of the other methods I was testing against available in the public interface. I did this because I didn’t want to obscure what we’re really trying to do here: write an S3 authentication library. However, if you’re interested, here’s a nice method for keeping those methods and attributes private in production yet available for testing.
The key to this technique is that Ruby’s classes are always open and extendible. So, you can open up the class when you’re testing it and make everything publicly accessible while still keeping everything locked down for production use. Here’s an example of how I did it while developing this class.
Here’s the class refactored to keep things private. Notice that only the make_authenticated_request method is publicly available
1 module S3Lib
2 require 'time'
3
4 def self.request(verb, request_path, headers = {})
5 s3requester = AuthenticatedRequest.new()
6 s3requester.make_authenticated_request(verb, request_path, headers)
7 end
8
9 class AuthenticatedRequest
10
11 POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']
12
13 def make_authenticated_request(verb, request_path, headers = {})
14 @verb = verb
15 @headers = headers
16 end
17
18 private
19
20 def canonical_string
21 "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
22 end
23
24 def canonicalized_headers
25 "#{canonicalized_positional_headers}"
26 end
27
28 def canonicalized_positional_headers
29 POSITIONAL_HEADERS.collect do |header|
30 (@headers[header] || "") + "\n"
31 end.join
32 end
33
34 end
Now, in the same file that your tests are in, open up the class and publicize all of the methods we want to test (canonical_string, canonicalized_headers and canonicalized_positional_headers) with public versions (public_canonical_string, public_canonicalized_headers and public_canonicalized_positional_headers). Next, make any instance variables you want access to readable (@headers). Finally, re-name the method calls in your tests by pre-pending public_ to them.
1 # Make private methods and attributes public so that you can test them
2 module S3Lib
3 class AuthenticatedRequest
4
5 attr_reader :headers
6
7 def public_canonicalized_headers
8 canonicalized_headers
9 end
10
11 def public_canonicalized_positional_headers
12 canonicalized_positional_headers
13 end
14
15 def public_canonical_string
16 canonical_string
17 end
18
19 end
20 end
21
22 require 'test/unit'
23 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev_private')
24
25 class S3AuthenticatorTest < Test::Unit::TestCase
26
27 def setup
28 @s3_test = S3Lib::AuthenticatedRequest.new
29 end
30
31 def test_http_verb_is_uppercase
32 @s3_test.make_authenticated_request(:get, '/',
33 {'host' => 's3.amazonaws.com'})
34 assert_match /^GET\n/, @s3_test.public_canonical_string
35 end
36
37 def test_canonical_string_contains_positional_headers
38 @s3_test.make_authenticated_request(:get, '',
39 {'content-type' => 'some content type',
40 'date' => 'December 25th, 2007',
41 'content-md5' => 'whee'})
42 assert_match /^GET\n#{@s3_test.public_canonicalized_positional_headers}/,
43 @s3_test.public_canonical_string
44 end
45
46 def test_positional_headers_with_all_headers
47 @s3_test.make_authenticated_request(:get, '',
48 {'content-type' => 'some content type',
49 'date' => 'December 25th, 2007',
50 'content-md5' => 'whee'})
51 assert_equal "whee\nsome content type\nDecember 25th, 2007\n", @s3_test.publ\
52 ic_canonicalized_positional_headers
53 end
54
55 def test_positional_headers_with_only_date_header
56 @s3_test.make_authenticated_request(:get, '',
57 {'date' => 'December 25th, 2007'})
58 assert_equal "\n\nDecember 25th, 2007\n",
59 @s3_test.public_canonicalized_positional_headers
60 end
61
62 end
This technique of opening the class will also come in handy when we actually write the code to talk to Amazon S3, but want to be able to test without a live internet connection.
Phew. That was a lot of reading and coding, but we have the positional headers working and well tested now. I’ll tone down the verbiage from now on so that we can finish up without killing too many extra forests.
The Canonicalized Amazon Headers
The Amazon headers are all headers that begin with x-amz-, ignoring case. You construct the canonicalized_amazon_headers with the following steps. I’m going to write out each specification along with some tests that express that specification.
- Find all headers that have header names that begin with
x-amz-, ignoring case. These are the Amazon headers1def test_amazon_headers_should_remove_non_amazon_headers2@s3_test.make_authenticated_request(:get, '',3{'content-type' => 'content',4'some-other-header' => 'other',5'x-amz-meta-one' => 'one',6'x-amz-meta-two' => 'two'})7headers = @s3_test.public_canonicalized_amazon_headers8assert_no_match /other/, headers9assert_no_match /content/, headers10end1112def test_amazon_headers_should_keep_amazon_headers13@s3_test.make_authenticated_request(:get, '',14{'content-type' => 'content',15'some-other-header' => 'other',16'x-amz-meta-one' => 'one',17'x-amz-meta-two' => 'two'})18headers = @s3_test.public_canonicalized_amazon_headers19assert_match /x-amz-meta-one/, headers20assert_match /x-amz-meta-two/, headers21end - Convert each of the Amazon header’s header names to lower case. (Just the header names, not their values)
1def test_amazon_headers_should_be_lowercase2@s3_test.make_authenticated_request(:get, '',3{'content-type' => 'content',4'some-other-header' => 'other',5'X-amz-meta-one' => 'one',6'x-Amz-meta-two' => 'two'})7headers = @s3_test.public_canonicalized_amazon_headers8assert_match /x-amz-meta-one/, headers9assert_match /x-amz-meta-two/, headers10end - For each Amazon header, combine the header name and header value by joining them with a colon. Remove any leading string on the header value as you do this
1def test_leading_spaces_get_stripped_from_header_values2@s3_test.make_authenticated_request(:get, '',3{'x-amz-meta-one' => ' one with a leading space',4'x-Amz-meta-two' => ' two with a leading and trailin\5g space '})6headers = @s3_test.public_canonicalized_amazon_headers7assert_match /x-amz-meta-one:one with a leading space/, headers8assert_match /x-amz-meta-two:two with a leading and trailing space /,9headers10end - If you have multiple Amazon headers with the same name, then combine the values in to one value by joining them with commas, without any white space between them.
1def test_values_as_arrays_should_be_joined_as_commas2@s3_test.make_authenticated_request(:get, '',3{'x-amz-mult' => ['a', 'b', 'c']})4headers = @s3_test.canonicalized_amazon_headers5assert_match /a,b,c/, headers6end - If any of the headers span multiple lines, un-fold them by replacing the newlines with a single space
1def test_long_amazon_headers_should_get_unfolded2@s3_test.make_authenticated_request(:get, '',3{'x-amz-meta-one' => "A really long head\4er\n" +5"with multiple line\6s."})7headers = @s3_test.canonicalized_amazon_headers8assert_match /x-amz-meta-one:A really long header with multiple lines./,9headers10end - Sort the Amazon headers alphabetically by header name
1def test_amazon_headers_should_be_alphabetized2@s3_test.make_authenticated_request(:get, '',3{'content-type' => 'content',4'some-other-header' => 'other',5'X-amz-meta-one' => 'one',6'x-Amz-meta-two' => 'two',7'x-amz-meta-zed' => 'zed',8'x-amz-meta-alpha' => 'alpha'})9headers = @s3_test.canonicalized_amazon_headers10assert_match /alpha.*one.*two.*zed/m, headers # /m on the reg-exp makes .* i\11nclude newlines12end - Join the headers together with new-lines (
\n)
Here is some code that passes those tests, which hopefully means it achieves the specifications:
1 class Hash
2
3 def downcase_keys
4 res = {}
5 each do |key, value|
6 key = key.downcase if key.respond_to?(:downcase)
7 res[key] = value
8 end
9 res
10 end
11
12 def join_values(separator = ',')
13 res = {}
14 each do |key, value|
15 res[key] = value.respond_to?(:join) ? value.join(separator) : value
16 end
17 res
18 end
19
20 end
21
22 module S3Lib
23 require 'time'
24
25 def self.request(verb, request_path, headers = {})
26 s3requester = AuthenticatedRequest.new()
27 s3requester.make_authenticated_request(verb, request_path, headers)
28 end
29
30 class AuthenticatedRequest
31
32 attr_reader :headers
33 POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']
34
35 def make_authenticated_request(verb, request_path, headers = {})
36 @verb = verb
37 @headers = headers.downcase_keys.join_values
38 end
39
40 def canonical_string
41 "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
42 end
43
44 def canonicalized_headers
45 "#{canonicalized_positional_headers}#{canonicalized_amazon_headers}"
46 end
47
48 def canonicalized_positional_headers
49 POSITIONAL_HEADERS.collect do |header|
50 (@headers[header] || "") + "\n"
51 end.join
52 end
53
54 def canonicalized_amazon_headers
55
56 # select all headers that start with x-amz-
57 amazon_headers = @headers.select do |header, value|
58 header =~ /^x-amz-/
59 end
60
61 # Sort them alpabetically by key
62 amazon_headers = amazon_headers.sort do |a, b|
63 a[0] <=> b[0]
64 end
65
66 # Collect all of the amazon headers like this:
67 # {key}:{value}\n
68 # The value has to have any whitespace on the left stripped from it
69 # and any new-lines replaced by a single space.
70 # Finally, return the headers joined together as a single string.
71 amazon_headers.collect do |header, value|
72 "#{header}:#{value.lstrip.gsub("\n"," ")}\n"
73 end.join
74 end
75
76 end
77 end
What did I just do to Hash?
If you’re not used to Ruby, you might have gotten a little worried when you saw the additions I made to the Hash class. I opened up a base class and added a couple of methods to it. Yikes! You might think that this is crazy and will lead to all kinds of problems, but it is accepted practice in the Ruby world. Personally, I love it: it makes my code more readable and concise, and it has never caused me a problem.
Date Stamping Requests
If you were feeling especially alert, you might have noticed that I side-stepped a specification in the canonicalized headers sections. That spec is that all requests to S3 must be time-stamped. A request can be time stamped in two ways: through the positional date header, or through an amazon header called x-amz-date. The x-amz-date header over-rides the date header. Also, to make the life of the users of your library easier, let’s make the library provide a date header equal to the current time if none is passed in. Here’s a set of tests that express that spec:
1 def test_date_should_be_added_if_not_passed_in
2 @s3_test.make_authenticated_request(:get, '')
3 assert @s3_test.headers.has_key?('date')
4 end
5
6 def test_positional_headers_with_no_headers_should_have_date_defined
7 @s3_test.make_authenticated_request(:get, '' )
8 date = @s3_test.headers['date']
9 assert_equal "\n\n#{date}\n", @s3_test.canonicalized_positional_headers
10 end
11
12 def test_xamzdate_should_override_date_header
13 @s3_test.make_authenticated_request(:get, '',
14 {'date' => 'December 15, 2005',
15 'x-amz-date' => 'Tue, 27 Mar 2007 21:20\
16 :26 +0000'})
17 headers = @s3_test.public_canonicalized_headers
18 assert_match /2007/, headers
19 assert_no_match /2005/, headers
20 end
21
22 def test_xamzdate_should_override_capitalized_date_header
23 @s3_test.make_authenticated_request(:get, '',
24 {'Date' => 'December 15, 2005',
25 'X-amz-date' => 'Tue, 27 Mar 2007 21:20\
26 :26 +0000'})
27 headers = @s3_test.public_canonicalized_headers
28 assert_match /2007/, headers
29 assert_no_match /2005/, headers
30 end
We’ll use the fix_date method to add a date header if it doesn’t exist. Notice that the test accesses the @headers hash. The line that reads attr_reader :headers makes that accessible to our tests. Here’s the code:
1 module S3Lib
2 require 'time'
3
4 def self.request(verb, request_path, headers = {})
5 s3requester = AuthenticatedRequest.new()
6 s3requester.make_authenticated_request(verb, request_path, headers)
7 end
8
9 class AuthenticatedRequest
10
11 attr_reader :headers
12 POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']
13
14 def make_authenticated_request(verb, request_path, headers = {})
15 @verb = verb
16 @headers = headers
17 fix_date
18 end
19
20 def fix_date
21 @headers['date'] ||= Time.now.httpdate
22 @headers.delete('date') if @headers.has_key?('x-amz-date')
23 end
24
25 def canonical_string
26 "#{@verb.to_s.upcase}\n#{canonicalized_headers}"
27 end
28
29 def canonicalized_headers
30 "#{canonicalized_positional_headers}"
31 end
32
33 def canonicalized_positional_headers
34 POSITIONAL_HEADERS.collect do |header|
35 (@headers[header] || "") + "\n"
36 end.join
37 end
38
39 end
40 end
The Canonicalized Resource
The canonicalized_resource is given by
1 /<bucket><uri><sub-resource>
The canonicalized_resource must start with a forward slash (/), it must include the bucket name (even if the bucket is not in the URI), and then comes the URI and the sub-resource (if any). The bucket name must be lower case. Here are some tests that express this.
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
3
4 class S3AuthenticatorCanonicalResourceTest < Test::Unit::TestCase
5
6 def setup
7 @s3_test = S3Lib::AuthenticatedRequest.new
8 end
9
10 def test_forward_slash_is_always_added
11 @s3_test.make_authenticated_request(:get, '')
12 assert_match /^\//, @s3_test.canonicalized_resource
13 end
14
15 def test_bucket_name_in_uri_should_get_passed_through
16 @s3_test.make_authenticated_request(:get, 'my_bucket')
17 assert_match /^\/my_bucket/, @s3_test.canonicalized_resource
18 end
19
20 def test_canonicalized_resource_should_include_uri
21 @s3_test.make_authenticated_request(:get, 'my_bucket/vampire.jpg')
22 assert_match /vampire.jpg$/, @s3_test.canonicalized_resource
23 end
24
25 def test_canonicalized_resource_should_include_sub_resource
26 @s3_test.make_authenticated_request(:get, 'my_bucket/vampire.jpg?torrent')
27 assert_match /vampire.jpg\?torrent$/, @s3_test.canonicalized_resource
28 end
29
30 def test_bucket_name_with_virtual_hosting
31 @s3_test.make_authenticated_request(:get, '/',
32 {'host' => 'some_bucket.s3.amazonaws.com\
33 '})
34 assert_match /some_bucket\//, @s3_test.canonicalized_resource
35 assert_no_match /s3.amazonaws.com/, @s3_test.canonicalized_resource
36 end
37
38 def test_bucket_name_with_cname_virtual_hosting
39 @s3_test.make_authenticated_request(:get, '/',
40 {'host' => 'some_bucket.example.com'})
41 assert_match /^\/some_bucket.example.com/, @s3_test.canonicalized_resource
42 end
43
44 end
Here is the AuthenticatedRequest library that passes these tests. Note the changes that have been made:
- The
HOSTconstant has been added and set to's3.amazonaws.com' - The
get_bucket_namemethod has been created. It is called from themake_authenticated_requestmethod. This method extracts the bucket from thehostheader and saves it to the@bucketinstance variable. - The
canonicalized_resourcemethod creates the canonicalized resource string. It is called in thecanonical_stringmethod.
1 class AuthenticatedRequest
2
3 attr_reader :headers
4 POSITIONAL_HEADERS = ['content-md5', 'content-type', 'date']
5 HOST = 's3.amazonaws.com'
6
7 def make_authenticated_request(verb, request_path, headers = {})
8 @verb = verb
9 @request_path = request_path.gsub(/^\//,'') # Strip off the leading '/'
10
11 @headers = headers.downcase_keys.join_values
12 fix_date
13 get_bucket_name
14 end
15
16 def fix_date
17 @headers['date'] ||= Time.now.httpdate
18 @headers.delete('date') if @headers.has_key?('x-amz-date')
19 end
20
21 def canonical_string
22 "#{@verb.to_s.upcase}\n#{canonicalized_headers}#{canonicalized_resource}"
23 end
24
25 def canonicalized_headers
26 "#{canonicalized_positional_headers}#{canonicalized_amazon_headers}"
27 end
28
29 def canonicalized_positional_headers
30 POSITIONAL_HEADERS.collect do |header|
31 (@headers[header] || "") + "\n"
32 end.join
33 end
34
35 def canonicalized_amazon_headers
36
37 # select all headers that start with x-amz-
38 amazon_headers = @headers.select do |header, value|
39 header =~ /^x-amz-/
40 end
41
42 # Sort them alpabetically by key
43 amazon_headers = amazon_headers.sort do |a, b|
44 a[0] <=> b[0]
45 end
46
47 # Collect all of the amazon headers like this:
48 # {key}:{value}\n
49 # The value has to have any whitespace on the left stripped from it
50 # and any new-lines replaced by a single space.
51 # Finally, return the headers joined together as a single string.
52 amazon_headers.collect do |header, value|
53 "#{header}:#{value.lstrip.gsub("\n"," ")}\n"
54 end.join
55 end
56
57 def canonicalized_resource
58 canonicalized_resource_string = "/"
59 canonicalized_resource_string += @bucket
60 canonicalized_resource_string += @request_path
61 canonicalized_resource_string
62 end
63
64 def get_bucket_name
65 @bucket = ""
66 return unless @headers.has_key?('host')
67 @headers['host'] = @headers['host'].downcase
68 return if @headers['host'] == 's3.amazonaws.com'
69 if @headers['host'] =~ /^([^.]+)(:\d\d\d\d)?\.#{HOST}$/ # Virtual hosting
70 @bucket = $1.gsub(/\/$/,'') + '/'
71 else
72 # CNAME Virtual hosting
73 @bucket = @headers['host'].gsub(/(:\d\d\d\d)$/, '').gsub(/\/$/,'') + '/'
74 end
75 end
76
77 end
The Full Signature
Now that we have all of the parts of the signature coded up, we can use some samples provided by the S3 Developers Guide to test that it works when we bring it all together. The examples are in the section of the document on REST authentication at http://docs.amazonwebservices.com/AmazonS3/2006-03-01/RESTAuthentication.html
Okay, so let’s take a look at the first example and code up a test to use it. Here’s the example:
Request canonical_string
1 GET /photos/puppy.jpg HTTP/1.1
2 Host: johnsmith.s3.amazonaws.com
3 Date:
4 Tue, 27 Mar 2007 19:36:42 +0000
5 Authorization: AWS 0PN5J17HBGZHT7JJ3X82:
6 xXjDGYUmKxnwqr5KXNPGldn5LbA=
7
8 GET\n
9 \n
10 \n
11 Tue, 27 Mar 2007 19:36:42
12 +0000\n
13 /johnsmith/photos/puppy.jpg
There are two interesting features to note:
- The bucket is provided in the
Hostheader, not in the URL, but it still shows up in thecanonical_resourcein thecanonical_string. - The actual encrypted Authorization header is provided for the sample request. I’ll talk about this more in the next section on signing the request.
Let’s translate that example to a unit test and see if all of our hard work comes together.
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
3
4 class S3AuthenticatorTest < Test::Unit::TestCase
5
6 def setup
7 @s3_test = S3Lib::AuthenticatedRequest.new
8 end
9
10 # http://developer.amazonwebservices.com/connect/entry.jspa?externalID=123&cat\
11 egoryID=48
12 def test_dg_sample_one
13 @s3_test.make_authenticated_request(:get, '/photos/puppy.jpg',
14 {'Host' => 'johnsmith.s3.amazonaws.com',
15 'Date' => 'Tue, 27 Mar 2007 19:36:42 +00\
16 00'})
17 expected_canonical_string = "GET\n\n\nTue, 27 Mar 2007 19:36:42 +0000\n" +
18 "/johnsmith/photos/puppy.jpg"
19 assert_equal expected_canonical_string, @s3_test.canonical_string
20 end
21
22 end
Save that in canonical_string_tests.rb and run it.
1 $> ruby test/canonical_string_tests.rb
2 Loaded suite test/canonical_string_tests
3 Started
4 .
5 Finished in 0.000447 seconds.
6
7 1 tests, 1 assertions, 0 failures, 0 errors
Phew. Everything works as planned.
The next step is to take all of the examples from the Developer’s Guide, translate them to unit tests and make sure they pass.
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
3
4 class S3AuthenticatorTest < Test::Unit::TestCase
5
6 def setup
7 @s3_test = S3Lib::AuthenticatedRequest.new
8 end
9
10 # http://developer.amazonwebservices.com/connect/entry.jspa?externalID=123&ca\
11 tegoryID=48
12 def test_dg_sample_one
13 @s3_test.make_authenticated_request(:get, '/photos/puppy.jpg',
14 {'Host' => 'johnsmith.s3.amazonaws.com',
15 'Date' => 'Tue, 27 Mar 2007 19:36:42 +\
16 0000'})
17 expected_canonical_string = "GET\n\n\nTue, 27 Mar 2007 19:36:42 +0000\n" +
18 "/johnsmith/photos/puppy.jpg"
19 assert_equal expected_canonical_string, @s3_test.canonical_string
20 end
21
22 def test_dg_sample_two
23 @s3_test.make_authenticated_request(:put, '/photos/puppy.jpg',
24 {'Content-Type' => 'image/jpeg',
25 'Content-Length' => '94328',
26 'Host' => 'johnsmith.s3.amazonaws.com',
27 'Date' => 'Tue, 27 Mar 2007 21:15:45 +0\
28 000'})
29 expected_canonical_string = "PUT\n\nimage/jpeg\nTue, 27 Mar 2007 21:15:45" \
30 +
31 "+0000\n/johnsmith/photos/puppy.jpg"
32 assert_equal expected_canonical_string, @s3_test.canonical_string
33 end
34
35 def test_dg_sample_three
36 @s3_test.make_authenticated_request(:get, '',
37 {'prefix' => 'photos',
38 'max-keys' => '50',
39 'marker' => 'puppy',
40 'host' => 'johnsmith.s3.amazonaws.com',
41 'date' => 'Tue, 27 Mar 2007 19:42:41 +\
42 0000'})
43 assert_equal "GET\n\n\nTue, 27 Mar 2007 19:42:41 +0000\n/johnsmith/", @s3_t\
44 est.canonical_string
45 end
46
47 def test_dg_sample_four
48 @s3_test.make_authenticated_request(:get, '?acl',
49 {'host' => 'johnsmith.s3.amazonaws.com'\
50 ,
51 'date' => 'Tue, 27 Mar 2007 19:44:46 +\
52 0000'})
53
54 assert_equal "GET\n\n\nTue, 27 Mar 2007 19:44:46 +0000\n" +
55 "/johnsmith/?acl", @s3_test.canonical_string
56 end
57
58 def test_dg_sample_five
59 @s3_test.make_authenticated_request(:delete, '/johnsmith/photos/puppy.jpg',
60 {'User-Agent' => 'dotnet',
61 'host' => 's3.amazonaws.com',
62 'date' => 'Tue, 27 Mar 2007 \
63 21:20:27 +0000',
64 'x-amz-date' => 'Tue, 27 Mar\
65 2007 21:20:26 +0000' })
66 assert_equal "DELETE\n\n\n\nx-amz-date:Tue, 27 Mar 2007 21:20:26 +0000\n/jo\
67 hnsmith/photos/puppy.jpg", @s3_test.canonical_string
68 end
69
70 def test_dg_sample_six
71 @s3_test.make_authenticated_request(:put,
72 '/db-backup.dat.gz',
73 {'User-Agent' => 'curl/7.15.5',
74 'host' => 'static.johnsmith.net:8080',
75 'date' => 'Tue, 27 Mar 2007 21:06:08 +\
76 0000',
77 'x-amz-acl' => 'public-read',
78 'content-type' => 'application/x-downl\
79 oad',
80 'Content-MD5' => '4gJE4saaMU4BqNR0kLY+\
81 lw==',
82 'X-Amz-Meta-ReviewedBy' => ['joe@johns\
83 mith.net', 'jane@johnsmith.net'],
84 'X-Amz-Meta-FileChecksum' => '0x026617\
85 79',
86 'X-Amz-Meta-ChecksumAlgorithm' => 'crc\
87 32',
88 'Content-Disposition' => 'attachment; \
89 filename=database.dat',
90 'Content-Encoding' => 'gzip',
91 'Content-Length' => '5913339' })
92 expected_canonical_string = "PUT\n4gJE4saaMU4BqNR0kLY+lw==\napplication/x-\
93 download\n" +
94 "Tue, 27 Mar 2007 21:06:08 +0000\n" + \
95
96 "x-amz-acl:public-read\nx-amz-meta-checksumalg\
97 orithm:crc32\n" +
98 "x-amz-meta-filechecksum:0x02661779\n" + \
99
100 "x-amz-meta-reviewedby:joe@johnsmith.net,jane@johnsmith.net\n"\
101 +
102 "/static.johnsmith.net/db-backup.dat.gz"
103 assert_equal expected_canonical_string, @s3_test.canonical_string
104 end
105
106 end
Now, let’s run them
1 $> ruby test/canonical_string_tests.rb
2 Loaded suite test/canonical_string_tests
3 Started
4 ......
5 Finished in 0.001213 seconds.
6
7 6 tests, 6 assertions, 0 failures, 0 errors
SUCCESS! Ahh, that feels good. Only one step remains before we have a fully working library: we have to take that canonical_string and encode it.
Signing the Request
The whole point of this authentication procedure is to digitally sign your request by creating the canonical_string and use it to create an Authorization header. The Authorization header looks like this:
1 Authorization = AWS <AWSAccessKeyId>:<signature>
and the signature like this:
1 Signature = Base64( HMAC-SHA1( UTF-8-Encoding-Of( canonical_string ) ) )
We spent a lot of time figuring out how to make the canonical_string. The next step is much easier: we need to take that canonical_string and feed it through the algorithms to encode it. The method of encoding the canonical_string is highly language dependent, so I’m going to be lazy and point you to the Amazon S3 Getting Started Guide (http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/). This page in the guide links to sample implementations in Java, C#, Perl, PHP, Ruby and Python: http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/PreparingTheSamples.html. Here is some Ruby code that does the trick:
1 **require 'base64'
2 require 'digest/sha1'
3 require 'openssl'**
4
5
6 module S3Lib
7
8 class AuthenticatedRequest
9
10 def make_authenticated_request(verb, request_path, headers = {})
11 @verb = verb
12 @request_path = request_path.gsub(/^\//,'') # Strip off the leading '/' \
13
14 ** @amazon_id = ENV['AMAZON_ACCESS_KEY_ID']
15 @amazon_secret = ENV['AMAZON_SECRET_ACCESS_KEY']
16 **
17 @headers = headers.downcase_keys.join_values
18 fix_date
19 get_bucket_name
20 end
21
22 .....
23
24 ** def authorization_string
25 generator = OpenSSL::Digest::Digest.new('sha1')
26 encoded_canonical = \
27 Base64.encode64(OpenSSL::HMAC.digest(generator, @amazon_secret, canonica\
28 l_string)).strip
29
30 "AWS #{@amazon_id}:#{encoded_canonical}"
31 end**
32
33
34 end
35 end
I’ve added the @amazon_id and @amazon_secret instance variables in the make_authenticated_request method, and added the authorization_string method that does all of the heavy lifting. All of the required libraries are included in the base Ruby distribution, so you should be able to just run this.
Let’s write some unit tests to see if that really works. Luckily, we have the examples from the Amazon S3 Developer’s Guide to work with. The examples all use the same set of (fake) authentication credentials.
Table 4.1. S3 Authentication Credentials used in the examples
Parameter Value
AWSAccessKeyId 0PN5J17HBGZHT7JJ3X82
AWSSecretAccessKey uV3F3YluFJax1cknvbcGwgjvx4QpvB+leU8dUj2o
We probably want to set these in the setup section of our tests. Remember that the S3Lib Library is getting the parameters from the AMAZON_ACCESS_KEY_ID and AMAZON_SECRET_ACCESS_KEY environment parameters, so we can set the parameters using Ruby’s ENV command:
1 def setup
2 # The id and secret key are non-working credentials
3 # from the S3 Developer's Guide
4 # http://developer.amazonwebservices.com/connect/entry.jspa
5 # ?externalID=123&categoryID=48
6 ENV['AMAZON_ACCESS_KEY_ID'] = '0PN6J17HBGXHT7JJ3X82'
7 ENV['AMAZON_SECRET_ACCESS_KEY'] = 'uV3F3YluFJax1cknvbcGwgjvx4QpvB+leU8dUj2o'
8 @s3_test = S3Lib::AuthenticatedRequest.new
9 end
We can then re-write the tests from the previous section to include a test for the Authentication header. Something like this:
1 require 'test/unit'
2 require File.join(File.dirname(__FILE__), '../s3_authenticator_dev')
3
4 class S3AuthenticatorTest < Test::Unit::TestCase
5
6 def setup
7 # The id and secret key are non-working credentials from the S3 Developer's \
8 Guide
9 # See http://developer.amazonwebservices.com/connect/entry.jspa
10 # ?externalID=123&categoryID=48
11 ENV['AMAZON_ACCESS_KEY_ID'] = '0PN6J17HBGXHT7JJ3X82'
12 ENV['AMAZON_SECRET_ACCESS_KEY'] = 'uV3F3YluFJax1cknvbcGwgjvx4QpvB+leU8dUj2o'
13 @s3_test = S3Lib::AuthenticatedRequest.new
14 end
15
16 # See http://developer.amazonwebservices.com/connect/entry.jspa
17 # ?externalID=123&categoryID=48
18 def test_dg_sample_one
19 @s3_test.make_authenticated_request(:get, '/photos/puppy.jpg',
20 {'Host' => 'johnsmith.s3.amazonaws.com',
21 'Date' => 'Tue, 27 Mar 2007 19:36:42 +\
22 0000'})
23 expected_canonical_string = "GET\n\n\nTue, 27 Mar 2007 19:36:42 +0000\n" +
24 "/johnsmith/photos/puppy.jpg"
25 assert_equal expected_canonical_string, @s3_test.canonical_string
26 assert_equal "AWS 0PN6J17HBGXHT7JJ3X82:xXjDGYUmKxnwqr5KXNPGldn5LbA=",
27 @s3_test.authorization_string
28 end
29
30 # See http://developer.amazonwebservices.com/connect/entry.jspa
31 # ?externalID=123&categoryID=48
32 def test_dg_sample_two
33 @s3_test.make_authenticated_request(:put, '/photos/puppy.jpg',
34 {'Content-Type' => 'image/jpeg',
35 'Content-Length' => '94328',
36 'Host' => 'johnsmith.s3.amazonaws.com',
37 'Date' => 'Tue, 27 Mar 2007 21:15:45 +\
38 0000'})
39 expected_canonical_string = "PUT\n\nimage/jpeg\nTue, 27 Mar 2007 21:15:45 +0\
40 000\n" +
41 "/johnsmith/photos/puppy.jpg"
42 assert_equal expected_canonical_string, @s3_test.canonical_string
43 assert_equal "AWS 0PN6J17HBGXHT7JJ3X82:hcicpDDvL9SsO6AkvxqmIWkmOuQ=",
44 @s3_test.authorization_string
45 end
46
47 def test_dg_sample_three
48 @s3_test.make_authenticated_request(:get, '',
49 {'prefix' => 'photos',
50 'max-keys' => '50',
51 'marker' => 'puppy',
52 'host' => 'johnsmith.s3.amazonaws.com',
53 'date' => 'Tue, 27 Mar 2007 19:42:41 +\
54 0000'})
55 assert_equal "GET\n\n\nTue, 27 Mar 2007 19:42:41 +0000\n/johnsmith/",
56 @s3_test.canonical_string
57 assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:jsRt/rhG+Vtp88HrYL706QhE4w4=',
58 @s3_test.authorization_string
59 end
60
61 def test_dg_sample_four
62 @s3_test.make_authenticated_request(:get, '?acl',
63 {'host' => 'johnsmith.s3.amazonaws.com',
64 'date' => 'Tue, 27 Mar 2007 19:44:46 +\
65 0000'})
66
67 assert_equal "GET\n\n\nTue, 27 Mar 2007 19:44:46 +0000\n/johnsmith/?acl",
68 @s3_test.canonical_string
69 assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:thdUi9VAkzhkniLj96JIrOPGi0g=',
70 @s3_test.authorization_string
71
72 end
73
74 def test_dg_sample_five
75 @s3_test.make_authenticated_request(:delete,
76 '/johnsmith/photos/puppy.jpg',
77 {'User-Agent' => 'dotnet',
78 'host' => 's3.amazonaws.com',
79 'date' => 'Tue, 27 Mar 2007 21:20:27 +\
80 0000',
81 'x-amz-date' => 'Tue, 27 Mar 2007 21:2\
82 0:26 +0000' })
83 assert_equal "DELETE\n\n\n\nx-amz-date:Tue, 27 Mar 2007 21:20:26 +0000\n" +
84 "/johnsmith/photos/puppy.jpg",
85 @s3_test.canonical_string
86 assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:k3nL7gH3+PadhTEVn5Ip83xlYzk=',
87 @s3_test.authorization_string
88 end
89
90 def test_dg_sample_six
91 @s3_test.make_authenticated_request(:put,
92 '/db-backup.dat.gz',
93 'User-Agent' => 'curl/7.15.5',
94 'host' => 'static.johnsmith.net:8080',
95 'date' => 'Tue, 27 Mar 2007 21:06:08 +00\
96 00',
97 'x-amz-acl' => 'public-read',
98 'content-type' => 'application/x-downloa\
99 d',
100 'Content-MD5' => '4gJE4saaMU4BqNR0kLY+lw\
101 ==',
102 'X-Amz-Meta-ReviewedBy' =>
103 ['joe@johnsmith.net', 'jane@johnsmith.ne\
104 t'],
105 'X-Amz-Meta-FileChecksum' => '0x02661779\
106 ',
107 'X-Amz-Meta-ChecksumAlgorithm' => 'crc32\
108 ',
109 'Content-Disposition' => 'attachment; fi\
110 lename=database.dat',
111 'Content-Encoding' => 'gzip',
112 'Content-Length' => '5913339')
113 expected_canonical_string = "PUT\n4gJE4saaMU4BqNR0kLY+lw==\napplication/x-d\
114 ownload\n" +
115 "Tue, 27 Mar 2007 21:06:08 +0000\n" +
116 "x-amz-acl:public-read\nx-amz-meta-checksumalgorithm:crc32\n" +
117 "x-amz-meta-filechecksum:0x02661779\n" +
118 "x-amz-meta-reviewedby:joe@johnsmith.net,jane@johnsmith.net\n" +
119 "/static.johnsmith.net/db-backup.dat.gz"
120 assert_equal expected_canonical_string, @s3_test.canonical_string
121 assert_equal 'AWS 0PN6J17HBGXHT7JJ3X82:C0FlOtU8Ylb9KDTpZqYkZPX91iI=',
122 @s3_test.authorization_string
123 end
124
125 end
Making the Request
We’ve finally got to the point where we have a signature that we can use to sign the request. The last step will be to actually make the request. We’ll be using the rest-open-uri library to do this, so if you haven’t already installed it, do a
1 $> sudo gem install rest-open-uri
from the command line (omit the sudo if you’re on Windows) to get it installed.
Once you have the rest-open-uri gem installed, making the request is simple. open-uri is in the Ruby Standard Library. It extends the Kernel::open method so that any file that starts with “xxx://” is opened as a URL. So, to open a URL, you just use something like this:
1 require 'open-uri'
2 reddit = open('http://reddit.com')
3 puts reddit.readlines
rest-open-uri (http://rubyforge.org/projects/rest-open-uri/) is a library by Leonard Richardson that extends open-uri, adding support for all of the RESTful verbs. You make a PUT request by adding a :method => :put to the headers hash when you make a call.
1 require 'rubygems'
2 require 'rest-open-uri'
3
4 # PUT to http://example.com/some_resource
5 open('http://example.com/some_resource', :method => :put)
6 # DELETE http://example.com/deleteable_resource
7 open('http://example.com/deleteable_resource', :method => :delete)
To make the request, then, we just need to require the rest-open-uri library and then add the following line to the make_authenticated_request method
1 req = open(uri, @headers.merge(:method => @verb,
2 'Authorization' => authorization_string))
Okay, let’s try this sucker out! First, make sure that you have actually set your environment parameters correctly so that you can authenticate to S3. AMAZON_ACCESS_KEY_ID should be set to your Amazon ID and AMAZON_SECRET_ACCESS_KEY to your Amazon Secret Key. On OS X or Unix, you can see the environment by typing env at the command line
1 $> env | grep AMAZON
2 AMAZON_ACCESS_KEY_ID=your_amazon_access_key_which_is_a_bunch_of_numbers
3 AMAZON_SECRET_ACCESS_KEY=your_secret_amazon_access_key
Okay, now that we’re sure about the authentication, let’s go do some testing
1 $> irb
2 >> require 's3_authenticator.rb'
3 => true
4 >> S3Lib.request(:get, '/spatten_presentations')
5 => #<StringIO:0x1741b60>
6 >> S3Lib.request(:get, '/spatten_presentations').read
7 => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n
8 <ListBucketResult xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\">
9 <Name>spatten_presentations</Name>
10 <Prefix></Prefix>
11 <Marker></Marker>
12 <MaxKeys>1000</MaxKeys>
13 <IsTruncated>false</IsTruncated>
14 <Contents>
15 <Key>ploticus_dsl.pdf</Key>
16 <LastModified>2007-09-12T16:09:24.000Z</LastModified>
17 <ETag>"94ca8590f028f8be0310bd5b2fabafdc"</ETag>
18 <Size>509594</Size>
19 <Owner>
20 <ID>9d92623ba6dd9d7cc06a7b8bcc46381e7c646f72d769214012f7e91b50c0de0f</ID>
21 <DisplayName>scottpatten</DisplayName>
22 </Owner>
23 <StorageClass>STANDARD</StorageClass>
24 </Contents>
25 <Contents>
26 <Key>s3-on-rails.pdf</Key>
27 <LastModified>2007-12-05T19:38:32.000Z</LastModified>
28 <ETag>"891b100f53155b8570bc5e25b1e10f97"</ETag>
29 <Size>184748</Size>
30 <Owner>
31 <ID>9d92623ba6dd9d7cc06a7b8bcc46381e7c646f72d769214012f7e91b50c0de0f</ID>
32 <DisplayName>scottpatten</DisplayName>
33 </Owner>
34 <StorageClass>STANDARD</StorageClass>
35 </Contents>
36 </ListBucketResult>"
Hey, it works! Notice that the first request we did just returned a StringIO object. That’s what the open command returns. To get at the body of the request, we use the read method on the StringIO object.
Note
If you want to read an IO object more than once, you need to rewind it between reads. Like this:
1 request.read
2 request.rewind
3 request.read
Notice that the listing shows the bucket we requested (spatten_presentations), along with some information about that bucket and a listing of all of the objects in that bucket. We’ll be talking more about the XML and how to parse it in the S3 API Recipes coming up shortly.
Over-riding the
We don’t want to have our tests call on S3 all of the time. It really slows things down and means we can’t test when we’re away from an internet connection. To fix it, we can just over-ride the open method in the S3lib::AuthenticatedRequest library. I did it like this:
1 module S3Lib
2 class AuthenticatedRequest
3
4 # Over-ride RestOpenURI#open
5 def open(uri, headers)
6 {:uri => uri, :headers => headers}
7 end
8
9 end
10 end
As you can see, it just returns a hash showing the parameters passed in, and never makes a call to the internet.
Error Handling
We now have a working authentication library, and we are almost ready to actually start talking to S3. There’s one more thing we should do, however, that will make our lives much easier and save us tons of time while we’re building the rest of the S3 library. We need to add in some error handling. To illustrate why, let’s try making a new object in a bucket using the current library.
1 $> irb
2 >> require 'code/s3_code/library/s3_authenticator'
3 => true
4 >> S3Lib.request(:put, "spatten_sample_bucket")
5 => #<StringIO:0x1741e94>
6 >> S3Lib.request(:put, "spatten_sample_bucket/sample_object", :body => "this is \
7 a test")
8 OpenURI::HTTPError: 403 Forbidden
9 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
10 320:in `open_http'
11 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
12 659:in `buffer_open'
13 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
14 194:in `open_loop'
15 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
16 192:in `catch'
17 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
18 192:in `open_loop'
19 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
20 162:in `open_uri'
21 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
22 561:in `open'
23 from /opt/local/lib/ruby/gems/1.8/gems/rest-open-uri-1.0.0/lib/rest-open-uri.rb:\
24 35:in `open'
25 from ./code/s3_code/library/s3_authenticator.rb:70:in `make_authenticated_reques\
26 t'
27 from ./code/s3_code/library/s3_authenticator.rb:37:in `request'
28 from (irb):3
29 >> puts req.body
30 NoMethodError: undefined method `body' for nil:NilClass
31 from (irb):5
What’s going on here? I can make a PUT request to create a bucket, but I can’t make a PUT to create an object. Hmmm. There’s really no way to figure out what’s going on, either, as the request we’re making just returns nil. Luckily, Amazon returns some information about what the error was. What we need to do is trap the error as it occurs and grab the information from it. Looking at the error more closely, we see that the OpenURI Library is raising a OpenURI::HTTPError. Let’s add some code to the S3Lib::request method to trap that error and see what information we can extract from it.
1 def self.request(verb, request_path, headers = {})
2 begin
3 s3requester = AuthenticatedRequest.new()
4 req = s3requester.make_authenticated_request(verb, request_path, headers)
5 rescue OpenURI::HTTPError=> e
6 puts "Status: #{e.io.status.join(",")}"
7 puts "Error From Amazon:\n#{e.io.read}"
8 puts "canonical string you signed:\n#{s3requester.canonical_string}"
9 end
10 end
Trying to make the object again gives us a bit more diagnostic feedback (I reformatted the Amazon error response a bit to make it more readable)
1 t$ irb -r 'code/s3_code/library/s3_authenticator'
2 >> S3Lib.request(:put, "spatten_sample_bucket/sample_object",
3 :body => "this is a test")
4 Status: 403,Forbidden
5 Error From Amazon:
6 <?xml version="1.0" encoding="UTF-8"?>
7 <Error>
8 <literal>SignatureDoesNotMatch</literal>
9 <Message>The request signature we calculated does not match the signature
10 you provided. Check your key and signing method.</Message>
11 <RequestId>7BD4FADF07973DEA</RequestId>
12 <SignatureProvided>redacted</SignatureProvided>
13 <StringToSignBytes>redacted</StringToSignBytes>
14 <AWSAccessKeyId>195MGYF7J3AC7ZPSHVR2</AWSAccessKeyId>
15 <HostId>Baq4uDiuK3jU7Xf3R35sOLYrdFZBASP/e0ncdUdvUX1BJ5HEh58ojC7/WRKXjc/c</HostId>
16 <StringToSign>
17 PUT
18
19 application/x-www-form-urlencoded
20 Thu, 20 Mar 2008 18:17:40 GMT
21 /spatten_sample_bucket/sample_object
22 </StringToSign>
23 </Error>
24
25 canonical string you signed:
26 PUT
27
28
29 Thu, 20 Mar 2008 18:17:40 GMT
30 /spatten_sample_bucket/sample_object
Ah-ha! Notice that the StringToSign that Amazon is returning has a content-type header of “application/x-www-form-urlencoded”. We didn’t provide a content-type header at all, so we didn’t include it in our canonical_string. It looks like one of the Ruby libraries we’re using was a little too clever and inserted the content-type for us. Let’s try adding our own content-type header. Hopefully that will work.
Warning
Having extra headers added by a library is a pretty common occurrence. If you are having troubles getting your authentication library working, make sure you check that there aren’t any unexpected headers in the string that Amazon is expecting you to sign.
1 $> irb -r 'code/s3_code/library/s3_authenticator'
2 >> req = S3Lib.request(:put, "spatten_sample_bucket/sample_object",
3 "content-type" => "text/plain",
4 :body => "this is a test")
5 => #<StringIO:0x173ac48>
6 >> puts req.status
7 200
8 OK
That looks promising. No errors raised, and a status of 200 OK. Let’s list the objects in the bucket and make sure everything is okay.
1 >> req = S3Lib.request(:get, "spatten_sample_bucket")
2 => #<StringIO:0x1732ac0>
3 >> puts req.read
4 <?xml version="1.0" encoding="UTF-8"?>
5 <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
6 <Name>spatten_sample_bucket</Name>
7 <Prefix></Prefix>
8 <Marker></Marker>
9 <MaxKeys>1000</MaxKeys>
10 <IsTruncated>false</IsTruncated>
11 <Contents>
12 <Key>sample_object</Key>
13 <LastModified>2008-03-20T18:26:01.000Z</LastModified>
14 <ETag>"54b0c58c7ce9f2a8b551351102ee0938"</ETag>
15 <Size>14</Size>
16 <Owner>
17 <ID>9d92623ba6dd9d7cc06a7b8bcc46381e7c646f72d769214012f7e91b50c0de0f</ID>
18 <DisplayName>scottpatten</DisplayName>
19 </Owner>
20 <StorageClass>STANDARD</StorageClass>
21 </Contents>
22 </ListBucketResult>
23 => nil
24 >> req = S3Lib.request(:get, "spatten_sample_bucket/sample_object")
25 => #<StringIO:0x1728a0c>
26 >> puts req.read
27 this is a test
28 => nil
That looks perfect: we have a new object in the bucket called “sample_object”, and getting that object gives us back the expected object contents.
We obviously don’t want to leave the error handling as is. Catching all errors and just printing out some information is decidedly sub-optimal. Let’s fix it up by creating a S3ResponseError class and initializing it with some information that will be useful for figuring out what went wrong. We’ll also make sure to add the error type given by Amazon (which was SignatureDoesNotMatch in our example above) so that we can use that to raise a more specific error type in our library.
1 module S3Lib
2
3 def self.request(verb, request_path, headers = {})
4 begin
5 s3requester = AuthenticatedRequest.new()
6 req = s3requester.make_authenticated_request(verb, request_path, headers)
7 rescue OpenURI::HTTPError=> e
8 raise S3Lib::S3ResponseError.new(e.message, e.io, s3requester)
9 end
10 end
11
12 class S3ResponseError < StandardError
13 attr_reader :response, :amazon_error_type, :status, :s3requester, :io
14 def initialize(message, io, s3requester)
15 @io = io
16 # Get the response and status from the IO object
17 @io.rewind
18 @response = @io.read
19 @io.rewind
20 @status = io.status
21
22 # The Amazon Error type will always look like <literal>AmazonErrorType</li\
23 teral>. Find it with a RegExp.
24 @response =~ /<literal>(.*)<\/literal>/
25 @amazon_error_type = $1
26
27 # Make the AuthenticatedRequest instance available as well
28 @s3requester = s3requester
29
30 # Call the standard Error initializer
31 # if you put '%s' in the message it will be
32 # replaced by the amazon_error_type
33 super(message % @amazon_error_type)
34 end
35 end
36 end
Note that the S3Lib::request method rescues any OpenURI::HTTPError errors and re-raises them as S3Lib::S3ResponseError errors, passing in the IO object and the AuthenticatedRequest instance to the error. We can use this new error class to do something like this if we just want to output some info:
1 #!/usr/bin/env ruby
2
3 require File.join(File.dirname(__FILE__),'s3_authenticator')
4
5 begin
6 req = S3Lib.request(:put, "spatten_sample_bucket/sample_object",
7 :body => "Wheee")
8 rescue S3Lib::S3ResponseError => e
9 puts "Amazon Error Type: #{e.amazon_error_type}"
10 puts "HTTP Status: #{e.status.join(',')}"
11 puts "Response from Amazon: #{e.response}"
12 if e.amazon_error_type == 'SignatureDoesNotMatch'
13 puts "canonical string: #{e.s3requester.canonical_string}"
14 end
15 end
In the recipes in the rest of this section, we will be creating new error types and raising them based on the amazon_error_type of the raised S3ResponseError.