Alex Bosworth's Weblog
Saturday - Dec 29, 2007
Amazon SimpleDb

I’ve recently been working on some top secret stuff which has the requirement of scaling to a large data set with a lot of simultaneous requests.

This problem has made me a convert to the AmazonWebServices framework, which lets me run my applications on a big old server farm, over a remote API without having to talk to anyone – which is convenient if you live on the other side of the world.

I was just approved to use the Amazon SimpleDb beta, and I have to say it makes me rethink the word database. It doesn’t do anything I take for granted from even the simplest of dbs:

  • It doesn’t support sorting: rows come in whatever order
  • No text search
  • Values can only be 1mb in size
  • No auto-incrementing ids
  • No consistency – values could be 2 different things simultaneously – ‘eventual consistency’
  • No types – no negatives, no integers, no dates, just strings.
  • No way to get multiple rows of data back – each item must be individually fetched.
  • High failure rate: queries often fail due to ‘internal error’ at Amazon
  • Queries cannot exceed 5 seconds
  • No schema
  • No joins – or functions

Those are the downsides. The positives are: I don’t have to run and optimize my own database, no need for a dedicated database box, scaling and indexing are free, running hardware is free, no cost when not active, and it’s fast and cheap (more testing needed on that though). You can also easily have multiple values for a single attribute, which is a bit trickier in a real database.

Really, I don’t think they should have called this ‘DB’ at all. Maybe AmazonDistributedHashTable++ would be better.

Anyways, I hope their infrastructure is run better than their documentation and developer support, because that is also a major negative. First, their API is appallingly bad – and completely different from their StorageAPI for some reason. The “REST” API tunnels all queries through a GET string with a billion parameters. It’s bad, bad practice.

Second, they don’t have a simple php practice library to go along with their simple service. The php library they do have is way too complicated – while the service only has 7 simple methods – the library requires 27 php files with thousands of lines of code to implement them.

Sure, they probably wanted to do a nice reference implementation that included support for all the features and did really good error handling or something – but it’s just scary and evil for something that’s supposed to be simple. At least it makes an attempt to be PHP5.

I wrote a replacement php library for playing around with the service. It’s missing some big optional components of the API like tokens and it doesn’t deal with retries, etc – but if you just want to try out SimpleDb and you use PHP, you can grab my really simple simpledb script – it’s 250 or so lines compared to their 27 file 5000 or so lines reference implementation. (you must have Crypt/HMAC and Curl however)

(This post has been read 3589 times.)

Comments (1) Add Comment
Good feedback..
Posted By: Jeff Barr Dec 29, 2007 14:40:34
Hi Alex, this is good feedback and I will make sure that the SimpleDB team sees it ASAP!

Regarding the internal errors, perhaps you can send me (jbarr at amazon.com) some of the failing request ids so that we can track down the problem.
Well what do you think? Post a comment.

Title:

Name:

URL:

Comment: