In a MySQL Conference presentation with the subheading “How to store a hundred billion BLOBs without buying a single disk,” Mark Atwood described a new open source storage engine that he built which allows you to store data in Amazons S3 service using MySQL.

It doesn’t (yet) allow you to create any generic table schema you desire. Instead, you create tables with a simple key/value pair structure. For the scenario where storing data in S3 might be most useful, however, this is actually the data structure that you’d most likely need. You can use your local mysql server to store your indexes and data relationships, then outsource large BLOB data such as images or videos to the slower remote S3 service, identifying each resource by a unique key.

It looks like it’s pretty simple to use. Here are a few example SQL statements that make use of the MySQL S3 engine (lifted from Mark’s slides).

Connecting to the AWS server
(USER 'aws id string',
PASSWORD 'aws secret string');

Creating a table
CREATE TABLE 'bierce' (
'defn' BLOB)
CONNECTION='awss3 DevilDictionary $server MyAWSAcct';

Insert and Select

INSERT INTO bierce (word, defn) VALUES
('AUTHOR', 'One noted for confusing bitterness with humor.');

SELECT defn FROM bierce WHERE word='WIT';

One note on selects: make sure you use a WHERE clause to select a single value by key or you’ll be transfering the whole table across the network. This is not only slow, but it will cost you in bandwidth (your own bandwidth costs plus the S3 bandwidth fee).

You can view Mark’s full presentation online. There’s a powerpoint slideshow, and a seperate text file containing the slide notes –Link.

The full source of the MySQL S3 plugin is available for download as well –Link.