MimicDB is a local data store of object metadata from S3. Consistency is created and maintained by "mimicking" every S3 API call locally. This allows many tasks like listing, searching keys and calculating storage usage to be completely handled locally, without the latency or costs of using the S3 API.
On average, tasks like these are 2000x faster using MimicDB.
The Python implementation of MimicDB defaults to use a Redis backend, although SQLite and in-memory backends are also available. The backend can be easily extended to use other databases.
Data is stored in a simple but powerful layout:
|mimicdb||A set of bucket names|
|mimicdb:bucket||A set of key names|
|mimicdb:bucket:key||A hash of key metadata (size and MD5)|
The mimicdb prefix can additionally have an optional namespace string, which allows multiple S3 connections to share the same backend. In that case, the layout looks like this:
MimicDB transactions take place on the same level as S3 API calls, so concurrent access to the backend is identical to concurrent access to an S3 connection.
Multiple simultaneous writes to the same key on S3 are reflected identically in MimicDB.
Bucket level key layout provides an easy way to partition Redis over multiple servers. Using consistent hash partitioning, each bucket and key set can be stored on the same instance.
MimicDB is currently implemented in Python via Boto. If you're using Boto already, the MimicDB Python library works as a drop in replacement.