AWS DynamoDB is a managed NoSQL document database service. It’s a proprietary NOSQL database created by AWS. Amazon uses it in their eCommerce website. Hence its performance and scalability is proven.
I have used it in a high volume data project which need more then 7000 writes per seconds and generating around 50 GBs of data daily. Though its an effort to design application with it but it scales really well.
In this article you will see few good reasons for you to evaluate AWS DynamoDB when you are planning to use MongoDB, Cassandra or others alike.
AWS DynamoDB is a managed service
To keep database running is not a small job. If size of data is in terabytes and growing, you need a team of infrastructure engineer to carry out following tasks:
- Architecture & design for a multi region, multi partition and redundant high performance database.
- 24 X 7 monitoring of database nodes health.
- Database engine upgrade.
- OS upgrade.
- Regular disk and memory space planning, monitoring and implementation.
- Computational power planning, monitoring and implementation.
- Security audit & trail.
- On occasion Database node maintenance and replacement.
If we are using MongoDB or Cassandra and wanted to run database with Terabytes of data we have to make sure all the above mentioned tasks overlooked by an Infrastructure team.
Though AWS DynamoDB keep you free from all the above tasks as it is a managed service. You just create tables and start pouring in data.
It helps in reducing database infrastructure management cost near to zero. It is one of the biggest selling point of it.
Even Petabytes of data is fine
AWS DynamoDB don’t have any limit on the size of tables, hence even Petabytes of data handled at same performance. All the data kept on Solid State Drive servers.
Easy read and write throughput management
AWS DynamoDB is a true cloud database. It provides following options to manage read and write throughout elasticity:
- Auto Scale – With Auto Scale feature you have ability to define increase and decrease read and write capacity of a table, when certain percentage or number of throughput capacity reached, AWS automatically increase or decrease number of partitions require to handle the new throughput. It helps in reducing the cost by keeping the number of partitions optimal as per demand.
- Use cron job to trigger change in read and write throughput for table using AWS CLI commands in script.
- Manually change throughput from management console.
Change in the table throughput result in creation or deletion of partitions. AWS make sure all these happens without any down time.
Automatic data and traffic management
DynamoDB automatically manage the replication and partition of the table based on the data size. It continuously monitor the table data size and spread tables on the sufficient number of servers which replicated to multiple availability zones in a region, when required. All these without any downtime and our knowledge.
On-demand backup & recovery for the table
DynamoDB will never loose your data because it replicate it in multiple zone on system which are fault tolerant.
Keeping backup of the table periodically can save our face, when application corrupt data. In some corporate there is a compliance need for the same. It provides simple admin console and API based backup and recovery mechanism. Backup and recovery is very fast and complete in seconds despite the size of table.
Point in time recovery
DynamoDB provides point in time recovery features to go back at any time in last 5 weeks (35 days) of time for a table. It is over and more to back up & recovery feature.
Multi region global tables
AWS DynamoDB does automatic syncing of data between multiple regions for global tables. You just need to specify in which regions want it to be available. Without global tables you were doing it by your own by executing code and copying data in multiple zone.
It is really helpful if application need multi region replication for performance reasons.
Inbuilt in-memory caching service DAX (DynamoDB Accelerator)
Caching improves the performance dramatically and cut the load on database engine for read queries.
DynamoDB Accelerator (DAX) is an optional caching layer which you setup with few click. DAX is specially built cache layer to work with DynamoDB. You can use it against ElasticCache or self hosted Redis because of its performance along with DynamoDB.
DynamoDB typically return the read queries under 100 milliseconds, with DAX it further improved and queries return under 10 milliseconds.
Encryption at rest
DynamoDB request response is HTTP based, just like many other NOSQL database. Encryption at rest is a feature provided to enable an extra layer of security for data to avoid the unauthorised access to storage. Sometime it required by compliance. It uses 256 bit AES encryption and encrypt table level data as well as indexes. It work seamlessly with AWS key management service for encryption key.
Document and key value item storage
DynamoDB can store JSON document or key value items in table.
Like other NOSQL document database, DynamoDB is schema-less. Key attribute is only one mandatory attribute in item.
Eventual and Immediate consistency
You can create table in two consistency modes in DynamoDB.
Eventual consistency – The cheaper option, with query may or may not make the latest item available.
Immediate consistency – If your application want immediate consistency with query result should always give the latest items.
Time to live items
This is one of the power feature of DynamoDB which enables many use cases not possible without writing custom application code. You can get your items deleted after certain amount of time automatically by a sweeper.
DynamoDB streams is another power feature which enable execution of AWS Lambda function when item created, updated or deleted. Streams are similar to AWS Kinesis stream and you can use it for many use cases. For E.g. create your own data pipeline for creating aggregated records like average, sum etc. Or sending email when a new user record inserted.
Local DynamoDB setup
For ease of development and integration test, you can use DynamoDB local distribution. It is Java application and it can with Java Runtime Environment installed in environment.
One last thing which I have not highlighted but important. Being a part of AWS cloud offering, It can easily integrated with AWS Athena for big data computation need. However you can always integrate it with Apache Spark or other Big data computation engine.
I will suggest you to try DynamoDB as your NOSQL need, let see if it fits your need. They provide generous free tier to start with.