AWS DynamoDB is a managed NoSQL document database service. It’s a proprietary NoSQL database created by AWS. Amazon uses it on their eCommerce website. Hence its performance and scalability are proven.
I have used it in a high volume data project which needs more than 7000 writes per seconds and generating around 50 GBs of data daily. Though its an effort to design application with it but it scales really well.
In this article, you will see few good reasons for you to evaluate AWS DynamoDB when you are planning to use MongoDB, Cassandra or others alike.
AWS DynamoDB is a managed service
To keep database running is not a small job. If the size of data is in terabytes and growing, you need a team of infrastructure engineer to carry out following tasks:
- Architecture & design for a multi-region, multi-partition and redundant high-performance database.
- 24 X 7 monitoring of database nodes health.
- Database engine upgrade.
- OS upgrade.
- Regular disk and memory space planning, monitoring and implementation.
- Computational power planning, monitoring and implementation.
- Security audit & trail.
- On occasion Database node maintenance and replacement.
If we are using MongoDB or Cassandra and wanted to run the database with Terabytes of data we have to make sure all the above-mentioned tasks overlooked by an Infrastructure team.
Though AWS DynamoDB keeps you free from all the above tasks as it is a managed service. You just create tables and start pouring in data.
It helps in reducing database infrastructure management cost near to zero. It is one of the biggest selling points of it.
Even Petabytes of data is fine
AWS DynamoDB doesn’t have any limit on the size of tables, hence even Petabytes of data handled at the same performance. All the data kept on Solid State Drive servers.
Easy read and write throughput management
AWS DynamoDB is a true cloud database. It provides following options to manage read and write throughout elasticity:
- Auto Scale – With Auto Scale feature you have the ability to define increase and decrease read and write capacity of a table, when certain percentage or number of throughput capacity reached, AWS automatically increase or decrease the number of partitions require to handle the new throughput. It helps in reducing the cost by keeping the number of partitions optimal as per demand.
- Use the cron job to trigger the change in read and write throughput for the table using AWS CLI commands in the script.
- Manually change throughput from the management console.
Change in the table throughput results in creation or deletion of partitions. AWS make sure all these happens without any down time.
Automatic data and traffic management
DynamoDB automatically manage the replication and partition of the table based on the data size. It continuously monitor the table data size and spread tables on the sufficient number of servers which replicated to multiple availability zones in a region, when required. All these without any downtime and our knowledge.
On-demand backup & recovery for the table
DynamoDB will never lose your data because it replicates it in multiple zones of the system which are fault tolerant.
Keeping the backup of the table periodically can save our face when application corrupt data. In some corporate, there is a compliance need for the same. It provides simple admin console and API based backup and recovery mechanism. Backup and recovery are very fast and complete in seconds despite the size of the table.
Point in time recovery
DynamoDB provides the point in time recovery features to go back at any time in last 5 weeks (35 days) of time for a table. It is over and more to back up & recovery feature.
Multi-region global tables
AWS DynamoDB does automatic syncing of data between multiple regions for global tables. You just need to specify in which regions want it to be available. Without global tables, you were doing it on your own by executing code and copying data in multiple zones.
It is really helpful if application needs multi-region replication for performance reasons.
Inbuilt in-memory caching service DAX (DynamoDB Accelerator)
Caching improves the performance dramatically and cuts the load on database engine for read queries.
DynamoDB Accelerator (DAX) is an optional caching layer which you setup with few clicks. DAX is specially built cache layer to work with DynamoDB. You can use it against ElasticCache or self-hosted Redis because of its performance along with DynamoDB.
DynamoDB typically return the read queries under 100 milliseconds, with DAX it further improved and queries return under 10 milliseconds.
Encryption at rest
DynamoDB request response is HTTP based, just like many other NoSQL database. Encryption at rest is a feature provided to enable an extra layer of security for data to avoid the unauthorised access to storage. Sometime it required by compliance. It uses 256 bit AES encryption and encrypt table level data as well as indexes. It work seamlessly with AWS key management service for encryption key.
Document and key-value item storage
DynamoDB can store JSON document or key-value items in the table.
Like other NoSQL document database, DynamoDB is schema-less. The key attribute is only one mandatory attribute in the item.
Eventual and Immediate consistency
You can create the table in two consistency modes in DynamoDB.
Eventual consistency – The cheaper option, with the query may or may not make the latest item available.
Immediate consistency – If your application wants immediate consistency with query result should always give the latest items.
Time to live items
This is one of the power features of DynamoDB which enables many use cases not possible without writing custom application code. You can get your items deleted after a certain amount of time automatically by a sweeper.
DynamoDB streams is another powerful feature which enables execution of AWS Lambda function when item created, updated or deleted. Streams are similar to AWS Kinesis stream and you can use it for many use cases. For E.g. create your own data pipeline for creating aggregated records like average, sum etc. Or sending the email when a new user record inserted.
Local DynamoDB setup
For ease of development and integration test, you can use DynamoDB local distribution. It is Java application and it can with Java Runtime Environment installed in the environment.
One last thing which I have not highlighted but important. Being a part of AWS cloud offering, It can easily integrate with AWS Athena for big data computation need. However, you can always integrate it with Apache Spark or other Big data computation engine.
I will suggest you to try DynamoDB as your NOSQL need, let see if it fits your need. They provide generous free tier to start with.