Choosing the Right Database for AWS Workloads

Choosing the Right Database for AWS Workloads
By Jay Smith / on 01 Sep, 2023

When choosing the right database for your AWS workloads, the first step is to clearly define your business objectives and overall migration strategy. Will you re-architect, re-platform, or simply re-host? Your objectives will guide the degree of freedom in choosing a target AWS database.

For example, a “lift and shift” strategy may involve minimal changes by migrating an on-prem Oracle database to Oracle on EC2. In contrast, you may opt for more flexibility by refactoring to use purpose-built, cloud-optimized AWS databases like Amazon Aurora or DynamoDB. Outlining your goals and approach upfront will pave the way for an effective database decision.

Comparing Database Types

When choosing a database for AWS, a key decision is whether to use a relational or non-relational database.

Relational databases like Amazon RDS are best suited for traditional, structured data that uses tables, rows, and columns. They support complex SQL queries and joins across multiple tables.

In contrast, non-relational databases are optimized for specific data models:

  • Key-value databases like Amazon DynamoDB excel at high read/write throughput for simple operations on large datasets. They are great for high-traffic web apps.
  • In-memory databases like Amazon ElastiCache provide microsecond latency and are ideal for caching layers.
  • Document databases like Amazon DocumentDB store semi-structured JSON data perfect for catalogs and user profiles.
  • Graph databases like Amazon Neptune efficiently store and navigate relationships, great for social networks.
  • Time series databases like Amazon Timestream optimize storage and analysis of time series data from IoT devices.

AWS offers purpose-built databases for each model above. Understanding the core capabilities of each database type will allow you to match them to your application’s specific data and use case.

Rather than shoehorning data into a one-size-fits-all relational database, evaluate both SQL and NoSQL options to find the purpose-built database that best fits your needs.

Data Considerations

When selecting a database, you need to evaluate your data and application needs. Two key factors are consistency vs availability, and structured vs unstructured data.

Consistency vs Availability

Some applications need strong consistency - all users see the same up-to-date data immediately. Others can tolerate eventual consistency for better availability and partition tolerance.

Relational databases often favor consistency, while non-relational ones provide more flexibility. Understand your requirements here.

Structured vs Unstructured Data

Structured data has a predefined schema like customers or products. Unstructured data like social posts has no fixed format.

  • Relational databases excel at structured data with fixed rows and columns.
  • Document databases like Amazon DocumentDB work great for semi-structured JSON data.
  • Key-value databases like Amazon DynamoDB handle unstructured data flexibly.

Also consider if you need to join data across tables or collections. Relational databases support complex SQL queries and JOINs much better.

Analyzing your data characteristics will help narrow the choices. Structured data with complex querying needs may point to a relational database, while flexible schemas and simple access patterns work well with NoSQL.

Operational Considerations

When selecting a database, you need to determine the level of operational management vs customization you need.

Managed vs Self-Managed

AWS offers fully managed databases like Amazon Aurora and Amazon DynamoDB that handle provisioning, OS patching, backups, recovery, scaling, security, and more.

Alternatively, you can deploy open source or commercial databases on self-managed EC2 instances. This allows full OS and database access for customization, but requires you to handle all ops tasks.

Amazon RDS Custom provides a middle ground - a managed service for EC2 database instances.

Evaluate whether you need full control vs delegating database ops to AWS. Managed services reduce overhead substantially.

Customization

Sometimes self-managed databases are needed to use specific versions or features not yet supported by managed services.

Amazon RDS Custom allows accessing the underlying EC2 instance while still getting core managed services like provisioning, backups, and monitoring.

Understand your customization needs upfront. If not required, leverage managed services to reduce administrative burden.

Reliability Considerations

Availability and disaster recovery should be top priorities for any production database.

AWS managed databases are designed for high availability (HA) by default. For example:

  • Amazon Aurora replicates copies across 3 Availability Zones with failover in seconds.
  • Amazon DynamoDB replicates data across 3 facilities by default.
  • Amazon RDS supports Multi-AZ deployments for automatic failover.

In contrast, self-managed databases require you to architect and implement a reliable HA and disaster recovery (DR) solution.

Unless you need specific customization, managed databases make achieving enterprise-grade reliability much simpler. Their HA and DR capabilities run automatically without extensive ops work.

Evaluate if you can leverage the built-in HA/DR of managed services, or if you are ready to architect and manage it yourself with self-managed databases.

Performance Considerations

When choosing a database, you need to evaluate performance requirements like scalability and low latency.

Scalability

Many applications need to handle large, variable workloads. Managed databases like Amazon Aurora and DynamoDB are designed to scale seamlessly.

Relational databases can scale vertically by using larger instance types, and horizontally through read replicas.

Non-relational databases like DynamoDB provide virtually unlimited, automated scaling.

If your workload varies or could grow suddenly, choose a database with excellent scalability to avoid issues down the road.

Latency

Applications like gaming, IoT, and financial trading need very low latency.

In-memory databases like Amazon ElastiCache and Amazon MemoryDB provide microsecond latency by keeping data in RAM.

Other databases like Aurora and DynamoDB optimize performance through caching, lean designs, and automated scaling.

Understand latency needs before choosing a database. In-memory databases are ideal for the lowest latency, while other databases offer great performance for mainstream apps.

Security Considerations

When evaluating databases, you need to analyze the security risks and capabilities.

All AWS databases provide encryption at rest and in transit. They integrate with AWS Identity and Access Management (IAM) for authentication and authorization.

However, certain data may require extra precautions:

  • Regulated data like healthcare records may need tighter access controls or added encryption.
  • Applications processing sensitive information need to prevent and monitor for data leakage.

Review security features like encryption schemes, network isolation, audit logging, user access controls, and compliance certifications. Ensure the database offers the security capabilities matching your data sensitivity and compliance needs.

Lean on IAM, data encryption, security groups, VPCs, and other AWS security services to enhance database protection. Consult with your security team on any regulated data or high risks early in your evaluation process.

Evaluating Requirements

Choosing the ideal database involves thoroughly evaluating your requirements in areas like:

  • Storage solutions - Do you need a data warehouse, caching layer, search engine, or time series data store? Match the storage model to your data and access patterns.
  • Technology stack - Review programming languages, frameworks, and other services that must integrate with the database. Ensure the database works smoothly with your tech stack.
  • Use cases - Databases excel in certain use cases. Key-value databases like DynamoDB are great for high-scale web apps, while graph databases like Neptune optimize relationship-based data.
  • Consistency needs - Determine if you need strong consistency or if eventual consistency works for your architecture. Relational databases often provide stronger consistency.
  • Data structure - Understand if you need to store structured, unstructured, or semi-structured data. This impacts whether a relational or non-relational database makes more sense.
  • Performance needs - Do you require millisecond response times or can you tolerate small delays? In-memory databases deliver the fastest performance.
  • Scalability needs - Analyze how data storage and traffic may grow over time. Prioritize databases that scale smoothly like DynamoDB.

Thoroughly evaluating all technical and business requirements will point you to the ideal database choice and prevent surprises down the road.

Use Cases

Understanding common database use cases can help guide your decision. Some key examples:

eCommerce

  • Need to support heavy reads and writes at scale
  • Require strong consistency for order transactions
  • Must store structured data like customers, products, orders
  • Should be highly available across regions

For core commerce data, relational databases like Amazon Aurora provide ACID transactions and scale well. DynamoDB can store flexible product info.

IoT Applications

  • Need to ingest high volumes of time series data
  • Must scale massively as devices grow
  • Should integrate edge locations with cloud
  • Require flexibility for semi-structured data

Time series databases like Amazon Timestream efficiently store and analyze time series data. DynamoDB handles flexible data at scale from millions of devices.

Real-Time Applications

  • Need very low latency response times
  • Must support high read and write throughput
  • Require scalability for spiky workloads
  • Should be highly available

In-memory databases like Amazon ElastiCache deliver microsecond latency. DynamoDB provides single-digit millisecond latency at any scale.

Enterprise Applications

  • Need to support complex business processes and reporting
  • Require strong security and access controls
  • Must provide high availability and disaster recovery
  • Should integrate with existing systems

Relational databases like Amazon Aurora allow scaling business systems while maintaining correctness and controls.

Mobile/Web Applications

  • Need to store flexible structured, semi-structured and unstructured data
  • Require scalability and elasticity for growth
  • Should provide high availability across regions
  • Must have security controls for public-facing apps

Use relational databases for structured data and DynamoDB for flexible schemas. Leverage HA, security and scalability of AWS databases.

Making a Database Decision and Migration Plan

With your requirements gathered and options evaluated, you can make an informed database decision.

Decision Factors

Consider factors like:

  • Data characteristics – Structure, schema, and access patterns
  • Performance needs – Latency, throughput, scalability
  • Reliability requirements – Availability, DR, backup/recovery
  • Operational overhead - Managed vs self-managed
  • Security and compliance needs - Encryption, access control, regulatory requirements
  • Customization needs - OS, database engine, tooling access
  • Cost - Licensing, operational overhead
  • Use case fit - Purpose-built databases aligned to your workload

Match these needs to database capabilities to find the best fit. You may also choose multiple databases for different workloads rather than a one-size-fits-all approach.

Migration Planning

Once you select a target database, decide on a migration strategy:

  • Re-host - Lift and shift existing databases to minimize changes
  • Re-platform - Migrate to cloud-optimized databases like Amazon Aurora
  • Re-architect - Refactor to microservices with purpose-built databases

Then determine a migration method like:

  • AWS Database Migration Service (DMS) - Fast, automated service for homogeneous or heterogeneous migrations
  • DMS with AWS Schema Conversion Tool - For converting schemas between different database types
  • AWS Database Freedom Program - Expert guidance and resources for migrations
  • Professional services - Assistance from AWS or partners for full-scope projects

With the right database choice and well-planned migration, you can optimize your workloads on AWS. Revisit decisions periodically as needs evolve.

Looking for help with AWS Databases or other advanced cloud technologies? The IT professionals at God Particle IT Group have the skills and experience to architect, build, and manage complex systems at scale. We specialize in cloud platforms like AWS and can provide enterprise-level support to develop and operate DynamoDB-based applications. Whether you need assistance with design, implementation, optimization, or managed services, contact us to see how we can help launch your next innovating using DynamoDB. With deep expertise across today’s leading technologies, God Particle IT Group offers responsive, high-touch services to innovate faster.