This post is the fifth article of many where I write about everything I’m learning about the AWS cloud. The last article was on auto scaling. In this post, I’ll talk about RDS, Amazon’s relational Database service.
Navigation
2. EC2 Storage
4. Auto Scaling
5. AWS RDS (This article)
RDS Databases
AWS Relational Database Service (RDS) is a collection of managed services that allow you to setup and operate SQL databases in the AWS cloud. RDS supports the following database engines:
- MySQL
- Aurora (AWS proprietary DB engine)
- MariaDB
- Oracle
- Microsoft SQL Server
- PostgreSQL
RDS is a managed database service that offers:
- automated provisioning and OS patching
- continuous backups and restore to specific timestamps (Point in Time Restore)
- monitoring dashboards
- read replicas for improved read performance.
- multi Availability Zone setup for disaster recovery
- maintenance windows for upgrades
- scaling capability
- storage is backed by EBS (gp2 or io1)
The downside of using RDS is that you cannot SSH into the DB instances you launch.
Scaling and Disaster Recovery
As a managed service, AWS takes care of scaling your database so it can keep up with the increasing demands of your applications. RDS instances can be scaled vertically or horizontally. Vertical scaling refers to adding more capacity on your storage and compute of your RDS instance. In contrast, horizontal scaling refers to adding additional RDS instances for reads and writes.
RDS Read Replicas
To scale horizontally, you can create read replicas of your database. Read replicas are read-only copies of a DB instance. You can reduce the load on your primary DB instance by routing queries from your applications to the read replica. This allows you to scale beyond the capacity constraints of a single database for read-heavy work. AWS allows you to create up to 15 read replicas in an AZ, across multiple AZ or in multiple regions.
When you add a new read replica, RDS makes the source database instance the primary instance. When you make updates to the primary database, RDS copies the updates asynchronously to the read replica. Clients have read/write access to the primary database and read-only access to the read replica.
Applications connecting to a DB with a read-replica cluster would need to have their connection strings updated. AWS does not charge you for traffic between DB instances and read replicas in the same region.
Read replicas — Use cases
Deploying a database with read replicas makes sense for a number of reasons:
- Implementing disaster recovery. Read replicas can be promoted to a stand alone instance as a disaster recovery solution if the primary DB instance fails.
- Scenario where you have two or more applications requiring access to the same data e.g the main production application and an analytics/reporting app. The reporting app, doesn’t write data to DB but could slow down the main DB if it performed queries on the main prod DB. The diagram below illustrates this:
RDS Multi AZ (Disaster Recovery)
RDS multi AZ is used to increase redundancy and is not used for scaling. Multi AZ creates standby databases in separate availability zones for automatic failover in the event of failure in the main DB instance. Changes to the main instance are synced as they happen to the standby DBs. No manual intervention is needed to switch from the main DB to the standby instance(s), the change happens automatically because RDS uses a single DNS name for the primary DB and its standby instances.
AWS Aurora
AWS Aurora is a proprietary Amazon Database technology built and optimised for the AWS cloud. Aurora is compatible with MySQL and PostgreSQL. Aurora offers a number of advantages over traditional databases, including:
- Performance: Aurora is up to five times faster than MySQL and PostgreSQL databases.
- Scalability: Aurora storage grows automatically in increments of 10GB and can scale up to 128 tebibytes(128 TiB) per database cluster.
- Supports up to 15 read replicas & the replication process is faster than MySQL (sub 10ms replica lag)
- Availability: Failover is instantaneous. It is HA native.
- Costs more than RDS (20% more) but it is more efficient
Aurora and High Availability
Aurora is designed with High Availability and offers a number of features to help you ensure that your database is always up and running, even in the event of a failure. These features include:
- Multi AZ Deployment: Aurora can be deployed across multiple availability zones in a single AWS region. If one instance fails, your DB will remain available in the other availability zones.
- Read replicas: Read replicas can be used to offload read traffic from your main DB instance and can also be used as a failover option in the event of main instance failure.
- Global database: For globally distributed applications, Aurora offers Global Database which is an aurora DB with multi region replication and <1sec latency.
- Fault-tolerant and self healing storage: Aurora uses auto expanding storage volumes, stores 6 copies of your data across all AZs and automatically scans data for errors and fixes them.
RDS Security
- At-rest encryption:
- master and replicas are encrypted using a KMS key that must be defined at creation time.
- read replicas cannot be encrypted if master isn’t.
- to encrypt existing DB, make a DB snapshot and restore from it as encrypted.
- In-flight encryption: DB are TLS-ready, you can use AWS TLS root certificates on the client side to secure communications between client and DB instances.
- IAM Authentication: IAM roles can be used to connect to the DB instead of username/password combination.
- Security Groups can be used to control network access to your RDS/Aurora DB.
- Audit logs can be retained by sending them to CloudWatch Logs.
RDS Proxy
The RDS proxy is a managed DB proxy service for RDS that allows applications to pool and share connections to the database. Sharing connections improves DB efficiency by reducing the number of open connections to the DB and helps avoid timeouts. The RDS proxy is a serverless, autoscaling and highly available service that reduces failover by up to 66%.
Conclusion
That’s all for relational databases, in the next article I’ll discuss non-relational databases.