How to build a highly available, highly scalable AWS secure cloud? – PART I

Introduction

Architecture introduces best practices to build highly available, scalable, manageable and secure web application on Amazon Web Services (AWS) Cloud. It specifically addresses the requirements/concerns described in requirements section below. This guide will be valuable to an individual or team who understands basic networking concepts. This will be a two-part series; Part 1 (this post) will discuss general requirements, AWS specific concepts, define the series terminology, and build up a foundation by discussing general designing concepts , Part 2 will address the security aspect of architecture and several ways to leverage available AWS services. Moreover, it will consolidate all the information and sketch the final architecture. So, let’s get started.

Methodology

This guide is distributed in two predominant sections that answer following questions and suggest reference architecture:

How can a highly available, scalable and manageable web application be built on AWS Cloud?
How to secure web application environment in AWS Cloud?

It will cover high-level concepts and but technical and configuration details are out of scope. Focus will be on core production environment and restricted to components addressing the customer requirements. Proofing of reference architecture is based on the following hypothesis:

Architecture follows similar architecture as traditional web hosting infrastructure.
Architecture does not consider cost estimation for the project.
Focused on DevOps automation to allow easy management and replication of multiple environments.
Default purchasing option for all AWS services is “On-Demand”, due to the uncertainty of workload.
AWS Aurora web service is available in region of architecture

Requirements

Customer requirements can be logically congregated into 6 major criterion. Moreover, unique ID# assigned below will be referenced throughout the document:

High-availability (HA)

HA1. provision for Disaster Recovery
HA2. effective distribution of load
HA3. a self-healing infrastructure that recovers from failed service instances

High-scalability (HS)

HS1. Infrastructure scaling to meet the demand (load-based scaling)
HS2. On demand services due to uncertainty of workload

Low-latency (LL)

LL1. Reduce latency of web application

Archival strategy (AS)

AS1. An archival strategy needed for objects inactive for more than 6 months

Convenience and control (CC)

CC1. Capability to configure database and data access layer
CC2. Capability to effortlessly manage and replicate multiple environments based on blueprint architecture.

Security (S)

S1. Securing data at rest and in transit
S2. Manage access control as the user base expands

Architecture Design
This section will address high availability, high scalability, low-latency, and management requirements. Present environment is based on LAMP stack (PHP and MySQL), which must be implemented using a multi-tier architecture as per best practices. The basic three-tier architecture consists of Client Tier, Web Tier and Database Tier. However, our architecture will have DMZ Tier, Web Tier, Database Tier, and Ancillary Tier, conceptually.
AWS OpsWorks service offers control and convenience of implementing LAMP stack into a multi-tier architecture. It is a configuration management service that offers Chef Recipes (blueprints), infrastructure orchestration and also introduces “point and click” deployment alternatives. OpsWorks features can be mapped to customer requirements as follows:

HS1 – auto instance scaling feature that automatically adds instances based on defined criteria (load or time based) and implemented using Auto-scaling groups.
HA2 – ELB service layer effectively distributes the load by performing the health check on underlying EC2 instances.
CC2 – OpsWorks management console introduces replication of complete stack in distinct availability zones or regions without any hassle.
CC1 – RDS service layer that provides the necessary level of abstraction and control on underlying RDS instances.
HA3 – auto-healing feature ensures that if any instance fails in the stack it is automatically replaced with a new instance.

Core architecture components will be implemented as follows:

Web Server – Amazon EC2 instance implemented with PHP App Server AWS OpsWorks Layer blueprint.
MySQL Database server – MySQL database engine on RDS instance.

AWS CONCEPTS
Following are key AWS concepts for the better understanding of architecture:
Auto scaling (High scalability)
Auto Scaling allows the instance or service to automatically scale the capacity according to defined conditions. AWS services scale dynamically and without impacting the availability of service. Auto Scaling ensures that instances scale out flawlessly to meet the demand.

Multi-AZ (High availability)
Each Amazon data centre location is called a region which contains several distinctive locations called Availability Zones, or AZs. AZs are engineered to be isolated from each other in case of failures, and afford low-latency, low-cost network connectivity to other AZs in the same region. High availability requirements of the application can be met by launching all services in multiple availability zones (Multi-AZ).

APPLICATION LAYERS
All the application layers will be implemented within Virtual Private Cloud (VPC)’s and OpsWorks Layers unless exclusively stated. AWS services referenced in this architecture support server-side encryption and SSL/TLS, hence security requirements are consolidated in “Encryption” section.

DMZ Tier
DMZ consist of Elastic Load Balancer, and public subnet containing NAT gateway and a Bastion Host. Load Balancer will be implemented using OpsWorks stack but NAT gateway and Bastion instances will be placed in a public subnet directly communicating with Internet Gateway.

Bastion Host
A bastion host is required to reduce the attack surface of AWS cloud and allow secure management of resources. It will be a hardened, public facing, internet exposed, special purpose EC2 instance launched in Multi-AZ deployment.

NAT Gateway
NAT gateway is a managed service that allows outbound connections from private subnets to the Internet but prevents the Internet from initiating connections to these subnets. NAT Gateway will be implemented in Multi-AZ deployment.

Elastic Load Balancing (ELB)
AWS Elastic Load Balancing (ELB) improves fault tolerance and availability by placing compute instances behind a Load Balancer. ELB health checks increase availability and reduce the latency of application. ELB features can be mapped to customer requirements as follows:

HS1 – ELB automatically scales its request handling capacity in response to incoming application traffic.
HA1 – ELB can automatically balance traffic across multiple instances and multiple Availability Zones.
HA2 – ELB detects unhealthy EC2 instances and spreads the load across the remaining healthy instances.

Web Tier
Web Tier consists of PHP/Apache application server implemented using Amazon EC2 instances and PHP App Server AWS OpsWorks Layer. EC2 combined with OpsWorks layer can be mapped to following customer requirements:

HA – Application layer is supported by one or more EC2 instances in Multi-AZ deployment.
HS – Auto-scaling groups scale-out EC2 instances based on defined criteria (load based threshold).
CC2 – PHP App Server OpsWorks Layer of automatically handles instance scaling and offers blueprints to launch PHP application server instances in OpsWorks stack.

Database Tier
The Database Tier is responsible for persistent data management and runs on MySQL standard database engine. Although Amazon RDS is an appropriate choice, it does not offer an auto-scaling feature for Compute configuration. Amazon Aurora is a database cluster service that leverages over RDS instances and offers high-availability, auto-scalability, and self-healing capability. Aurora’s features address following customer requirements:

HA1 and HA3- Aurora storage is self-healing and fault-tolerant. It also offers automatic backups and point in time restore for disaster recovery.
HS1 – Aurora scales the storage automatically. Although compute scaling is not automatic, cluster replicas can handle the load seamlessly during the up-gradation process.
LL1 – Query cache is enabled by default in Aurora, reducing latency.
CC1 and CC2 – Aurora leverages on RDS capabilities of managing administrative tasks like automatic backups, software patching and maintenance of underlying infrastructure.

Ancillary Tier
Amazon services stated below optimise latency and address archival, backup, and disaster recovery requirements.

Simple Storage Service (S3)
S3 is highly available storage service that stores data as objects. We need to implement two S3 buckets for our architecture. A public S3 bucket of static (HTML, image, video, etc.) files for CloudFront configuration and private bucket to store automatic backups and EC2 persistent data.

CloudFront
CloudFront is Amazon’s content delivery network that works with S3 buckets to rapidly deliver web content across the globe and optimise latency of an application. Its supports private content, HTTPS authentication, and parametrized requests. CloudFront is dependent on S3 Bucket for original data objects and refers to the origin server (S3 bucket) for updated information.

Glacier
Glacier is a data archival service that integrates with S3, IAM, and CloudTrail. It will be configured to automate data archiving on S3 buckets based on defined criteria. Data stored in Glacier is immutable, encrypted and mandates request signing. A low-cost archival strategy for objects older than 6 months can be achieved by combining Lifecycle management feature of S3 with Glacier.

Combinations of S3, CloudFront and Glacier address following customer requirements:

HA – S3 automatically replicates data within the region and also offers cross-region replication.
HS – S3 supports auto-scaling based on demand.
LL1 – CloudFront and S3 reduce latency and improve the performance of the application.
AS1 – Glacier provides low-cost archival service in conjunction with S3 lifecycle management feature. S3 allows the customer to define the rules of archival and specify the objects that need to be archived.

We will take a logical break here and cover security of cloud in next section, before drafting the final architecture.

About Ajinkya Patil

Leave a Reply

Your email is safe with us.

Most Commented Posts

Recent Comments

Most Liked Posts

Find us on