quinta-feira, 13 de junho de 2019

What should we know about AWS Networking?

I did and I used this summary with some bullets and some parts of text extracted from AWS papers and from come online courses like cloud guru, linux academy, udemy. To study concepts about  AWS Networking.
I hope it could be useful to someone.

OSI Model:
AWS Responsibility:
  • Physical (CAT5, fiber optical cable)
  • Data Link (MAC)

Customer Responsibility :
  • Network (IP,ARP)
  • Transport (TCP)
  • Session (Setup, Negotiation,Teardown )
  • Application (web browser)


Unicast: Communication in which a frame is sent from a host and addressed to a specific destination. 




Multicast: Communication in which a frame is sent to a specific group of devices or clients.


Broadcast: Communication in which a frame is sent from one address to all other addresses.
TCP:
  • connection based, stateful, acknowledge  
  • After everything I say, I want you to confirm that you receive it
  • Ex: web, email, file transfer
UDP:
  • Connectionless, stateless, simple no retransmission delays.
  • I’m going to start talking and its ok if you miss some words
  • Ex: Streaming media, DNS
ICMP:
  • Used by network devices to exchange info
  • We routers can keep in touch about the health of the network using our own language
  • Ex: traceroute, ping   

Ephemeral Ports:
Short-lived transport protocol used in IP communications 
Above the "well-known” IP ports (above 1024)
Dynamic ports
Suggested range is 49152 to 65535 but
    Linux kernels generally use 32568 to 61000
    Windows plataformas default from 1025
NACL and security groups implications 

AWS Managed VPN
  • AWS managed IpSec VPN connection over your existing internet
  • Quick and usually simple way to establish secure tunnelled connection to a VPC
  • Redundant link for Direct connect or other VPC VPN
  • Support to Static routes or BGP peering and routing
  • Dependent on your internet connection
Limitations:


  • Network latency, variability, and availability are dependent on internet conditions
  • Customer managed endpoint is responsible for implementing redundancy and failover (if required)
  • Customer device must support single-hop BGP (when leveraging BGP for dynamic routing)



Direct Connect

  • Dedicate network connection over private lines straight int AWS backbone
  • Used when required a big pipe int AWS, lots of resource and service being provided on AWS to tour corporate users
  • More predictable network perfomance, potential bandwidth cost reduction, up to 10Gbps provisioned connections, supports BGP peering and routing.
  • Work with your existing Data network Provider, create virtual interfaces (VIF) to connect to VPCs (privates VIF ) or others AWS services like S3 or Glacier (public VIF)

Limitations:
  • May required additional telecom and hosting provider relationships and/or new network circuits.
 

VPN Cloud Hub

  • Connect location in a Hub and Spoke manner using AWS’s Virtual private gateway.
  • Used when is necessary has link remote offices for backup or primary WAN access to AWS resources or each others.
  • Reuses existing internet connection, Supports BGP route to direct traffic for example, use MPLS first then CloudHub VPN as Backup.
  • Depends on the internet connection, no inherent redundancy.
  • Assing multiple Customer Gateway to a Virtual Private gateway, each whit their own BGP ASN and unique IP range.
Limitations:
  • Network latency, variability, and availability are dependent on the internet
  • User managed branch office endpoints are responsible for implementing redundancy and failover (if required)


AWS Direct Connect + VPN

  • Use Case: IPsec VPN connection over private lines
  • Advantages are the same as the previous option with the addition of a secure IPsec VPN connection
Limitations:
  •  Same as the previous option with a little additional VPN complexity

Software VPN

  • You provide your own VPN endpoint and software.
  • You must manage both ends of the VPN connection for compliance reasons or you want to use a VPN option not supports by AWS.
  • Ultimate flexibility and manageability.
  • You must design for any needed redundancy across the whole chain.
  • Install VPN via Marketplace appliance or on an EC2 instance.  
Limitations:
  • Customer is responsible for implementing HA (high availability) solutions for all VPN endpoints (if required)

  
Transit VPC

Software appliance- based VPN connection with hub VPC
AWS managed IPsec VPN connection for spoke VPC connection
Common strategy for connection geographically disperse VPCs and locations in order to create a global network transit center.
Used when locations and VPC-deployed assets across multiple regions that need to communication with one another.
Ultimate flexibility and manageability but also AWS- managed VPN hub-and-spoke between VPCs.
You must design for any needed redundancy across the whole chain.
Provides like cisco , Juniper Network and Riverbed have offering which work with their equipments and AWS VPC.

VPC peering

AWS provided network connectivity between two VPCs.
Used when is necessary multiple VPCs need to communicate or access each other resources.
Uses AWS backbone without touching the internet.
If A is connected to B and B is connected to C, A cannot talk with C via B. (Transitive peering not supported)
VPC peering requested is made, accepter accepter request (either within Account or acros accounts).
Leverages AWS networking infrastructure
Does not rely on VPN instances or a separate piece of physical hardware
No single point of failure No bandwidth bottleneck

PrivateLink

AWS PrivateLink network connectivity between VPCs and/or AWS services using interfaces endpoints.
Used when is necessary keep private subnet truly private by using the AWS backbone to reach other services rather that the public internet.
Redundant: uses AWS backbone
As of October 2018, they can be accessed over inter-region VPC peering.
Create endpoint for needed AWS or marketplace service in all needed subnets, access via the provided DNS hostname.

Limitation:
VPC Endpoint services only available in AWS region in which they are created.

VPC Endpoint

It can be divided into two groups:

Interface Endpoint:
Elastic network interface with an Private IP
Uses DNS entries to redirect traffic.
Could be used on API Gateway, cloudFormation, Cloud watch, etc.
Securing with security groups.

Gateway Endepoints:
A gateway that is a target for a specific route.
Uses prefix list in the route table to redirect traffic.
Cloud be used on S3, DynamoDB.
Securing with VPC Endpoint Polices.

NO VPC Endpoint:
 
VPC Endpoint :


 
Internet Gateways:
  • Internet gateway 
  • Egress-Only Internet Gateway
  • NAT Instance 
  • Nat Gateway 

Internet gateway 
Horizontally scaled, redundant and highly available component that allows communication between your VPC and the internet.
No available risk or bandwidth constraints.
If you subnet is associated with a route to the internet, the it is a public subnet.
Suports IVP4 and IPV6
Purpose 1:  Provide route table target for internet bound traffic.
Purpose 2: Perform NAT for instance with public IP addresses.

Does not performance NAT for instances with privates IP’s only.

Egress-Only Internet Gateway 
IPV6 address are globally unique and are therefore public by default.
Provides outbound internet access for IPV6 address instances.
Prevents inbound access to those IPV6 instances.
Stateful forwards traffic from instances to internet and the sends back the response.
Must create a custom route for ::/0 to the Egress-Only internet gateway.
Use Egress-Only internet Gateway instead of NAT for IPV6.
Allow IPV6 base on traffic within a VPC to access the internet, whilst denying any internet based resources, the possibility of initiating a connection back into the VPC.


NAT Instance

Ec2 instance from a special AWS provided AMI
Translate traffic from many private IP instance to a single public IP and back.
Doesn’t allow public internet initiate connections into private instances.
Not support fot IPV6(use Egress-Only internet gateway )
NAT instances must live o a public subnet with route to Internet Gateway.
Private instances in private subnets must have route to the NAT instances, usually the default route destination of 0.0.0.0/0

NAT Gateway

Fully managed NAT services that replaces need for NAT instances on EC2.
Must be created in a public subnet.
Uses an elastic IP for public IP for the life of the Gateway.
Private instances in private subnets must have route to the NAT instances, usually the default route destination of 0.0.0.0/0
Create in specific AZ redundancy, create NAT gateway in each AZ with routes for private subnet to use the local Gateway.
Up to 5Gbps bandwidth that can scale up to 45Gbps.
Can’t use a NAT Gateway to access VPC peering, VPN or Direct connect, so be sure to include specific route to those in your route table

Advices:
Only two componentes allow VPC to internet communication using IPV6 address and those are “Internet gateway”(inbound) and “Egress-Only Internet Gateway “(outbund).  NAT Instance and NAT Gateway explicitly don’t support IPV6 traffic and a Direct Connection carries data between a Data center and an AWS VPC, but doesn’t travel over the internet.


NAT Gateway vs NAT Instance

Availability 
NAT Gateway -> High availability within AZ
NAT Instance ->  On your own 

Bandwidth:
NAT Gateway -> Up to 45 Gbps
NAT Instance ->  Depends on bandwidth of instance type

Maintenance:
NAT Gateway -> Managed by AWS
NAT Instance ->  On your own 

Performance:
NAT Gateway -> Optimised for NAT
NAT Instance ->  Amazon Linux AMI configured to perform NAT 

Public IP:
NAT Gateway -> Elastic IP that can not be detached
NAT Instance -> Elastic IP that can be detached.

Security Groups:
NAT Gateway -> Cannot be associate with NAT Gateway
NAT Instance ->  Can use Security Groups

Bastion Server:
NAT Gateway -> Not supported 
NAT Instance ->  Can be used as bastion server

Routing Table

VPCs have an implicit route and main routing table
You can modify the main routing table or create new tables
Each route table contains a local route for the CIDR block
Most specific route for an address wins

 
Boarder Gateway Protocol

Popular routing protocol for the internet
Propagate information about the network to allow for dynamic routing
Required for Direct connection and optional for VPN
Alternative of not using BGP with AWS VPC is static routes
AWS supports BGP community tagging as a way to control traffic scope and route preference
Required TCP port 179 + ephemeral port
Autonomous System Number (ASN) = Unique endpoint identifier.

Weighting is local to the route and higher weight is preferred path to outbound traffic. 


Enhanced Networking

Generally used for high Performance Computing use-cases.
Uses single root I/O virtualisation (SR-IOV) to delivery high performance that traditional virtualised network interfaces.
Might have to install driver if other than Amazon Linux HVM AMI
Intel 82599 VF interface (10 Gbps)
Elastic Network Adapter (25 Gbps)

Placement Groups:
  • Clustered
  • Spread 

Clustered:
Instances are placed into a low-latency group within a single AZ
Used when need low network latency and/or high network throughput
Get the most out of Enhanced networking instances
Finite capacity, recommended launching all you might need up front.

Spread:
Instances spread across underlying hardware fails.
Reduce risk of simultaneous failure if underlying hardware fails.
Can span multiples AZ’s
Max of 7 instance running per groups per AZ. 


 Route 53

Register domain names.
Check the health of your domain resources.
Routes internet traffic for your domain.

Whats is a DNS?
The Domain Name Systems (DNS) is the phonebook of the Internet. Humans access information online through domain names, like google.com  yahoo.com

DNS record types (A, CNAME, MX, TXT, etc)
  • A-Records (Host address)
The A-record is the most basic and the most commonly used DNS record type.
It is used to translate human friendly domain names such as "www.example.com" into IP-addresses such as 23.211.43.53 (machine friendly numbers).
A-records are the DNS server equivalent of the hosts file - a simple domain name to IP-address mapping.

  • CNAME-records are domain name aliases.
    Computers on the Internet often performs multiple roles such as web-server, ftp-server, chat-server etc.
To mask this, CNAME-records can be used to give a single computer multiple names (aliases).
For example, the computer "computer1.xyz.com" may be both a web-server and an ftp-server, so two CNAME-records are defined:

  • TXT-records are used to hold descriptive text.
They are often used to hold general information about a domain name such as who is hosting it, contact person, phone numbers, etc.
One common use of TXT-records is for SPF (see http://www.openspf.org).

  • ALIAS-Records (Auto Resolved Alias)
ALIAS-records are virtual alias records resolved by Simple DNS Plus at at the time of each request - providing "flattened" (no CNAME-record chain) synthesized records with data from a hidden source name.
This can be used for different purposes - including solving the classic problem with CNAME-records at the domain apex (for the zone name / for "the naked domain").
Route 53 Concepts (alias, hosted zone, etc)

  • MX-Records (Mail exchange)
MX-records are used to specify the e-mail server(s) responsible for a domain name.
Each MX-record points to the name of an e-mail server and holds a preference number for that server.
When sending an e-mail to "user@example.com", your e-mail server must first look up any MX-records for "example.com" to see which e-mail servers handles incoming e-mail for "example.com".

Route 53 Routing Polices

Simple: Simple routing is the most simple and common DNS policy which can accommodate a single FQDN (fully qualified domain name) or IP address. In case of an A record you have to enter the IP address as the value. For load balancers you use CNAME type.

Failover: Failover routing allows you to route traffic to a resource when the resource is healthy and to another resource when the first one is unhealthy.

Geolocation: Geolocation Routing Policy allows the access to the resources based on the geographic location of the users or client.

Latency: You want to return a website’s IP address to client which has lower latency compared to its identical peer hosted in a different AWS Region

Multivalue Answer: Multivalue answer Routing Policy is like Simple Routing Policy but it can return multiple IP addresses associated with an FQDN. 

Weighted: Result is returned based on a weight of the DNS record. This is used for distributing the number of sessions equally or unequally among the servers.

Cloud Front:

Distributed content delivery service for simple static asset caching up to 4k live and on-demand video streaming.
CloudFront is integrated with AWS – both physical locations that are directly connected to the AWS global infrastructure, as well as other AWS services. CloudFront works seamlessly with services including AWS Shield for DDoS mitigation, Amazon S3, Elastic Load Balancing or Amazon EC2 as origins for your applications, and Lambda@Edge to run custom code closer to customers’ users and to customize the user experience. Lastly, if you use AWS origins such as Amazon S3, Amazon EC2 or Elastic Load Balancing.

Elastic Load Balance:

Distributed inbound connections to backend endpoints.
Three deferents options:
  • Application Load Balancer (Layer 7)
  • Network Load Balancer (layer 4)
  • Classic Load Balancer (layer 4 or layer 7)
Can be used for public or private network.

Protocols:
ALB: HTTPS, HTTP
NLB: TCP
Classic LB: TCP, SSL, HTTP, HTTPS

Path or Host-based Routing:
ALB: YES
NLB: NO
Classic LB: NO

SSL Offloading:
ALB: YES
NLB: NO
Classic LB: YES

Server Name Indication(SNI)
ALB:YES
NLB: NO
Classic LB: NO

Sticky Session:
ALB:YES
NLB: NO
Classic LB: YES

Static IP, Elastic IP:
ALB:NO
NLB: YES
Classic LB: NO

User Authentication:
ALB:YES
NLB: NO
Classic LB: NO


Application Load Balances:



Classic Load Balancer:



Network Load Balancer:

 

Concept of Stick Session:

MPLS is an encapsulation protocol used in many service provider and large- scale enterprise networks. Instead of relying on IP lookups to discover a viable "next-hop" at every single router within a path (as in traditional IP networking), MPLS predetermines the path and uses a label swapping push, pop, and swap method to direct the traffic to its destination. This gives the operator significantly more flexibility and enables users to experience a greater SLA by reducing latency and jitter.

Customer Gateway
A customer gateway (CGW) is the anchor on your side of the connection between your network and your Amazon VPC.4 In an MPLS scenario, the CGW can be a customer edge (CE) device located at a Direct Connect location, or it can be a provider edge (PE) device in an MPLS VPN network. For more information on which option best suits your needs, see the Colocation section later in this document.


CIDR = Classless Inter-Domain Routing
WAN =  Wide Area Network
MPLS =  Multiprotocol Label Switching


    
References:

domingo, 2 de junho de 2019

What should we know about AWS Data Store?

I did and I used this summary with some bullets and some parts of text extracted from AWS papers and from come online courses like cloud guru, linux academy, udemy. To study concepts about Data Store.
I hope it could be useful to someone.

Data store

  • Persistence data store: Durable database after restart( RDS,Glacier )
  • Transient Data store: temporary data that will be passed to another database or process (SQS, SNS)
  • Ephemeral Data Store: Data can be lost with stop (EC2 instance store, Memcached)

IOPS vs Throughput

  • IOPS : (input / output operations per second) measures how fast reading or writing 
  • Throughput:  measures the amount of data that can be moved over a period of time

Consistency is far better than rare moments of greatness.

Availability is considered a higher priority for the BASE model but not the ACID model which values consistency over availability. Lazy writes are typical of eventual consistency rather than perpetual consistency

ACID
  • Atomic -> transaction are all or nothing
  • Consistence -> transaction must be valid
  • Isolated -> Transaction can't be mess with one another
  • Durable -> completed transaction must stick around

BASE

  • Basic Availability -> values availability even if stale
  • Soft State -> might not be instantly consistent across stores
  • Eventual Consistency -> will achieve consistency at some point

DataBase Options:

Database on ec2: ultimate control over database, preferred DB not available on RDS
RDS: traditional database for OLTP, data is well structured ACID complyant
DynamoDB: Name/value or unpredictable data structure, in-memory performance with persistence, scale dynamically
Redshift: massive amount of data, primarily OLAP workloads
Neptune: relationships between objects a major portion of data values
Elasticache: Fast temporary storage for small amounts of data

Saas partitioning models:
  • Silo: separate database for each tenant
  • Bridge: single database, multiple schemas
  • Pool: Shared database, single schema



Silo Model:


Bridge Model:
Pool Model:


RDS:


Managed database option for mysql, Maria, postgreSQL,Sql server, oracle, aurora
Best for structured, relational data store needs.
Aims to be drop-in replacement for existing on-prem instance of same databases
Automated backup and patching in customer-defined maintenance windows.
Push button scaling, replication and redundancy.

Multi-AZ
Read replicas service regional users
For Mysql, non-transnational storage engine like MyISAM don’t support replication, you must use InnoDB or XtraBD on Maria
MariaDB is an open source fork of mysql

RDS Anti-patterns:
  • Lots of large binary object like BLOBs- user S3
  • Automated scalability - use dynamoDB
  • Name/Value Data Structure  - use dynamoDB
  • Data is not well structured or unpredictable - - use dynamoDB
  • Other database platform like, DB2, Hana or complete control over the database -> ec2

S3

The maximum object at s3 can be 5TB
The biggest object on a put can be 5GB
It is recommended to use multi-part upload if it is greater than 100GB

The filename can be considered a key not exactly a path
Amazon S3 is designed for 99.999999999 percent (11 nines) durability per object
99.99 percent availability over a one-year period.




Consistency:

S3 prove read-after-write consistency for PUTs in new objects
Head or GET requests for keys before the object exists will return in eventual consistency
S3 offers eventual consistency for overwrite PUTs and DELETES
Update on a single key are atomic
If you make a HEAD or GET request for the S3 key name before creating the object, S3 provides eventual consistency for read-after-write. As a result, we will get a 404 Not Found error until the upload is fully replicated. 


Security:

  • Resource-based->object ACL, Bucket policy 
  • User-based -> IAM policy 
  • Optional Multi-factor Authentication before delete

Data protection

New version with each write
Enable roll-back and un-delete capabilities
Old version count as billable size until they are permanently deleted.
Integrated with lifecycle management
Cross-Region Replication

LifecycleManagement 

Optimize storage cost
Adhere to data retention policies
Keep S3 volumes well maintained

Analytics

Data lake concepts -> with Athena, Redshift Spectrum, QuickSigth
IoT Streaming data Repository -> Kinesis Firehose
Machine Learning and AI Storage -> Rekognition, Lex, MXNet
Storage Class Analysis -> s3 management Analytics

Encryption at Rest

  • SSE-S3 -> Use s3 existing encryption keys for AES-256
  • SSE-C -> Upload your own AES-256 encryption key which s3 will use when it writes the object
  • SSE-KMS -> Use a key generated and managed by AWS key management Service
  • Client-side -> Encrypt object using you own local encryption process before uploading S3 

Tricks:

  • Transfer Acceleration -> Speed up data uploads using cloudFront in reverse 
  • Requester pay -> the requester rather the bucket owner pays for requests and data transfer 
  • Tags -> Assing tags to object for use in costing, billing, security.
  • Events -> trigger notification to SNS, SQS or lambda when certain events happen in you bucket
  • Satic web hosting -> simple and massively scalable static website hosting 
  • BitTorrent -> use the bitTorrent protocol to retrieve any publicly available object by automatically generating a.torrent file

Glacier

Cheap, slow to respond, seldom accessed
Cold storage
Used by AWS Storage Gateway Virtual tape library
Integrated with aws  s3 via lifecycle management
Faster retrieval speed options if you pay more
Amazon Glacier is designed to provide average annual durability of 99.999999999 percent (11 nines) for an archive.
Glacier Vault Lock is an immutable way to set policies on a Glacier vault such as retention or enforcing MFA before delete.

S3 for blobs 

EFS: service provides scalable network file storage for Ec2 instances
EBS: service provide block storage volumes for ec2
Ec2 instance storage: temporary block storage volumes for ec2
Storage gateway: an on-premises storage appliance that integrate with cloud storage
Snowball: service transport large amount of data to and from cloud
Amazon S3 is designed for 99.999999999 percent (11 nines) durability and 99.99 percent availability
Each AWS Snowball appliance is capable of storing 50 TB or 80 TB of data
For Amazon S3, individual files are loaded as objects and can range up to 5 TB in size

CloudFront: service provide a global content delivery network
Amazon CloudFront is designed for low-latency and high-bandwidth delivery of content
CDN is an edge cache, Amazon CloudFront does not provide durable storage. The origin server, such as Amazon S3 or a web server running on Amazon EC2

Amazon s3 offers a range of storage classes desired for deferents use cases including:

S3 standard: for general propose storage of frequently accessed data .
S3 standard infrequent access (Standard-IA): for long lived, but less frequently accessed data
Glacier: for low cost archival data, retrieval jobs typically complete in 3 to 5 hours, you can improve the upload experience for large archives by using multipart upload for archives up to about 40 TB (the single archive limit)

S3 Usage Patterns:
  • First: used to store and distributed static web content and media
  • Second: used to host entire static websites
  • Third: used as a data store for computation and large-scale analytics, such as financial transactional analysis, clickstream analytics, and media transcoding.

Elastic Block Storage


Virtual hard drives
Can only be user with ec2
Tied to a single AZ
Variety of optimised choices for IOPS, Throughput and cost
Temporary
Ideal for caches, buffers, works areas
Data goes away when ec2 is stopped or terminated
Cost effective and easy backups strategy
Share data set with other user or account
Migrate a system  to a new AZ or Region
Amazon EBS provides a range of volume types that are divided into two major categories: SSD-backed storage volumes and HDD-backed storage volumes.

Volume types:
  • SSD-Backed Provisioned IOPS (io1): I/O-intensive NoSQL and relational databases
  • SSD-Backed General Purpose (gp2) : Boot volumes, low-latency interactive apps, dev & test
  • HDD-Backed Throughput Optimized (st1) : Big data, data warehouse, log processing
  • HDD-Backed Cold (sc1) : Colder data requiring fewer scans per day

Elastic File Server


Implementation of NFS file share
Elastic storage capacity, and pay for only what  you see
Multi-AZ metadata and data storage
Configure mount points in one, or many, AZs.
Can be mounted from on-premises system ONLY if using direct connect
Alternative, use EFS file sync agent
if your overall Amazon EFS workload will exceed 7,000 file operations per second per file system, we recommend the files system use Max I/O performance mode.
IAM permissions for API calls; security groups for EC2 instances and mount targets; and Network File System-level users, groups, and permissions.

Storage gateway


Virtual machine that you run on-premises with VMWare or HyperV
Provides local storage resources backed by S3 and glacier
Ofter used in disaster recovery preparedness to sync to aws
Useful in cloud migration

File gateway: Allow on-prem or Ec2 instance to object in s3 via NFS or SMB mount point
Volume Gateway store mode: async replication of on-prem data to s3
Volume gateway cached mode: primary data stored in s3 with frequently access data cached locally on-prem
Tape Gateway: virtual media changer and tape library for use with existing backup software

WorkDocs
Secure, fully managed file collaboration service
Can integrate with AD for SSO
Web, mobile and native clientes (no linux client)
HIPPA, PCI DSS and ISO compliance requirements
Available SDK for creating complementary apps

Database on Ec2:


Run any database with full control and ultimate flexibility.
Must manage everything like backup, redundancy, patching, scale
Good option with your require a database not yet supported by RDS, such as IBM DB2, or SAP Hana
Good option if it is not feasible to migrated to aws managed database

AWS offers two EC2 instance families that are purposely built for storage-centric workloads:
-SSD-Backed Storage-Optimized (i2):NoSQL databases, like Cassandra and MongoDB, scale out transactional databases, data warehousing, Hadoop, and cluster file systems.

-HDD-Backed Dense-Storage (d2): Massively Parallel Processing (MPP) data warehousing, MapReduce and Hadoop distributed computing, distributed file systems, network file systems, log or data-processing applications.

DynamoDB


Managed, multi-az NoSQL data store with cross-region Replication option
Defaults to eventual consistency reads but can request strongly consistency read via SDK parameter
Priced on throughput, rather than computer
Provision read and write capacity in anticipation of need
Autoscale capacity adjusts per configured min/max levels
On-demand capacity for flexible capacity at a small premium cost
Achieve ACID compliance with DynamoDB transactions

Secondary index

Global Secondary index 
    -> Partition key and sort key can be different from those on the table
    -> User when you want a fast query of attributes outside the primary key without having to do a table scan (read everything sequentially )
  
Local Secondary index 
    -> Same partition key as the table but different sort key
    -> Use when you already know the partition key and want to quickly query on some other attributes

Max 5 local and 5 global secondary indexes
Max 20 attributes across all indexes
Indexes take up storage space

Example:
Suppose we created a Global secondary index using customerNum, we could query by this fields with light-speed

If you need to:
-> access just a few attributes the fastest way possible, Consider projecting just those few attributes in a global secondary index, Benefits: lowest possible latency access for non-key items.
-> frequently access some non-key attributes , Consider projecting those attributes in a global secondary index, Benefits: lowest possible latency access for non-key items.
-> frequently access most non-key attributes , Consider projecting those attributes or even the entire table in a global secondary index, Benefits: maximum flexibility
-> rarely query but write or update frequently, Consider projecting keys only for the global secondary index, Benefits: very fast write or updates for non-partition keys itens 

Redshift:


Fully managed, clustered  peta-byte scale data warehouse
Extremely cost-effective as comparable to some other on-premises data warehouse platform
PostgresSQL compatible with JDBC and ODBC drivers available compatible with most BI tools out of the box
Features parallel processing and columnar data store which are optimised for complex queries
Option to query directly from data file on s3 via Redshift spectrum

Multitenancy on Amazon Redshift
Amazon Redshift also places some limits on the constructs that you can create within each cluster. Consider the following limits:
* 60 databases per cluster
* 256 schemas per database
* 500 concurrent connections per database
* 50 concurrent queries
* Access to a cluster enables access to all databases in the cluster

Datalake:
Query raw data without extensive per-processing
Lessen time from data collection to data value
Identify correlations between disparate data sets

Neptune:
Fully-managed graphs database
Supports open graphs APIs for both Gremlin and SPARQL



ElasticCache


Fully-managed implemented of two popular in-memory data store - Redis and Memcached
Push button scalability for memory, writes and reads
In memory key/value store - not persistent in the traditional sense
Billed by node size and hours of  use

Web Session store - in case with load balance web servers, store web session information in redis so if a server is lost, the session info is not lost and another web server can pick-up

Database caching -use memcached in front of AWS RDS to cached popular queries to offload work from RDS and return results faster to users.

Leaderboards- use redis to provide a live leaderboard from millions of users of your mobile app.

Streaming Data Dashboards-provide a landing spot of streaming sensor data on the factory floor, providing live real-time dashboard displays. 

Memcahed: If you need cached like a queries database

Redis
If you need encryption, HIPPA compliance, suporte clustering.
High availability
Pub/sub capability
Geo spacial indexing
Backup and restore

Alternatives to ElastiCache:

  • Amazon CloudFront content delivery network (CDN)—this approach is used to cache web pages, image assets, videos, and other static data at the edge, as close to end users as possible
  • Amazon RDS Read Replicas—some database engines, such as MySQL, supportthe ability to attach asynchronous read replicas.
  • On-host caching—a simplistic approach to caching is to store data on each Amazon EC2 application instance, so that it's local to the server for fast lookup


When deciding between Memcached and Redis, here are a few questions to consider:
  • Is object caching your primary goal, for example to offload your database? If so, use Memcached.
  • Are you interested in as simple a caching model as possible? If so, use Memcached.
  • Are you planning on running large cache nodes, and require multithreaded performance with utilization of multiple cores? If so, use Memcached.
  • Do you want the ability to scale your cache horizontally as you grow? If so, use Memcached.
  • Does your app need to atomically increment or decrement counters? If so, use either Redis or Memcached.
  • Are you looking for more advanced data types, such as lists, hashes, and sets? If so, use Redis.
  • Does sorting and ranking datasets in memory help you, such as with leaderboards? If so, use Redis.
  • Are publish and subscribe (pub/sub) capabilities of use to your application? If so, use Redis.
  • Is persistence of your key store important? If so, useRedis.
  • Do you want to run in multiple AWS Availability Zones (Multi-AZ) with failover? If so, use Redis.

References: