[AWS] Amazon Elastic Block Store EBS

Give me a lever long enough and a fulcrum on which to place it, and I shall move the world. ― Archimedes

Amazon Elastic Block Store(Amazon EBS)

Amazon Elastic Block Store (Amazon EBS) provides persistent block storage volumes for use with Amazon EC2 instances in the AWS Cloud. Each Amazon EBS volume is automatically replicated within its Availability Zone to protect you from component failure, offering high availability and durability. Amazon EBS volumes offer the consistent and low-latency performance needed to run your workloads. With Amazon EBS, you can scale your usage up or down within minutes – all while paying a low price for only what you provision.

Amazon EBS is designed for application workloads that benefit from fine tuning for performance, cost and capacity. Typical use cases include Big Data analytics engines (like the Hadoop/HDFS ecosystem and Amazon EMR clusters), relational and NoSQL databases (like Microsoft SQL Server and MySQL or Cassandra and MongoDB), stream and log processing applications (like Kafka and Splunk), and data warehousing applications (like Vertica and Teradata).

Amazon EBS volumes are available in a variety of types that differ in performance characteristics and price. Multiple Amazon EBS volumes can be attached to a single Amazon EC2 instance, although a volume can only be attached to a single instance at a time.

Benefits

Reliable, secure storage

Each Amazon EBS volume provides redundancies within its Availability Zone to protect against failures. Encryption and access control policies deliver a strong defense-in-depth security strategy for your data.

Quickly Scale Up, Easily Scale Down

Amazon EBS allows you to optimize your volumes for capacity, performance, or cost giving you the ability to dynamically adapt to the changing needs of your business.

Consistent, Low-latency Performance

Amazon EBS General Purpose (SSD) volumes and Amazon EBS Provisioned IOPS (SSD) volumes deliver low-latency through SSD technology and consistent I/O performance scaled to the needs of your application.

Geographic Flexibility

Amazon EBS provides the ability to copy snapshots across AWS regions, enabling geographical expansion, data center migration, and disaster recovery providing flexibility and protecting for your business.

Backup, Restore, Innovate

Protect your data by taking point-in-time snapshots of your Amazon EBS volumes providing long-term durability for your data. Boost the agility of your business by using Amazon EBS snapshots to create new EC2 instances.

Optimized Performance

An Amazon EBS–optimized instance provides dedicated network capacity for Amazon EBS volumes. This provides the best performance for your EBS volumes by minimizing network contention between EBS and your instance.

Use cases

Relational database

Amazon EBS scales with your performance needs, whether you are supporting millions of gaming customers or billions of e-commerce transactions. Databases such as Oracle, Microsoft SQL Server, MySQL and PostgreSQL are widely deployed on Amazon EBS.

NoSQL databases

Amazon EBS volumes provide the consistent and low-latency performance your application needs when running NoSQL databases.

Enterprise applications

Amazon EBS meets the diverse needs of your organization by providing reliable block storage to run mission-critical applications such as Oracle, SAP, Microsoft Exchange and Microsoft SharePoint.

Business continuity

Minimize data loss and recovery time by regularly backing up your data and log files across different geographic regions. Copy Amazon Machine Images (AMIs) and EBS Snapshots to launch applications in new AWS regions.

Development and test

Amazon EBS enables your organization to be more agile and responsive to customer needs. Provision, duplicate, scale, or archive your development, test and production environments with a few clicks.

Types of Amazon EBS Volumes

Amazon EBS provides the following volume types, which differ in performance characteristics and price, so that you can tailor your storage performance and cost to the needs of your applications. The volumes types fall into two categories:

SSD-backed volumes optimized for transactional workloads involving frequent read/write operations with small I/O size, where the dominant performance attribute is IOPS

HDD-backed volumes optimized for large streaming workloads where throughput (measured in MiB/s) is a better performance measure than IOPS

The following table describes the use cases and performance characteristics for each volume type.

Solid-State Drives (SSD) Hard Disk Drives (HDD)
Volume Type General Purpose SSD (gp2)* Provisioned IOPS SSD (io1) Throughput Optimized HDD (st1) Cold HDD (sc1)
Description General purpose SSD volume that balances price and performance for a wide variety of workloads Highest-performance SSD volume for mission-critical low-latency or high-throughput workloads Low-cost HDD volume designed for frequently accessed, throughput-intensive workloads Lowest cost HDD volume designed for less frequently accessed workloads
Use Cases
  • Recommended for most workloads

  • System boot volumes

  • Virtual desktops

  • Low-latency interactive apps

  • Development and test environments

  • Critical business applications that require sustained IOPS performance, or more than 16,000 IOPS or 250 MiB/s of throughput per volume

  • Large database workloads, such as:

    • MongoDB

    • Cassandra

    • Microsoft SQL Server

    • MySQL

    • PostgreSQL

    • Oracle

  • Streaming workloads requiring consistent, fast throughput at a low price

  • Big data

  • Data warehouses

  • Log processing

  • Cannot be a boot volume

  • Throughput-oriented storage for large volumes of data that is infrequently accessed

  • Scenarios where the lowest storage cost is important

  • Cannot be a boot volume

API Name gp2 io1 st1 sc1
Volume Size 1 GiB - 16 TiB 4 GiB - 16 TiB 500 GiB - 16 TiB 500 GiB - 16 TiB
Max. IOPS**/Volume 16,000*** 64,000**** 500 250
Max. Throughput/Volume 250 MiB/s*** 1,000 MiB/s† 500 MiB/s 250 MiB/s
Max. IOPS/Instance†† 80,000 80,000 80,000 80,000
Max. Throughput/Instance†† 1,750 MiB/s 1,750 MiB/s 1,750 MiB/s 1,750 MiB/s
Dominant Performance Attribute IOPS IOPS MiB/s MiB/s

.* Default volume type for EBS volumes created from the console is gp2. Volumes created using the CreateVolume API without a volume-type argument default to either gp2 or standard according to region:

** gp2/io1 based on 16 KiB I/O size, st1/sc1 based on 1 MiB I/O size

*** General Purpose SSD (gp2) volumes have a throughput limit between 128 MiB/s and 250 MiB/s depending on volume size. Volumes greater than 170 GiB and below 334 GiB deliver a maximum throughput of 250 MiB/s if burst credits are available. Volumes with 334 GiB and above deliver 250 MiB/s irrespective of burst credits. An older gp2 volume may not see full performance unless a ModifyVolume action is performed on it. For more information, see Modifying the Size, IOPS, or Type of an EBS Volume on Linux.

** Maximum IOPS of 64,000 is guaranteed only on Nitro-based instances. Other instance families guarantee performance up to 32,000 IOPS.

† Maximum throughput of 1,000 MiB/s is guaranteed only on Nitro-based instances. Other instance families guarantee up to 500 MiB/s. An older io1 volume may not see full performance unless a ModifyVolume action is performed on it. For more information, see Modifying the Size, IOPS, or Type of an EBS Volume on Linux.

†† To achieve this throughput, you must have an instance that supports it. For more information, see [Amazon EBS–Optimized Instances].(https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html)

Protecting Data

Over the lifecycle of an Amazon EBS volume, there are several practices and services.

Backup/Recovery (Snapshots)

You can back up the data on your Amazon EBS volumes, regardless of volume type, by taking point-in-time snapshots. Snapshots are incremental backups, which means that only the blocks on the device that have changed since your most recent snapshot are saved.

Taking Snapshots You can take snapshots in many ways:

Data for the snapshot is stored using Amazon S3 technology. The action of taking a snapshot is free. You pay only the storage costs for the snapshot data.

When you request a snapshot, the point-in-time snapshot is created immediately and the volume may continue to be used, but the snapshot may remain in pending status until all the modified blocks have been transferred to Amazon S3.

It’s important to know that while snapshots are stored using Amazon S3 technology, they are stored in AWS-controlled storage and not in your account’s Amazon S3 buckets. This means you cannot manipulate them like other Amazon S3 objects. Rather, you must use the Amazon EBS snapshot features to manage them. Snapshots are constrained to the region in which they are created, meaning you can use them to create new volumes only in the same region. If you need to restore a snapshot in a different region, you can copy a snapshot to another region.

Creating a Volume from a Snapshot

To use a snapshot, you create a new Amazon EBS volume from the snapshot. When you do this, the volume is created immediately but the data is loaded lazily. This means that the volume can be accessed upon creation, and if the data being requested has not yet been restored, it will be restored upon first request. Because of this, it is a best practice to initialize a volume created from a snapshot by accessing all the blocks in the volume.

Snapshots can also be used to increase the size of an Amazon EBS volume. To increase the size of an Amazon EBS volume, take a snapshot of the volume, then create a new volume of the desired size from the snapshot. Replace the original volume with the new volume.

Recovering Volumes

Because Amazon EBS volumes persist beyond the lifetime of an instance, it is possible to recover data if an instance fails. If an Amazon EBS-backed instance fails and there is data on the boot drive, it is relatively straightforward to detach the volume from the instance. Unless the DeleteOnTermination flag for the volume has been set to false, the volume should be detached before the instance is terminated. The volume can then be attached as a data volume to another instance and the data read and recovered.

Encryption Options

Many workloads have requirements that data be encrypted at rest, either because of compliance regulations or internal corporate standards. Amazon EBS offers native encryption on all volume types.

When you launch an encrypted Amazon EBS volume, Amazon uses the AWS Key Management Service (KMS) to handle key management. A new master key will be created unless you select a master key that you created separately in the service. Your data and associated keys are encrypted using the industry-standard AES-256 algorithm. The encryption occurs on the servers that host Amazon EC2 instances, so the data is actually encrypted in transit between the host and the storage media and also on the media. (Consult the AWS documentation for a list of instance types that support Amazon EBS encryption.) Encryption is transparent, so all data access is the same as unencrypted volumes, and you can expect the same IOPS performance on encrypted volumes as you would with unencrypted volumes, with a minimal effect on latency. Snapshots that are taken from encrypted volumes are automatically encrypted, as are volumes that are created from encrypted snapshots.