Amazon S3 bucket
What is an Amazon S3 bucket?
An Amazon S3 bucket is a public cloud storage resource available in Amazon Web Services (AWS) Simple Storage Service (S3) platform. It provides object-based storage, where data is stored inside S3 buckets in distinct units called objects instead of files.
Amazon S3 buckets are similar to file folders and can be used to store, retrieve, back up and access objects. Each object has three main components -- the object's content or data, a unique identifier for the object and the descriptive metadata, including the object's name, URL and size.
An object must exist within a bucket, as it can't exist alone. Each Amazon account could have hundreds of buckets, each containing numerous objects.
What are S3 buckets used for?
Amazon's Simple Storage Service buckets are mainly used to help individuals and enterprises meet their data storage, backup and delivery needs in the cloud.
An infinite amount of data can be stored and protected using Amazon S3 buckets for a variety of use cases:
- Data lakes.
- Dynamic websites.
- Mobile applications.
- Backup and restore operations.
- Big data analytics.
- User-generated content.
- Storage archives.
- Enterprise applications.
- IoT devices.
How to use an S3 bucket
An S3 user first creates a bucket in the AWS region of their choice and gives it a globally unique bucket name. It's crucial to know that Amazon S3 buckets are globally unique, which means that the bucket names of any two AWS accounts in the same region can't be the same. AWS recommends that users choose regions geographically close to them to reduce latency and storage costs.
Once the bucket is created, the user selects a tier for the data. Different S3 tiers have different levels of redundancy, pricing and accessibility. One bucket can store objects from different S3 storage tiers.
Next the user specifies access privileges for the objects stored in a bucket using the AWS Identity and Access Management service, bucket policies or access control lists (ACLs).
An AWS user can interact with an Amazon S3 bucket via the AWS Management Console, AWS Command Line Interface or application programming interfaces (APIs). The S3 access points that have the Amazon Resource Names and the bucket hostname can be used to access objects inside a bucket.
S3 bucket features
Amazon S3 offers numerous features to manage and organize data and support particular use cases.
Commonly used features that can be enabled for S3 buckets include the following:
- Versioning control. Versioning control can preserve every version of an object when a user performs an operation, such as copy or delete. This helps prevent the object from being accidently deleted. Multi-factor authentication can also be enabled on an S3 bucket to prevent accidental deletions.
- Object ownership. This bucket-level setting can be used to disable ACLs and to take ownership of every object inside a bucket, streamlining access management for data stored in Amazon S3. By default, when a user uploads an object to another Amazon account's S3 bucket, that account -- the object writer -- automatically becomes the owner of the object, has access to it and can grant other users access to it through ACLs. However, this default behavior can be modified by the bucket owner by using the Object Ownership feature.
- Object replication. The Amazon S3 Replication feature can replicate objects between buckets. Amazon S3 can be configured to automatically replicate S3 bucket objects across different AWS regions using S3 Cross-Region Replication. For buckets that need to be replicated within the same AWS region, Same-Region Replication is used.
- Transfer Acceleration. This feature helps execute fast, secure transfers from a client to an S3 bucket via AWS edge locations.
- Block Public Access. A set of security controls called S3 Block Public Access ensures that the general public can't access S3 buckets and objects. These settings can be easily applied to all buckets in the AWS account or specific S3 buckets with just a few clicks in the Amazon S3 Management Console.
- Audit logs. A user can configure an Amazon S3 bucket to capture all access log entries made to it. These server access logs are great for auditing, as they keep track of every request made against a bucket or the objects it contains.
- Object tagging. Users can restrict and manage access to S3 objects using the Amazon S3 Tagging feature. These tags are key-value pairs that can be added to, changed or removed from S3 objects at any point in their lifespan. They enable the creation of identity and access management (IAM) policies, the configuration of S3 lifecycle policies and the customization of storage metrics.
Bucket configurations
Amazon S3 supports a variety of configuration options for buckets. For example, a user can configure their bucket for website hosting, add a configuration to control the object lifecycle in the bucket or configure the bucket to record all accesses to it.
Amazon S3 supports sub-resources so users can store and manage bucket configuration data. Users can also create and manage these sub-resources using the Amazon S3 API. However, they can also use the AWS console or the Amazon SDKs for this purpose.
When setting up S3 buckets, the bucket owner can also create object-level configurations. For example, by setting up an ACL that's unique to an object, the owner of the bucket can specify object-level permissions.
Bucket permission options
By default, only the owner of the bucket can access the buckets and resources inside them. However, a bucket owner can grant cross-account permissions to another AWS account or users in another account to upload objects.
For objects stored inside a bucket, access privileges are typically granted through the following permission options:
- Bucket policies. The bucket owner can use a bucket policy to grant permissions to the bucket and any objects inside the bucket that belong to the owner. A bucket policy can be easily created using the AWS Policy Generator.
- AWS Identity and Access Management service. The AWS IAM web service lets users securely manage who has access to their Amazon S3 buckets and other AWS resources. Several users can be created under the same Amazon account using AWS IAM and user policies can be linked to these accounts to control S3 object access permissions.
- ACLs. In addition to bucket policies and IAM-based user policies, ACLs can be used to limit access to objects in an S3 bucket. Both S3 buckets and objects have ACLs that can be used to grant access to S3 objects.
S3 pros and cons
The Amazon S3 service offers stable and scalable storage choices. However, it also comes with a few drawbacks:
Pros
- High availability. The availability of a service is determined by how easily and readily it can be used. To guarantee high availability, AWS availability zones or regions are spread across several countries across the globe.
- Limitless server capacity. Amazon S3 provides unlimited server capacity, letting users store data without having to worry about hard drive failures or other service interruptions.
- Ease of use. Amazon S3 cloud storage is extremely user friendly and comes with an intuitive interface. It's specifically designed for fast, secure access and comes with a wealth of documentation, videos and information to help users get started with the service even if they don't have prior experience using cloud services.
- Durability. The probability of data loss is measured by durability. All of Amazon's services, including Amazon S3, are extremely durable. The durability of the S3 Standard Tier is 99.999999999%, which essentially means that if 100 billion objects are stored in S3, at most, only one object would be lost. For example, the data in S3 buckets is preserved even if two data centers fail simultaneously.
- Security. S3 provides many security and data protection options. Virtual private cloud endpoints enable users to connect to their S3 resources from their Amazon Virtual Private Cloud (Amazon VPC). Automatic encryption of data is also provided as soon as the data uploading process is finished. Various other security options are also offered, such as IAM, which enables only a certain person to access information. Also, bucket owners can monitor who is accessing their data and obtain information such as the location of access, time and device accessing the platform.
- Different storage classes. For diverse demands and requirements, AWS offers S3 Intelligent-Tiering in a variety of S3 storage classes, including Standard for frequent usage, Infrequent Access storage for infrequent use and Glacier, which is Amazon's long-term storage platform.
- Horizontal scaling. Amazon S3 scales horizontally, distributing data across multiple servers to handle massive volumes of data and requests. Horizontal scaling enables concurrent connections and auto-partitioning to boost its capacity and performance.
Cons
- No directories. No concept of directories and directory-like structures exist in S3. The "/" character is only a component of the key name or the object name.
- Data retrieval. Since an S3 bucket isn't a local disk, data retrieval requires sending queries via the internet or the internal network of AWS. Due to the nature of the internet, this can occasionally result in delays and possible request failures.
- Object store. S3 isn't a file system but just an object store. Therefore it can never be treated as a Portable Operating System Interface file system.
- Latency and availability. S3 isn't a real-time storage service even though it's designed for high durability and availability. When accessing S3 data, users could encounter some latency and downtime, especially when there's a spike in demand or an infrastructure problem.
- Limited data management. S3 lacks built-in data management tools such as version control, backup and recovery, and data compression because it's essentially a data storage service. To meet these objectives, users can create these functionalities themselves or use other AWS services built for these specific uses.
- Complex billing. For small business owners who aren't particularly tech-savvy, billing for most Amazon services, including S3, can be complex and difficult to understand. However, most consumers eliminate this problem by working with resellers, who make the billing procedure more intuitive and easier for the end-user to comprehend.
S3 bucket limitation and pricing
Although there's no limit to the number of objects that can be stored in a bucket, buckets can't exist inside of other buckets.
S3 performance remains the same regardless of how many buckets a user creates. Each AWS account can create 100 buckets, and users can request a service limit increase to obtain more. The AWS account that creates a bucket owns it, and ownership isn't transferable. An S3 user can delete a bucket, but another AWS user can claim that globally unique name.
AWS charges users for storing objects in a bucket and for transferring objects in and out of buckets. Amazon S3 bucket pricing varies by region.
S3 bucket alternatives and competitors
Although AWS S3 offers many exclusive options, there are several vendors that offer alternatives to S3 buckets and storage options:
- DigitalOcean Space. This object storage service features a built-in S3-compliant content delivery network for simple scaling. The service is $5 per month for 250 GB of storage, 1 TB of outgoing transfer and unlimited uploads. It provides a drag-and-drop interface for uploading files, while all data transfers are automatically secured by a Secure Sockets Layer certificate.
- Google Cloud Storage (GSC). One of the largest storage providers, GSC serves more than 140 facilities and more than 20 regions across the globe. It offers four billing options depending on the frequency of data usage: Standard, Nearline, Coldline and Archive. As part of the Google Cloud Free Tier, GSC offers a monthly limit of 5 GB storage, 5,000 Class A operations, 50,000 Class B operations and 1 GB egress. Other pricing varies by region.
- Wasabi Hot Cloud Storage. Wasabi currently offers services in Asia Pacific, Europe and North America. Monthly costs start at $0.0059 per GB or $5.99 per TB. Wasabi doesn't charge for egress or API requests.
- Blackblaze B2. Backblaze has five datacenters: one in Europe and four in the U.S. Download is free for cloud replication of files across regions. B2 is a pay-for-usage service, so users are only charged for the amount of data and storage used. The first 1 GB of data downloaded daily is free and then $0.01 per GB, per month. The first 10 GB of storage is free and is then $0.005 per GB, per month.
The thin distinction between cloud storage and cloud backup can be confusing. Find out how these two cloud choices differ from one another and which option is best for specific use cases.