Overview
Amazon S3 stores data as objects within buckets. An object consists of a file and optional metadata that describes it. To store an object, you upload the file to a bucket and can set permissions for both the object and its metadata.
Buckets serve as containers for objects. You can create and menage multiple buckets, controlling access permissions (who can create, delete, and list objects), viewing access logs, and selecting the geographic region for storage to minimize latency or meet compliance requirements.
Getting Started
Install the latest version of Boto3 using pip:
pip install boto3
Import Boto3 and specify the service you want to use:
import boto3
# Connect to Amazon S3
s3 = boto3.resource('s3')
Once you have the s3 resource, you can send requests to the service. The following code iterates through the bucket collection and prints all bucket names:
import boto3
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
You can also upload and download binary data. The example below uploads a new file to S3, assuming the bucket already exists:
data = open('sample_image.jpg', 'rb')
s3.Bucket('my-bucket').put_object(Key='sample_image.jpg', Body=data)
Working with Amazon SQS
This section demonstrates how to use Boto3 with AWS services. The following examples show how to work with Amazon Simple Queue Service (SQS), which allows you to queue and process messages. You'll learn how to use "resources and collections" to create new queues, retrieve and use existing queues, push messages to queues, and process messages from queues.
Creating a Queue
Create a queue with a specified name. You can optionally set queue attributes such as the delay seconds before processing items. This example creates a queue named test. Before creating a queue, you must first obtain the SQS service resource:
import boto3
# Obtain the service resource
sqs = boto3.resource('sqs')
# Create the queue. Returns an SQS.Queue instance
queue = sqs.create_queue(QueueName='test', Attributes={'DelaySeconds': '5'})
# Access the identifier and attributes
print(queue.url)
print(queue.attributes.get('DelaySeconds'))
Accessing Existing Queues
Retrieve a queue by its name. If the queue does not exist, an exception is raised:
sqs = boto3.resource('sqs')
# Get the queue by name. Returns an SQS.Queue instance
queue = sqs.get_queue_by_name(QueueName='test')
# Access identifiers and attributes
print(queue.url)
print(queue.attributes.get('DelaySeconds'))
You can also list all existing queues:
for queue in sqs.queues.all():
print(queue.url)
Sending Messages
Sending a message adds it to the end of the queue:
import boto3
sqs = boto3.resource('sqs')
# Get the queue
queue = sqs.get_queue_by_name(QueueName='test')
# Create and send a new message
response = queue.send_message(MessageBody='world')
# The response contains message ID and MD5 hash
print(response.get('MessageId'))
print(response.get('MD5OfMessageBody'))
You can also create messages with custom attributes:
queue.send_message(MessageBody='boto3', MessageAttributes={
'Author': {
'StringValue': 'Daniel',
'DataType': 'String'
}
})
Messages can be sent in batches. The following sends two messages in a single request:
response = queue.send_messages(Entries=[
{
'Id': '1',
'MessageBody': 'world'
},
{
'Id': '2',
'MessageBody': 'boto3',
'MessageAttributes': {
'Author': {
'StringValue': 'Daniel',
'DataType': 'String'
}
}
}
])
# Check for any failures
print(response.get('Failed'))
The response contains both successful and failed message lists, allowing you to retry failures if needed.
See also: SQS.Queue.send_message(), SQS.Queue.send_messages()
Processing Messages
Messages are retrieved in batches:
import boto3
sqs = boto3.resource('sqs')
# Get the queue
queue = sqs.get_queue_by_name(QueueName='test')
# Process messages by printing body and optional author name
for message in queue.receive_messages(MessageAttributeNames=['Author']):
# Retrieve custom author attribute if present
author_text = ''
if message.message_attributes is not None:
author_name = message.message_attributes.get('Author').get('StringValue')
if author_name:
author_text = ' ({0})'.format(author_name)
# Output body and author (if set)
print('Hello, {0}!{1}'.format(message.body, author_text))
# Acknowledge message processing
message.delete()
Output:
Hello, world!
Hello, boto3! (Daniel)
See also: SQS.Queue.receive_messages(), SQS.Message.delete()
Amazon S3 Examples
Amazon Simple Storage Service (Amazon S3) is an object storage service offering scalability, data availability, security, and performance.
Creating an S3 Bucket
S3 bucket names must be globally unique across all AWS regions. Buckets can be located in specific regions to minimize latency or meet regulatory requirements:
import logging
import boto3
from botocore.exceptions import ClientError
def create_bucket(bucket_name, region=None):
"""Create an S3 bucket in a specified region.
If no region is specified, the bucket is created in the S3 default
region (us-east-1).
:param bucket_name: Bucket to create
:param region: Region to create bucket in, e.g., 'us-west-2'
:return: True if bucket created, else False
"""
try:
if region is None:
s3_client = boto3.client('s3')
s3_client.create_bucket(Bucket=bucket_name)
else:
s3_client = boto3.client('s3', region_name=region)
location = {'LocationConstraint': region}
s3_client.create_bucket(
Bucket=bucket_name,
CreateBucketConfiguration=location
)
except ClientError as e:
logging.error(e)
return False
return True
Listing Existing Buckets
import boto3
s3 = boto3.client('s3')
response = s3.list_buckets()
# Display bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
print(f'{bucket["Name"]}')
Uploading Files
The AWS SDK for Python provides two methods for uploading files to S3 buckets.
The upload_file method accepts a filename, bucket name, and object name. It handles large files by splitting them into smaller chunks and uploading them in parallel:
import logging
import boto3
from botocore.exceptions import ClientError
def upload_file(file_name, bucket, object_name=None):
"""Upload a file to an S3 bucket.
:param file_name: File to upload
:param bucket: Target bucket
:param object_name: S3 object name. Defaults to file_name if not specified
:return: True if uploaded successfully, else False
"""
if object_name is None:
object_name = file_name
s3_client = boto3.client('s3')
try:
s3_client.upload_file(file_name, bucket, object_name)
except ClientError as e:
logging.error(e)
return False
return True
The upload_fileobj method accepts a readable file-like object. The file object must be opened in binary mode, not text mode:
s3 = boto3.client('s3')
with open("FILE_PATH", "rb") as f:
s3.upload_fileobj(f, "BUCKET_NAME", "OBJECT_KEY")
ExtraArgs Parameter
Both upload_file and upload_fileobj except an optional ExtraArgs paramter for various purposes. Valid settings are documented in boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS.
Setting metadata to attach to the S3 object:
s3.upload_file(
'FILE_PATH', 'BUCKET_NAME', 'OBJECT_KEY',
ExtraArgs={'Metadata': {'mykey': 'myvalue'}}
)
Setting an ACL (Access Control List) value of 'public-read' on the S3 object:
s3.upload_file(
'FILE_PATH', 'BUCKET_NAME', 'OBJECT_KEY',
ExtraArgs={'ACL': 'public-read'}
)
The ExtraArgs parameter also supports custom or multiple ACL grants:
s3.upload_file(
'FILE_PATH', 'BUCKET_NAME', 'OBJECT_KEY',
ExtraArgs={
'GrantRead': 'uri="http://acs.amazonaws.com/groups/global/AllUsers"',
'GrantFullControl': 'id="01234567890abcdefg"',
}
)
Callback Parameter
Both upload_file and upload_fileobj accept an optional Callback parameter. This parameter references a class that the Python SDK calls intermittently during transfer operations.
The callable class must implement a __call__ method. Each invocation receives the number of bytes transferred, which can be used to implement progress monitoring.
The following callback configuration tells the SDK to create an instance of the ProgressPercentage class. During upload, the instance's __call__ method is invoked periodically:
s3.upload_file(
'FILE_PATH', 'BUCKET_NAME', 'OBJECT_KEY',
Callback=ProgressPercentage('FILE_PATH')
)
Example implementation of the ProgressPercentage class:
import os
import sys
import threading
class ProgressPercentage(object):
def __init__(self, filename):
self._filename = filename
self._size = float(os.path.getsize(filename))
self._seen_so_far = 0
self._lock = threading.Lock()
def __call__(self, bytes_amount):
with self._lock:
self._seen_so_far += bytes_amount
percentage = (self._seen_so_far / self._size) * 100
sys.stdout.write(
"\r{0} {1} / {2} ({3:.2f}%)".format(
self._filename, self._seen_so_far, self._size,
percentage))
sys.stdout.flush()
Downloading Files
The AWS SDK for Python provides methods for downloading files similar to the upload methods.
The download_file method accepts the bucket name, object name, and the filename to save:
import boto3
s3 = boto3.client('s3')
s3.download_file('BUCKET_NAME', 'OBJECT_KEY', 'FILE_PATH')
The download_fileobj method accepts a writable file-like object. The file object must be opened in binary mode:
s3 = boto3.client('s3')
with open('FILE_PATH', 'wb') as f:
s3.download_fileobj('BUCKET_NAME', 'OBJECT_KEY', f)
Like the upload methods, download methods are available on the S3 Client, Bucket, and Object classes, each providing the same functionality. Use whichever is most convenient.
Download methods also support the optional ExtraArgs and Callback parameters. Valid ExtraArgs settings are documented in boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS. The Callback parameter serves the same purpose as it does for uploads.