AWS Lambda Python Tutorial Step by Step: From Zero to Production
I remember the first time I deployed a Python function to AWS Lambda. I had spent two days writing a perfectly good web scraper, only to hit a wall of cryptic errors about missing modules, handler paths, and timeout configurations. It felt like the documentation was written for people who already knew how everything worked.
That experience taught me something important: AWS Lambda has a deceptively simple concept—run code without managing servers—but the execution details trip up almost everyone the first time around. This guide is the one I wish I had back then.
Whether you are building your first serverless function or migrating existing Python workloads to Lambda, this AWS Lambda Python tutorial step by step will walk you through everything from initial setup to deploying a production-ready API endpoint.
What Is AWS Lambda and Why Python?
AWS Lambda is a compute service that runs your code in response to events and automatically manages the underlying infrastructure. You do not provision servers, you do not apply OS patches, and you do not pay for idle time. You pay only for the compute time your code consumes, measured in millisecond increments.
Python is one of the most popular runtimes for Lambda, and for good reason. The ecosystem is enormous, the syntax is readable, and most data processing, automation, and API tasks can be expressed in far fewer lines of Python than other languages. AWS currently supports Python 3.12 as the latest runtime, and also maintains support for 3.11 and 3.10 for backward compatibility.
Lambda works best for short-lived, event-driven tasks: processing an uploaded image, transforming a database record, responding to an HTTP request, or running a scheduled cleanup job. It is not designed for long-running processes or persistent connections, though there are patterns to work around those limitations.
Prerequisites Before You Start
Before writing a single line of code, make sure you have these things in place:
- An AWS Account: If you do not have one, create it at aws.amazon.com. You will need a credit card on file, but everything in this tutorial stays within the free tier.
- Python 3.10 or later installed locally: Download it from python.org if you haven’t already.
- The AWS CLI installed and configured: Run
pip install awsclithenaws configurewith your access key and secret key. You can generate these in the AWS Console under IAM > Users > Security credentials. - A code editor: VS Code with the AWS Toolkit extension is a solid choice, but any editor works.
- Basic Python knowledge: You should understand functions, dictionaries, and how to work with pip.
If you run aws sts get-caller-identity and see your account ID returned, you are ready to go.
AWS Lambda Python Tutorial Step by Step
Step 1: Set Up Your AWS Environment
First, create a dedicated IAM user or role for Lambda development rather than using your root account. In the AWS Console, navigate to IAM > Users > Create user. Give it a name like lambda-developer and attach the AWSLambda_FullAccess managed policy for learning purposes. In a production setting, you would narrow these permissions down significantly.
Verify your CLI setup:
aws sts get-caller-identity
You should see output similar to:
{
"UserId": "AIDAXXXXXXXXXXXXXXXX",
"Account": "123456789012",
"Arn": "arn:aws:iam::123456789012:user/lambda-developer"
}
Step 2: Create Your First Lambda Function via the Console
Navigate to the Lambda service in the AWS Console and click Create function. Select Author from scratch, then configure these settings:
- Function name:
hello-python - Runtime: Python 3.12
- Architecture: x86_64 (arm64 works too and can be slightly cheaper, but x86_64 has broader package compatibility)
- Execution role: Use the default “Create a new role with basic Lambda permissions”
Click Create function. AWS generates a basic handler for you. This creates the function and an IAM execution role that allows it to write logs to CloudWatch.
Step 3: Write the Handler Code
Replace the default code in the inline editor with this:
import json
def lambda_handler(event, context):
"""
Entry point for the Lambda function.
Args:
event: The event data passed to the function (dict for most invocations)
context: Runtime information (object with properties like function_name, remaining_time_in_millis)
"""
name = event.get('name', 'World')
response = {
'message': f'Hello, {name}!',
'function_name': context.function_name,
'log_group_name': context.log_group_name,
'request_id': context.aws_request_id
}
return {
'statusCode': 200,
'body': json.dumps(response),
'headers': {
'Content-Type': 'application/json'
}
}
The lambda_handler name is the default entry point, but you can change it in the function configuration under Handler. The format is filename.handler_function_name, so for a file called app.py with a function called process_event, you would set the handler to app.process_event.
The event parameter contains the data that triggered your function. For an API Gateway trigger, this includes HTTP headers, query parameters, and the request body. For an S3 trigger, it contains bucket name and object key information. The context object gives you runtime metadata—function name, memory allocation, remaining execution time, and the request ID for tracing.
Step 4: Test Your Lambda Function
In the console, click the Test tab. Create a new test event with the name HelloTest and this JSON:
{
"name": "Sexy Developer"
}
Click Test. You should see:
{
"statusCode": 200,
"body": "{\"message\": \"Hello, Sexy Developer!\", \"function_name\": \"hello-python\", \"log_group_name\": \"/aws/lambda/hello-python\", \"request_id\": \"...\"}",
"headers": {
"Content-Type": "application/json"
}
}
Check the execution log below the result. You will see the REPORT line showing billed duration, memory used, and init duration. That init duration is your cold start time, which we will discuss later.
Step 5: Add External Dependencies with Layers
This is where most beginners hit their first wall. You try to import requests and get:
Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'requests'
Lambda does not install packages from a requirements.txt automatically. You have two main options: Lambda Layers or deploying a deployment package (ZIP file).
Using a Lambda Layer is the cleanest approach for shared dependencies:
# Create a clean directory for your layer
mkdir python-layers
cd python-layers
# Create the Python package directory (this exact structure matters)
mkdir -p python
# Install packages into it
pip install requests pytz -t python/
# Zip it up
zip -r layer.zip python/
# Create the layer in AWS
aws lambda publish-layer-version \
--layer-name common-deps \
--zip-file fileb://layer.zip \
--compatible-runtimes python3.12 \
--description "Common Python dependencies"
Note the response. You need the LayerVersionArn from the output. Go back to your function in the console, scroll to Layers, click Add a layer, choose your common-deps layer, and save. Now import requests will work.
Step 6: Deploy a Full API with API Gateway
A Lambda function sitting in isolation is not very useful. Let me walk you through connecting it to API Gateway so you can call it over HTTP.
Go to API Gateway in the AWS Console and create a REST API (not HTTP API for this example—REST API gives you more configuration options, though HTTP API is faster and cheaper for simple cases).
- Click Create API > REST API > Build
- Name it
hello-api - Under Resources, click Create method, select POST, and confirm
- Set Integration type to Lambda Function, select your
hello-pythonfunction, and save - Click Deploy API, create a new stage called
prod
You will get an invoke URL like https://xxxxxxx.execute-api.us-east-1.amazonaws.com/prod. Test it:
curl -X POST https://xxxxxxx.execute-api.us-east-1.amazonaws.com/prod \
-H "Content-Type: application/json" \
-d '{"name": "Lambda Learner"}'
You should receive the JSON response from your function. That is a live, internet-accessible API endpoint running your Python code, with no servers to manage.
Building a More Practical Example: Image Metadata Extractor
Let me build something that demonstrates real-world patterns—extracting metadata from images uploaded to S3.
import json
import logging
from PIL import Image
from PIL.ExifTags import TAGS
import boto3
logger = logging.getLogger()
logger.setLevel(logging.INFO)
s3_client = boto3.client('s3')
def get_exif_data(image_path):
"""Extract EXIF metadata from an image file."""
image = Image.open(image_path)
exif_data = image.getexif()
metadata = {}
for tag_id, value in exif_data.items():
tag_name = TAGS.get(tag_id, tag_id)
# Convert bytes to string for JSON serialization
if isinstance(value, bytes):
value = value.decode('utf-8', errors='replace')
metadata[str(tag_name)] = str(value)
return metadata
def lambda_handler(event, context):
"""
Triggered by S3 PutObject events.
Extracts image metadata and stores it in a DynamoDB table.
"""
for record in event['Records']:
bucket_name = record['s3']['bucket']['name']
object_key = record['s3']['object']['key']
logger.info(f"Processing file: s3://{bucket_name}/{object_key}")
# Download the image to /tmp (the only writable directory in Lambda)
download_path = f"/tmp/{object_key.split('/')[-1]}"
s3_client.download_file(bucket_name, object_key, download_path)
try:
metadata = get_exif_data(download_path)
logger.info(f"Extracted {len(metadata)} metadata fields")
# Here you would write to DynamoDB, e.g.:
# dynamodb = boto3.resource('dynamodb')
# table = dynamodb.Table('image-metadata')
# table.put_item(Item={
# 'object_key': object_key,
# 'bucket': bucket_name,
# 'metadata': metadata,
# 'processed_at': context.aws_request_id
# })
return {
'statusCode': 200,
'body': json.dumps({
'object_key': object_key,
'metadata_fields': len(metadata),
'metadata': metadata
})
}
except Exception as e:
logger.error(f"Error processing {object_key}: {str(e)}")
raise e
finally:
# Clean up /tmp to avoid filling up the 512MB ephemeral storage
import os
if os.path.exists(download_path):
os.remove(download_path)
This function uses Pillow for image processing. You would package it as a deployment package since the layer approach gets cumbersome for function-specific dependencies. Here is how to do that:
mkdir image-processor
cd image-processor
# Create the function file
cat > lambda_function.py << 'EOF'
# ... paste the code above ...
EOF
# Install dependencies locally
pip install Pillow -t .
# Zip everything together (excluding hidden files)
zip -r ../image-processor.zip . -x ".*"
# Deploy or update the function
aws lambda update-function-code \
--function-name image-metadata-extractor \
--zip-file fileb://../image-processor.zip
To set up the S3 trigger, go to your Lambda function > Configuration > Triggers > Add trigger, select S3, choose your bucket, and set the event type to Put. Now every image uploaded to that bucket automatically gets processed.
Deploying with AWS SAM (The Professional Way)
Using the console is fine for learning, but real projects need infrastructure as code. AWS SAM (Serverless Application Model) is the most straightforward framework for Lambda-based applications.
Install SAM CLI:
pip install aws-sam-cli
Initialize a new project:
sam init --name hello-sam --runtime python3.12 --app-template hello-world --package-type Zip
cd hello-sam
SAM generates a template.yaml file that defines your infrastructure. Here is a more complete version:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Hello World SAM Template
Globals:
Function:
Timeout: 10
Runtime: python3.12
Environment:
Variables:
LOG_LEVEL: INFO
Resources:
HelloWorldFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: app.lambda_handler
Events:
HelloWorld:
Type: Api
Properties:
Path: /hello
Method: post
Layers:
- !Ref CommonDepsLayer
CommonDepsLayer:
Type: AWS::Serverless::LayerVersion
Properties:
ContentUri: layers/common/
CompatibleRuntimes:
- python3.12
RetentionPolicy: Retain
Outputs:
HelloWorldApi:
Description: API Gateway endpoint URL for Prod stage
Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/hello/"
Build and deploy:
# Build the dependencies
sam build
# Deploy (first time, it will guide you through creating a SAM CLI managed stack)
sam deploy --guided
# Subsequent deployments
sam deploy
SAM handles packaging your dependencies, creating the API Gateway, setting up IAM roles, and deploying everything in one command. It also generates a samconfig.toml file so subsequent deploys are a single sam deploy command.
Common Pitfalls and How to Avoid Them
After deploying dozens of Lambda functions across production workloads, these are the issues I see most often:
Pitfall 1: Forgetting the /tmp Directory Limitation
Lambda gives you read-only access to the deployed code and only 512MB of writable storage in /tmp (you can increase this up to 10GB in the configuration). If you try to write to the current working directory or any other path, you get a PermissionError: [Errno 30] Read-only file system.
# WRONG - will fail
with open('output.json', 'w') as f:
f.write(data)
# CORRECT - use /tmp
import os
temp_path = os.path.join('/tmp', 'output.json')
with open(temp_path, 'w') as f:
f.write(data)
Also remember that /tmp persists between invocations within the same execution environment, which can actually be useful for caching, but can also cause stale data bugs if you are not careful.
Pitfall 2: Cold Start Latency
When a Lambda function has not been invoked for a while (typically 5-15 minutes), AWS tears down the container. The next invocation requires AWS to provision a new container, load your code, and run initialization code outside the handler. This is the cold start.
For a simple function, cold starts are 100-300ms. For functions with large dependencies like Pandas or TensorFlow, they can exceed 5 seconds. You can mitigate this with:
- Provisioned Concurrency: Keeps a minimum number of warm instances ready. This costs more but eliminates cold starts.
- Minimizing package size: Only include the packages you actually need. A 5MB deployment package cold-starts faster than a 50MB one.
- Keeping initialization outside the handler: Database connections, SDK clients, and configuration loading should happen at the module level.
import boto3
import json
# These run once per cold start, not per invocation
s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('my-table')
def lambda_handler(event, context):
# This runs on every invocation
response = table.get_item(Key={'id': event['id']})
return response.get('Item', {})
Pitfall 3: Incorrect Return Format for API Gateway
If your function is triggered by API Gateway but does not return the expected format, you get a 502 Bad Gateway error with the message “Malformed Lambda proxy response.” The return value must be a dictionary with statusCode (integer), body (string), and optionally headers (dictionary).
“`python