A Practical Kubernetes Deployment Tutorial for Beginners
I still remember the first time I looked at a Kubernetes YAML file. It felt like I was trying to read an alien language. There were nested indentations, strange abbreviations like svc and rs, and an overwhelming number of fields. If you are a developer looking to dip your toes into the cloud-native world, finding a solid kubernetes deployment tutorial for beginners that actually makes sense can be a challenge. Most guides either skip the fundamentals or dive so deep into cluster architecture that you lose sight of the code.
Let’s strip away the complexity. In this guide, we are going to focus on the core building block of Kubernetes: the Deployment. By the end of this article, you will have written a manifest, deployed a real application to a local cluster, scaled it, updated it, and learned how to avoid the mistakes that trip up most developers when they first make the leap from raw Docker containers to orchestrated workloads.
Prerequisites: What You Need Before Starting
Before we write a single line of YAML, let’s make sure your local machine is ready. You don’t need a massive cloud budget or a multi-node AWS cluster to learn this. Everything we do here can run entirely on your laptop.
Local Environment Setup
For this tutorial, I highly recommend using Docker Desktop with the Kubernetes feature enabled, or Minikube (version v1.34.0 or newer). Both options create a lightweight, single-node Kubernetes cluster inside a virtual machine on your machine.
If you choose Minikube, install it via Homebrew (on macOS) or Chocolatey (on Windows), and start it by running:
minikube start --driver=docker
You will also need kubectl (version v1.30.0 or newer), which is the command-line tool you will use to talk to your cluster. Once your cluster is running, verify your setup:
kubectl version --client
kubectl get nodes
If the get nodes command returns a Ready status, you are good to go. You will also need a basic understanding of Docker, as we will be containerizing a simple application before deploying it.
Understanding the Anatomy of a Kubernetes Deployment
It is tempting to think of a Kubernetes Deployment as just a wrapper around a Docker container. However, understanding what happens under the hood will save you hours of debugging later.
Pods vs. Deployments
In Kubernetes, you do not run containers directly. You run Pods. A Pod is the smallest deployable unit in Kubernetes, and it represents a single instance of a running process in your cluster. A Pod can hold one container, or multiple containers that need to share resources like storage and network space.
However, if you create a Pod directly and that Pod crashes, it stays dead. There is no automatic recovery. That is where the Deployment comes in. A Deployment acts as a manager for your Pods. You tell the Deployment, “I always want three instances of my web application running,” and the Deployment will continuously monitor the cluster. If a node fails and a Pod dies, the Deployment spins up a new one to maintain your desired state.
The Deployment Manifest Structure
Kubernetes resources are defined using YAML files. A typical Deployment manifest has four main sections:
apiVersion: Tells Kubernetes which API version to use. For Deployments, this isapps/v1.kind: Specifies the type of resource we are creating (in this case,Deployment).metadata: Data that helps uniquely identify the object, such as thenameandlabels.spec: The actual desired state. This is where you define how many replicas you want, how to find the Pods (selector), and the blueprint for creating the Pods (template).
Step-by-Step: Your First Kubernetes Deployment
Let’s build this from scratch. We are going to create a simple Python web server, containerize it, and deploy it.
Step 1: Create a Simple Application
Create a new directory for your project. Inside it, create a file named app.py:
from flask import Flask
import os
app = Flask(__name__)
@app.route('/')
def hello():
return "Hello from Kubernetes! I am running on pod: " + os.getenv('HOSTNAME', 'unknown')
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
This is a minimal Flask app that simply returns a greeting and the hostname of the Pod it is running inside. This hostname trick will become very useful later when we test scaling.
Step 2: Containerize the App
In the same directory, create a Dockerfile:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
You will also need a requirements.txt file:
Flask==3.0.3
Werkzeug==3.0.3
Now, build the Docker image. Since we are using a local cluster (like Minikube or Docker Desktop), we need to make sure the image is available to the cluster. If using Minikube, run this command to point your local Docker daemon to Minikube’s internal daemon:
eval $(minikube docker-env)
Then, build the image:
docker build -t local-flask-app:v1 .
Note: We are tagging this as v1. Tagging with latest is a common beginner trap that we will discuss later.
Step 3: Write the Deployment YAML
Create a file named deployment.yaml. This is where the magic happens.
apiVersion: apps/v1
kind: Deployment
metadata:
name: flask-deployment
labels:
app: flask
spec:
replicas: 2
selector:
matchLabels:
app: flask
template:
metadata:
labels:
app: flask
spec:
containers:
- name: flask-app
image: local-flask-app:v1
imagePullPolicy: Never
ports:
- containerPort: 5000
Let’s break down the spec section, as this is where most confusion lies:
* replicas: 2: We want exactly two Pods running at all times.
* selector.matchLabels: This tells the Deployment how to identify the Pods it owns. It looks for Pods with the label app: flask.
* template: This is the blueprint for the Pod. Notice that the template.metadata.labels must match the selector.matchLabels. If they don’t, the Kubernetes API will reject this file with a validation error.
* imagePullPolicy: Never: Because we built this image locally and did not push it to a registry like Docker Hub, we tell Kubernetes to never try to pull it from the internet.
Step 4: Apply the Deployment
Open your terminal and apply the manifest to your cluster:
kubectl apply -f deployment.yaml
You should see output like: deployment.apps/flask-deployment created.
Check the status of your Deployment:
kubectl get deployments
Give it a few seconds, and you will see the READY column show 2/2. Now, check the individual Pods:
kubectl get pods
You should see two Pods running. If you see a status of ErrImagePull or ImagePullBackOff, double-check that you ran the eval $(minikube docker-env) command and that your imagePullPolicy is set to Never.
Step 5: Exposing Your Deployment with a Service
Right now, your Pods have internal cluster IP addresses, but you cannot reach them from your local web browser. We need a Service to act as a load balancer and expose the Pods.
Create a file named service.yaml:
apiVersion: v1
kind: Service
metadata:
name: flask-service
spec:
type: NodePort
selector:
matchLabels:
app: flask
ports:
- port: 80
targetPort: 5000
nodePort: 30001
Correction Note: In the Service spec.selector, you actually don’t need the matchLabels nested key; you put the labels directly under selector. Let’s fix that in the actual code block below to ensure it works perfectly.
apiVersion: v1
kind: Service
metadata:
name: flask-service
spec:
type: NodePort
selector:
app: flask
ports:
- port: 80
targetPort: 5000
nodePort: 30001
Apply it:
kubectl apply -f service.yaml
If you are using Minikube, you can open the application in your browser by running:
minikube service flask-service
If you are using Docker Desktop’s built-in Kubernetes, you can simply open http://localhost:30001 in your browser. Refresh the page a few times. Notice how the hostname changes? That is the Kubernetes Service load-balancing traffic between your two replicas.
Updating and Scaling Your Deployment
This is where Kubernetes truly shines over simply running docker run.
Scaling Up for Traffic
Imagine your application suddenly goes viral. Two Pods are not enough to handle the traffic. With a traditional setup, you might be scrambling to provision new servers. With Kubernetes, you can scale with a single command:
kubectl scale deployment flask-deployment --replicas=5
Run kubectl get pods again. You will see Kubernetes instantly creating three new Pods to bring the total up to five. If traffic dies down, you can scale it back down just as easily.
Rolling Updates Without Downtime
You found a bug in your code and want to push an update. Update your app.py file to say “Hello from Kubernetes V2!”.
Rebuild your Docker image, making sure to bump the version tag:
docker build -t local-flask-app:v2 .
Now, you could edit your deployment.yaml file manually to change the image tag from v1 to v2. However, there is a faster way using the command line:
kubectl set image deployment/flask-deployment flask-app=local-flask-app:v2
Watch the rollout happen in real-time:
kubectl rollout status deployment/flask-deployment
By default, Kubernetes performs a Rolling Update. It will slowly spin up new Pods running v2, wait for them to become healthy, and then gracefully terminate the old v1 Pods. Your users will experience zero downtime. Refresh your browser, and you will eventually see the “V2” message taking over.
If something goes wrong and you realize v2 is broken, rolling back is incredibly simple:
kubectl rollout undo deployment/flask-deployment
Common Pitfalls and How to Avoid Them
In my early days of working with Kubernetes, I made almost every mistake in the book. Here are the most common pitfalls developers hit when learning Deployments, and how to sidestep them.
Forgetting Resource Requests and Limits
By default, if you do not specify resource requests and limits in your YAML, Kubernetes will let your container consume as much CPU and memory as it wants. In a shared cluster, a runaway memory leak in one Pod can cause the whole node to crash, taking down unrelated applications.
Always define resources. Update your deployment.yaml container spec to include this:
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
requests: The minimum amount of CPU/Memory the Pod is guaranteed to get.limits: The maximum amount the Pod is allowed to use. If it tries to exceed the memory limit, the Pod is killed with an `OOM