Kubernetes - Volumes - Part-1
emptyDir, hostPath

In this article we are going to explore Kubernetes Volumes together. We will discuss about what they are, why the exist and how they actually work behind the scene.
If you ever wondered “Where my data go when a pod restarts“ you are in a right place. Lets dive in 🚀
🤔 Why Do We Even Need Volumes?
Let’s start with a simple question: why does Kubernetes need volumes at all?
We know that a Pod is made up of one or more containers. And we also know an important (and slightly scary) fact about containers: Containers are ephemeral. This means when a container crashes, restarts, or gets recreated, its filesystem is wiped clean. Any files written inside the container? 💥 Gone.
This is where Volumes come to the rescue. A Kubernetes Volume provides a way to decouple storage from the container’s lifecycle. So to answer the question :
🎯 Why We Use Volumes
Protect application data from container crashes
Share data between containers in a Pod
Make applications more reliable and production-ready
Kubernetes offers multiple volume types, each designed for different use cases—temporary storage, shared storage, cloud disks, network storage, and more.
In the next sections, we’ll explore these volume options one by one, understand when to use what, and avoid common mistakes along the way.
📦 emptyDir — The Simplest Kubernetes Volume
The most basic volume type in Kubernetes is emptyDir. An emptyDir volume is created at the Pod level, not at the container level.
The What ?
👉 But what does Pod level actually mean?
When a Pod is created: Kubernetes creates the emptyDir volume once. Every container inside that Pod can mount and access the same volume.
So, the volume does not belong to a single container, it belongs to the pod itself. As long as the pod exists the volume exists
🤝 How Do Containers Share an emptyDir?
Let’s make this real with a common and very practical scenario. Imagine a Pod with two containers:
App Container: Runs your main application and generate logs
Sidecar Container: Collect those logs and ship them to a central logging system. You don’t want to add extra load or logging logic inside your main app, so you offload that responsibility to a sidecar container.
Both containers mount the same emptyDir volume as a result the app writes the logs to the volume and sidecar reads logs from the same volume. They’re isolated containers but sharing data seamlessly.
The important catch is data in the emptyDir volume will be lost when the pod restarts or crashed. But the data will be preserved even if any of the container in the pod crashes or restarted. Because emptyDir lives only as long as the Pod lives.
So in simple terms emptyDir:
Container lifecycle ❌ does NOT affect data
Pod lifecycle ✅ DOES affect data
The When ?
emptyDir is ideal for:
Temporary files
Cache data
Shared workspace between containers
Log sharing (like our sidecar example)
🚫 It is not meant for long-term or critical data storage.
The How ?
Here’s a minimal working example of exactly the scenario we discussed.
apiVersion: v1
kind: Pod
metadata:
name: shared-workspace
spec:
containers:
- name: producer
image: busybox
# Writes data to the volume
volumeMounts: # <--- VolumeMount block (Producer)
- name: shared-data
mountPath: /app/data
- name: consumer
image: busybox
# Reads data from the volume
volumeMounts: # <--- VolumeMount block (Consumer)
- name: shared-data
mountPath: /app/input
volumes: # <--- Volume definition
- name: shared-data
emptyDir: {} # <--- The magic keyword
🔍 Let’s Dissect the Example Step by Step
Let’s begin at the bottom of the Pod spec—the
volumesblock. This is where we defineThe volume name -
shared-dataThe volume type -
emptyDir
This tells Kubernetes: Create an empty directory when the Pod starts and attach it to this Pod. At this point, the volume exists—but no container is using it yet.
Now look at the two containers inside the Pod: Each container has its own
volumeMountsblock. Both containers mount the same volume (shared-data), but at different paths. This is intentional—and this is where things get interesting.
🤔 The Common Doubt (Very Natural One!)
You might be thinking: If the producer writes logs to /app/data, how does the consumer read them from /app/input?
Think of the volume as a room. And think of mountPath as a door. A room can have multiple doors, no matter which door you enter, you end up in the same room
With this analogy in our example /app/data is one door and /app/input is another door, both doors lead to the same emptyDir volume, so The producer writes logs into the room through the /app/data door and the consumer reads those same logs from the room through the /app/input door
With emptyDir, data survives container crashes and restarts. However, once the Pod is restarted or recreated, all the data in the volume is lost.
But what if we want our data to survive even Pod restarts or crashes? That’s where hostPath comes in. Let’s take a look.
📦 hostPath
The main limitation of emptyDir is that the data is lost when the Pod restarts. This problem is addressed by hostPath.
The What ?
With hostPath, the data is preserved no matter what happens to the Pod, as long as the Pod is scheduled on the same node.
🤔 How does this work?
The trick is simple: hostPath mounts a specific file or directory from the node’s filesystem directly into your Pod. So even if the Pod dies and a new Pod starts on the same node, the data is still there—because it never left the node in the first place.
The When?
Use hostPath when:
You want data to survive Pod restarts
You are okay with the Pod running on the same node
You are working in local development, single-node clusters, or testing environments
You need access to node-level files (logs, sockets, configs)
⚠️ Not recommended for multi-node production workloads due to portability and security concerns.
⚠️ During node maintenance or a node crash, extra care is required. Any Pod using a hostPath volume will lose its data if it gets drained and rescheduled onto a different node. The data is preserved only as long as the Pod remains on the same node and the node is up and running.
The How ?
apiVersion: v1
kind: Pod
metadata:
name: shared-workspace
spec:
containers:
- name: producer
image: busybox
volumeMounts:
- name: shared-data
mountPath: /app/data
- name: consumer
image: busybox
volumeMounts:
- name: shared-data
mountPath: /app/input
volumes:
- name: shared-data
hostPath: # <--- The magic keyword hostPath
path: /tmp/shared-data
type: DirectoryOrCreate
🔍 Let’s Dissect the Example Step by Step
Instead of
emptyDir, we now usehostPathThe data is stored on the node at
/tmp/shared-dataThe
typefield tells Kubernetes what it should expect at the given path on the node and what to do if it doesn’t exist. If it does not exist create the directory automatically📦 Common
hostPathTypes
| Type | What it means |
"" (empty string) | Default value. No checks are performed |
DirectoryOrCreate | Uses the directory, or creates it if it does not exist |
Directory | Directory must already exist |
FileOrCreate | Uses the file, or creates it if it does not exist |
File | File must already exist |
Socket | Unix domain socket |
CharDevice | Character device |
BlockDevice | Block device |
🤔 Common Doubts
You might have this question in mind: We know that hostPath is tied to a specific node. What happens if I restart the Pod or apply a rollout that recreates Pods? Is there any guarantee that the Pod will be scheduled on the same node where my data exists? And if it gets scheduled on a different node, will the data be lost?
Yes—if the Pod is scheduled on a different node, the data will be lost. In most cases, Kubernetes tries to reschedule the Pod onto the same node it was previously running on. This is why, during normal Pod restarts or rollouts, you often see the Pod coming back on the same node and your data appearing to be “safe.”
❓ Why Does This Happen?
When the scheduler evaluates where to place a Pod, it considers: Existing node assignments, Node availability, Resource constraints. If the node is Healthy, Not Drained, and has sufficient resources Kubernetes will typically place the pod back on the same node. ‘
However There is no strict guarantee. If the node is crashed, drained, under maintenance, or out of resources the pod will be rescheduled to different node.
So far, we’ve seen how hostPath and emptyDir volumes work—great for temporary data, experiments, caching, and understanding how Kubernetes handles storage inside a Pod or a node. But what if you want your data to live beyond Pod restarts, survive node failures, and stay safe from all the usual Kubernetes chaos? 🤯
That’s exactly where Persistent Volumes (PV) come into the picture. They solve the problem of long-lived, reliable storage in Kubernetes. I’ve covered Persistent Volumes in detail in the next blog, breaking down how they work, why they matter, and when you should use them in real-world setups.
If you’ve made it this far — great job 👏 You now have a solid understanding of Kubernetes’ ephemeral storage story.
👉 Continue the journey here: Persistent Volumes in Kubernetes — and let’s level up your storage game 🚀



