A step-by-step walkthrough of how to set up a new MacBook Pro for product management work, covering everything from terminal and dev tools to AI assistants and note-taking apps.
Welcome, reader.
This is the magazine edition of my blog. Each article is presented in a classic three-column print layout, inspired by the tech magazines I grew up reading.
Use the arrow keys or click the page edges to flip through. On mobile, swipe left or right.
Every article links back to its full version on the blog if you want to read more or share it.
Murat Karslioglu
I just started a new role as a Staff Product Manager. Day one was approaching fast, and I had a blank MacBook Pro sitting on my desk. A 14-inch M4 Pro with 24GB of RAM and a 512GB SSD.
The question wasn’t what to install. It was how to set up a machine that lets a PM move at the speed of thought: writing PRDs to spinning up prototypes to jumping on a customer call, all without friction.
This is the guide I wish I’d had.
0
Tools Installed
0
Hour to Set Up
0
Excuses Left
There’s a tension at the heart of every PM’s toolkit. You’re not an engineer, but you need to speak their language. You’re not a designer, but you need to give precise feedback. You’re not in sales, but you need to demo the product on the fly.
My philosophy is simple:
Be technical enough to prototype, clear enough to document, and fast enough to never block your team.
That means optimizing for speed, clarity, and collaboration, not engineering perfection. Every tool below earns its place by making me faster at one of three jobs:
Hardware
MacBook Pro 14″ / M4 Pro
macOS Sequoia 15.4.1
Why does a PM need an M4 Pro? Because on any given Tuesday I’m running Docker containers, Figma with a 200-screen file, 40 browser tabs of customer research, a Zoom call, and a local AI model. All at once, without the fans spinning up.
Before installing a single app, I spend 20 minutes dialing in macOS itself. These tweaks are small individually, but compound into a noticeably smoother experience.
Dark mode, always. Auto-hide the Dock. Remove every app I won’t use daily. Show battery percentage. Turn on Night Shift for late writing sessions.
I keep exactly three apps allowed to interrupt me: Calendar (meetings), Slack (team comms), and Linear (project updates). Everything else gets turned off. Context switching is the PM’s worst enemy.
System Settings → Trackpad. Max tracking speed. Enable Tap to Click. These two changes alone make the MacBook feel twice as responsive.
If you’re using an Apple Mouse with your Apple Studio Display (or standalone), head to System Settings → Mouse. Enable Secondary Click and set it to Click Right Side. Right-click is essential for context menus everywhere, from Figma to the terminal.
System Settings → Keyboard → Keyboard Shortcuts → Spotlight: disable Spotlight’s Cmd+Space shortcut. We’re replacing it with something much better.
Open Finder, then go to Finder → Settings (Cmd+,). A few tweaks that save daily friction:
Non-negotiable: FileVault ON (full disk encryption), Touch ID for everything. If you’re handling customer data, competitive intel, or roadmap docs (and you are), encryption isn’t optional.
Get a password manager. 1Password is my pick. It handles passwords, SSH keys, API tokens, and secure notes in one place. Bitwarden is a solid free alternative. Either way, stop reusing passwords and storing secrets in plain text files.
A handful of defaults write commands that macOS should ship with out of the box:
# Screenshots as JPG (smaller, good enough)
defaults write com.apple.screencapture type jpg
# Show hidden files, path bar, status bar in Finder
defaults write com.apple.finder AppleShowAllFiles YES
defaults write com.apple.finder ShowPathbar -bool true
defaults write com.apple.finder ShowStatusBar -bool true
# Unhide the Library folder
chflags nohidden ~/Library
killall Finder Everything starts here. One command to install the macOS package manager that makes everything else possible:
/bin/bash -c "$(curl -fsSL \
https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" From this point on, installing software is just brew install or brew install --cask. No more dragging .dmg files around.
Here’s where things get opinionated. I split my apps into two categories: GUI apps I interact with visually, and terminal tools that power my command-line workflows.
As of this writing, these are my current preferences. They change over time, so check the last updated date at the top.
brew install --cask \
raycast google-chrome arc firefox \
ghostty visual-studio-code cursor \
figma notion linear-linear \
slack discord zoom loom \
cleanshot rectangle obsidian \
tableplus postman docker \
1password vlc maccy imageoptim That’s 24 apps installed in under a minute. Let me walk through the ones that matter most.
brew install \
git gh wget nvm pnpm yarn \
jq tree htop tlrc bat \
fzf ripgrep eza claude-code These are the quiet workhorses. bat is a better cat. eza is a better ls. fzf is fuzzy finding for everything. tlrc gives you practical examples instead of man pages. ripgrep searches code faster than you can think of what to search for.
This is the single most impactful app on this list. Raycast replaces Spotlight with something that actually understands how you work.
Cmd+Space opens it. From there I can:
I use three browsers, each with a distinct job:
Chrome is my primary. The dev tools are unmatched. Extensions: 1Password, uBlock Origin, React DevTools, JSON Viewer, Loom, Notion Web Clipper, Grammarly, and a design QA trio: ColorZilla, WhatFont, Page Ruler.
Arc is for research. Its Spaces feature lets me keep separate contexts (competitive research, customer interviews, documentation) without drowning in tabs.
Firefox Developer Edition is for cross-browser testing. Because “it works on Chrome” isn’t a shipping standard.
The default Terminal app is fine. Ghostty is better. It’s fast, memory-efficient, and GPU-accelerated. Split panes, native macOS feel, and none of the bloat.
Layer on Oh My Zsh for plugin management and Starship for a beautiful, informative prompt:
# Oh My Zsh
sh -c "$(curl -fsSL \
https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
# Starship prompt
brew install starship
echo 'eval "$(starship init zsh)"' >> ~/.zshrc
# Hack Nerd Font (for icons in the prompt)
brew install --cask font-hack-nerd-font Three plugins that make the terminal feel like it can read your mind:
These are custom plugins, so you need to clone them first:
git clone https://github.com/zsh-users/zsh-autosuggestions \
${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions
git clone https://github.com/zsh-users/zsh-syntax-highlighting \
${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting
git clone https://github.com/zsh-users/zsh-completions \
${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-completions Then add them to your plugin list in ~/.zshrc:
plugins=(
git
zsh-completions
zsh-autosuggestions
zsh-syntax-highlighting
docker
npm
) These save me hundreds of keystrokes a day:
# Git (the ones I actually use)
alias gs="git status"
alias ga="git add ."
alias gc="git commit -m"
alias gp="git push"
alias gl="git lg"
# Navigation
alias projects="cd ~/Projects"
alias work="cd ~/Projects/work"
# Utilities
alias week="date +%V"
alias serve="python3 -m http.server 8000" Although a traditional IDE is needed less and less with AI-powered tools like Cursor and Claude Code handling most of the heavy lifting, I’m still using VS Code to manually review and update code. Honestly, I expect this section to be removed in the next six months.
For now, VS Code is where I spend a good chunk of my day. Not just for code. I use it for Markdown, JSON, YAML, meeting notes, and PRDs.
Writing
Product Work
Development
Productivity
A few settings that make VS Code feel like a focused writing environment, not an IDE:
{
"editor.fontSize": 14,
"editor.fontFamily": "Hack Nerd Font Mono",
"editor.minimap.enabled": false,
"editor.padding.top": 36,
"workbench.colorTheme": "GitHub Dark Default",
"workbench.sideBar.location": "right",
"workbench.activityBar.location": "hidden",
"files.autoSave": "afterDelay",
"files.autoSaveDelay": 1000,
"[markdown]": {
"editor.formatOnSave": false,
"editor.wordWrap": "on"
}
} Sidebar on the right. Activity bar hidden. Minimap off. Auto-save on. It’s a writing tool that happens to also run code.
This is the 2025 part. Two tools that didn’t exist in my setup a year ago, and now I can’t imagine working without them.
For rapid prototyping and MVP development. When I need to go from “idea on a whiteboard” to “working prototype” in an afternoon, Cursor is where it happens.
This is the tool that changed how I work. Claude Code lives in my terminal and handles complex, multi-step coding tasks autonomously. Currently running with Opus 4.6, it’s the best I’ve found for handling PRDs and building quick MVPs.
npm install -g @anthropic-ai/claude-code What I actually use it for:
# Generate a PRD from rough notes
claude "Create a comprehensive PRD for 'user-authentication'
with problem statement, user stories, success metrics,
and technical considerations"
# Analyze customer feedback
claude "Analyze this feedback file. Extract themes,
pain points, feature requests, and sentiment"
customer-feedback.txt
# Scaffold a prototype
claude "Build a landing page with hero, features,
and CTA using Tailwind CSS" These tools don’t get the headlines, but they keep everything running smoothly.
Notion
Documentation hub. PRD templates, meeting notes, customer interview databases, competitive analysis. The single source of truth for everything written.
Obsidian
Personal knowledge management. Local-first, Markdown-based. Where I build my “second brain”: daily notes, product insights, reading notes, patterns I notice across customer calls.
Figma
Design collaboration. Review designs, create quick wireframes, annotate with feedback, prototype simple flows. Learn the shortcuts: C for comments, V for move.
Linear
Issue tracking that doesn’t feel like punishment. Keyboard-driven, fast, beautiful. C to create, / to search. Custom views per project.
CleanShot X
Screenshots and screen recording that’s better than macOS built-in in every way. Annotate instantly, record GIFs for bug reports, scrolling capture for long pages.
Loom
Async video for remote PMs. Share product demos, give design feedback, explain complex concepts, all without scheduling a meeting.
Rectangle + Maccy
Window management via keyboard (Ctrl+Opt+arrows) and clipboard history (Cmd+Shift+V). Small tools, massive time savings.
Docker Desktop
Containers for local development. Run databases, APIs, and full-stack apps without polluting your system. It’s the quickest way to get a reproducible dev environment. Worth noting: if your team runs Kubernetes in production, Rancher Desktop (free, ships with k3s) might be a better fit. Podman is another solid alternative if you want something lighter and daemonless.
TablePlus + Postman
Database GUI and API testing. For when you need to verify metrics, understand data models, or test endpoints yourself. Read-only production access is your friend.
Even as a PM, these are non-negotiable. You need to clone repos, review PRs, and run prototypes locally.
git config --global user.name "Murat Karslioglu"
git config --global user.email "your-email@company.com"
git config --global init.defaultBranch main
# A beautiful git log
git config --global alias.lg "log --color --graph \
--pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset \
%s %Cgreen(%cr) %C(bold blue)<%an>%Creset' \
--abbrev-commit" ssh-keygen -t ed25519 -C "github"
ssh-add --apple-use-keychain ~/.ssh/github
# Add to GitHub with the CLI
gh auth login
gh ssh-key add ~/.ssh/github.pub -t github # Install via NVM (version manager)
nvm install --lts
node -v && npm -v
# Global tools for quick prototyping
npm install -g serve http-server \
json-server netlify-cli vercel Tools are nothing without workflows. Here are the three I run on repeat.
pmsetup in the terminal: starts Docker, opens Notion, Linear, SlackCustomer call → Loom recording → transcribe with Otter.ai → extract insights into Notion → synthesize patterns in Obsidian → update the PRD
Every insight has a clear path from conversation to product decision.
Sketch in FigJam → mockup in Figma → build in Cursor or with Claude Code → deploy to Vercel → share a Loom walkthrough
From idea to deployed prototype, tested with real users, in a single day.
Simple, predictable, and hard to mess up:
~/Projects/
├── work/ # Company projects
│ ├── docs/ # Internal documentation
│ ├── prototypes/ # Quick MVPs
│ └── research/ # Customer interviews, analysis
├── blog/ # Personal blog
├── learning/ # Courses, tutorials
└── personal/ # Side projects Everything has a home. Nothing lives on the Desktop.
A setup is only as good as its maintenance. I follow a simple cadence:
Weekly: brew update && brew upgrade, clear Downloads, archive old Notion pages, push Obsidian vault to GitHub.
Monthly: Update VS Code extensions, remove unused apps, clear browser caches, npm update -g.
Quarterly: macOS system update, audit installed apps, review security settings, clean up SSH keys.
Backup strategy: Code lives on GitHub. Documents in Notion + Google Drive. Personal notes in Obsidian (synced to GitHub). Passwords in 1Password. Config files in a dotfiles repo. Time Machine to an external SSD for everything else.
If you’re starting a new PM role, here’s what to get done in week one:
This setup takes about an hour from a blank MacBook to a fully operational PM workstation. It balances three things:
Technical
Build MVPs, understand the stack, speak the team’s language
Clear
Write PRDs, present effectively, document thoroughly
Fast
Minimal friction, zero context switching, great tools
Tools don’t make you a great PM. Solving customer problems does. But great tools let you move faster and think clearer.
Now close this tab and go set up your machine.
Every personal site starts with a question: what do I actually want this to be?
I didn’t want a bloated portfolio template. I wanted something fast, clean, and focused on writing. Something that felt like opening a well-designed book.
After evaluating several options, the choice was clear:
0
Lighthouse Score
0 KB
KB of JavaScript
0
Pages Built
The design follows a few simple principles:
This site is a living project. I plan to keep iterating: adding new content, refining the design, and experimenting with new ways to tell stories on the web.
If you’re curious about the source code, it’s all on GitHub.
Thanks for scrolling along.
Welcome to my new website. I’ve been meaning to build this for a while, and here we are.
In the age of social media, having your own space on the internet feels more important than ever. It’s a place to think out loud, share what I’m learning, and connect with people who are curious about similar things.
I plan to write about a few topics that I spend most of my time thinking about:
Here’s a code snippet, because every developer blog needs one:
function buildSomethingGreat(idea: string): Product {
const validated = validateWithUsers(idea);
const prioritized = applyFramework(validated, "RICE");
return ship(prioritized);
}
I’ll be publishing regularly, or at least that’s the plan. If any of this sounds interesting, stick around.
“The best time to plant a tree was twenty years ago. The second best time is now.”
Thanks for reading.

Every storage protocol you need to understand, why it exists, and which ones will survive the next decade.
Before we get into it, let’s agree on terms. A storage protocol is the language a computer uses to talk to the thing holding its data. It defines how bytes get from Point A (your application) to Point B (a disk, a flash chip, a cloud bucket, a GPU’s memory). Different protocols exist because Point B keeps changing, and so does what we’re asking it to do.
Think of it like ordering food. You can walk into the kitchen and grab it yourself (local storage). You can call a waiter (a network protocol). You can use a delivery app (cloud API). Each method has trade-offs in speed, convenience, and how much control you have over what arrives.
The AI revolution didn’t invent new physics. Data still lives on magnetic platters, flash cells, or DRAM chips. But it did change what “fast enough” means, how much data we’re moving, and where it needs to go. Let’s walk through every layer.

Local storage means a physical device plugged directly into your machine. No network, no middleman. Three technologies dominate.
A spinning metal platter coated in magnetic material, with a tiny arm that floats nanometers above the surface reading and writing data. The same basic design since IBM shipped the first one in 1956 (it was the size of two refrigerators and held 5 MB).
How they work. The arm seeks to a position on the platter, waits for the right sector to spin underneath, and reads or writes magnetically. This mechanical motion is why HDDs have seek times measured in milliseconds. The arm literally has to move.
Speed. Sequential reads around 200-250 MB/s for modern drives. Random I/O is the killer: maybe 100-200 IOPS because each operation requires a physical seek.
Where they still win. Cost per terabyte. An 18TB HDD costs around $250. That’s roughly $0.014/GB. Nothing else comes close for bulk capacity. Cold archives, backup targets, surveillance footage, regulatory retention. Any workload where you need petabytes and can tolerate latency.
AI relevance. HDDs still hold the majority of the world’s training data in cold storage tiers. The dataset you download from Hugging Face probably lived on spinning rust before it reached you.
No moving parts. Data is stored in NAND flash cells: tiny transistors that trap electrons to represent bits. SATA SSDs plug into the same connectors that HDDs use, which made them a drop-in upgrade starting around 2010.
How they work. Flash cells are organized into pages (4-16KB) and blocks (256-512 pages). You can read or write individual pages, but you can only erase an entire block at once. This asymmetry (read a page, erase a block) is the source of most SSD complexity. A chip called the Flash Translation Layer (FTL) manages the mapping.
Speed. SATA tops out at 600 MB/s (the interface is the bottleneck, not the flash). Random IOPS around 50,000-100,000.
AI relevance. Minimal for training workloads. SATA’s 600 MB/s ceiling is a hard wall. But plenty of inference servers still have SATA SSDs for the OS and model weight storage where latency isn’t the primary concern.
NVMe (Non-Volatile Memory Express) is what happens when you throw away the legacy interface and design a protocol specifically for flash. Instead of talking through the SATA/AHCI stack (designed for spinning disks), NVMe talks directly over PCIe lanes, the same high-speed bus your GPU uses.
How they work. Same NAND flash as SATA SSDs, but the protocol supports 65,535 queues with 65,536 commands each (vs SATA’s single queue of 32 commands). That’s the difference between a single-lane road and a 65,535-lane highway.
Speed. PCIe Gen4 x4: 7 GB/s reads. Gen5 x4: 14 GB/s. Random IOPS: 1,000,000+. A single NVMe drive is faster than an entire rack of HDDs.
Form factors. M.2 (the little stick in your laptop), U.2 (2.5” enterprise), and the newer EDSFF (ruler-shaped, designed for maximum density: 32 drives in 1U for 4+ PB in less than 2 inches of rack space).
AI relevance. This is where it gets interesting. A single GPU training run might read hundreds of terabytes. NVMe’s bandwidth means a node with 24 drives can deliver 168 GB/s to local applications. That’s enough to feed multiple GPUs without starving them. NVIDIA’s GPUDirect Storage (GDS) can even bypass the CPU entirely. Data flows straight from NVMe to GPU memory over PCIe.
The cost. NVMe is 3-5x the price per TB of HDDs. But price-per-IOPS and price-per-GB/s tell a completely different story. For performance-sensitive workloads, NVMe is the cheapest option by far.
Here’s where it gets fun. What if the storage isn’t physically in your server but you want your applications to think it is?
DAS is technically “remote” in the sense that the drives live in a separate enclosure (a JBOD, Just a Bunch of Disks), connected to your server by a cable. But the connection is direct, not over a network. Common interfaces include SAS (Serial Attached SCSI) cables that can connect 100+ drives to a single server.
Think of it as an extension cord for your storage. Your server sees the drives as if they were internal. No network stack, no shared access. Simple, fast, cheap.
AI use case. DAS JBOFs (Just a Bunch of Flash) are the storage backbone of many GPU training clusters. NVIDIA DGX systems ship with NVMe SSDs as DAS. When you need raw bandwidth without network overhead, DAS wins.
NAS puts storage on the network and exposes it as a file system. Your server mounts a remote share and accesses files with standard read/write/open/close operations, the same POSIX semantics as a local filesystem.
The protocol. NFS (Network File System), invented by Sun Microsystems in 1984, is the Unix standard. SMB/CIFS is the Windows equivalent. NFSv4.1+ adds parallel NFS (pNFS) for distributing data across multiple servers.
How it feels. You mount -t nfs server:/export /mnt/data and then ls /mnt/data like it’s local. Applications don’t know the difference. That’s the magic, and the trap.
The trap. POSIX file semantics (locks, permissions, open-close-delete atomicity) are expensive to maintain over a network. Every stat() call, every directory listing, every lock check crosses the network. At scale, metadata operations become the bottleneck, not data transfer.
AI relevance. NFS is the most common protocol for AI training data today. Why? Because PyTorch’s DataLoader, TensorFlow’s tf.data, and every ML framework expect a filesystem path. dataset = ImageFolder("/mnt/training-data/") just works. No special SDK, no API calls, no code changes. This simplicity is NFS’s superpower.
Here’s the dirty secret: NFS is often not the right protocol for AI workloads. Training data is read sequentially, shuffled, and never modified. POSIX semantics (locks, permissions, mtime tracking) are pure overhead. But NFS persists because changing the data loading code is friction, and engineers optimize for “works today” over “optimal tomorrow.”
Block storage strips away the file abstraction entirely. No filenames, no directories, no permissions. Just numbered blocks (typically 512 bytes or 4KB) on a logical volume. The server sees a raw disk and puts its own filesystem on top.
Think of it as renting an empty apartment. The building (SAN) provides the space, but you bring your own furniture (filesystem) and organize it however you want.
Storage Area Networks emerged in the late 1990s when databases outgrew local storage. The idea: build a dedicated high-speed network just for storage traffic, separate from the regular Ethernet LAN.
The protocols:
Fibre Channel (FC). The original SAN protocol. Dedicated switches, dedicated cables (fiber optic), dedicated HBAs (Host Bus Adapters). Blazing fast for its era (1 Gb/s in 1997, 64 Gb/s today). Extremely reliable. Extremely expensive. Think of FC like a private highway: fast and uncongested, but you have to build the entire road yourself.
iSCSI. “Let’s run SCSI commands over regular Ethernet.” Launched in 2003, iSCSI democratized SANs. Instead of dedicated FC infrastructure, you use your existing network. Slower than FC (Ethernet has more overhead), but dramatically cheaper. The Honda Civic to FC’s Ferrari.
Fibre Channel over Ethernet (FCoE). An attempt to get FC’s performance on Ethernet’s infrastructure. Required special “lossless” Ethernet switches. Never gained traction. It combined the complexity of both protocols with the advantages of neither.
The SAN era (roughly 2000-2015) was the golden age of enterprise storage vendors. EMC (now Dell EMC), NetApp, IBM, Hitachi, Pure Storage. These companies built empires selling SAN arrays that cost more than sports cars. A fully loaded EMC Symmetrix could run $5 million. The sales cycle involved steak dinners and golf outings. Storage admins had dedicated SAN teams.
What made SANs dominant was databases. Oracle, SQL Server, DB2: all needed consistent, low-latency block I/O. A SAN could deliver sub-millisecond latency with enterprise features like snapshots, replication, thin provisioning, and deduplication. Try doing that with a pile of local disks.
The decline. The cloud changed everything. AWS EBS (Elastic Block Store) is essentially a cloud SAN (block volumes over the network) but you don’t buy the hardware, configure the switches, or hire the SAN admin. On-premises SANs still exist (banks, hospitals, government), but new deployments are increasingly cloud-based or replaced by software-defined alternatives.
AI relevance. Block storage is critical for databases that support AI workflows: PostgreSQL for metadata, vector databases like pgvector, ML experiment tracking. But you don’t train models on block storage. The block interface (read block 47,382 from LUN 3) is a terrible match for “stream 50TB of images sequentially.”
NVMe-oF is the modern answer to the SAN. The concept: extend the NVMe protocol over a network, so remote flash drives appear as if they’re locally attached. Microsecond-level remote storage access.
Local NVMe is fast: 10 microsecond latency. But what if you have 1,000 NVMe drives in a rack and 100 compute nodes that need access? You can’t plug every drive into every server. NVMe-oF extends the NVMe queuing model over a network fabric, preserving the multi-queue architecture that makes NVMe fast.
| Transport | Latency Added | Infrastructure Required | Reality Check |
|---|---|---|---|
| RDMA (RoCEv2) | ~5-10 us | Lossless Ethernet (PFC/ECN), specialized NICs | Fastest, but configuring lossless Ethernet correctly is an art form. Misconfigure one switch and performance craters. |
| InfiniBand | ~2-5 us | Dedicated InfiniBand switches and HCAs | HPC standard, NVIDIA’s home turf. Fast and reliable, but separate network fabric. |
| TCP | ~30-80 us | Standard Ethernet | Easy to deploy, works everywhere. But 30-80us on top of NVMe’s 10us is a 3-8x latency hit. Still way faster than iSCSI. |
The promise. “Remote NVMe that feels local.” Disaggregated storage: separate your compute and storage into independent pools that scale independently.
The reality in 2026. NVMe/TCP works and is widely deployed, but “feels local” is a stretch when you 3x the latency. RDMA is genuinely close to local performance, but requires careful network engineering. InfiniBand delivers on the promise, but only within HPC/AI clusters that already run InfiniBand for GPU-to-GPU communication.
AI relevance. This is big. NVIDIA’s entire inference infrastructure assumes NVMe-oF as the transport between storage and compute. BlueField-4 DPUs speak NVMe-oF natively. When Jensen Huang talks about “AI factories,” the storage fabric connecting thousands of GPUs to petabytes of flash is NVMe-oF over InfiniBand or RoCEv2.

Object storage throws away everything you know about filesystems and block devices. No hierarchy. No directories. No block addresses. Just three things:
training-data/imagenet/n01440764/n01440764_10026.JPEG)You interact with it through HTTP: PUT to store, GET to retrieve, DELETE to remove, LIST to enumerate. That’s essentially the whole API.
Amazon launched S3 (Simple Storage Service) on March 14, 2006. It was designed for one thing: giving web applications a place to store files without managing servers. Upload a profile photo, serve a static website, store log files. Nobody was thinking about AI.
Why S3 won. Three things that sound boring but changed everything:
For its first decade, object storage had a reputation problem: it was slow. And honestly? It was. Early S3 latency was 50-200ms per request. You couldn’t run a database on it. You couldn’t mount it as a filesystem without hideous performance. It was “archive tier,” the place you put data when you didn’t need it anytime soon.
The reasons were architectural: HTTP overhead, eventually consistent reads (until 2020, S3 could return stale data after a write), and the simple fact that it was designed for throughput and durability, not latency.
Everything changed between 2020 and 2025:
Strong consistency (2020). S3 became strongly consistent at no extra cost. Read-after-write consistency for all operations. This single change eliminated the #1 objection from serious workloads.
S3 Express One Zone (2023). Purpose-built for latency-sensitive workloads. Single-digit millisecond first-byte latency. 10x faster than standard S3.
S3 Tables (2024). Native Apache Iceberg support. Object storage that understands tabular data. 3x faster queries, automatic compaction, built-in catalog.
S3 Vectors (2025). Native vector embedding storage and nearest-neighbor search. Sub-second queries over 2 billion vectors.
Performance at scale. Modern object stores (MinIO, Ceph RGW, and cloud-native ones) can deliver 100+ GB/s aggregate throughput on commodity hardware. That’s enough to feed a rack of GPUs.
Here’s the contrarian take that’s rapidly becoming consensus: object storage will eat everything.
Not because it’s the fastest protocol for every workload. It isn’t. But because it solves the problems that actually matter at scale:
Scale without limits. Filesystems break at billions of files (ask anyone who’s run ls on a directory with 10 million entries). Block storage requires LUN management and capacity planning. Object storage scales to trillions of objects by design. Flat namespace, hash-based distribution, no directory tree to maintain.
Economics. Object storage on commodity hardware costs 1/10th of enterprise SAN storage. Erasure coding gives you 11-nines durability at 1.5x raw capacity (vs. 3x for replication).
HTTP is universal. Every language, every platform, every cloud speaks HTTP. No special drivers, no kernel modules, no vendor lock-in (assuming S3-compatible API).
Metadata is first-class. Unlike block and file storage, every object carries its own metadata. This is transformative for data management. Search, classify, govern, and lifecycle data based on its properties, not its location in a directory tree.
Immutability is natural. Objects are written once and read many times. This aligns perfectly with training datasets, model checkpoints, audit logs, and regulatory archives. No in-place updates means no corruption, no locking, no read-write conflicts.
A data lake is a fancy name for “dump everything into object storage and figure out the schema later.” Coined by Pentaho CTO James Dixon around 2010, the idea was to store raw data (structured, semi-structured, unstructured) in its native format on cheap storage (originally HDFS, now mostly S3-compatible object storage).
The appeal. No upfront schema design. No ETL pipeline to transform data before loading. Just dump your CSV files, JSON logs, Parquet tables, images, and videos into buckets. Analyze later with Spark, Presto, or Hive.
The problem. Data lakes became data swamps. Without schema enforcement, governance, or quality checks, organizations ended up with petabytes of data nobody could find, trust, or use. “Schema on read” sounds liberating until you realize nobody documented the schema.
The lakehouse architecture (Databricks coined the term in 2020) is the fix. It puts a structured table format (Apache Iceberg, Delta Lake, or Apache Hudi) on top of object storage. You get:
Why it matters for AI. A lakehouse is where training data lives in production. Your ML pipeline reads from Iceberg tables on object storage, trains a model, writes evaluation metrics back to another table, and stores model artifacts as objects. All in the same system.
The progression looks like this:
Raw data (logs, events, sensors)
|
v
Object Storage (S3-compatible, durable, cheap)
|
v
Iceberg Table (schema, versioning, ACID)
|
v
Feature Engineering (Spark, Flink, DuckDB)
|
v
Training Pipeline (PyTorch DataLoader)
|
v
Model Artifacts -> back to Object Storage
Everything in this pipeline speaks object storage. The lakehouse doesn’t replace S3. It adds structure on top of it. This is why object storage is the foundation layer that everything else builds on.

NVIDIA doesn’t build storage. But NVIDIA increasingly dictates what storage looks like through its certification programs, reference architectures, and the sheer gravitational pull of being the center of the AI universe.
Here’s something that surprises people: most AI training clusters today use NFS or Lustre for training data. Not object storage. File protocols.
Why? Three reasons:
PyTorch expects a filesystem. DataLoader(dataset=ImageFolder("/data/train/")) needs a mounted path. Rewriting data loaders to use S3 APIs is possible (via smart libraries) but adds complexity.
NVIDIA DGX and certification. NVIDIA’s validated designs (DGX SuperPOD, BasePOD) have historically certified file-based storage partners. WEKA, DDN Lustre/EXAScaler, VAST Data, NetApp: all primarily file/NFS vendors. The certification program ensures these systems can keep GPUs fed. If you want the “NVIDIA Certified” badge, you play by NVIDIA’s rules.
Random access patterns. Training with random shuffling requires random reads across a dataset. File protocols handle this naturally. Object storage traditionally adds HTTP overhead per request, making small random reads expensive.
But the tide is turning. NVIDIA’s storage ecosystem is evolving in a significant direction:
Larger datasets demand object scale. When your training set is 100TB, NFS can handle it. When it’s 10PB (common for foundation model training), you need object storage’s scale-out economics. No NFS server handles 10PB gracefully. Object storage distributes it across hundreds of nodes automatically.
Cloud training is object-native. Every major cloud’s AI training service (AWS SageMaker, Google Vertex AI, Azure ML) reads training data from object storage. Cloud-native training pipelines skip NFS entirely.
New protocols bridge the gap. S3-compatible APIs with range reads, batch operations, and prefetch hints are closing the performance gap. Libraries like AIStore, S3 connector for PyTorch, and fsspec abstract the protocol. Your DataLoader code stays the same, but reads come from S3 instead of NFS.
NVIDIA is broadening. The partner ecosystem is expanding beyond file storage. Fast object storage that can deliver sustained high-bandwidth reads to GPU clusters is becoming a validated tier. The writing is on the wall: object storage with performance guarantees will be a first-class citizen in NVIDIA’s reference architectures.
Here’s where it gets really interesting, and where most storage coverage misses the point entirely.
At CES 2026, NVIDIA announced ICMS (Inference Context Memory Storage). It’s not a product you buy. It’s a new tier in the memory hierarchy, sitting between local NVMe and shared object storage.
Why it exists. Modern AI inference (especially with large language models and AI agents) builds enormous context windows. When a chatbot maintains a 128K-token conversation, that context lives as a KV (key-value) cache in GPU memory. But GPU HBM is precious and limited. When the KV cache overflows HBM, it needs somewhere to spill.
The memory hierarchy for AI inference looks like this:
| Tier | What | Latency | What Lives There |
|---|---|---|---|
| G1 | GPU HBM | Nanoseconds | Active KV cache, model weights |
| G2 | Host RAM | Microseconds | Overflow KV cache, prefill buffers |
| G3 | Local NVMe | ~100 us | Warm context, model weight shards |
| G3.5 (ICMS) | Network flash (RDMA) | Low microseconds | Shared KV cache across pods |
| G4 | Object Storage | Milliseconds | Training data, checkpoints, datasets |
The magic is G3.5. Without ICMS, if Agent A builds a context on Node 1 and Agent B needs related context on Node 7, it has to be recomputed from scratch. ICMS creates a shared flash tier across the pod, powered by BlueField-4 DPUs with 800 Gb/s RDMA connectivity.
The rebrand: CMX. After GTC 2026 (March 16-19, San Jose), expect NVIDIA to rebrand ICMS as CMX (Context Memory eXtensions). Same technology, better name. “ICMS” is a mouthful; “CMX” matches NVIDIA’s naming style (think CUDA, NVLink, NIM) and positions context memory as a platform extension rather than a storage product. The technology doesn’t change (BlueField-4, NVMe-oF transport, NVIDIA Dynamo orchestration) but the messaging shifts from “storage infrastructure” to “memory architecture for agentic AI.”
Why should you care? Because CMX defines what the G4 object storage layer needs to be:
Fast enough to pre-stage into CMX. If your object store can’t deliver sustained 100+ Gb/s to the CMX tier, it becomes the bottleneck for the entire inference pipeline.
Smart enough to know what to pre-stage. The object store that understands inference patterns (which context windows are reused, which model shards are hot, which datasets feed which agents) will outperform one that treats everything as opaque blobs.
Integrated with the NVIDIA ecosystem. NVMe-oF transport, GDS support, Dynamo integration. The storage system that speaks NVIDIA’s language will get the certification, the reference architecture inclusion, and ultimately the deployment.
Let’s get practical. Here’s a cheat sheet:
Winner: Object Storage
Raw data arrives from everywhere: web scrapes, sensor feeds, user logs, public datasets. You need scale (petabytes), durability (don’t lose it), and cost efficiency (most of it won’t survive filtering). Object storage with lifecycle policies to tier cold data is the obvious choice.
Winner: Data Lakehouse (Object Storage + Iceberg)
Iceberg tables on object storage give you schema enforcement, versioned datasets, time travel for reproducibility, and engine-agnostic access. Run Spark or Flink for ETL, query with DuckDB for exploration, all reading from the same Iceberg tables.
Current Winner: NFS/Lustre. Future Winner: Object Storage
Today, NFS wins because of tooling compatibility and random read performance. But as datasets grow beyond what single NFS servers can handle, and as PyTorch’s data loading ecosystem adds first-class S3 support, object storage’s scale-out architecture becomes necessary. The crossover is happening now for datasets above ~500TB.
Winner: Object Storage
Checkpoints are large (multi-GB to TB), written periodically, and need durability. Object storage with versioning is ideal. Write checkpoint v47, keep the last 10 versions, auto-expire older ones. No filesystem to manage.
Winner: Local NVMe + CMX + Object Storage (tiered)
Hot model weights on local NVMe (G3). Shared context in CMX (G3.5). Model artifacts and full weight sets in object storage (G4). The tiers work together. CMX pre-stages from object storage, local NVMe caches the hottest data.
Winner: CMX (ICMS)
This is CMX’s reason for existing. Shared, transient, high-bandwidth context that doesn’t need durability but needs to be accessible across pods. Neither NFS nor object storage is designed for this workload.
Winner: Object Storage (with native vector support)
Billions of embeddings need scale-out storage, not a single-node vector database. AWS S3 Vectors showed the direction: vectors as a storage primitive, not a separate system.
Here’s my prediction for the next five years:
NFS and block storage won’t disappear, but they’ll shrink to niche roles. NFS for legacy compatibility and small-scale training. Block for databases. Neither grows.
Object storage becomes the universal foundation. Not because it’s perfect for every workload, but because it’s good enough for most and best for scale, economics, and data management. The performance gap with file/block protocols shrinks every year. When object storage is within 20% of NFS speed but 10x cheaper at 100x the scale, the math doesn’t lie.
Table formats (Iceberg) and vector indexes become standard features of object storage, not separate products. Just as S3 absorbed consistency, it will absorb tabular and vector capabilities. MinIO’s AIStor Tables and AWS S3 Tables/Vectors are the first wave.
CMX creates a new storage category that didn’t exist before: transient, shared, high-bandwidth context memory. It’s not file, block, or object. It’s something new, purpose-built for AI inference, and it will become as fundamental to AI infrastructure as GPUs are to training.
The storage protocol that wins the AI era isn’t the fastest one. It’s the one that understands data (schemas, embeddings, inference patterns, lifecycle) rather than just moving bytes. Object storage, extended with tables and vectors and integrated with CMX, is on that trajectory. Everything else is arguing about I/O latency while the world moves to data semantics.
The bytes still matter. They always will. But the protocol that just moves bytes and nothing else? That’s the one heading for the history books.
NVIDIA GTC 2026 runs March 16-19 in San Jose. Watch for the CMX announcement during Jensen Huang’s keynote (Monday, March 16, 8-11 AM PDT). For current ICMS details, see the NVIDIA Technical Blog and BlueField-4 announcement. NVMe-oF specifications from NVM Express. Apache Iceberg at iceberg.apache.org. S3 API reference at AWS S3 documentation.

A 37-year-old standard is holding storage back.
In 1983, the IEEE authorized a project to standardize the kernel interface across the proliferating zoo of Unix variants: AT&T System V, BSD, Xenix, SunOS, HP-UX, and others. The result, published in 1988 as IEEE Std 1003.1, was named POSIX (Portable Operating System Interface, the name suggested by Richard Stallman). Its goal was elegantly simple: write your program once, compile it on any conforming Unix, and it works.
POSIX’s I/O model reflected the reality of 1988 computing:
/home/user/data/file.txt) mapped directly to the on-disk structure of the Unix filesystem (UFS, later ext2/3/4, XFS, ZFS).write() followed by a read() on the same file descriptor returns the data you just wrote. Always. Immediately. No eventual consistency, no stale reads.stat(), chmod(), chown(), utimes(). Querying and modifying file metadata costs a few microseconds because the inode is on a local disk, cached in RAM.For three decades, this model worked. It worked for workstations, for databases, for HPC (where parallel filesystems like Lustre and GPFS extended POSIX semantics across clusters), for web servers, for everything that ran on Unix.
It worked because the fundamental assumption held: storage is local, or close enough to local that the abstraction doesn’t leak.
That assumption is now false.
Every POSIX operation begins with metadata. open() traverses the directory tree, resolving each path component through inode lookups. stat() fetches inode attributes. readdir() enumerates directory entries. On a local ext4 filesystem with VFS caching, these operations complete in microseconds.
On a distributed filesystem with billions of files, they don’t.
A metadata server (MDS) in Lustre, CephFS, or HDFS must handle every stat(), every open(), every readdir() from every client. AI training pipelines that scan millions of small image files (ImageNet: 14 million files, average 100 KB) generate millions of metadata operations per minute. The MDS becomes the bottleneck long before the data servers are saturated.
The standard workaround (“pack your small files into tar archives”) is an admission that the filesystem abstraction has failed. When the recommended practice is to work around the interface rather than use it, the interface is wrong.
Object storage has no metadata server. A key like training/imagenet/n01440764/n01440764_10026.JPEG is hashed directly to a storage node. No directory traversal, no inode lookup, no centralized bottleneck. Flat namespaces scale linearly.
POSIX I/O is stateful. open() creates a file descriptor with an implicit seek position. The kernel tracks this state for every open file, across every process, on every node that mounts the filesystem.
In a distributed system with 1,000 clients, each with 100 open files, the filesystem must maintain 100,000 pieces of state and keep them consistent. If a client crashes, the server must detect the failure and clean up its state (file locks, lease renewals, buffered writes). NFS’s statd and lockd daemons exist solely to manage this complexity, and they are notoriously unreliable.
Object storage is stateless. PUT /key writes an object. GET /key reads it. No open, no close, no seek position, no file descriptor. Each request is self-contained. A crashed client leaves no state to clean up. A failed server leaves no orphaned locks to resolve.
POSIX defines two locking mechanisms: flock() (BSD advisory locks) and fcntl() (POSIX record locks). Both are broken in distributed environments.
The dysfunction is legendary:
flock() doesn’t work over NFS. Prior to Linux 2.6.12, flock() on NFS files locked only locally. Other nodes saw no lock at all. Kernel 2.6.12 “fixed” this by silently converting flock() calls to fcntl() POSIX locks, which broke programs that acquired both lock types on the same file.fcntl() is unreliable over NFS. Different kernel versions implement it differently. Some lock locally and don’t notify the server. Some notify the server but do it wrong. There is no way to detect whether file locking actually works on a given NFS mount.flock() fails on NFS. fcntl() fails on SMB. There is literally no POSIX-compliant locking mechanism that works reliably across network filesystems.Object storage doesn’t need locks. Objects are immutable once written (or versioned). Concurrent writes to the same key are resolved by last-writer-wins or conditional writes (ETags, If-Match). There is no shared mutable state to protect.

POSIX guarantees close-to-open consistency at minimum, and many implementations provide stricter guarantees: a read() after a write() on the same file always returns the new data. In a distributed filesystem, maintaining this guarantee requires distributed locking, cache invalidation, and consensus protocols that scale poorly.
CephFS, which implements POSIX semantics over a distributed object store (RADOS), documents its deviations from POSIX explicitly, because full compliance is either impossible or prohibitively expensive at scale. Lustre similarly relaxes POSIX guarantees under concurrent access to maintain performance.
The irony: most modern applications don’t need POSIX consistency. AI training reads are embarrassingly parallel. Each worker reads different files, no sharing. Analytics queries read immutable Parquet files. Log ingestion appends to different partitions. The consistency guarantees that POSIX enforces (at enormous cost) are consumed by almost nobody.
Object storage offers tunable consistency. S3 achieved strong read-after-write consistency in December 2020, not because POSIX demanded it, but because applications needed it. The system provides exactly the guarantee required, no more.
POSIX namespaces are hierarchical: directories contain files and other directories, forming a tree. This model assumes that the organizational structure of data is known at write time and doesn’t change.
Modern data infrastructure violates this constantly:
stat() thousands of directory entries.Object storage’s flat namespace with prefix-based listing is simpler and faster. ListObjectsV2(prefix="sales/2025/Q3/") returns matching keys without traversing a directory tree. Prefixes are metadata, not structure.
Every POSIX I/O operation is a syscall: open(), read(), write(), close(), stat(), fstat(), lseek(), fsync(). Each syscall crosses the user-kernel boundary, triggering a context switch that costs 100-500 nanoseconds on modern hardware.
For a training pipeline reading millions of small files:
open(): 1 syscallfstat(): 1 syscall (get file size)read(): 1-N syscalls (depending on file size)close(): 1 syscallThat’s 4+ syscalls per file, millions of files, hundreds of nanoseconds each. Millions of context switches per second just to read training data. This is why frameworks like NVIDIA DALI, WebDataset, and TFRecord exist. They pack files into sequential archives to amortize syscall overhead across thousands of samples.
Object storage replaces this with a single HTTP request: GET /key. One network round-trip, one response, no kernel state transitions.
POSIX permissions (owner/group/other, rwx bits) and even POSIX ACLs were designed for multi-user Unix workstations. They assume:
None of this maps to modern cloud infrastructure:
GetObject but deny ListBucket for the same prefixObject storage’s IAM model (policies attached to identities, evaluated per-request) is fundamentally more expressive and more secure than anything POSIX can offer.

The storage industry’s instinct, when confronted with a new paradigm, is to build a bridge. POSIX is everywhere. Applications expect it. So we’ll put a POSIX layer on top of object storage and everyone can keep their existing code.
This is how we got:
Ceph RGW. An S3-compatible gateway that translates HTTP requests into RADOS operations. Every S3 PUT becomes a series of internal RADOS writes with metadata bookkeeping. Small objects suffer disproportionately. The translation overhead (multipart handling, bucket index updates, metadata journal writes) can exceed the actual data I/O.
S3FS-FUSE. Mounts an S3 bucket as a local filesystem. Every read() becomes an HTTP GET. Every stat() becomes a HEAD request. Every readdir() becomes a ListObjects call. The latency of each operation goes from microseconds (local VFS cache) to milliseconds (network round-trip). SNIA has documented why S3FS fails for AI/ML workloads. The performance penalty is 10-100x compared to local filesystem access.
HDFS. Attempted to provide a filesystem interface over distributed storage while relaxing POSIX semantics (append-only, no random writes). Still requires a centralized NameNode for all metadata operations, which becomes the bottleneck at scale.
JuiceFS, cunoFS, Alluxio. Modern attempts to provide high-performance POSIX over object storage. Better engineered than S3FS, but still fundamentally constrained by the impedance mismatch: every POSIX operation must be translated into one or more object operations, with metadata consistency maintained by an external database (Redis, TiKV, PostgreSQL).
Every translation layer adds latency, complexity, and failure modes. Every gateway is a bottleneck. Every bridge is a constraint.
The solution is not a better bridge. The solution is to stop crossing the river.
Applications that need POSIX (legacy databases, desktop file managers, NFS-based workflows) will continue to use local or network filesystems. They always will. But every new application, every new training pipeline, every new analytics platform, every new AI agent framework should be built on native object storage APIs. Not because POSIX is bad. It was great for what it was designed to do. But the workloads have changed, the scale has changed, and the assumptions have changed.
Here’s the question that keeps appearing in storage futurism panels: “Will quantum computing make all of this obsolete? Will quantum storage replace object storage entirely?”
The short answer: no. Not in your career, and probably not in your children’s careers.
Quantum computing’s fundamental unit, the qubit, has a property that makes it useless for persistent storage: decoherence. A qubit’s quantum state (the superposition that gives it computational power) decays over time as the qubit interacts with its environment. Current coherence times range from microseconds to milliseconds for superconducting qubits.
For context: a modern NVMe SSD retains data for years. A qubit retains its state for millionths of a second.
Recent progress is encouraging. Researchers at the University of Innsbruck demonstrated a multi-ion quantum memory with a coherence time exceeding two hours in a cryogenic trap. But this required exotic laboratory conditions and stored a single qubit. Storing a petabyte (8 x 10^18 bits) with quantum fidelity is not an engineering challenge we’re within decades of solving.
Moreover, the no-cloning theorem (a fundamental law of quantum mechanics, not an engineering limitation) states that an unknown quantum state cannot be perfectly duplicated. This means:
Every classical storage system’s durability guarantee (eleven nines of durability, N+M redundancy, geographic replication) depends on the ability to copy data. Quantum mechanics forbids this for quantum states. You cannot build a durable storage system on a foundation that prohibits copies.
QRAM (Quantum Random Access Memory), the theoretical ability to query classical data in superposition, is a genuine research topic with real potential for quantum algorithms (Grover’s search, HHL linear system solving, quantum ML). But QRAM is about accessing classical data from a quantum computer, not about storing data in quantum states. The storage layer remains classical.
Quantum computing’s real impact on storage is not about replacing it. It’s about breaking its security model.
Shor’s algorithm, running on a sufficiently powerful quantum computer, can factor large integers and compute discrete logarithms in polynomial time. This breaks:
These are the cryptographic primitives that protect data at rest (AES key wrapping, disk encryption key management), data in transit (TLS), and data integrity (digital signatures on checksums).
The timeline is debated but converging: cryptographically relevant quantum computers (CRQCs) are projected for the 2030s, with nation-state actors potentially arriving earlier. Citi Research published a trillion-dollar security assessment in January 2026 calling this “the trillion-dollar security race.” The “harvest now, decrypt later” threat (adversaries capturing encrypted traffic today to decrypt it when quantum computers arrive) is already considered active by intelligence agencies.
NIST responded by finalizing three post-quantum cryptography standards in August 2024:
With a fourth, HQC, a code-based backup algorithm for ML-KEM, released in March 2025.
For storage systems, this means:
Quantum computing doesn’t replace object storage. It makes object storage’s security model obsolete, and demands a migration to post-quantum cryptography that most storage systems haven’t started.
The path forward is not incremental. It’s not “POSIX with some object features” or “object storage with a POSIX gateway.” It’s a clean break.
The S3 API (PUT, GET, DELETE, HEAD, ListObjects, multipart upload, pre-signed URLs) is the lingua franca of modern data infrastructure. Every cloud provider speaks it. Every AI framework reads from it. Every analytics engine queries through it. Kubernetes has COSI (Container Object Storage Interface) as the native standard for provisioning S3-compatible buckets, complementing CSI for block/filesystem storage.
New storage systems should speak S3 natively. Not through a gateway, not through a translation layer, but as their primary and only data interface. No POSIX shim. No FUSE mount. No NFS gateway. If an application needs POSIX, it can use a local filesystem or a purpose-built network filesystem. The object store should not contort itself to emulate something it isn’t.
As I wrote in Storage Is Dead. Long Live Data., the next storage system must understand its contents: Iceberg tables, vector embeddings, inference context. This is the opposite of POSIX, which treats everything as a bag of bytes with permissions attached.
Native object storage can embed table catalogs, vector indexes, and schema metadata directly into the storage engine. POSIX can’t. Its metadata model is fixed by a 37-year-old standard that knows about owners, groups, timestamps, and permission bits. Nothing else.
New storage systems being designed today will be in production in the 2030s, squarely within the CRQC threat window. Building with classical-only cryptography is technical debt with a known, approaching deadline.
The right architecture: ML-KEM for key exchange, ML-DSA for object integrity signatures, AES-256-GCM for data encryption (quantum-resistant at 256-bit key lengths), and crypto-agility built into the wire protocol so algorithms can be rotated without a format migration.
The defining architectural choice: no centralized metadata server (unlike HDFS’s NameNode, CephFS’s MDS, or Lustre’s MDT). Object placement computed deterministically via consistent hashing. BLAKE3 to a partition, HRW to a node. Metadata travels with the object or lives at computed locations. No gateway process translates between protocols. The storage engine is the API server.
This eliminates:

POSIX earned its place in computing history. It unified Unix, enabled portable software, and provided a stable foundation for 37 years of systems engineering. That’s a remarkable achievement for any standard.
But POSIX was designed for a world where storage was a local disk, files numbered in the thousands, users sat at terminals, and “distributed” meant NFS over 10 Mbps Ethernet. It was not designed for petabyte-scale flat namespaces, billions of immutable objects, AI training pipelines that read millions of files per hour, or federated identity systems that span clouds.
The choice for new storage systems is clear:
Adapt native object storage (no gateway, no metadata server, no POSIX shim) or lose to the systems that did.
Quantum computing won’t save POSIX. It won’t replace object storage. What it will do is break the cryptographic foundations that both rely on, forcing a migration to post-quantum algorithms that’s easier to do in a clean, modern system than in one dragging 37 years of compatibility baggage.
The river has moved. Stop building bridges to the old bank.
POSIX history from IEEE Std 1003.1-1988 and The Open Group. POSIX I/O scalability analysis from The Next Platform and Frontiers in HPC. NFS locking problems from apenwarr and Lennart Poettering. CephFS POSIX deviations from Ceph documentation. S3FS limitations from SNIA. QRAM research from Quantum Journal. Post-quantum cryptography standards from NIST. Quantum security timeline from BCG and Citi Research. Kubernetes COSI from kubernetes.io.

Published ahead of NVIDIA GTC 2026 (San Jose, March 16-19)
Every decade, the storage industry reinvents itself. But each reinvention has shared the same core assumption: storage is about bytes. Store them, retrieve them, don’t lose them. The interface changes (SCSI, iSCSI, NFS, S3) but the contract doesn’t: you give me bytes, I give them back when you ask.
That contract is ending.
Network Attached Storage gave us the file abstraction. Hierarchical namespaces, POSIX semantics, NFS and SMB. It was the language of workstations, home directories, and shared drives. Files had names, permissions, and modification times. Storage understood nothing about what was inside them.
Storage Area Networks stripped away even the file abstraction. Raw blocks, addressed by LUN and offset, served databases and virtual machines that needed deterministic latency and their own filesystem semantics. Storage became dumber on purpose. Block devices are maximally generic, maximally fast, and maximally ignorant of their contents.
Amazon S3 reinvented storage as an HTTP API. Objects with keys, metadata, and flat namespaces. No hierarchy, no POSIX, no open/close semantics. Just PUT and GET over the internet. S3’s genius wasn’t technical. It was economic. Pay-per-request pricing, infinite namespace, and eleven nines of durability turned storage from a capital expenditure into a utility.
For nearly two decades, the entire industry orbited S3’s API. MinIO, Ceph RGW, Wasabi, Backblaze B2, Cloudflare R2. Every alternative object store exists because S3 defined the interface. The competition was on cost, performance, and deployment model. Never on capability.
We’re now entering the fourth era, and it breaks the pattern. For the first time, the storage system is expected to understand what it stores. Not just bytes, but rows, columns, embeddings, schemas, versions, and inference context. The contract is no longer “store my bytes.” It’s “understand my data.”
Three simultaneous shifts are driving this.

Apache Iceberg has quietly become the most consequential data infrastructure project since S3 itself.
The numbers tell the story: Iceberg adoption is projected to surpass Delta Lake within three years, with 31% current adoption and 29% planned adoption versus Delta’s 39%/23% split. The Iceberg catalog service market hit $578 million in 2024 and is projected to reach $4.18 billion by 2033, growing at 21.7% annually. Enterprises report 90% reductions in S3 API costs after migrating from Hive to Iceberg, and 20% savings on compute from more efficient query execution.
What’s happening is structural: organizations are replacing their data warehouses with Iceberg tables sitting on object storage. The lakehouse architecture (coined by Databricks, now an industry-wide movement) puts an open table format (Iceberg, Delta, Hudi) on top of S3-compatible storage and queries it directly with Spark, Trino, DuckDB, Flink, or any engine that understands the format.
This changes what object storage needs to be. An Iceberg table isn’t a single object. It’s a graph of metadata files (manifest lists, manifests, snapshots) pointing to data files (Parquet, ORC, Avro), all stored as objects. The catalog that tracks tables, schemas, and snapshots becomes the critical control plane. If your object store doesn’t speak Iceberg natively, you need an external catalog service. Another system to deploy, monitor, secure, and scale.
The hyperscalers got the memo. AWS launched S3 Tables in December 2024, the first S3 feature that understands tabular structure, with built-in Iceberg support delivering 3x faster query throughput and 10x higher TPS than self-managed tables, plus automatic compaction and snapshot management. S3 Tables added Iceberg REST Catalog APIs in March 2025, letting any Iceberg-compatible engine discover and query tables stored in S3 without an external metastore.
On the software-defined side, MinIO is the only company that has fully internalized this shift. AIStor Tables, announced GA in February 2026, embeds the full Apache Iceberg V3 Catalog REST API directly into the object store. No external Hive Metastore. No AWS Glue dependency. No separate catalog service. Tables and objects coexist in a single system. The catalog is the storage.
This is the right architectural instinct. When every analytics query begins with a catalog lookup that resolves to a set of objects, separating the catalog from the store is an artificial boundary that adds latency, complexity, and failure modes.
The rise of RAG (Retrieval-Augmented Generation), semantic search, and AI agents has created a new data type that doesn’t fit any existing storage abstraction: the vector embedding.
An embedding is a fixed-length array of floating-point numbers (typically 256-2048 dimensions) that represents the semantic meaning of a piece of content. A document paragraph, an image, a code snippet, a customer interaction. Querying vectors means finding the nearest neighbors in high-dimensional space, not matching keys or scanning columns.
The first generation of vector databases (Pinecone, Weaviate, Qdrant, Milvus) built purpose-built systems for this workload. But as embedding counts scale into the billions, a pattern is emerging: vector storage is converging back into object storage.
AWS made this explicit with S3 Vectors, launched in preview in 2025 and generally available in December 2025 with support for 2 billion vectors per index. S3 Vectors reduces vector storage and query costs by up to 90% compared to purpose-built vector databases, delivers sub-second query latency for infrequent access patterns, and integrates natively with Amazon Bedrock for RAG workflows.
The lesson: vectors aren’t a separate workload that needs a separate database. They’re a data type that belongs in the same store as the objects and tables they index. A document lives in object storage. Its Iceberg-managed metadata lives in a table. Its embedding lives in a vector index. All three should be in the same system, governed by the same policies, replicated by the same engine, and queried through the same endpoint.
No software-defined object store handles this today. The ones that recognize the convergence first will define the next decade.

At CES 2026, Jensen Huang announced something that most storage coverage buried under GPU hype: the Inference Context Memory Storage (ICMS) platform. ICMS is not a storage product. It’s a new tier in the memory hierarchy, and it rewrites the relationship between GPUs and storage.
NVIDIA’s Rubin platform defines five tiers for inference data:
| Tier | Medium | Access Time | Purpose |
|---|---|---|---|
| G1 | GPU HBM | Nanoseconds | Active token generation |
| G2 | Host System RAM | Microseconds | KV cache staging, prefill buffers |
| G3 | Local NVMe SSDs | ~100 microseconds | Warm KV cache, short-term reuse |
| G3.5 (ICMS) | Ethernet-attached flash | Low microseconds (RDMA) | Shared KV cache across pods |
| G4 | Shared Object Storage | Milliseconds | Durable artifacts, checkpoints, datasets |
The breakthrough is G3.5. Traditional inference offloads KV cache from GPU HBM to host RAM (G2) or local SSD (G3). But these are per-node resources. When Agent A builds a 128K-token context on Node 1, and Agent B needs a related context on Node 7, there’s no shared tier. The context must be recomputed from scratch.
ICMS solves this with a pod-level shared flash tier, powered by BlueField-4 DPUs with 800 Gb/s connectivity, RDMA-accelerated NVMe-oF transport, and purpose-built KV cache management via NVIDIA Dynamo and NIXL (NVIDIA Inference Transfer Library).
The performance claims are striking: 5x higher tokens-per-second and 5x better power efficiency compared to traditional storage approaches for long-context inference. The key insight is that KV cache is transient, derived, and recomputable. It doesn’t need the durability guarantees of traditional storage, but it needs the bandwidth and shareability that local SSDs can’t provide.
ICMS doesn’t replace object storage. It creates a new tier above it in the latency hierarchy and below it in the durability hierarchy. The infrastructure looks like:
┌──────────────────────────────────────────────────┐
│ GPU Cluster (Rubin Pods) │
│ G1: HBM ←→ G2: Host RAM ←→ G3: Local NVMe │
│ ↕ │
│ G3.5: ICMS (BlueField-4 + Flash JBOFs) │
│ Shared KV cache, RDMA, NVMe-oF │
│ ↕ │
│ G4: Object Storage (S3-compatible) │
│ Training data, checkpoints, Iceberg tables │
│ Vector indexes, model artifacts │
└──────────────────────────────────────────────────┘
G4, the object storage layer, is still the foundation. It holds the durable data: training datasets, model weights, fine-tuning artifacts, Iceberg-managed analytics tables, vector embeddings, and RAG corpora. ICMS doesn’t replace any of this. What it does is create a new consumer of object storage, one that pre-stages context from G4 into G3.5 for rapid inference access.
This has profound implications:
Object storage must be fast enough to feed ICMS. If the G4 tier can’t deliver data to G3.5 at wire speed, the entire memory hierarchy stalls. Slow object storage becomes the bottleneck for inference latency.
Object storage must understand data semantics. ICMS doesn’t want raw bytes. It wants KV cache blocks, embedding chunks, and context windows. The storage system that can organize, index, and pre-stage this data based on inference patterns will outperform one that treats everything as opaque objects.
The storage vendor ecosystem is mobilizing. NVIDIA named 12 storage partners for ICMS at launch: DDN, Dell, HPE, Hitachi Vantara, IBM, Nutanix, Pure Storage, Supermicro, VAST Data, and WEKA among them. Conspicuously, no software-defined object storage project is on that list. The ICMS ecosystem is being built by proprietary vendors.
At GTC 2026, expect NVIDIA to deepen the Dynamo + ICMS integration story, likely with live demos showing agentic AI workloads with shared context across inference pods. The storage systems that integrate with this stack (speaking NVMe-oF, understanding KV cache semantics, delivering RDMA-capable throughput) will be positioned as the G4 foundation for the next generation of AI infrastructure.
These three shifts (tables, vectors, and inference context) are not separate trends. They’re converging into a single requirement: the storage system must understand data, not just bytes.
Here’s what that means concretely:
The storage system must embed an Iceberg REST Catalog, manage table metadata (snapshots, manifests, schema evolution), and perform automatic maintenance (compaction, orphan file cleanup, snapshot expiration). Tables are not a separate product. They’re a view of the same objects.
AWS understood this with S3 Tables. MinIO understood this with AIStor Tables. The next software-defined storage system must understand it too.
Vector embeddings must be a first-class storage primitive, not a separate database that happens to use the object store as a backend. Store vectors, query nearest neighbors, and link embeddings to their source objects and table rows, all through the same API.
AWS understood this with S3 Vectors. No one else has followed.
The G4 tier must deliver sustained, high-bandwidth reads to feed ICMS pre-staging. This means:
When storage understands tables and vectors, replication becomes semantic: replicate a table’s latest snapshot (not individual Parquet files), replicate an embedding index (not individual vector blobs), apply retention policies to table versions (not object prefixes). Governance becomes meaningful: column-level access control in Iceberg tables, embedding visibility policies for multi-tenant RAG, audit trails that reference table operations rather than raw PUTs and GETs.
The worst outcome is the current state: one system for objects, another for tables, another for vectors, and a proprietary appliance for KV cache. Each with its own API, its own consistency model, its own failure modes, its own monitoring stack.
The right outcome is a single system that stores objects, manages Iceberg tables over those objects, indexes vectors alongside them, and serves as the durable foundation for ICMS-accelerated inference. All through one endpoint, on one cluster, with one operational model.
Let’s be honest about the competitive landscape.
The hyperscalers get it. AWS is systematically expanding S3 from “object store” to “data platform.” S3 Tables for Iceberg, S3 Vectors for embeddings, S3 Express One Zone for low-latency inference data. Each launch makes S3 harder to leave. That’s the point.
MinIO gets it. They’re the only software-defined storage company with no hardware lock-in that has shipped native Iceberg V3 support (AIStor Tables, GA February 2026), articulated a coherent lakehouse-on-object-storage strategy, and positioned their product as a data platform rather than just a byte store. AB Periasamy and the MinIO team have consistently been 12-18 months ahead of the rest of the software-defined storage world in recognizing architectural shifts.
The traditional storage vendors are adapting. Dell, Pure, NetApp, and VAST Data are all part of NVIDIA’s ICMS partner ecosystem. But their advantage is integration agreements, not architecture. They’re adding Iceberg support, adding vector capabilities, and adding RDMA endpoints to existing products. Bolted on, not built in.
The rest of the software-defined world doesn’t get it. Ceph is still arguing about RGW performance. SeaweedFS is focused on POSIX compatibility. Garage is optimizing for self-hosting. These are all valid goals, but they’re goals from Era 3. The data-aware storage system (the one that speaks Iceberg, indexes vectors, and feeds NVIDIA’s inference pipeline) doesn’t exist yet in the software-defined world outside of MinIO’s commercial offering.

There is a gap in the market that is about to become a chasm.
On one side: AWS, building the definitive data platform but locking it inside their cloud. On the other: MinIO, building the on-premises alternative but as a commercial product with enterprise licensing.
In between: no software-defined, cloud-native, data-aware object storage with no hardware lock-in that natively handles Iceberg tables, vector indexes, and ICMS-ready inference workloads. No system that an organization can deploy on their own hardware, on any cloud, and use as the foundation for both analytics and AI.
The infrastructure stack that Jensen Huang will showcase at GTC 2026 (Rubin GPUs, BlueField-4 DPUs, Dynamo inference framework, Spectrum-X networking, and ICMS) needs a G4 layer. NVIDIA doesn’t build storage. They build partnerships with storage vendors. The question is whether that G4 layer will be a proprietary appliance from a traditional vendor, a hyperscaler lock-in play, or a software-defined data platform with no hardware lock-in that runs anywhere.
Storage is no longer about storage. It’s about data. The system that understands this, that treats tables, vectors, and inference context as native citizens rather than afterthoughts, will define the next era.
The first three eras were about how to store bytes efficiently. The fourth era is about what those bytes mean.
NVIDIA GTC 2026 runs March 16-19 in San Jose. Jensen Huang’s keynote is Monday, March 16, 8-11 AM PDT. ICMS details from the NVIDIA Technical Blog and NVIDIA Newsroom. MinIO AIStor Tables coverage from Blocks and Files and MinIO Blog. Apache Iceberg adoption data from the 2025 State of the Iceberg Ecosystem survey. Amazon S3 Tables announcement and S3 Vectors GA announcement. NVIDIA Dynamo documentation.