Kubernetes — Lakshmi Narasimhan

Your Cloud Bill Is A Tax On Someone Else's Resume

Lakshmi Narasimhan — Fri, 24 Apr 2026 00:00:00 +0000

There’s an insurance company somewhere — real, working, profitable — with 100,000 monthly users and a peak concurrent load of about 5,000.

They spend high six figures a month on Kubernetes.

They employ twenty people to keep it running.

This story surfaced this week in the Hacker News thread on David Crawshaw’s cloud essay, and the comments section turned into a confessional. Engineer after engineer describing the same pattern: cluster adopted, cluster “optimized,” cloud spend doubled, incidents doubled, and somehow the only thing anyone can agree on is that they need to hire a platform engineer.

You don’t. You never did. Your entire application would run on a laptop.

The incentive nobody likes to say out loud

Here’s the quiet part: your DevOps team does not choose infrastructure based on what your application needs.

They choose it based on what their next job will pay for.

Kubernetes on a resume is worth more than Docker Compose on a resume. Terraform on a resume is worth more than “I SSH’d into the box.” Managed EKS on a resume is worth more than “I run a VM.” Every procurement decision in a modern engineering org is being made by someone who, at some level, is also writing the next page of their LinkedIn.

And management, god bless them, trusts the sales and marketing departments of Datadog and AWS and HashiCorp more than they trust their own engineers. So when someone internally says “we could do this on one server,” and someone externally sends a deck titledScaling Your Platform For The Future, guess which one wins the meeting.

The decision was never technical. You just paid the technical price for it.

Kubernetes is not the villain. The scale is.

Let’s be precise, because “Kubernetes” is doing a lot of work in this essay.

Full enterprise Kubernetes — managed control planes, service meshes, operators for everything, a dedicated platform team, Helm charts nested inside Helm charts like Russian dolls of YAML — that thing was built for Google’s problem. Multi-tenant, multi-region, thousands of services, teams that don’t talk to each other.

If your org does not look like that, you are wearing a costume.

K3s on a single VPS is not the same animal. Docker Compose on a single VPS is not the same animal. Kamal shipping containers to one Debian box is not the same animal. Those are orchestration for people who want one sane way to deploy a container, not a career in platform engineering.

The HN thread is full ofengineers who moved from full K8s to one of these simpler setups. The reports are boringly consistent: costs collapsed, incidents dropped, debugging became possible again. Nobody was shocked. Everyone had been waiting for permission to say it.

The solo founder’s version of this trap

You are not the insurance company. You do not have twenty people. You have you, and maybe a contractor, and a credit card that is getting nervous.

And yet — you will read the AWS Well-Architected Framework. You will follow a tutorial that starts with “first, let’s set up your VPC.” You will pay $80/month for a managed database to store 200 rows. You will provision a load balancer in front of one server. You will copy the shape of infrastructure you saw at your day job, because that shape felt legitimate, and you want to feel legitimate too.

This is how solo founders end up with a$600/month AWS bill for an app that has six users.

The shape of legitimacy is the trap. Nobody cares what your infrastructure looks like until you have customers, and once you have customers,“my app runs on one $12 VPS” is a story peoplelove. It’s the opposite of suspicious. It’s proof that the thing works.

What to actually do

One machine until you can’t. One VPS. One Postgres on that VPS. One reverse proxy. Docker Compose or Kamal to deploy. You are allowed to stop here for years.
Scale vertically first. Hetzner will rent you a 48-core EPYC machine with 256 GB of RAM for €199/month. A mid-tier managed Kubernetes cluster on AWS starts at more than that before you’ve run a single pod. Most apps die from bad unit economics, not from running out of CPU.
When you outgrow that — and you might not —K3s on a few boxes gives you orchestration without the org chart. This is the actual sweet spot for a solo operator who needs more than one machine but less than a platform team.
Treat every infrastructure recommendation as a resume artifact until proven otherwise. Ask who benefits if you adopt this. If the answer is “the person telling me to adopt it,” weigh accordingly.
Your cloud bill is a leading indicator of how much time you are spending on things that do not make your product better. Watch it like you watch your weight.

The cloud was supposed to be leverage. For most people, most of the time, it has become the opposite: a recurring invoice for someone else’s credibility.

You are allowed to just run the server.

]]>

My Agent Runs 10 Cron Jobs. Three of Them Are Worth the Electricity.

Lakshmi Narasimhan — Mon, 20 Apr 2026 00:00:00 +0000

I have a daemon that runs on a server. It’s been up for seven weeks. It has ten scheduled jobs — some hourly, some daily, some weekly. Or at least, that’s what’s on paper.

This is what people are calling “the future of work.”

I’m not sure it is. I’m sure it’s what sells on Twitter.

The demo economy

Always-on agents photograph well. That’s most of what’s going on.

“My agent posted while I slept” is tweetable in a way that “I wrote a cron job” isn’t, even when the outputs are identical. The demo-industrial complex has figured this out. YouTubers build daemons. Framework authors build daemons. There are now three different subreddits comparing daemons. The flywheel is real, the content is prolific, and very little of it is honest about what the daemon is actually producing.

The hype bundles together several different things that deserve to be separated:

Agents thatrun work while you’re asleep (useful, conditionally)
Agents thatreact to things happening in the world (useful, conditionally)
Agents thatcapture things as they happen on your phone (useful, conditionally)
Agents thatrun heartbeats and ask themselves what to do (pure performance art)
Agents thatself-evolve in a loop in the background (fun demos, almost no output)
Agents thatspawn a hundred parallel subagents to research a topic (almost always worse than one good search)

The hype treats all six as the same thing. They aren’t.

The 20% that actually earns its keep

Honest list of when a background daemon does something a CLI or a 10-line bash cron can’t:

Scheduled work that has to happen when you’re not there. Crawl competitor sites at 3am. Pull last night’s Sentry errors. Summarize overnight industry chatter into a 7am brief. Your laptop is off, something has to be running somewhere. Legitimate.

Reactive triggers on external events.

Email arrives -> triage.

Substack comment -> draft reply.

Sentry alert -> diagnose + suggest fix.

The trigger comes from outside; compute has to meet it. Legitimate if the volume actually warrants automation (if you get three emails a day, triage is a solved problem — your inbox).

On-the-move capture.

Voice memo from your phone -> transcribed -> landed in memory.

Forwarding a link from your phone to your agent. The value is that capture happens when inspired, not when at desk. Real lift for content creators who have thoughts in elevators.

Judgment-laden monitoring.

Not “disk at 80%” — any shell script can do that.“Disk at 80% AND growing 2% per hour AND that’s unusual for this host.”

Requires context; needs to know what normal looks like. This is where LLMs in a daemon genuinely beat a threshold-based alerting stack.

That’s it. Four categories. Anything else is mostly burning tokens.

The 80% that’s noise

Heartbeats that ask the agent “anything to do?”

The agent wakes up, loads context, decides there isn’t anything to do, goes back to sleep. You pay for the loaded context every time. Over a day this adds up to real money for the privilege of watching an agent shrug.

Self-evolution loops.

“The agent improves itself while you sleep.” What it’s usually doing is refactoring its own prompts in circles. Cool demo on YouTube. Zero measurable outcome delta after a month of running.

Parallel subagent fan-out for research.

Ten agents search the web about the same question and return ten lightly-paraphrased versions of the same top three results. One focused 10-minute session beats this, almost always.

“Long-running overnight research tasks.”

When the output lands in your morning inbox, is it better than what 30 focused minutes at your desk would produce? Honestly check. Usually no.

Replacing things you could cron in 10 lines of bash.

The test: could a $5 VPS with a shell script + cron +jq do this? If yes, you’re not using AI for the part that needs AI. You’re using it because daemons are cool.

Receipts: what’s actually on my VM

I pulled the daemon’s state file and the log directory while writing this. Fifty-four days of uptime. Ten jobs on paper. The picture is worse than I thought.

Three are running reliably.

sentry-monitor has fired 191 times since early March. Latest run: this morning. When the night throws errors it reads them, groups them, and suggests a fix — not a link to the stack trace, an actual “here’s what’s probably wrong and here’s the one-line change.” Category 2 plus category 4. Keep.

infra-health has fired 190 times on basically the same cadence. Knows what normal looks like per host. Stays quiet when a disk spike is a scheduled backup and shouts when it isn’t. Category 4. The whole reason an LLM beats a thresholds-and-Prometheus stack here, and no, you cannot Grafana your way to this in under six months of tuning. Keep.

scout has fired 71 times across seven weeks. Daily-ish. Scans Reddit, HN, and Substack for signal that feeds this blog’s content calendar. Ido use the output. Category 2 if I’m generous. Keep — but it absorbs the next two jobs on the list below.

Now the uncomfortable part.

Three of the ten have straight-up stopped running and I didn’t notice.

morning-brief was scheduled daily at 6am. It last fired on March 18. A full month of no overnight brief. I did not miss it. I did not investigate. I did not know.

seo-audit was weekly. It has run exactly once in the daemon’s entire fifty-four-day lifetime, on March 1. Seven missed weeks. Nobody wrote a bug report to themselves. Nobody opened a file that wasn’t there.

auto-draft was supposed to produce a draft post every day. It has run exactly once, on April 11. Eight days of silence. Also unnoticed.

If a job stopped running a month ago and you didn’t miss it, the job was never producing anything that mattered. That’s not my heuristic. That’s the audit, evaluating itself while I was busy talking about audits on Twitter.

Four more are in some stage of limping.

reddit-scan — 27 runs over 45 days, last one April 10. Running, sort of, when the mood takes it. Nine days of silence so far on that one.

x-scan — identical pattern to reddit-scan. Same overlap. Same drift. Same silence since April 10. These two were supposed to be complementary; they’ve turned out to be redundantand unreliable, which is a rare trick.

engagement-brief — four runs, total, in the job’s entire lifetime. Not daily. Not weekly. More like “occasionally, if the stars align.”

x-analytics — three runs, last one March 16. Effectively dead, which is fine, because I check my X numbers roughly once a month anyway.

Final tally, the honest one.

Three jobs firing on schedule, producing output I use. Three jobs that silently stopped weeks ago and nobody in this house noticed, including me. Four jobs wandering between “running” and “not really” with no clear reason why.

Three-of-ten is the optimistic read. The pessimistic read is that six of the ten audited themselves — they cut themselves by going quiet, and I hadn’t even done them the courtesy of looking.

This is from someone who builds daemons for a living and writes about them for a job. What do you think yours looks like under the hood?

The five-question self-test

Before you keep any always-on agent job, make it answer these:

Would I actually miss this if it stopped? If you turned it off for two weeks and no one noticed, it’s not producing value. It’s producing comfort.
Does the cadence match downstream consumption? A job that fires 4x/day for output you read weekly is 27 extra runs a week of pure overhead.
Is the trigger genuinely external? (Scheduled time, incoming event, captured input.) If the agent is just checking on itself, you’ve built a Roomba that vacuums an empty room.
Could a shell script + cron +jqdo this? If yes, you’re not using AI for the part that needs AI.
Does the output change my behaviour? If yesterday’s run and last Thursday’s run would have produced the same action from me (or none), one of them was wasted.

Honest answers will cull your cron list by half. Mine certainly did, once I stopped writing this post and actually did the audit.

What this isn’t saying

I’m not arguing against always-on agents. I’m arguing against always-on agents thataren’t doing anything.

There’s real value when the conditions line up — work-while-you-sleep, external-trigger-response, on-the-move-capture, judgment-laden-monitoring. The reason I keep the daemon running (even after cutting half its jobs) is those four categories genuinely earn the monthly subscription. The reason I’m writing this is that the other six patterns — the ones that photograph well — are funding a lot of framework development and not much measurable outcome.

If your agent is doing category 1-4 work, the hype is warranted. If it’s doing category 5-6 work, you’re paying a subscription to a demo.

The uncomfortable question for most of the agent-community content right now iswhich category is the thing being demoed, really? And whether the person demoing it has done the five-question audit on their own cron list.

My guess: very few have. The demo economy doesn’t reward the audit. It rewards the screenshot of the agent waking up at 3am and pretending to be useful.

]]>

The $30/Year Stack for Launching Small Bets

Lakshmi Narasimhan — Mon, 19 Jan 2026 00:00:00 +0000

Every time I launch a new small bet, I need the same boring stuff: professional email, a chat widget, uptime monitoring. The kind of infrastructure that’s completely unsexy but makes you look like you have your act together.

For years, I overcomplicated this. Custom SMTP servers. Self-hosted monitoring. Elaborate setups that took days to configure and broke whenever I looked at them wrong.

Then I realized something: I was spending more time on infrastructure than on validating whether anyone wanted my product.

So I built a repeatable stack. Total cost: about $30-42 per year, per small bet. Here’s the whole thing.

Domain & Hosting: Cloudflare (Free)

Buy your domain wherever you want, but point the nameservers to Cloudflare immediately.

Cloudflare’s free tier is absurd:

DNS management (fast, reliable)
Free SSL certificates (automatic)
DDoS protection
CDN caching
Cloudflare Pages (unlimited sites, unlimited bandwidth)

That last one is key. Your landing page goes on Cloudflare Pages. Connect your repo, push to main, it deploys. No servers. No bills. No thinking about infrastructure when you should be thinking about whether anyone wants your product.

I run every small bet’s landing page on CF Pages. Zero hosting cost.

Email: Google Workspace (The India Pricing Hack)

You want professional email.hello@yourdomain.com, notyourdomain.help@gmail.com like some kind of digital nomad running a dropshipping scam.

Google Workspace direct pricing: $6/month. Painful when you’re running multiple bets.

Google Workspace through an Indian reseller: Rs.125/month. That’s roughly $1.50.

Same product. Same Gmail experience. Same everything. Just… cheaper, because regional pricing exists and Google apparently forgot to close this loophole.

Recommended resellers: Medha Cloud, Host IT Smart, Shivaami. They’re authorized, they’re legit, and they’ll save you $50+/year per domain.

Setup takes 30 minutes: verify domain, add MX records, configure SPF/DKIM/DMARC so your emails don’t land in spam. Done.

Support: Crisp Chat (Free)

Intercom wants $74/month. For a small bet that might make $0.

Crisp’s free tier gives you:

2 team seats (it’s just you anyway)
Unlimited conversations
Mobile app for notifications
A widget that doesn’t look like it was designed in 2008

Copy-paste their script tag into your landing page. Five minutes.

Upgrade trigger: when you have so many support conversations that you need automation. Which means you have customers. Which means you can afford to pay for things.

Monitoring: BetterStack (Free)

Your app will go down at 3am on a Sunday. This is not a prediction, it’s a guarantee.

BetterStack’s free tier:

10 uptime monitors
1GB logs/month
Email and Slack alerts
3-day log retention

Is 3-day retention enough? For a small bet you’re validating? Yes. You’re not running a bank.

Alternative: Axiom gives you 500GB ingest and 30-day retention if you’re logging more aggressively. Also free.

Error Tracking: Sentry (Free)

Your code will throw exceptions in production that never happened locally. Classic.

Sentry’s free tier:

5K errors/month
10K performance transactions
1 user
90-day retention

For a small bet, 5K errors/month is plenty. If you’re hitting that limit, either your app is broken or you have enough users to pay for it.

Database: Supabase (Free Tier or Self-Hosted)

Every small bet needs a database. Supabase’s free tier is genuinely useful:

500MB database
1GB file storage
50K monthly active users
Unlimited API requests

That’s enough to validate most ideas. The catch: you get 2 free projects total. After that, it’s $25/month per project.

For small bets that graduate to real products, I self-host Supabase on a $6/month Hetzner VPS. Full Postgres, auth, storage, realtime — no project limits, no usage caps. (I’m building a service calledSupabyoi to make this dead simple. More on that soon.)

The Complete Stack

Domain — ~$10-15/year
Cloudflare (DNS + Pages) — Free
Google Workspace (India) — ~1.50/month( 1.50/month( 18/year)
Crisp — Free
BetterStack — Free
Sentry — Free
Supabase — Free

Total: ~1.50/month, 1.50/month, 30-42/year

That’s DNS, hosting, professional email, live chat, uptime monitoring, error tracking, and a database for less than a single month of most “startup” tools.

The Rules

Don’t upgrade until you have paying customers. Free tiers exist for validation. Use them.

Keep the setup identical across bets. Same tools, same patterns, same DNS records. You should be able to launch a new bet’s infrastructure in an afternoon, not a weekend.

Resist the urge to self-host. Yes, youcan run your own mail server. You can also perform your own dental surgery. Neither is advisable.

When To Actually Upgrade

Google Workspace — You need >30GB storage → $7/mo
Crisp — You need chatbots or >2 team members → $25/mo
BetterStack — You’re pushing >1GB logs/month → $24/mo
Sentry — You’re hitting 5K errors/month → $26/mo
Supabase — You need >2 projects or more storage → $25/mo (or self-host)

Notice a pattern? These are all “you have real traction” problems. Good problems to have.

What’s Not Covered (Yet)

This is the skeleton — the basic infrastructure every small bet needs from day one.

I’ll cover these in separate posts:

Tech stack choices (frameworks, languages, deployment)
Payment processing (Stripe, Lemon Squeezy, regional considerations)
CI/CD pipelines (GitHub Actions, deployment automation)
Landing page patterns (what actually converts)

One thing at a time.

The Point

Infrastructure should be invisible. It should cost almost nothing while you’re validating. It should scale up only when you have revenue to pay for it.

$30/year per bet means you can run 10 small bets for less than most people pay for a single Notion subscription.

Stop building infrastructure. Start shipping products.

This is part of my “Deploy” series — simple infrastructure patterns for solo operators who’d rather build products than manage servers.

]]>

I Found a Cryptominer in My Client's Production Cluster. Claude Code Found the Attacker.

Lakshmi Narasimhan — Sat, 03 Jan 2026 00:00:00 +0000

New Year’s Day. Coffee in hand. Ready to ease back into work.

Then I saw the logs.

2026-01-02T06:34:27 GET xmrig-6.24.0-linux-static-x64.tar.gz
2026-01-02T06:34:30 GET http://37.32.6.33:7979/m
2026-01-02T06:34:30 spawn /opt/systemf/m ENOENT

xmrig. In production. Someone was mining Monero on my client’s Kubernetes cluster.

The horror.

The Investigation

I had a few hundred megabytes of JSON logs and approximately zero patience for manually correlating timestamps. So I did what any reasonable person would do: I asked Claude Code to analyze the logs and figure out what triggered the miner download.

Within seconds, it built a timeline:

Time Event

06:34:26. Normal request to /onboarding

06:34:27. xmrig downloaded from GitHub

06:34:30. Secondary payload from sketchy IP

06:34:57. Container OOMKilled

The cryptominer was so resource-hungry it consumed 2GB of memory in 30 seconds and crashed the container. Ironic. The attacker’s greed saved us from a prolonged compromise.

But how did they get in?

Chasing Red Herrings

Claude Code’s first suspect: a low-version npm package calleddevice-unique-keygen. Added by a developer whose email matched the package maintainer. Classic supply chain attack pattern.

I got excited. Maybe too excited.

Claude Code fetched the GitHub repo, analyzed the source code, checked for postinstall scripts, looked for obfuscated code, searched for eval() calls.

Nothing. The package was clean. Just a browser fingerprinting library. Boring. Legitimate.

We moved on.

No malicious init containers. No sidecars. No .ashrc shenanigans. The Dockerfile was clean. The pod spec was clean.

Everything was clean except someone was definitely mining crypto on our infrastructure.

The Actual Answer

Claude Code rannpm audit on the codebase.

critical │ Next.js is vulnerable to RCE in React flight protocol
Package │ next
Patched │ >=15.3.6
Your ver │ 15.3.4
CVSS │ 10.0

CVSS 10. The maximum possible score. The “your house is actively on fire” of security ratings.

The app was running Next.js 15.3.4. A publicly disclosed RCE vulnerability. No authentication required. An attacker could run arbitrary commands on the server by sending a crafted request.

That’s exactly what happened. They sent a request, ran wget twice, downloaded the miner, and started extracting crypto value from compute cycles they weren’t paying for.

The container’s memory limit stopped them. A $20/month Kubernetes resource limit prevented what could have been ongoing theft.

What Claude Code Actually Did

I want to be clear about what happened here. I didn’t single-handedly unravel a sophisticated attack. I didn’t manually correlate log timestamps or reverse-engineer obfuscated npm packages.

I said “check these logs” and Claude Code:

The entire investigation took under an hour. Not because I’m fast. Because Claude Code is.

The Fix

pnpm update next@^15.3.6

One command. That’s the remediation for a CVSS 10.0 vulnerability.

We also orphaned the compromised pods for forensic analysis, rotated secrets, and added proper security contexts to prevent future wget adventures.

The Lesson

Two things saved us:

One thing would have prevented this entirely: runningnpm audit before deployment.

The attacker exploited a vulnerability that was publicly disclosed and patched. We just hadn’t updated yet.

Godspeed with your own dependency updates.

My Medium friends can read this over there as well.

]]>

I spent years on Kubernetes. Now I'm betting against it.

Lakshmi Narasimhan — Thu, 04 Dec 2025 00:00:00 +0000

I’ve spent years in the Kubernetes ecosystem. I wrote about K3s. I ran production clusters. I know my way around kubectl, Helm charts, and the CNCF landscape.

And I’m building a deployment tool that doesn’t use any of it.

Here’s why.

Kubernetes solves problems you don’t have

K8s is incredible engineering. It solves real problems:

Multi-team deployments without stepping on each other
Automatic failover across dozens of nodes
Fine-grained resource allocation at massive scale
Rolling updates for services with thousands of instances

If you’re Spotify, you need this. If you’re running a 50-person engineering org, you need this.

If you’re a solo dev with one FastAPI app and a Celery worker? You don’t.

As one dev put it: “Do you want to build a product, or do you want to build an infrastructure team? Kubernetes makes sense for the latter, but it’s often overkill for the former.”

You need:

git push → app is live
Rollback when you break something
Logs you can actually read
Alerts when the site goes down

That’s it. Everything else is ceremony.

The hidden cost isn’t the cluster

“But K3s is lightweight! You can run it on a $6 VPS!”

True. I’ve done it. Here’s what they don’t tell you:

A solo devrecently posted on r/kubernetes with a title that said it all: “Solo dev tired of K8s churn… What are my options?”

His pain point wasn’t learning Kubernetes. It was the maintenance:

“I don’t mind learning the topics and writing the config, I do mind having to deal with a lot of work out of nowhere just because the underlying tools are beyond my control and requiring breaking updates.”

He’d been burned by Bitnami charts pulling the rug, NGINX ingress breaking changes. Things that worked stopped working — not because he changed anything, but because the ecosystem did.

“It all felt very straightforward, and it worked so well for a bit, but it starts to crumble even when I haven’t changed anything on my side.”

This is the hidden cost. Not the setup — the churn.

The YAML tax: Every change requires editing manifests. Add an env var? YAML. Change a port? YAML. Want a cron job? That’s a whole new CronJob resource. One team had a production outage caused by an improperly indented YAML line. A single space broke prod.

The debugging tax: Something’s wrong. Is it the pod? The service? The ingress? The network policy? The PVC? Hope you remember how to readkubectl describe.

The upgrade tax: K3s made this easier, but you’re still running a distributed system. A 2024 report found over 77% of Kubernetes practitioners still have issues running their clusters — up from 66% in 2022. It’s getting harder, not easier.

The cognitive tax: Part of your brain is always allocated to “how does Kubernetes work” instead of “how do I ship features.”

As one commenter put it: “Choose your churn.” There’s always something.

The Reddit OP’s conclusion? He gave up on K8s entirely. Settled on plain NixOS on a single Hetzner VPS. Accepted that 99.9% uptime from one server is good enough. Skipped the redundancy he thought he needed.

“I am trying to write my software, I just want a reliable thing to host it with the freedom and reliability that one would expect from a system that stays out of your way.”

That’s the real ask. A system that stays out of your way.

For teams, the Kubernetes tax is worth paying. You split it across people, you build expertise, you amortize the cost.

Solo? You pay it all yourself, every time.

What actually works for solo devs

So if not Kubernetes, what?

The same Reddit OP nailed the PaaS problem too:

“These ‘managed-docker’ services charge per container/pod and force the user to over-provision. Your pod doesn’t run on 250mb RAM? Ok pay for 1GB even though you only need 500mb.”

I’ve tried everything:

Heroku (great until the bill hits)
Railway/Render (same story, nicer UX — $50-100/mo for what costs $5 on a VPS)
Dokku (solid, but showing its age)
Coolify (powerful, but now you’re babysitting another server)
K3s (overkill for most solo projects)
Raw Docker + nginx (works but tedious)

The best setup I’ve found:Kamal.

It’s from 37signals. They run Basecamp and HEY on it. It’s just Docker + SSH. No cluster, no orchestrator, no YAML manifests.

kamal deploy

That’s it. It SSHs into your server, pulls your container, does a zero-downtime swap. Rollback is one command. Logs are one command.

It’s boring. It works.

My bet: AI interface > dashboards > CLI > YAML

Here’s where it gets interesting.

Kamal solved the “deploy” problem. But ops is more than deploy:

Why is the app slow right now?
What happened at 3am?
Should I upgrade my VM or optimize my code?
Show me the errors from the last hour

These questions require jumping between tools. SSH into the box, grep the logs, check Grafana, cross-reference with your deploy history.

My bet: you shouldn’t need to do any of that.

You should just ask.

“Why is memory usage spiking?” → Here’s what’s using RAM, and here’s the trend over the last week.

“Roll back to yesterday’s deploy” → Done. Here’s what changed.

“Show me errors from the /api/checkout endpoint” → Found 47 errors, here’s the pattern.

This isn’t science fiction. LLMs are good at this now. The interface just doesn’t exist yet.

What I’m building

VMKit is my attempt at this interface.

Bring your own VPS (Hetzner, DigitalOcean, whatever)
It handles Kamal, Traefik, SSL, monitoring
The interface is conversation — web chat or MCP server in Claude Code

No Kubernetes. No YAML manifests. No 47-screen dashboards.

Just say what you want.

I might be wrong. Maybe solo devs actually love clicking through Render’s UI. Maybe the Kubernetes complexity is worth it for everyone.

But I don’t think so. I think the right answer for one person running one to three apps is radically simpler than what we have today.

vmkit.dev if you want to follow along.

The uncomfortable truth

I’m not anti-Kubernetes. I’m anti-complexity-for-its-own-sake.

K8s is a tool. An incredibly powerful one. But tools have contexts where they make sense and contexts where they don’t.

Solo dev shipping a SaaS? You don’t need pod autoscaling. You need deploys that work and a way to debug when they don’t.

That’s the bet.

]]>

AWS Is Overrated

Lakshmi Narasimhan — Sat, 11 Oct 2025 00:00:00 +0000

If you’re an indie dev building your first SaaS, AWS is not your friend.

It’s a maze of services, dashboards, and acronyms pretending to make you productive while quietly billing you for curiosity.

Sure, it’s “the industry standard.” But here’s the thing: you’re not Netflix. You’re not Stripe. You don’t need fifteen managed services to ship an MVP. You just need one working prototype in front of users.

When I started shipping my own SaaS projects, I defaulted to AWS too. Everyone said it was the “serious” choice. I spun up EC2s, tinkered with VPCs, IAM roles, and CloudWatch dashboards.

Two weeks later, my app still wasn’t live. But my bill was.

That’s when it clicked. AWS is optimized forscale, notspeed. It’s designed for teams with DevOps pipelines, budgets, and compliance officers. Indie devs have none of those.

Here’s the real problem:

AWS makes youfeel productive because it has a service for everything.

But it slows you down because you end upassembling infrastructure instead of shipping software.

You’re busy wiring VPCs while your users are waiting for a login page.

If you’re building your first SaaS, you’re better off with:

Render orFly.io for fast deploys.
Railway,Supabase if you love simplicity.
DigitalOcean app platform
Or even your ownK3s box on a $30 DigitalOcean droplet if you like to tinker.(More on this in future posts)

You’ll have full control, predictable costs, and a deploy story you can explain in a single sentence.

That’s what matters at your stage — not five-nines availability across three regions.

AWS will always have its place. It’s incredible at running serious workloads, regulated systems, and multi-tenant platforms at scale.

But for indie devs trying to launch, learn, and iterate fast — it’soverkill.

Use the simplest stack that lets you ship.

Add complexity only when success forces you to.

Because nothing kills momentum faster than debugging IAM policies instead of building features.

TL;DR

If you’re a solo founder or small team, your advantage isn’t scale — it’s speed.

Don’t trade that away for a cloud that was never built for you.

I share one short post daily-ish for productive indie developers — how to ship faster, cheaper, and saner. Subscribe if that’s your vibe.

]]>

Why Your SaaS Needs a Docker Compose Setup Even If You’re Just One Person

Lakshmi Narasimhan — Fri, 10 Oct 2025 00:00:00 +0000

If you’re building a SaaS solo, the biggest productivity killer isn’t writing code — it’ssetting up your damn environment.

You know the story:

You clone your repo on a new laptop or spin up a new dev box, run flask run or uvicorn main:app –reload, and boom — connection refused on localhost:5432.

Postgres isn’t running.

Your .env file is half missing.

Supabase changed a port.

And now you’re googling “how to reset a Postgres user password” for the third time this month.

That’s why I’ve stopped messing around with manual setups — and started containerizing mylocal environment usingDocker Compose.

Not because it’s trendy.

Because it’s the only way to guarantee I can pull, build, andrun my app in under a minute.

The indie dev reality

As solo devs, we move fast. We don’t have infra teams or onboarding docs. Most of our systems live in muscle memory and terminal history.

That’s fine when you’re in the groove — until you need to:

Revisit a project after a few months.
Share it with a collaborator.
Spin it up on a new machine.
Or just fix a quick bug and realize nothing runs anymore.

A solid local setup is like documentation that actually works.

Docker Compose is the simplest way to get there.

Why Docker Compose?

It’s not about “microservices” or “container orchestration.” Ignore that stuff.

Compose is just a YAML file that says:

“Here’s everything my app needs — run it all together.”

You can define your web app, Postgres, and even Supabase’s local stack if you want to mirror production closely.

When you run docker compose up, everything spins up consistently — same versions, same ports, same config — every time.

It’s reproducibility for humans.

The minimal example (Python + Postgres)

Let’s say you’re building a FastAPI app that talks to Postgres.

Here’s a dead-simple docker-compose.yml to make your life easier:

link to the gist

And a Dockerfile to go along with it.

Link to the gist

That’s it.

Now you can run your entire stack with one command:

docker compose up

Your Python app connects to Postgres instantly.

No need to brew install, no weird port conflicts, no “is Postgres running?” guessing game.

Want to mirror Supabase locally?

If you’re usingSupabase in production but want to run locally, Supabase has its own CLI that uses Docker under the hood.

You can spin up a near-production clone with:

supabase start

That’ll run Postgres, API, auth, and storage locally in containers — no manual setup required.

It’s heavier, but it’s great if you’re testing row-level security, triggers, or anything that depends on Supabase’s stack.

But isn’t Docker heavy?

Yeah, a little.

The first time you pull images, it’ll download a few hundred MB. After that, it’s fast.

And honestly, the alternative is worse — debugging inconsistent environments and broken local databases.

The real magic isn’t that it’s fast — it’s that it’sreliable.

If you take a break from your project for a month, you can come back and it’ll just work.

That’s worth the disk space.

Bonus: the same setup works for production

Here’s the underrated part — once you have this docker-compose.yml, you’re halfway to a production deployment.

You can:

Build your app image with docker compose build api.
Push it to a registry like Docker Hub or GitHub Container Registry.
Deploy it to Fly.io, Render, Railway, or your VPS — all of which happily accept a pre-built Docker image.

That means yourlocal setup = production setup.

No “works on my machine,” no separate Heroku config, no hand-tuned server differences.

You’re testingexactly what you’ll ship.

For example, to build your image for deployment:

docker compose build api
docker tag yourapp_api your-registry.com/yourapp:latest
docker push your-registry.com/yourapp:latest

Then you can run it anywhere with:

docker run -p 8000:8000 your-registry.com/yourapp:latest

This alignment — same Dockerfile, same Compose config — is what makes deployment predictable, even as a one-person team.

Quality-of-life improvements

Once you’ve got Compose running smoothly, you can make it even nicer:

1. Add a Makefile or script for one-command startup:

make up

Your Makefile contents:

up:
docker compose up --build

2. Add a seed script for your DB:

docker compose exec db psql -U dev -d app -f seeds.sql

3. Run tests in the same containers:

docker compose run api pytest

You now have a full, consistent local dev environment thatfeels like production, without the cloud bill.

The point isn’t Docker — it’s repeatability

You’re not doing this to “learn containers.”

You’re doing it because your time is too valuable to waste on setup chores.

A docker-compose.yml file is the indie dev version of a safety net.

You can drop your laptop, clone your repo on a new one, and be productive in 60 seconds flat.

And when it’s time to deploy?

You’re already 90% there.

TL;DR

Your SaaS deserves a repeatable local setup.
Docker Compose makes it dead simple for Python + Postgres (and Supabase).
It doubles as your build foundation for production images.
You’ll thank yourself every time you reopen an old project or deploy something new.

It’s one of those rare decisions that’s both practicaland future-proof.

Set it up once — and your dev-to-prod pipeline just became a lot less fragile.

]]>

The Kubernetes Controller That Auto-Reloads Your ConfigMaps

Lakshmi Narasimhan — Tue, 07 Oct 2025 00:00:00 +0000

Every now and then, you stumble upon a Kubernetes project that makes you stop and think, “Wait, why isn’t this built-in?”

Stakater Reloader is one of those for me.

Here’s the problem it quietly solves: Kubernetes Deployments, DaemonSets, and StatefulSets don’t automatically reload when their ConfigMaps or Secrets change. You could update the config file, roll out a new image, patch the deployment — but the pods? They’ll keep running happily with the old values until you manually restart them. It’s one of those “by design” quirks that has tripped up almost every engineer at least once.

Reloader fixes that. It’s a lightweight controller that watches for changes in ConfigMaps and Secrets. When it detects one, it simply triggers a rolling restart of the workloads that depend on them. Nothing fancy, nothing hacky — just Kubernetes done right.

Here’s how it works under the hood. It uses the Kubernetes watch API to monitor resource updates. When a change is observed, it looks for deployments or other workloads annotated with

reloader.stakater.com/auto: “true”

If it finds one, it patches the deployment’s pod template spec — usually by bumping an annotation — forcing Kubernetes to treat it as a new version and trigger a rolling update. No sidecars, no injection tricks, no external scripts. Just a clean use of the existing control-plane semantics.

It’s elegant precisely because it doesn’t reinvent anything. It leans into how Kubernetes already works, filling in an obvious usability gap.

You could argue this is the kind of feature that belongs incore Kubernetes. After all, it’s not “extra functionality” — it’s just common sense. If my config changes, my app should refresh. But Kubernetes’ philosophy has always been to stay minimal, leaving operators and tools to extend behavior. That’s how ecosystems like Stakater exist in the first place.

And yet, Reloader feels different. It doesn’t add complexity; itremoves friction. It codifies a best practice we’ve all implemented in ad-hoc ways — shell scripts, kubectl rollout restart, or CI hacks. In a way, Reloader formalizes something that should have been declarative from day one.

If you look at its implementation, it’s almost deceptively simple — a few controllers, an event handler, and some logic to patch annotations. But simplicity is what makes it beautiful. It’s one of those tools that quietly runs in the background for years without drawing attention — until one day you disable it and everything starts to feel broken again.

The lesson? Some of the most powerful Kubernetes tools don’t add layers of abstraction; they close tiny gaps that make the system feel humane.

Reloader doesn’t try to be clever. It just keeps your pods honest.

]]>

When DIY Beats Managed Kubernetes

Lakshmi Narasimhan — Sun, 21 Sep 2025 00:00:00 +0000

When I first started working with Kubernetes, I immediately gravitated toward managed offerings like EKS, GKE, and AKS. The promise was compelling: let AWS/Google/Azure handle the control plane while you focus on your applications. Fast forward a few years, and I’ve come to a somewhat contrarian position—for many teams, especially those with some ops capability, running K3s on virtual machines often makes more sense than using managed Kubernetes.

Let me explain why, and the important caveats to make this approach work.

The Managed Kubernetes Tax

Managed Kubernetes services aren’t free—and I’m not just talking about the literal cost (though that’s significant). They come with several forms of “tax”:

Financial cost: You pay for control plane(s), often per cluster. For small to medium workloads, this can be disproportionately expensive.
Complexity tax: Managed K8s integrates deeply with cloud provider infrastructure—IAM, networking, storage—adding layers of abstraction and potential failure points.
Upgrade friction: Managed K8s upgrades are often more complex than they need to be, involving node group rotations and potential downtime.
Cognitive overhead: You still need to understand Kubernetes, plus the cloud provider’s implementation quirks and limitations.

Take EKS, for example. What starts as “just let AWS manage the control plane” quickly spirals into wrestling with IAM roles for service accounts, custom CNIs, AWS Load Balancer Controllers, and cluster autoscaler configurations that mysteriously stop working after upgrades. I’ve spent entire days debugging issues that stemmed from the interaction between EKS and AWS’s underlying services—time that could have been spent improving our actual applications.

Enter K3s: Kubernetes Without the Bloat

K3s is a certified Kubernetes distribution designed for resource-constrained environments. It’s packaged as a single binary under 100MB and uses significantly fewer resources than standard K8s. But don’t let the “lightweight” label fool you—K3s is a production-grade distribution that powers everything from IoT devices to large-scale production systems.

When deployed on standard VMs (whether AWS EC2, DigitalOcean Droplets, or your own infrastructure), K3s offers several advantages:

Simplicity: A K3s cluster can be bootstrapped with a single command. No complex cloud provider integration required.
Cost efficiency: Run your entire control plane and worker nodes on standard VMs, often at a fraction of the cost of managed offerings.
Portability: Your setup works the same way regardless of where your VMs are hosted, making multi-cloud and hybrid deployments straightforward.
Easier upgrades: K3s upgrades can be as simple as replacing a binary and restarting a service.
Full control: No mysterious behavior or limitations imposed by the cloud provider’s implementation.

The Critical Caveat: You Need Automation

Here’s where I need to be clear: this approach only makes sense if you invest in automation. You’re essentially building your own management layer, which requires:

Infrastructure as Code: Your entire VM fleet and K3s deployment should be defined in Terraform, Pulumi, or similar.
Automated scaling: Scripts or tools that can add/remove nodes based on cluster metrics.
Upgrade playbooks: Well-tested procedures for upgrading K3s versions with minimal disruption.
Monitoring and alerting: Comprehensive visibility into both VM and Kubernetes-level metrics.
Backup and disaster recovery: Regular etcd snapshots and documented recovery procedures.

Without these elements, you’re likely better off with managed Kubernetes. The goal isn’t to recreate every feature of EKS/GKE/AKS, but to build a simpler, more focused system that meets your specific needs.

Real-World Example

For one of my recent projects, we replaced an EKS cluster costing roughly $250/month (control plane + required minimum nodes) with a K3s setup on three small VMs totaling $60/month. The migration took 3 days, and we’ve had fewer operational issues since.

Our automation includes:

Terraform for VM provisioning
Ansible for K3s installation and configuration
Custom scripts for horizontal scaling based on node resource utilization
Prometheus + Grafana for monitoring
Weekly etcd snapshots stored in S3

The entire setup is documented in a Git repository, and new team members can spin up a local replica for testing using Vagrant.

The maintenance complexity with EKS was what ultimately pushed us over the edge. Every few months, AWS would deprecate something or introduce a new “recommended” way to handle networking, storage, or access control. We’d spend days reading through documentation changes and testing upgrades in staging environments. With K3s, upgrades are predictable and focused on Kubernetes itself, not the surrounding ecosystem of AWS-specific components.

When to Stick with Managed Kubernetes

This approach isn’t for everyone. You should probably stick with managed Kubernetes if:

You have large, complex clusters with hundreds of nodes
Your team has limited operations expertise
You need advanced features like managed node auto-scaling groups
You’re heavily invested in cloud-provider specific features

Conclusion

The beauty of the K3s-on-VMs approach is that it strips Kubernetes down to what it does best—orchestrating containers—without the added complexity that comes from deep cloud provider integration.

By building your own lightweight management layer through automation, you get the benefits of Kubernetes with more control, often at a lower cost. The key is being honest about your team’s capabilities and needs.

For startups, indie hackers, and teams that value simplicity and cost-efficiency, this approach is worth considering. You might find that a little investment in automation pays significant dividends in both cost savings and reduced operational complexity.

Of course, if you enjoy spending your weekends debugging why your EKS cluster suddenly can’t talk to your RDS instances despite no apparent changes, then by all means, stick with managed Kubernetes. Some people also enjoy jigsaw puzzles with missing pieces.

]]>