I’ve been thinking recently about two things: The Roman Empire and the unnecessary bloat in Software Engineering.
You see, about 7 months ago I quit my job due to burnout.
Since then, I spent a lot of time thinking and analyzing what is so wrong with our industry.
I wrote once about how our industry is broken, but that article was referring more to the technical side and code bloat.
Today, I want to talk about the cultural and architectural sides.
Let’s build a Kafka queue with distributed microservices architecture
Seems like the microservices architecture hype is finally over.
People are coming back to traditional monoliths, or not caring at all about labeling their architecture.
But there still remains a trend that many follow.
I call it “the MANGA trend”.
Inexperienced engineers, or engineers that are driven by ego/desire to be promoted (and sometimes boredom, I’ll talk about it later), come up with one solution for every problem.
The “Kafka solution”.
Kafka, in this case, is just a placeholder for “your favorite MANGA tech stack” (MANGA: Meta, Apple, Netflix, Google, Amazon—The holy quintet of software engineering).
What happens is that one of these “big tech” companies post on their super cool engineering blog, about a super hard problem they had, and how they solved it by building a reverse-proxy-on-top-of-a-potato-using-ipv6-over-telephone-wires, which is now open-source, and all inexperienced or ego driven engineers be like: “Wow! I want that too!”
So they convince their fellow teammates, team leads, and engineering managers (using their superior writing skills which they acquired thanks to my recent book Technical Writing for Software Engineers)—that we, too, need to have such architecture.
It’s good for Google, why would it be bad for us?
Their plan could have been bulletproof, and would lead them to promotion, if only they didn’t forget one thing: they are not Google, and they have 12 MAU (monthly active users).
The desire to implement over-engineered solutions is caused by multiple factors.
The need for social validation is one of them.
We want to be as cool and as trendy as the other company.
But we forget that we are not Google.
We don’t have the same scale, and definitely not solving the same problem.
There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
But then comes the “anti-Donald” developer who says:
We need to optimize for future. Our product is going to grow, and we need to support its future growth.
And while this statement by itself is not wrong (assuming it’s backed by actual data) there is another fallacy that comes attached to it.
The fallacy that says that the existing tech we use—is not scalable.
Which most of the time is wrong.
The following statements will piss-off many engineers, so be warned:
And your bottleneck is not in your asynchronous processing technology.
It is, in fact, in your managerial inefficiency, and technological bloat.
All aboard the hyper-growth train
There is a hard truth that needs to be said: there are more developers today than meaningful jobs for them.
Here, I said it, you can cancel me now.
Companies are living organisms.
And they have many of the same social interactions as humans have.
One such interactions is mirroring.
When you, as a respecting and politically correct human being, find yourself in an environment of racists, you have two options:
Leave
Be racist (mirror your environment)
Yes, technically you have third option: to stand your ground and stay true to your principles.But com’on, I’m trying to make a point, so bear with me.
Companies rarely have the option to leave.
They all compete on the same market, over the same talent (most of the time).
And they all mirror each other.
This is why when Google announces layoffs, all the other MANAs will follow it.
And when one company releases an AI model, others will pivot to it as well.
And finally, when one enters a hyper-growth mode, others will follow.
They do so for various reasons, which most of the time are unrelated to actually needing this talent for meaningful jobs.
One reason is mimicking what others are doing.
Another reason is to keep candidates off the market.
Amazon would rather hire all the best AI engineers, so Nvidia wouldn’t get them, even though Amazon had no plans to release a general purpose LLM.
But FOMO is real, and it’s better to insure myself then let a competitor get ahead.
This creates a weird situation where a company is now stuck with personnel which has no actual work to do.
But nobody is getting paid to watch Instagram Reels all day, so we have to come up with some sort of work.
And smart engineers, being smart, start to come up with over-engineered solutions.
This is why we end up with Kafka infested development in every company.
In reality, it would be cheaper, and easier to implement Kafka when you are milking your resources to the max, instead of prematurely optimizing it.
And the reasons being is…
Legacy
I wrote about legacy code in this blog post.
And the truth is that legacy code is unavoidable in our modern engineering culture.
Modern development practices prioritize endless feature list, while pushing technical backlog down to the abyss.
But technical backlog is the building block of avoiding legacy code.
You need to have constant refactoring sessions, and revisiting old code in order to avoid code rotting.
By the way, opposite scenario, where majority of the time is spent on refactoring, leads to another pandemic that I call “lets-rewrite-it-to-v2-using-new-cool-tech”.This also stalls healthy development and leads to over-engineered solutions that disregard any product features.
If instead of having a “premature-optimization” development cycles, we would concentrate on refactoring, you wouldn’t need to introduce Kafka “just-because”™.
But due to the desire to do cool stuff, “like Google does”, paired with the fact that a lot of the engineers are over-hired—we end up with premature optimization.
And in reality it always fails, because you are building for something you don’t have, hence you can’t know what do you need.
It would be wiser to introduce a particular technology when you hit a bottleneck and understand what you need to fix.
The good news is that there seems to be a shift in the minds of developers and companies.
I see more and more people sharing how they abolish their overly complicated AWS setups in order to go back to traditional VPS + DevOps setups.
More engineers, myself included, are speaking about using simple technologies.
The above-mentioned blog post by Adriano Caloiaro about using PostgreSQL as queue is a great example.
The increased mentions of simple technologies such as SQLite and HTMX is another example.
So there is hope.