Architecture decisions made for Netflix’s scale don’t survive contact with a four-person team.

The Trap That Catches Every Team Twice
The story plays out the same way every time. A team of five reads a post about how Uber decomposed their monolith. A senior engineer watches a Netflix talk about deploying 4,000 times a day. Excitement builds. A planning session happens. Three months later, the team has split their functional Laravel application into eleven services, added a service mesh, set up distributed tracing and hired a platform engineer to keep it running.
Feature velocity dropped by half. Deployments take three times as long. The last production incident required four people to trace a single failed request across six services. One developer now spends most of their time on infrastructure instead of product work.
Microservices didn’t fail them. They applied the wrong tool to the wrong problem.
Here is what nobody says plainly: microservices are an organizational scaling solution. The architecture you get from a microservices approach maps onto your org chart, not your domain model. That is not a coincidence. It is physics. And it has a name.
Conway’s Law Is Not Optional
In 1968, Melvin Conway published a paper in Datamation that contained one of the most consistently ignored observations in software engineering: “Organizations which design systems are constrained to produce designs which are copies of the communication structures of those organizations.”
Conway’s Law. You have heard it cited at conference talks. You probably have not internalized what it means for the architecture choices you make this quarter.
When Amazon organizes its product teams around the “two-pizza team” rule, each team owning a bounded service and able to deploy independently, the microservices architecture they produce directly reflects that organizational structure. Each team needs autonomy. Each team needs to release without coordinating with seven other teams. Each team needs a deployment boundary that matches its ownership boundary. The architecture serves the humans, not the other way around.
When a five-person startup builds microservices, the architecture serves nothing. There is no communication bottleneck between teams to eliminate because there is only one team. There is no deployment independence to achieve because everyone already coordinates everything in one Slack channel. The architectural overhead exists for its own sake and you pay for it in full.
This matters because the costs of microservices are not theoretical. They are concrete and paid every sprint.
The Microservices Tax You Are Already Paying
Consider a user request that touches five of your services sequentially. Even with healthy services running at 10ms response time each, you add 50ms in pure network latency before a single line of business logic runs. Add serialization, deserialization, network jitter and service discovery overhead and that 10ms per hop climbs steadily. For parallel service calls the math improves but never disappears.
That is just latency. The operational overhead is worse.
Each service needs its own deployment pipeline. Its own logging configuration. Its own health checks. Its own secrets management. Its own alerting thresholds. If you have twelve services and one developer nominally owns infrastructure, that developer spends more time keeping services alive than building features.
Debugging is where the real damage lands. A stack trace in a monolith tells you exactly what happened and on which line. In a distributed service architecture you get a request ID and a distributed trace you have to reconstruct manually. You need distributed tracing infrastructure. You need correlation IDs threaded through every log call in every service. You need someone who understands Jaeger or an equivalent at 2am when production is down and a customer is calling your CEO.
That debugging gap does not show up on your roadmap. It shows up as “investigation” tickets that take three days to close and features that slip by a week because two engineers spent Tuesday afternoon chasing a failure across five services with no clear owner.
The Case Study Nobody Wants to Sit With
In 2023, Amazon Prime Video published a technical post-mortem on their Video Quality Analysis (VQA) service. This was not a legacy system. It was a purpose-built real-time monitoring pipeline designed to check audio and video quality across thousands of concurrent Prime Video streams.
The original architecture was a distributed pipeline built on AWS services: video segments processed through AWS Step Functions, video frames written to Amazon S3 as intermediate storage between steps and individual defect detection components running as separate functions. The architecture was textbook microservices thinking.
The problems were structural. AWS Step Functions charged per state transition, and a video monitoring pipeline produces a lot of state transitions. S3 roundtrips for intermediate data between processing steps added both cost and latency at the throughput levels required. The distributed design created tight coupling through the shared S3 data store while surrendering the autonomy that actually justifies service separation.
The team’s solution was to consolidate the service into a single process. No more S3 for intermediate frame storage. No more Step Functions orchestration overhead between steps. Data moved in memory between detection components. Deployment complexity dropped. Debugging the pipeline became straightforward again.
The outcome was a 90% reduction in infrastructure costs, alongside better scalability and simpler operations than the distributed version had ever managed. Amazon’s Prime Video Tech Blog published the full write-up in 2023 if you want to read the specifics.
Amazon built a monolith for the component that needed to be a monolith. This is not an ironic reversal. It is engineering judgment applied without attachment to architectural fashion.
What the Modular Monolith Actually Is
The modular monolith is not a concession for teams that failed at microservices. It is a deliberate architectural pattern that delivers clean domain separation without the distributed systems tax.
The core constraint: one deployable artifact, multiple modules. Each module owns its domain logic, its data access layer and its public interface. Modules communicate through well-defined contracts, not HTTP calls across a network. You get the domain boundary without the network boundary, until the network boundary is actually justified.
Here is what this structure looks like in a Laravel application:
app/
Modules/
Orders/
Actions/
CreateOrder.php
CancelOrder.php
Models/
Order.php
OrderItem.php
Events/
OrderCreated.php
Services/
OrderService.php
routes.php
Inventory/
Actions/
ReserveStock.php
ReleaseStock.php
Models/
Product.php
StockMovement.php
Events/
StockReserved.php
StockInsufficient.php
Services/
InventoryService.php
routes.php
Billing/
Actions/
ChargeCustomer.php
RefundOrder.php
Services/
BillingService.php
routes.phpModules communicate through events, not direct class imports between module internals. The Orders module dispatches an OrderCreated event. Inventory listens and reserves stock. Billing listens and initiates a charge. No module reaches into another module's private implementation.
// Modules/Orders/Actions/CreateOrder.php
final class CreateOrder
{
public function __construct(
private readonly OrderRepository $orders,
private readonly Dispatcher $events
) {}
public function execute(CreateOrderData $data): Order
{
$order = $this->orders->create($data);
$this->events->dispatch(new OrderCreated(
orderId: $order->id,
total: $order->total
));
return $order;
}
} // Modules/Inventory/Listeners/HandleOrderCreated.php
final class HandleOrderCreated
{
public function __construct(
private readonly ReserveStock $reserveStock
) {}
public function handle(OrderCreated $event): void
{
$this->reserveStock->execute($event->orderId);
}
}
This is real domain separation. No network overhead. No serialization. No distributed transaction complexity. You can test the Orders module in complete isolation from Inventory. You can onboard a developer to Billing without needing them to understand the full system.
And when the time genuinely comes to extract a module into its own service, you have a clean seam to cut along. The public interface is already defined. The event contract is already in place. The data ownership is already clear.
Shopify ran this pattern on their Rails codebase for years as the organization scaled. Service boundaries got drawn when deployment independence or genuinely different infrastructure requirements justified the cost, not before.
Enforcing Boundaries Before They Drift
The modular monolith only works if module boundaries hold under deadline pressure. Convention alone does not hold. When a developer needs to ship a fix fast, they reach for the shortest path. If that path goes straight through another module’s internals, it will be taken.
Tooling enforces what discipline cannot sustain.
For PHP projects, Deptrac is a static analysis tool that lets you define architectural layers and run violation checks in CI. A developer importing Inventory\Models\Product directly inside the Orders module gets a pipeline failure before the code merges.
# deptrac.yaml
layers:
- name: Orders
collectors:
- type: directory
value: app/Modules/Orders/.*
- name: Inventory
collectors:
- type: directory
value: app/Modules/Inventory/.*
ruleset:
Orders:
- ~Inventory
Inventory:
- ~OrdersFor Python projects using FastAPI, Import Linter serves the same purpose. Define your boundary rules in a configuration file and the tool checks your import graph on every CI run.
In Go, Google’s Service Weaver framework takes this further. You write the application as a modular monolith and decide at deploy time whether components run in a single process or as separate services. The boundary enforcement is part of the framework, not a convention you hope the team respects.
Turn architectural guidelines into pipeline failures. That is the only way boundaries survive contact with a real production codebase over two years.
When Microservices Are Actually the Right Tool
None of this means microservices are always wrong. There are conditions under which they genuinely justify the overhead.
Different infrastructure profiles per component. If your image processing pipeline needs 40 CPU cores and your user API needs two, running them as a single deployable artifact wastes money and creates deployment risk. Components with genuinely different scaling characteristics belong in separate units.
Incompatible technology stacks. If your application is PHP and you need a machine learning inference component in Python, you need a deployment boundary. Forcing incompatible runtimes into a single process is worse than the network overhead.
Multiple autonomous teams at genuine scale. When you have several product teams of eight or more developers, coordinating releases across a shared codebase becomes the bottleneck. Service boundaries reduce that coordination cost. Notice the precondition though: multiple large teams. Not one team. Not three developers who sit together.
Hard security or compliance isolation. A payments component with PCI-DSS requirements, a health data processor under HIPAA or a service handling government authentication may need physical deployment separation regardless of team size. Regulatory requirements create legitimate architectural constraints.
Most applications running twelve services in Kubernetes do not meet any of these conditions. Most of them have a module problem wearing a microservices costume.
The Migration Path for What You’ve Already Built
If you are already running microservices and this is landing somewhere uncomfortable, the solution is not to tear everything down in a weekend.
Audit your services against the criteria above. Which services have genuinely different scaling requirements? Which ones deploy independently more than once a month? Which ones are owned by separate teams with separate release cadences and separate on-call rotations?
The services that fail those tests are candidates for consolidation. Not a full rewrite. Consolidation: bring the code into a single application as distinct modules, maintain database separation initially, remove the HTTP calls between them and watch your operational overhead contract.
The services that pass the tests are the ones that actually belong as services. Keep them. Operate them properly.
Fix the Modules First
The industry spent a decade treating microservices as the default architecture for any application expected to grow. The result is a generation of systems that are harder to debug, slower to ship and more expensive to run than they need to be.
The question to ask before drawing a service boundary is not “could this be a separate service?” Almost anything could be. The question is: what specific coordination problem does this boundary solve?
If you cannot name the coordination problem, you do not have a reason for the boundary.
Most applications do not have coordination problems between teams. They have tangled dependencies between modules, blurry domain boundaries and objects that do too much. Turning those modules into services does not fix the problem. It makes the problem distributed and gives it a latency bill.
Fix the module boundaries. Add enforcement tooling. Ship features faster because you are not debugging across six network hops.
Save the service extraction for the problem you actually have, not the problem Netflix solved in 2015.


