Unlocking PHP's Concurrency Secrets: How to Overcome Common Limitations and Optimize Performance

Contents

1 Why php’s concurrency feels weird (and what’s really going on)
2 The first lie: “PHP is single-threaded, so it has no concurrency”
3 The second lie: “Just scale horizontally”
4 Where php really hits its limits
5 Concurrency inside one php process: promises, generators, fibers
6 Where php’s model quietly helps you
7 Practical ways to stop fighting php’s concurrency model
8 The quiet skill: knowing when “enough” concurrency is enough

Why php’s concurrency feels weird (and what’s really going on)

Sometime around 23:40, with an almost-cold mug of coffee and three failing tests, I finally accepted something I’d been avoiding all week: the problem wasn’t MySQL, wasn’t Redis, wasn’t “the network”.

It was concurrency.

Or, more precisely: it was my assumptions about concurrency in PHP.

You probably know this feeling. A production graph quietly slopes upward: response times creep from 120ms to 800ms, then occasionally spike into “why is everything on fire” territory. Nothing obvious in the logs. CPU is… fine-ish. The database is a little grumpy, but not “I’m dying” levels.

And then someone asks the question:

“How many requests can one of our PHP workers actually handle at the same time?”

Silence.

We think we know. We’ve seen “WEB_CONCURRENCY” on Heroku, pm.max_children in PHP‑FPM, maybe even played with ReactPHP or Amp and said the words “event loop” out loud in a meeting.

But under all of that, PHP is still… PHP: a request comes in, code runs top to bottom, script ends. Clean, predictable, boring.

So how do we square this simple mental model with real-world load, job queues, async HTTP calls, and users who apparently like to hammer “Refresh” like it’s a minigame?

Let’s peel this slowly. Not with benchmarks and academic diagrams, but with the stuff you and I actually touch: nginx, PHP‑FPM, jobs, I/O waits, and that one endpoint that randomly turns into molasses.

The first lie: “PHP is single-threaded, so it has no concurrency”

You’ve heard this, right?

“PHP is single-threaded. That’s why Node is better at concurrency.”

It’s one of those statements that’s technically not wrong and yet completely misleading.

Here’s the nuance:

A single PHP request (that usual index.php in your web app) is basically:
- single-threaded,
- synchronous, and
- blocking on I/O by default.
A PHP application in production is almost never just one request. Even the simplest setup with PHP‑FPM is actually:
- multiple PHP worker processes,
- each handling one request at a time,
- all sitting behind a web server (nginx, Apache) that accepts many connections concurrently.

That means:

Within one process: zero concurrency, unless you explicitly opt into it (Fibers, parallel, event loops, etc.).
Across many processes: your app absolutely has concurrency. Ten workers → ten requests in flight.

So when someone says “PHP has no concurrency”, what they usually mean is:
“PHP code, as we normally write it, does not interleave multiple tasks inside one process.”

And that’s true.

But if your production app is slow, it’s rarely because of some theoretical lack of threads. It’s usually because of something much more mundane:

Not enough PHP‑FPM workers.
Too much memory per process.
Blocking I/O in the wrong place.
A job queue that fans out tasks without understanding limits.

This is why on platforms like Heroku, they talk explicitly about PHP application concurrency as a function of memory_limit and FPM worker count: you get as many concurrent requests as you can afford to keep in RAM.

You want more concurrency? You pay with memory, CPU, or complexity. Often all three.

The second lie: “Just scale horizontally”

We’ve all said it in a retro at some point:

“If things get slow, we’ll just add more servers.”

And honestly, for PHP that’s not terrible advice. The “shared-nothing, stateless request” model is very good at horizontal scaling. Spin up more containers/dynos/VMs, point the load balancer, done.

Except.
You can feel it in your stomach when you say “just”.

Horizontal scaling is not magic. It doesn’t fix:

per-request slowness,
database bottlenecks,
queues that pile up faster than they drain,
bad worker configuration (too few or too many).

And there’s something else: horizontally scaling can hide concurrency problems by throwing money at them until the bill arrives.

The deeper question for us, as PHP developers, is not “can we scale”, but:

“What are the inherent concurrency limitations of our stack, and how do we work with them instead of against them?”

So let’s talk about where PHP actually struggles, not just in theory, but in the quiet, real ways that show up as “timeouts” and “job stuck” and “why does staging feel fine but prod melts”.

Where php really hits its limits

1. One request, one worker, one world

In typical PHP‑FPM land:

One incoming HTTP request is:
- assigned to one worker process,
- that worker runs your entire script,
- and does not touch any other request until it’s done.

That means:

If your code does file_get_contents('https://remote-api') and that API stalls for 5 seconds, your worker is blocked for 5 seconds. No other request will use that process.
If you hit a memory spike in one request, that process is in danger of being killed once it crosses memory_limit.

In isolation, that’s fine. Modern servers can run dozens of workers. But under load?

Picture this:

You have 8 PHP‑FPM workers.
A new feature does three slow curl_exec() calls to an upstream service, sequentially.
That endpoint suddenly becomes popular.

It doesn’t take a flamegraph to see what happens next:

Those workers fill up with slow requests.
New incoming requests pile up in the web server’s queue.
Eventually you hit gateway timeouts and everybody starts staring angrily at Grafana.

The limitation here is not theoretical. It’s painfully physical:

A blocked worker is dead weight until it returns.

There is no automatic “switch to something else while you wait” like you’d get in Node.js with async I/O, unless you build that yourself.

2. Memory as a concurrency throttle

This is the part a lot of us ignore… until a platform like Heroku forces us to care.

Each PHP worker process has its own memory usage. And there’s a simple, brutal relationship:

More memory per process → fewer workers → less concurrency.
Less memory per process → more workers → more concurrency.

On Heroku specifically, they make this explicit:

memory_limit is used to calculate how many PHP‑FPM children (pm.max_children) they can safely spawn for a dyno.
Halve the memory_limit → roughly double the workers.

That’s not a Heroku quirk; it’s just math.

On your own server, it’s the same dynamic:

1 GB of RAM
Each PHP process effectively uses ~100 MB under load
You can’t safely run 50 workers. You’ll thrash swap and everything will die in slow motion.

So the concurrency limitation is: how fat is each request?

Are you loading 50 MB CSVs into memory?
Hydrating thousands of ORM entities into arrays that live forever in the request?
Using image processing that doubles, triples, quadruples memory for a while?

Every time you say “ah, it’s fine, we have RAM”, you’re deciding—quietly—how many concurrent requests your app can actually handle.

You can try to crank pm.max_children to 100 and feel fast in benchmarks.
Then a real workload arrives, memory spikes, and your kernel becomes a full-time page-swapping enthusiast.

3. The built-in server fantasy

Some of us had this phase: “What if I just used the built‑in PHP server in production? Simple, pure, neat.”

That server, as experiments have shown, is basically single-threaded by default:

One request at a time.
Next request waits for the previous one to finish.
No concurrency at all unless you use experimental knobs.

It’s perfect for local dev, and absolutely the wrong place to learn about PHP’s real concurrency behavior.

If you’ve ever wondered why your app feels blazing-fast locally and then chokes when 50 users hit it in staging, part of the answer is:

“You’ve never actually seen your code under real concurrency constraints.”

On your laptop, you send requests one at a time. In production, 200 people and five cron jobs show up at once.

The limitations only reveal themselves when everything is happening together.

Concurrency inside one php process: promises, generators, fibers

At some point, frustrated with blocking I/O, you find your way to things like:

ReactPHP
Amp
Fibers (PHP 8.1+)
The parallel extension
Or some brave homegrown generator-based scheduler

And suddenly, PHP doesn’t feel so “single-threaded” anymore.

You write something like (pseudocode-ish):

$promises = [
    $httpClient->getAsync('https://api-1'),
    $httpClient->getAsync('https://api-2'),
    $httpClient->getAsync('https://api-3'),
];

$responses = awaitAll($promises);

Or you use Fibers to interleave execution:

$fiber = new Fiber(function () {
    // Some long-running task
});
$fiber->start();

// Do other things here, then resume:
$fiber->resume();

It almost feels like multithreading. But it’s not.

Under the hood:

You still have one OS thread executing PHP.
At any given time, only one fiber or coroutine is running.
Concurrency here means: you can pause one task while it waits on I/O and resume another.

This is where the distinction your brain probably glossed over becomes important:

Concurrency: dealing with multiple things at once.
Parallelism: literally running multiple things at the same time on different CPU cores.

Where php’s model quietly helps you

It’s easy to complain about PHP as the slow, blocking, “boomer” language of the web. But the single-request, single-process model also gives you something incredibly valuable:

Isolation: One bad request dies alone, without poisoning others.
Simplicity: No shared mutable state across threads. No mutexes. No deadlocks at 3 AM.
Predictability: Every request starts from a clean slate. Memory leaks don’t slowly pile up across hours of runtime.

For hiring managers on a platform like Find PHP, this matters more than we admit. When you bring in a PHP developer, you’re buying into this operational model:

Scale horizontally by adding more workers/dynos/containers.
Keep each request or job simple, isolated, predictable.
Use infrastructure-level tuning (FPM, memory_limit, queues) to control concurrency.

It’s not as trendy as async/await everything. It also breaks less in subtle, nightmarish ways.

The trade-off is:

You don’t get magical, transparent concurrency inside one process.
You have to be explicit about where and how you handle multiple things at once.

Which brings us to the practical side: if these are the limitations, what do we do with them?

Practical ways to stop fighting php’s concurrency model

Let’s go from monitors and theories to concrete knobs and habits.

1. Tune php-fpm like it actually matters (because it does)

So many PHP apps are “tuned” by accident: default configs copy-pasted from Stack Overflow, maybe changed once in 2017.

This is where concurrency is born or strangled.

Some key pieces:

pm (process manager): usually set to dynamic or static.
pm.max_children: the max number of worker processes = max concurrent requests.
memory_limit: per-process memory cap.

The mental model:

Figure out how much memory one worker uses under typical load.
Multiply by desired workers.
Stay within the RAM actually available.

Rough sketch:

Deploy.
Run a realistic load test (not just ab on /health).
Use top / ps / your host’s metrics to see average memory per php-fpm process.
Compute how many you can fit without swapping.
Adjust pm.max_children and memory_limit accordingly.

On platforms like Heroku, this computation is done for you using WEB_CONCURRENCY and memory_limit. They even warn: if some other buildpack sets WEB_CONCURRENCY with Node-like assumptions, your PHP app will underperform badly.

The limitation here isn’t PHP; it’s us, pretending PHP‑FPM settings are “ops stuff” and not our concern.

Good PHP developers—good PHP teams—treat FPM config as part of the codebase’s performance story, not an afterthought.

2. Be honest about what is actually blocking

Have you ever profiled a “slow” request and realized most of its time is spent in:

curl_exec()
Doctrine fetching a huge result set
S3 downloads
A slow internal HTTP API

…not in the PHP code itself?

From the outside, it just looks like “The PHP app is slow.”

Under the hood, it’s “The PHP app is patiently waiting, blocking a worker.”

Some practical moves:

Reduce the number of external calls per request.
Add timeouts and fallbacks for network calls (nothing holds a worker hostage like a missing timeout).
Batch DB queries and limit result sizes.
For truly chatty flows with upstream APIs, consider background jobs instead of doing everything synchronously in the request/response cycle.

This is where event-loop tools like ReactPHP or Amp can shine:

If you must call multiple APIs from one request, make them concurrently with non-blocking I/O and await them together.
But adopt these consciously. They add cognitive overhead and deserve their own architecture conversations.

3. Use queues as a pressure valve, not a trash bin

A common pattern:

“Requests are too slow? Throw everything into a queue.”
“Workers will process jobs in the background. Problem solved.”

Except:

Those workers are also PHP processes.
They also have concurrency limits.
They can also get overwhelmed and start lagging hours behind.

If you’re using queues (Laravel Horizon, Symfony Messenger, custom stuff):

Know how many workers you actually run.
Know how many jobs per second they can handle.
Know what happens during traffic spikes: does your queue length explode faster than workers drain it?

Queues are where concurrency limitations move when you “fix” the web layer.

It’s better than blocking users, but it’s still the same game: memory, CPU, number of processes.

4. Think in “flows”, not isolated endpoints

Concurrency problems rarely live inside one endpoint. They live in flows:

POST /checkout calls internal API A, which calls service B, which writes to a queue C.
Meanwhile GET /account hits the same DB table that’s being hammered by a cron job.
Meanwhile your queue workers are busy compressing 200 MB video uploads.

What looks like “intermittent latency” is really a concurrency story:

Several flows compete for the same resources.
Some of them are unexpectedly heavy.
You discover the limits live, with real users.

As a PHP dev, one underrated skill is to hold the whole flow in your head for a few minutes:

Where does this request go?
Which services does it call?
Which queues, DBs, caches does it touch?
What happens if 200 users do this at the same time, while a daily cron runs?

You start to see where concurrency will hit the ceiling before production tells you.

5. When you really need “modern async”

Sometimes you really do want Node-style, “let’s multiplex all the I/O things” behavior.

In the PHP world, that means:

ReactPHP / Amp for event loops and non-blocking I/O.
Fibers to make async code feel synchronous.
Or even Swoole/RoadRunner-type architectures that keep worker processes hot and handle many requests per process.

This is a different universe:

Your app logic might live in a long-running process.
You need to manage memory leaks, state, and lifecycle much more carefully.
Suddenly, concurrency bugs become possible inside your PHP code, not just at the infrastructure layer.

It’s powerful, but it asks something of you:

“Are you willing to take responsibility for the complexity that comes with this kind of concurrency?”

In many teams, the honest answer is: not yet. Or not everywhere.

A balanced approach I’ve seen work:

Keep the main web app on the familiar FPM model.
Use specialized services (often in PHP, sometimes not) for workloads that truly benefit from async concurrency.
Hire or grow developers who can straddle both worlds thoughtfully.

On a platform like Find PHP, you can tell when a candidate understands this. They don’t just say “Use ReactPHP.” They ask:

“What’s your ops model?”
“Who maintains this over time?”
“How comfortable is the team with long-lived PHP processes?”
“Is the complexity worth it for the actual bottlenecks you have?”

That’s concurrency literacy, not trend-chasing.

The quiet skill: knowing when “enough” concurrency is enough

Behind all the knobs and theory, there’s a human thing going on:

You, late at night, deciding whether to reduce memory_limit to fit more workers.
You, pushing back gently when someone wants to add three API calls inside a request with no timeout.
You, drawing a rectangle on a whiteboard labelled “queue” and writing “this is not a magic bucket” under it.

Concurrency limitations in PHP aren’t just technical constraints. They’re boundaries that shape the experience of using and building with PHP:

It nudges us toward isolation and statelessness.
It nudges us toward horizontal scale, not clever threads.
It asks us to be honest about memory, I/O, and the cost of “simple” blocking calls.

You don’t have to become an operating systems expert. But over time, the developers who stand out on teams and job platforms are often the ones who:

Understand both the promise and the price of concurrency tools.
Can read a PHP‑FPM config and tell a story about what it means in practice.
See beyond “the endpoint is slow” to “this is where our model of concurrency is cracking.”

And they’re usually the ones quietly making sure that when traffic spikes, the graphs bend but don’t break.

Some evenings, that just looks like re-running ab on a tuned endpoint and watching the p95 drop by 200ms. No fireworks, no big speeches, just a small feeling in your chest that the system breathes a little easier now.

That’s often all concurrency really is for us in PHP land: not a flashy superpower, but the quiet art of giving our code, our workers, and our future selves just enough room to move.

Unlocking PHP’s Concurrency Secrets: How to Overcome Common Limitations and Optimize Performance