Turn Your PHP Application into a Scalable Docker Image with Polyglot Images
Back a few years ago the company I was working for was working on a migration to Google Cloud and, as part of the company-on-company schmoozefest, our Google rep offered us some top-tier passes to the Cloud NEXT 2017 conference. After signing up for every possible talk being given by Kelsey Hightower I browsed the schedule for other interesting things to fill in my schedule. One that caught my eye was the lone talk being given with the word ‘PHP’ in the title, something truly rare in the high-falutin’ Cloud Startup 3.0 Agile Synergy Buzzword crowd. The subject was using Google Kubernetes Engine to deploy scalable PHP apps.
This caught my eye because:
- I’ve been writing PHP since roughly 2002.
- My job at the time was to wrangle literally thousands of disparate LAMP stacks.
- I viewed Kubernetes was my way out of that particular nightmare.
Fast-forward to the talk itself. I’m going to do my best not to speak ill of the presenters and their presentation, it was a tough subject and a tough crowd. However, the one unforgivable thing was that the lone PHP app container that they used in their demo was an agglomeration of Apache and mod_php of all things. They also used Wordpress, but let’s not dwell on that particular sin at the moment.
During the question phase someone asked the obvious:
“Why did you bundle Apache and PHP into the same container?”
And were given the worst possible answer:
“Oh, it’s just easier that way. Next question.”
[expletives deleted]
This ran counter to everything that I was led to believe that Docker demanded. One container, one service. It makes no sense to build an “omnibus” container like this with Apache and PHP in lockstep like that, no matter how lean Apache might be these days. Nor is it any more sane to try to try and figure out the magic number of PHP workers per Apache instance for resource limits to work without problems or waste.
I was really put off by that whole debacle, and it wasn’t until I was leaving the company in Fall 2018 and ticking things off my bucket list that I came back to tackle this K8s proof-of-concept to leave behind as part of my legacy.
My main goals were:
1. Segregate the HTTP server and CGI components into separate, scalable services.
The first thing I did was pull in an Apache image, a PHP-FPM image, and cobble them together into a basic K8s deployment to accomplish Point 1. Static files were served directly by the Apache side, and non-static requests were piped over to the FPM containers.
It was beautiful. If I were to start writing an app from scratch this is probably the way I’d structure the platform. However, with the existing app having no concept of keeping static and dynamic content separate [heavily the opposite, actually] I was constrained by Point 2…
2. Minimize the amount of application change necessary to dockerize the application, but without negating the benefits of the new platform.
Part of the reason why I’d found myself a new job was that I had long since shouted myself hoarse about such divides and the benefits that this and many other things would bring. Talk to enough people working in tech and you’ll find a lot of people similarly bound by history and organizational inertia, no matter how popular blog posts like “How I changed my Fortune 500 company’s entire application stack overnight with this one weird trick! [Sysadmins hate him]” might be.
My first solution to this problem seemed so obvious that I couldn’t believe that no one had though of it before. All I needed to do was pack all the app code into a versioned bundle, push it into an object store, and then pull that into the container on initialization! Genius!
After stepping back to admire my creation I realized that I had just essentially re-invented a container registry, except much worse. “Ahh well,” I sighed “I’ll just COPY the app source into both containers.” Which would have been all well and good, but for Point 3…
3. Minimize the amount of duplication between application images.
The way that the application and company were structured we were looking at literally thousands of final images, and our app bundle was not anything I would describe as lean. Having app code layers on top of both the Apache and FPM images would wreak havoc down the line both in terms of unnecessary storage and IO in both the container registry, and the K8s nodes as well.
Inverting the problem by layering Apache and FPM on top of the app code wasn’t much better as we’d just wind up with several thousand Apache and FPM layers with slightly different hashes.
“Just how the hell am I supposed to build anything but an omnibus container?”
“I guess it’s time to get yelled at in #docker on IRC.”
After a few false starts over a few days I got into a very productive conversation with someone whose name, to my discredit, I never actually noted down. [Thank you, whoever you are]
“Why don’t you make a Polyglot image?”
“A What-y what?”
A Polyglot Image
In practice, a Polyglot image is virtually identical to the cursed Omnibus image, with Apache and FPM built into the same image, but with one crucial difference: Each service in the image has its own entry point.
The Dockerfile looks a little like this:
FROM centos:7# this script handles the installation of apache and PHP, and the
# associated cleanup. [I like tidy Dockerfiles, ok]
COPY fs-overlay/tmp/container-init.sh /tmp/
RUN bash /tmp/container-init.sh# fs-overlay contains all the relevant config and folder structure.
COPY fs-overlay/ /WORKDIR /var/www/html
EXPOSE 80 9000
From here the client app images are built like:
FROM polyglot:1.0.0COPY app/ /var/www/html
The Kubernetes deployments look like:
image: client-app:1.0.0
command: ["/usr/sbin/httpd", "-DFOREGROUND"]
and:
image: client-app:1.0.0
command: ["/usr/sbin/php-fpm", "-F", "-R"]
With this one weird trick we’ve satisfied my 3 goals:
- In the Apache service group, call only the Apache entrypoint. In the FPM service group, call only the FPM entrypoint. Each container/group is running one service and can be scaled simply and automatically.
- The existing application can still do much of the same unsightly mixing of static and dynamic content it usually does, though in the documentation that I left accompanying the PoC I strongly recommended that stuff like that be addressed with cloud buckets, CDNs, and so forth.
- There is no duplication between the images, because there is only one polyglot image!
While we’re only using a single image, we’re still running separate, process-specific containers that we can scale independently, and this sufficiently placates the Docker purist in me.