First, I’d like to preface this by saying that PHP is a complicated piece of software and that I am by no means an expert when it comes to PHP internals. If you find something wrong with this article, please let me know! My current employer introduced me to using nginx with PHP-FPM over FastCGI. I fast became a fan of nginx because of its small footprint and, in my opinion, cleaner, more modular design and implementation. I quickly switched to using nginx/PHP-FPM as my main means for deploying PHP web applications. I recently became interested in the FastCGI protocol itself and how exactly PHP-FPM works. This led me investigate more in regards to PHP internals and the different methods one may use to get PHP code to run in a meaningful manner. This article will attempt to give a high-level overview of what the SAPI layer is, why it’s important, a few common SAPI modules, and the very basics of how it works. The PHP developers recognized the need for PHP code to be run from different contexts and environments somewhat early and implemented a layer to allow the environment to be easily swapped out while still using the same core Zend Engine and PHP core code. This layer is called the SAPI layer or Server Application Programming Interface. This is a perfect example of good programming practice to allow flexibility and extensibility through modular design. For example, the PHP developers probably had no idea upon initial design of this layer that it would one day be used to process PHP scripts in a separate process given information from a webserver over FastCGI and then pass the results back to the webserver over this same FastCGI transport. However, due to this modularity of design, such a SAPI module exists and is called PHP-FPM. I would wager that almost all distributions of PHP come with at least two of the SAPI module implementations though you may not know them by that name. Two of the most popular PHP SAPI modules are the CLI (command line interface) and mod_php for Apache. The CLI SAPI, of course, allows one to run a binary, usually named ‘php’, and execute PHP scripts passed in as arguments from the command line having the output rendered to the standard output (stdout) and standard error (stderr) streams on the console. The mod_php SAPI allows the Apache webserver to service requests for specific PHP resources, have PHP read the PHP file, tokenize and execute it, and capture the output (usually HTML of course) to be sent back to the web client. The mod_php SAPI also is able to communicate certain information about the request to the webserver (query string, POST data, server host name, remote client ip address, etc, etc) to PHP so PHP can use this information during the processing of the PHP script. All of this is accomplished through the SAPI layer. So, now that I think we have a good idea of what it is and why it is important, let’s take a little look at how the basics work. I would like to introduce the sapi_module_struct. This struct is a bit large and probably rather daunting at first but it’s really not all that bad once we get to know it. I’d also like to say that if you are not a developer looking to develop your own SAPI module or just a curious fellow/gal you should probably go on about your business at about this point. The following comes straight out of the PHP source distribution under the path ‘main/SAPI.h’.
Okay, like I said, there’s a bit much to swallow here. Let’s just hit on a few key members and callbacks in this article and hopefully we can get more specific and explore further in future articles. First, name and pretty_name are pretty self explanatory. I will say that pretty_name is what gets displayed in phpinfo() as Server API. It seems to me that the startup and shutdown callbacks in the struct are pretty much wrappers to php_module_startup and php_module_shutdown in most cases and are fairly simple in their implementation. The startup callback needs to be called after sapi_startup() and the shutdown callback needs to be called before sapi_shutdown(). I still need to dig a little deeper to discover what all goes on at this stage. Next, are the activate and deactivate callback functions. Activate is called during php_request_startup() which is run once at the beginning each request. Deactivate is called at the end of each request and will run before we call php_request_shutdown(). We use these hooks to perform any work that needs to happen before each request and after each request executes. For example, in the deactivate callback of my simple FastCGI SAPI, I use this hook to send out my FCGI_END_REQUEST over the transport and close the connection which signals to the webserver that there is no further output for this request. ub_write is a very important callback which gets called during execution of the PHP script when PHP wants to write data out. ub_write in the CLI SAPI, in most cases, simply writes the data passed into the callback to stdout. In contrast, ub_write in the PHP-FPM SAPI enqueues the data to be sent over the wire in a FCGI_STDOUT record back to the webserver. This callback is where we decide how to handle the data that PHP is outputting. The send_headers callback gets passed a pointer to a sapi_headers_struct where one might loop over the headers and perform certain modifications or additions to the headers while writing them out. The read_cookies callback simply returns a string which gets parsed out and populated in the $_COOKIE PHP superglobal. In my simple FastCGI SAPI and in PHP-FPM we pull this from FCGI_PARAMS that has been sent across with the key HTTP_COOKIE and return it. The read_post callback gets called (usually multiple times) getting passed a buffer to fill and the number of bytes available in the buffer. In PHP-FPM this is drained from FCGI_STDIN records which the webserver sends from the POST data posted by the web client and written out to the buffer accordingly. This is then parsed by PHP and populated in the $_POST superglobal. As you can see, all of these callbacks provided by the SAPI layer allow us to control how we pass certain data to PHP and how we handle the output generated by PHP. Again, this allows us to easily swap out the context or environment in which we run PHP code and easily harness the power of PHP in many different ways. This has been just a brief introduction to how some of these things tie together to get the job done. All of this information was gleaned from other SAPI modules and inspecting PHP core code. I still have a long way to go in understanding all the nuances and subtleties of the SAPI layer and look forward to bringing you more information about developing PHP SAPI modules. Also, it has come to my attention that there is absolutely no documentation on how to go about writing one of these SAPI modules (other than the source code of existing ones) and I’d like to change that. I will be working on some tutorials for jumping into SAPI module development as I progress in my own endeavors into this area. Keep an eye out for them! Thanks for reading.