Methods for asynchronous processes in PHP 5.4


Dave Newson - April 15th 2012

Warning: This article is extremely out-of-date.

Please consider the following articles, which outline much better methods to achieve asynchronous processes in PHP:

There's certainly more than one way to skin asynchronous tasks in PHP. Currently one of the best resources on the subject is a thread on Stack Overflow about asynchronous PHP threads which details a range of methods that can be used to achieve async processes.

A recent work project involved the need for an asynchronous task processor in PHP; the requirement being that a very large CSV file be uploaded by a user and processed by PHP. The processing itself was expected to take a long time for larger files, so we wanted to avoid having the user sat on a blank page while the browser says “Connecting…” for hours on end while the file was processed. We also wanted to limit the system so you could only run one import at a time, and other users could see the progress of your import, as well as you.

One option would be a frequently running Cron job, which ran every minute or so and processed the file when it’s uploaded, but that would incur a delay and it’s inefficient to run a script all the time just because you want an occasional background task. PHP is getting pretty mature now so there must be a good way to do this, right? Right..?

shell_exec()

The shell_exec() function allows us to run shell commands on the server from within PHP; this can be used to run the PHP CLI (Command Line Interface), and with a few extra commands we can send the CLI to the background.

$cmd = 'nohup nice -n 10 /usr/bin/php -c /path/to/php.ini -f /path/to/php/file.php action=generate var1_id=23 var2_id=35 gen_id=535 >> /path/to/log/file.log';
$pid = shell_exec($cmd);

For those who don’t know Bash/Linux too well, here’s the $cmd split into what does what:

nohup “No-Hangup”; execute the following command in the background and continue to run even after the user logs off.
nice -n 10 Set the thread priority of the following command to “10″ (on a scale of -19 to 19).
/usr/bin/php This is the assumed location of the PHP CLI executable.
-c /path/to/php.ini The -c argument allows you to specify a target php.ini to be used by the PHP CLI. This argument can be omitted in order to use the default php.ini
-f /path/to/php/file.php action=generate var1_id=23 var2_id=35 gen_id=535 The -f argument specifies the PHP file to be executed, while the following text (eg. “action=generate”) sets the command-line arguments for the script (note: CLI arguments are different to GET/POST/REQUEST vars).
» /path/to/log/file.log Send any text output for this script to /path/to/log/file.log, and append it to the end of the file.

Sources:

http://stackoverflow.com/questions/209774/does-php-have-threading http://stackoverflow.com/questions/45953/php-execute-a-background-process#45966

Pros

  • PHP CLI is extremely powerful; you can set your own PHP INI if needs be.

  • Supplies logging of all output from PHP CLI; even crashes or fault logs.

  • Secure to your server, so you don’t need any verification or validation

  • Async tasks and Cron tasks should be interchange, as they should both use the CLI.

  • Works with most frameworks that have suitable CLI support

Cons

  • shell_exec() is not always available on all servers. Shared hosting may not allow this method to be used

  • Any process you create will be tied to the webserver as a child/parent process relationship, so if the parent process dies or times out, the executed script may be taken with it. To get around this use set_time_limit(0) before calling shell_exec() (Source).

  • Commands used under shell_exec() would need to be different between Windows and Linux, and the CLI location changes from server to server.

  • GET/POST/COOKIE/REQUEST/SESSION values are not persistent through to the CLI, and arguments passed to the CLI end up in $argv. See the $argv documentation on php.net for more info on working with CLI arguments.

Notes

  • If you can't access the PHP CLI, you can always use wget (on linux) instead, to call your page as a download. This effectively lets you call the new thread using shell_exec() but has all the downsides of using CURL (method 3).

Detach the Client

One of the main reasons we need asynchronous processes in PHP is to allow something very time consuming to happen in the background, but not keep the client “on hold” for that entire duration; as PHP is typically synchronous, anything that takes a long time on the server will appear to take a long time for the client.

In that scenario, one solution would be to “detach” the client from the currently loading page, and let them have control of their browser back while the PHP script continues to do it's thing. We should be able to make this happen by sending some headers to the client to say “ok, we’re done here, connection ends”, even though PHP is still running.

This contribution onthe PHP documentation gives us an example of how this can be done.

ob_end_clean();
header("Connection: close");
ignore_user_abort(true);

echo ('Text the user will see');

$size = ob_get_length();
header("Content-Length: $size");
ob_end_flush();                   // Strange behavior; this will not work unless..
flush();                          // both functions are called !

// Do processing here
sleep(30);

echo('Text user will never see');

Pros

  • All the processing happens under one call, simplifying the code path

  • GET/POST/COOKIE/REQUEST/SESSION vars are all present, based on the session of the user.

  • Coupled with a simple redirect header, you should be able to instigate the process and then bump the user back to another page.

Cons

  • If content is sent before your custom headers, your task will no longer behave as asynchronous; the client will remain stuck on that page until the task is complete.

  • Errors and other output will be lost (though these can be saved with ob_start and customer error handlers).

Notes

  • I personally had very little success with this method, despite it being ideal for my situation, but this may have been caused by the framework in use at the time. If you have better luck, let me know.

cURL

cURL is a library for PHP which allows you to call assets from the web into PHP. Commonly you might use it to fetch a Twitter feed, or scrape some data from another site. In this instance, I wanted to abuse curl's timeout function to do the reverse of the previous options and detach us as a user from a page we request on our own server.

To do this, it's a simple case of calling CURL with a few additional arguments:

$c = curl_init();
curl_setopt($c, CURLOPT_URL, $url);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, true);  // Follow the redirects (needed for mod_rewrite)
curl_setopt($c, CURLOPT_HEADER, false);         // Don't retrieve headers
curl_setopt($c, CURLOPT_NOBODY, true);          // Don't retrieve the body
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);  // Return from curl_exec rather than echoing
curl_setopt($c, CURLOPT_FRESH_CONNECT, true);   // Always ensure the connection is fresh

// Timeout super fast once connected, so it goes into async.
curl_setopt( $c, CURLOPT_TIMEOUT, 1 );

return curl_exec( $c );

With the code above, we're instructing CURL to not return any headers (CURLOPT_HEADER) or body content (CURLOPT_NOBODY), and we're setting the timeout (CURLOPT_TIMEOUT) to 1 second. This means when the CURL request is made, it should return almost immediately while simultaneously creating a new request on our own server.

On the async task side of things, we need to call a couple of functions to ensure the script can continue correctly. Some of these instructions are probably relevant to Method 2 also.

ignore_user_abort(true);
set_time_limit(0);

Pros

  • Works for almost any server with cURL enabled.

  • Same calls on Windows and Linux systems.

  • You can debug the script by calling the cURL target directly in a browser

Cons

  • While GET/REQUEST variables are inherited from the URL, sending COOKIE or SESSION data is far more complicated.

  • Not all servers have cURL enabled

  • cURL needs to be able to resolve the target hostname and access it. Servers with fake DNS, HTACCESS restrictions or user/password verification might cause problems for this method.

  • Asynchronous task process needs to be exposed to the web so cURL can access it.

  • You'll need to use a custom error handler and ob_start in order to capture errors and crashes.

Notes

  • Passing SESSION variables reliably to cURL is a royal pain, so you may end up exposing your password-protected async function to the outside world to get it working.

  • If cUrl isn't available, you can try using file_get_contents or fopen to request the file.

  • If none of these functions work for you, you can try shell_exec'ing an instance of wget to the same URL.

stream_socket_client()

An addition to PHP 5, stream_socket_client has an option to perform requests asynchronously, so we’re essentially able to do the same thing as cURL without needing the cURL module.

Wez Furlong has already written a big ol' article about using this method, Guru – Multiplexing. Simply take out the while loop for reading back the results, and you should have what you're after.

Pros and Cons

  • Same as cURL, except you don’t need to worry about the cURL module being available.

Notes

  • The process outlined by Wez Furlong is intended to fetch content from multiple sources at once, and then read the data returned. While it can “timeout” I have to wonder if PHP will internally hold on to that connection until it completes, or if the connection is dropped when the script ends. I never personally got that far.

Personal Taste

I can't profess to know which method is “the best” as they all have their ups and downs.

By far and away the most professional method is using shell_exec, as it has the best error handling capability of all, and you get the most control over your CLI instance, but it can give you trouble if you've never worked with the CLI, if your environment is locked down, or you want to pass a large number of variables from your user.

The most flexible solution is probably a variation on cURL using either stream_socket_client, cURL itself, or wget as a fallback. Though I haven’t tested stream_socket_client, it should have all the benefits of cURL without needing the cURL library, making it usable on the widest range of hosts and environments. You simply need to point it to a URL of your async script, and it/ll call that URL asynchronously. Awesome.


Comments

comments powered by Disqus