PHP HttpRequest class options and notes

In a recent post I talked about the PECL_HTTP extension for PHP. In this post I will cover a few of the options you can set for the HttpRequest object and related functions.

The PECL_HTTP extension allows you to set a number of options when you make a request. Usually you put the options in a key=>value array and pass the array as an argument to the request functions (I.e. http_get(), http_post_data(), http_request(), etc.) or assign the array to the HttpRequest object using the setOptions() method. Here is a code example using the HttpRequest object:

$http_req = new HttpRequest("http://www.example.com");
$http_req->setOptions(array(timeout=>10,useragent=>"MyScript"));

In the above code the timeout was set to 10 seconds and the user agent, which is a request header that identifies the browser to the server, is set to “MyScript”. I am going to cover just a few of the options but a full list of all the request options can be found here.

timeout

The timeout option specifies the maximum amount of time in seconds that the request may take to complete. Set this too high and your HTTPD process that PHP is running in could be stalled for quite a bit waiting for a request that may never complete. If you set it too low you might have problems with sites that are just slow to respond. This may require some tweaking depending on what you are doing. If you are making requests to a web server in Taiwan you might want to set the timeout a bit higher. The default timeout does not appear to be documented in the HttpRequest options page on PHP.NET (that I could find) but if you look at the http_request_api.c file in the HTTP_PECL source code, it looks like it is 0L AKA nothing:

HTTP_CURL_OPT(CURLOPT_TIMEOUT, 0L);

This indicates it will wait forever unless you explicitly set a timeout so it might be a good idea to set one! I put together a page that makes an HTTP request and then another one that will sleep for some number of seconds that I can test against.

Here is the code for the page that will sleep:

<?php

echo "Sleeping.";
sleep(30);

?>

Here is the code for the page that will make the HTTP request:

<?php
$http_req = new HttpRequest("http://localhost/alongtime.php");
$http_req->setOptions(array(timeout=>10, useragent=>"MyScript"));
try {
    $http_req->send();
} catch (HttpException $ex) {
    if (isset($ex->innerException)){
        echo $ex->innerException->getMessage();
        exit;
    } else {
        echo $ex;
        exit;
    }
} //end try block
echo $http_req->getResponseBody();
?>

When I pull up the page that makes the HTTP request in my browser I get the following error:

Timeout was reached; Operation timed out after 10000 milliseconds with 0 bytes received (http://localhost/alongtime.php)

If I don’t set the timeout option at all, the page responds 30 seconds later since it will wait forever or at least the 30 second sleep time on the target page.

connecttimeout

The connecttimeout option indicates the maximum amount of time in seconds that the request may take just connecting to the server. This does not include the time it takes for the server to process and return the data for the request. This option will have the same considerations as above although the number should be considerably lower since it is only the connection timeout and not the timeout for the whole request. Again, the default value is not documented but if you look at the http_request_api.c file in the HTTP_PECL source code, it looks like it is 3 seconds:

HTTP_CURL_OPT(CURLOPT_CONNECTTIMEOUT, 3);

dns_cache_timeout

One of the interesting features of the HTTP_PECL extension is that it will cache DNS lookups. Some of the Windows operating systems do this but many of the Linux distributions do not by default. By the way, if you want to clear your cached DNS lookup entries on a Windows box use the command “ipconfig /flushdns”. If you are making multiple requests to the same site, DNS lookup caching should provide a significant performance advantage because a round trip to the DNS server isn’t required for every request. The dns_cache_timeout option sets the number of seconds that will pass before the cached DNS lookup results will expire and a new DNS lookup will be performed. Again, the default value is not documented but if you look at the http_request_api.c file in the HTTP_PECL source code, it looks like it is 60 seconds which is probably fine for most applications:

HTTP_CURL_OPT(CURLOPT_DNS_CACHE_TIMEOUT, 60L);

redirect

The redirect option determines how many redirects the request will follow before it returns with a response. The default is 0 (this IS documented), which may not work in many situations because some applications respond with one or two redirects for authentication, etc. If you set this too high your application may get bounced around too many times and never return. I have not tried it but you could probably put someone in a redirect loop. Anyway, a value of around 4 or 5 should be adequate for most applications I would imagine.

useragent

The useragent option allows you to specify a different User-Agent request header to send to the server than the default which is “PECL::HTTP/x.y.z (PHP/x.y.z)” where x.y.z are the versions.

I made a little one-liner test page that returns the user agent info sent to the server:

<?php

echo $_SERVER['HTTP_USER_AGENT'];

?>

If I make an HTTP request to this page using the HttpRequest object without setting the useragent I get:

PECL::HTTP/1.6.2 (PHP/5.2.6-2ubuntu4)

If I do something like this:

$http_req->setOptions(array(timeout=>10, useragent=>"Mark’s Browser"));

I will get:

Mark’s Browser

The reason I bring this up is because some applications that you might make a request to may respond different depending on your user agent setting. In some cases you may need to spoof a specific browser to get what you are after.

Conclusion

As mentioned before, there are many more HttpRequest options. I just covered a few notable ones that I have some limited experience with.

One thought on “PHP HttpRequest class options and notes”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>