Bug Report: Solution to 416 errors, 502 errors and other errors with https
-
Hi guys:
In modules/checkers/http.php lines 226-246 there’s an attempt to optimize cURL requests for checking link targets:
$nobody = !$use_get; //Whether to send a HEAD request (the default) or a GET request
$parts = @parse_url($url);
if( $parts['scheme'] == 'https' ){
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); //Required to make HTTPS URLs work.
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$nobody = false; //Can't use HEAD with HTTPS.
}if ( $nobody ){
//If possible, use HEAD requests for speed.
curl_setopt($ch, CURLOPT_NOBODY, true);
} else {
//If we must use GET at least limit the amount of downloaded data.
$request_headers[] = 'Range: bytes=0-2048'; //2 KB
}//Set request headers.
if ( !empty($request_headers) ) {
curl_setopt($ch, CURLOPT_HTTPHEADER, $request_headers);
}Note that it uses a GET request for https and a HEAD request for http. Note also that it applies a Range header to the GET request.
I’ve confirmed that a) HEAD requests work fine in https. I’m not sure why the comment on line 232 claims it doesn’t. Maybe it’s a historical thing? and b) GET requests with Range headers break lots of pages, causing them to be marked as false positives. They tend to fail with 416 Range not satisfiable errors if they don’t have the right bytes for the hardcoded range, or 502 Bad Gateway, if the site’s reverse proxy doesn’t know what to do with the Range request etc, etc. I’m sure there are more cases I haven’t identified.
Here’s a quick snippet that demonstrates the issue. It’s a https request. HEAD works fine, GET works fine. GET with Range breaks.
$nobody = !$use_get; //Whether to send a HEAD request (the default) or a GET request$parts = @parse_url($url);
if( $parts['scheme'] == 'https' ){
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); //Required to make HTTPS URLs work.
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$nobody = false; //Can't use HEAD with HTTPS.
}if ( $nobody ){
//If possible, use HEAD requests for speed.
curl_setopt($ch, CURLOPT_NOBODY, true);
} else {
//If we must use GET at least limit the amount of downloaded data.
$request_headers[] = 'Range: bytes=0-2048'; //2 KB
}//Set request headers.
if ( !empty($request_headers) ) {
curl_setopt($ch, CURLOPT_HTTPHEADER, $request_headers);
}Let me know if you want me to submit a patch. I’m not sure how to go about it.
- The topic ‘Bug Report: Solution to 416 errors, 502 errors and other errors with https’ is closed to new replies.