How do I find out what 500 errors are about?
-
I’m using the cloud version of BLC (recently switched from local), and I’m getting an error message:
Scan aborted: Too many server errors.?This is to prevent overloading your server. Please retry the scan in a while or?contact support?if the issue persists.
I know that my links do not return a 500 error on my server (most of my links redirect), so I’d like to drill down into a bit more detail on why/where the 500 is getting returned. Is there a way to get that additional information?
Alternatively, is there a way to see how where in the process the scan was aborted? My site has 91223 Total Links and 1054 Unique URLs. I’m not sure what those actual labels mean, but I’d like to understand how many of them were scanned before BLC found the “35” broken links and aborted the scan.
The page I need help with: [log in to see the link]
-
Hi @turbodb,
Hope this message finds you well.
I know that my links do not return a 500 error on my server (most of my links redirect), so I’d like to drill down into a bit more detail on why/where the 500 is getting returned. Is there a way to get that additional information?
Well, indeed, it should not be detected as a 500 error, since is redirecting, as other links should return a status 403 since they redirect to Amazon. I can’t confirm but this might be due to your server firewall. Still, I notified our BLC team, and they might provide further information.
Alternatively, is there a way to see how where in the process the scan was aborted? My site has 91223 Total Links and 1054 Unique URLs. I’m not sure what those actual labels mean, but I’d like to understand how many of them were scanned before BLC found the “35” broken links and aborted the scan.
Unique URLs are the ones found on your site only.
You will find more information about it on our documentation at this link https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#broken-link-summary, it also includes the Error Codes.
Since our BLC team works on very complex issues, getting a reply from them could take more time than usual. We will back to this topic once we get an update from them.
Best regards,
LauraThanks Laura @wpmudevsupport3.
I’ve exported the CSV of the report from my wpmudev hub, in case it can be helpful to figure out what’s going on. It is available below.
All of the links in that report are currently resolving for me, so I don’t know why they are showing up as errors in BLC. It’s got me seriously considering either another tool, or going back to the local scanner (which has other issues, but at least seems to get through all the links instead of aborting due to errors).
Hi @turbodb,
We got feedback from our BLC team, they performed a few tests and confirmed what I mentioned in my previous reply, it seems you are using CloudFront and might be blocking our BLC bot, in such cases, you might need to whitelist our UA, and our IPs, you will find them on this link: https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#broken-link-checker-user-agent.
Additionally, they shared the results in JSON format, you can take a look over them on this link too https://drive.google.com/file/d/1V7_1bXrkAl3WMG4XylYyoB4v6CzPlcUm/view?usp=sharing
Kindly whitelist our User Agent and IPs, run a new scan, and let us know the results.
Best regards,
LauraHi Laura,
I am not using CloudFront at all, my site is hosted on a $5/mo Amazon Lightsail instance. A single box, with a public IP address, and a bitnami stack.
What tests were performed to determined I am using CloudFront? (I suppose Amazon could be using it “for free” without my knowledge, but I doubt that would be the case, as they aren’t in the habit of giving away those types of services).
Thanks,
DanHi @turbodb,
I hope you are doing well today!
According to the cURL results the site is using CloudFront and our BLC team noticed that our User Agent is being blocked by it.
You can confirm this by performing the following;
curl -IL https://adventuretaco.com/go/if-everybody-did-jo-anne-stover/
Above command will throw 503 error and once the UA is changed like;
curl -IL -A "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/81.0" https://adventuretaco.com/go/if-everybody-did-jo-anne-stover/
This will result with 200, so you should unblock UA of BLC
https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#broken-link-checker-user-agentKind regards,
ZaferThanks Zafer @wpmudevsupport15,
I think what’s going on here is that, when you run the curl command described, a redirect occurs on my site (adventuretaco.com) and then CloudFront is being used at the redirect destination (amazon.com).
This means – I think – that my site, running on an AWS Lightsail instance, does not use Cloudfront, but that BLC running into HTTP 500 errors after the redirects because amazon is doing something to block the bot traffic.
See below for the trace
curl -IL https://adventuretaco.com/go/if-everybody-did-jo-anne-stover/ HTTP/2 302 x-robots-tag: noindex, nofollow x-redirect-by: WordPress location: https://amzn.to/3tvpTdU cache-control: max-age=0 expires: Tue, 04 Jun 2024 00:16:01 GMT vary: Accept-Encoding content-type: text/html; charset=UTF-8 date: Tue, 04 Jun 2024 00:16:01 GMT server: Apache HTTP/2 301 cache-control: private, max-age=90 content-security-policy: referrer always; content-type: text/html; charset=utf-8 date: Tue, 04 Jun 2024 00:16:02 GMT location: https://www.amazon.com/If-Everybody-Did-Ann-Stover/dp/0890844879?crid=29R42WY6XBCF3&dchild=1&keywords=everybody+did&qid=1612561014&sprefix=everybody+did,aps,364&sr=8-2&linkCode=sl1&tag=srchamzn-20&linkId=ab58ffd173d8b3808625c3ed5343cf51&language=en_US&ref_=as_li_ss_tl referrer-policy: unsafe-url server: nginx set-cookie: _bit=o540g2-c78698ce62c20457a9-00t; Domain=amzn.to; Expires=Sun, 01 Dec 2024 00:16:02 GMT strict-transport-security: max-age=1209600 content-length: 395 HTTP/2 503 content-type: text/html date: Tue, 04 Jun 2024 00:16:02 GMT server: Server accept-ranges: bytes x-amz-rid: 1NXFP2E2NDKH7SRW08J9 vary: Content-Type,Accept-Encoding,User-Agent etag: "a6f-6187f291ddc80" strict-transport-security: max-age=47474747; includeSubDomains; preload last-modified: Wed, 15 May 2024 14:44:50 GMT x-cache: Error from cloudfront via: 1.1 646b6f21a2659c68f7a3822d035b97d2.cloudfront.net (CloudFront) x-amz-cf-pop: NRT57-C2 alt-svc: h3=":443"; ma=86400 x-amz-cf-id: Ga9AudCUHmBDKRawpBRiIak02Bq7xlhXTqcZSsrJNS4ETzZArurkeg==
It seems a little strange to me that BLC wouldn’t work for amazon links, as it seems like *a lot* of the links that people would want to check for blogs would be their affiliate links to amazon.
Given all this, I have two questions:
- Is it expected behavior that BLC doesn’t work with links to amazon?
- It seems to me that BLC should not stop processing external links if a redirect has occurred prior to receiving an HTTP 500, or if the redirect is no longer on the same domain as the original external link. Would it be possible to change BLCs behavior to continue processing these types of links?
Thanks!
Hello @turbodb
Hope you’re doing well.
Thank you for your observations, indeed it looks like Cloudfront could be involved at the amazon.com end. However, I was able to make some additional tests using Postman and the results were a bit different, I am confirming more about this with our BLC team and have already shared my findings with them.
Also about the BLC to stop processing external links, I am checking about this with the BLC team if something like that could be possible.
We will share an update here as soon as we receive further insights on those points from the BLC team.
Further regarding the BLC reporting Amazon links, we did have some reports about the affiliate links reporting HTTP 5XX error from scanning at the Amazon end – this happens when we make multiple requests, hence we already skipped some links https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#scanned-skipped, and the developers are planning to include the amazon links to the list.
However, one of the reasons for the new engine is that we have a specific bot and we can contact those providers to allow it but we don’t have any ETA or guarantee they would allow the scan.
Kind Regards,
SaurabhHi @turbodb,
As stated above we have already raised this further with our team’s attention to check if any further improvements could be implemented down the roadmap.
Since our team will be exploring features to improve the workflow regarding this, I’ll go ahead and mark it as resolved for now.
However, for any new feature updates, you can get updates on our progress by subscribing to our roadmap at https://wpmudev.com/roadmap/.
Kind Regards,
Nithin
hi Nithin,
before you mark this as resolved, there were two issues that were going to be followed up on
However, I was able to make some additional tests using Postman and the results were a bit different, I am confirming more about this with our BLC team and have already shared my findings with them.
Also about the BLC to stop processing external links, I am checking about this with the BLC team if something like that could be possible.
Hi @turbodb,
I have checked and confirmed that all the findings from our investigation have been brought to the attention of our developers for further review and improvement. Our developers are actively looking into this and further updates will be included in our roadmap as we have mentioned in our above response.
Kind Regards,
Nebu John
- You must be logged in to reply to this topic.