• Resolved tremlas

    (@tremlas)


    Having just upgraded the version of simply static to 2.1.5.3 I was very surprised to see the number of URLs scanned by simply static on my site leap from 3403 to 6755 pages when I didn’t add any pages or posts (just some edits). Looking at the generated export log I see 3351 missing URL errors where previously I didn’t see any.
    The missing URLs are all of the form:

    404 	https://virtual.internal/character/imprudentius/GNU Terry Pratchett 	Found on /character/imprudentius/	
    404 	https://virtual.internal/character/matthew-dixon/text/html; charset=UTF-8 	Found on /character/matthew-dixon/

    and looking at the HTML for these pages shows that simply static now tries to follow the
    meta http-equiv="Content-Type" content="text/html; charset=UTF-8"
    and
    meta http-equiv="X-Clacks-Overhead" content="GNU Terry Pratchett"
    (It also followed a meta name="msapplication-TileImage" html tag but that was indeed actually a link).

    It looks to me like simply static is now overeager on following meta tags – at the very least it will need to be more intelligent about the values of http-equiv field since of the offical ones only default-style could be a link (X-Clacks-Overage is NOT an offical value for http-equiv but ideally would not cause the scan).

    In addition the extra URLs doubled the time for the scan

Viewing 2 replies - 1 through 2 (of 2 total)
Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘2.1.5.3 following http-equiv fields in the html causing many irrelevant 404s’ is closed to new replies.