• Here’s a screenshot of the ? character appearing on pagespeed.ninja as well:
    https://pasteboard.co/HvY6Onc.png

    The issue happens when adjusting HTML Parser from “Fast simple” to “Standard full”. I would like to use “libxml” but it conflicts with too many things on my page.

    Basically anywhere an   is, it seems to get parsed into this character: ?

    • This topic was modified 6 years, 4 months ago by SJF.
Viewing 15 replies - 1 through 15 (of 23 total)
  • Plugin Author Denis Ryabov

    (@dryabov)

    Hmm, do you use UTF-8 charset on the website?

    PS. Could you send me the link to the page you need help with? (hello-at-pagespeed-dot-ninja)

    Thread Starter SJF

    (@sjf)

    @dryabov – we are doing this on a local development environment, so unfortunately I cannot provide you with a link. Also, yes, we are using utf-8 charset.

    If you check the screenshot I sent of pagespeed.ninja, and even check the sourcecode of the page, you’ll see on line 113 it has the ? character I’m referring to.

    Plugin Author Denis Ryabov

    (@dryabov)

    What PHP version do you use? Do you use mb_string extension in mbstring.func_overload=2 mode? Do you use mb_output_handler (maybe via a 3rdparty plugin)?

    Thread Starter SJF

    (@sjf)

    I am unsure where to locate that, but I will take a look. What should the setting be set as, once I find it?

    Plugin Author Denis Ryabov

    (@dryabov)

    To get PHP version (if unknown), you can look at website response headers (e.g. via Chrome or Firefox devtools), usually there is a header like

    X-Powered-By: PHP/7.2.8

    where 7.2.8 is PHP version. Alternatively, it’s possible to install “Display PHP version” plugin to get correct result (X-Powered-By header may be overwritten by both PHP and webserver).

    Parameter mbstring.func_overload is located in php.ini file (it is disabled by default and is marked as deprecated in PHP 7.2).

    The mb_output_handler function is rarely used thing, so unlikely it is used in your case.

    Thread Starter SJF

    (@sjf)

    That helps, thank you. We are on an IIS server running v5.4.45 of PHP. It looks like we have the extension php_mbstring.dll enabled, and inside of the php.ini file it looks like all of the ‘mb’ things you mentioned are commented out… so nothing is actually applied there.

    I went ahead and installed v7.2.7 of PHP but got a 500 error. When turning WP Debugging on, it just keeps the 500 error… and when switching back to 5.4.45 (with debugging on) I see a lot of errors… ?? so I’ll have to tackle those.

    Question: are you saying if we successfully upgrade to 7.2.7 (where there aren’t 500 errors) then we shouldn’t be seeing this “replacement character”?

    • This reply was modified 6 years, 4 months ago by SJF.
    Plugin Author Denis Ryabov

    (@dryabov)

    > are you saying if we successfully upgrade to 7.2.7 then we shouldn’t be seeing this “replacement character”

    No. I’m just trying to guess where this issue may come from.

    Few more questions:

    1) What is the value of Content-Type header in website response headers? Is it “text/html” or “text/html; charset=utf-8”?

    2) Do you set page charset using corresponding meta tag, e.g.
    <meta charset=”utf-8″/>
    (just open page sources in Chrome or Firefox browser using Ctrl+U shortcut)?

    Plugin Author Denis Ryabov

    (@dryabov)

    and

    3) Do you use a 3rdparty caching plugin?

    Thread Starter SJF

    (@sjf)

    1) Content-Type: text/html; charset=UTF-8
    2) Yes

    Thread Starter SJF

    (@sjf)

    It’s also worth noting… just as pagespeed.ninja has it… you can actually see the ? character in the source code. When switching from “Fast simple” to “Standard full” is when the ? appears, and switching it back I can see in the source code that it’s a nbsp

    Thread Starter SJF

    (@sjf)

    (just saw #3)…

    3) Yes, WP Cache, which was active but not on (through its settings there’s an option, we’ve had it off in dev, esp to test this) but I went ahead and deactivated it, and it still has the issue.

    Also I’ve emailed you the public URL to check out.

    Plugin Author Denis Ryabov

    (@dryabov)

    Fixed pagespeed.ninja sources, there was an issue in the page sources that is not related to PageSpeed Ninja optimizing engine.

    As to your website, could you disable PageSpeed Ninja, save page content to file, and send it to hello-at-pagespeed-dot-ninja?

    Thread Starter SJF

    (@sjf)

    No problem. Sent.

    Thread Starter SJF

    (@sjf)

    Also another interesting find… not sure how related this is (or how it may help)… but when I run a local version of the site via MAMP (Apache), with the same exact PageSpeed Ninja settings, it doesn’t have the ? characters where nbsp is. What gives? :'(

    Plugin Author Denis Ryabov

    (@dryabov)

    In your page sources you have characters with character code 160 (A0 in hex) that are not valid UTF-8. A0 is non-breakable space in most single-byte codepages, but in UTF-8 it is written as two-bytes sequence C2 A0. It is well-known issue, and most of browsers internally replace A0 by C2 A0, but PHP doesn’t apply this transformation. You can use “Fast” parser, and we will try to think what is possible to do with other parsers.

Viewing 15 replies - 1 through 15 (of 23 total)
  • The topic ‘Replacement Character (?) where nbsp should be’ is closed to new replies.