• Resolved nicole2292

    (@nicole2292)


    I am having a lot of trouble with changing the encoding of an existing wordpress site. Basically the site was mostly in English (latin characters) however now we need to include many other languages such as Thai and Chinese etc.

    I have changed the encoding in the wp-config file from
    define( ‘DB_CHARSET’, ‘utf8mb4’ );
    to
    define( ‘DB_CHARSET’, ‘utf8mb4_unicode_520_ci’ );

    Which now allows me to save the foreign characters. That is working perfectly ??

    I have also changed the collation for all tables in the database to be utf8mb4_unicode_520_ci as well.

    However this change is causing issues for the existing English content.

    Now in the English content I see the question mark in the black diamond for many spaces but not all spaces and for apostophes and quotation marks.

    The content in the two different formats appears like this:

    Using define( ‘DB_CHARSET’, ‘utf8mb4_unicode_520_ci’ );

    
    which?lies approximately 50-minutes from central Bangkok, is one of Thailand?s premier golfing destinations. The land at?the members only?</p>
    

    Using define( ‘DB_CHARSET’, ‘utf8mb4’ );

    
    which&nbsp;lies approximately 50-minutes from central Bangkok, is one of Thailand’s premier golfing destinations. The land at&nbsp;the members only&nbsp;
    

    The apostophes and quotation marks are easily fixed with a search and replace on the database however the spaces are still a problem. Because I don’t know what to search for and replace in the database to replace the “different” spaces.

    A search for &nbsp; returns no results as this is not actually stored in the database. So what can I do to replace all the “BAD” space characters with normal space characters?

    Please note that I have already tried

    
    remove_filter('the_content', 'wptexturize');
    remove_filter( 'the_content', 'wpautop' );
    

    and neither resolves this issue.

    I have also tried the UTF8 sanitise plugin and no help either.

    Lastly I have tried exporting the entire database and there appears to be no difference in the .sql file between the normal spaces and the “BAD” spaces, however even after export and reimport of the same .sql the problem still persists so there must be a difference between the space characters.

    I’m completely stumped.

    Thanks for your help ??

    • This topic was modified 4 years, 4 months ago by Jan Dembowski. Reason: Moved to Fixing WordPress, this is not an Developing with WordPress topic
Viewing 1 replies (of 1 total)
  • Thread Starter nicole2292

    (@nicole2292)

    SOLVED MYSELF:

    The solution was indeed to do a search and replace on the database searching for a “BAD” space character and replacing it with a “GOOD” space character.

    Please note that both space characters looked exactly like a normal space.

    I used the “Better Search Replace” wordpress plugin to run a search and replace over the entire database.

    I went into the database with phpMyAdmin and opened up some affected content to edit then I copied a space character which I knew was a “BAD” space which would show as ? on the front end of the website. I pasted it into the “Search for” field in Better Search Replace.

    I then copied a normal space which I knew would show correctly in the front end and pasted it into the “Replace with” field in Better Search Replace.

    I ran the search and replace across the entire database and it worked to remove all of the ? which should have been spaces.

    So it turns out a space is not a space even when it looks just like a space ??

    • This reply was modified 4 years, 4 months ago by nicole2292.
Viewing 1 replies (of 1 total)
  • The topic ‘Character Encoding of spaces in wordpress: ? shows as ? in black diamond’ is closed to new replies.