• Resolved nhowarth

    (@nhowarth)


    A few weeks ago my self-hosted site was moved to another server. Since then non-Latin characters have not displayed correctly in my posts.

    E.g. If I type ‘Σε σταυροδρ?μι φα?νεται να βρ?σκεται η αγορ? ακιν?των’ when I save a draft of the post it changes to ‘?? ??????????? ???????? ?? ????????? ? ????? ????????’.

    My wpconfig.php contains:
    define(‘DB_CHARSET’, ‘utf8’);
    define(‘DB_COLLATE’, ”);

    .htaccess includes:
    AddDefaultCharset UTF-8

    And my Header includes:
    <meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″ />

    I’m using WordPress 5.2.2 and PHP 7.7.2

    I’m using the ‘classic-editor’ plugin, but deactivating it makes no difference.

    Any thoughts on where to look next?

    TIA
    Nigel

    The page I need help with: [log in to see the link]

Viewing 15 replies - 1 through 15 (of 16 total)
  • Moderator bcworkz

    (@bcworkz)

    I think the charset discrepancy lies in the database. Your data was probably imported into a new DB without the proper charsets specified. Check through the phpMyAdmin app. Usually accessed through your cPanel. Not only does the entire DB have a charset, each table has its own and every text type column has its own. The setting is actually labeled as “collation”, of which the charset is part. Any utf8mb4 collation should be fine. I use utf8mb4_unicode_ci.

    Thread Starter nhowarth

    (@nhowarth)

    Thanks for getting back to me – I’ve checked the database and the collation is ‘utf8mb4_unicode_ci’.

    But the tables in the actual database, which are all MyISAM, have a two collations: ‘utf8_general_ci’ (with a charset ‘utf8’) and ‘latin1_swedish_ci’ (with a charset ‘latin1’.) Could this be the problem?

    Regards
    Nigel

    • This reply was modified 5 years, 3 months ago by nhowarth.
    Moderator bcworkz

    (@bcworkz)

    Yes it could. latin1 char mapping for Greek letters is nowhere close to utf8 mapping. You can’t simply change the collation because SQL will try to re-map the character encoding. Presumably all the data is utf8 despite the latin1 collation, so re-mapping will not end well.

    I’m not even 100% sure changing the collation will fix the problem, but in any case all WP data should normally all be utf8. Making wholesale changes to a DB table is always risky. What I will suggest is reasonably safe, but there is some risk. Proceed carefully.

    Export the table with the latin1 collation, using the SQL format defaults. Save the file somewhere safe as a backup. Make a copy of that file and open in a text editor. One of the first entries in the file is a CREATE TABLE query. Change all COLLATE arguments in this query to ‘utf8mb4_unicode_ci’. Save the changes. Importing this file will use the proper collations without re-mapping the data.

    Rename the original table using a different prefix, such as “old_”. Import the edited SQL file. You should be able to save Greek titles now. After you’ve thoroughly verified there are no ill effects from altering the table this way, it’s probably safe to drop the original “old_” prefixed table. Keep the original export file in a safe place just in case an unnoticed issue surfaces. Of course, at some point after enough changes to the table have been made, even this backup becomes superfluous.

    Thread Starter nhowarth

    (@nhowarth)

    Thanks bcworkz – let’s hope it works.

    Regards

    Nigel

    Moderator bcworkz

    (@bcworkz)

    You’re welcome. Let us know how it goes.

    Thread Starter nhowarth

    (@nhowarth)

    Unfortunately bcworkz it failed. When I tried to import the tables, phpMyAdmin reported:

    #1115 – Unknown character set ‘utf8mb4_unicode_ci’

    Thread Starter nhowarth

    (@nhowarth)

    I may have cracked it. I used ‘utf8’ rather than ‘utf8mb4_unicode_ci’.

    Created a new database and imported the export from the live database. The tables were shown with COLLATION utf8_general_ci.

    I’ll try this on my live database & keep you posted.

    Thread Starter nhowarth

    (@nhowarth)

    That was partly successful. I could read my posts and they all looked OK but I couldn’t log into the WP dashboard (it took my to my home page.) And the home page layout was screwed up.

    Thread Starter nhowarth

    (@nhowarth)

    I’ve made some progress and I think I’m on the right track. I’ve changed the COLLATion in the following tables to utf8:

    wp_commentmeta
    wp_comments
    wp_links
    wp_options
    wp_postmeta
    wp_posts
    wp_terms
    wp_termmeta
    wp_term_relationships
    wp_term_taxonomy

    and can now login – and the site is looking fine. I can edit and publish posts BUT when I create and try to save a draft of a new post it causes an error in the Yoast SEO plugin.

    I’ll progress this over the next few days/weeks when I get the time.

    Thanks for your help bcworkz

    Regards
    Nigel

    Moderator bcworkz

    (@bcworkz)

    You’re welcome. Wow, that really turned into quite the process! I glad your site is working. If you’ve not already done so, maybe try completely removing Yoast through the plugin admin panel (not by FTP), then re-installing again. Be sure your WP is up to date too.

    Thread Starter nhowarth

    (@nhowarth)

    Removing Yoast SEO & re-installing it did the trick ??

    I don’t know how my database got in this mess. There are still 37 tables with Collation ‘latin1_swedish_ci’ – plus a further 19 tables with Collation ‘utf8_bin’.

    Now everything seems to be running normally, I’ll leave it for a few days and work my way through the other latin1s.

    Thanks again
    Nigel

    Moderator bcworkz

    (@bcworkz)

    Cool! (about Yoast, not 37 tables ?? ) I had no idea why that might work, but it seemed like something to try. The power cycle fix for plugins that everyone forgets to try before calling tech support ??

    Yikes! about tables. It seems like some DB distros defaulted to Latin1 Swedish when no collation was specified. My current MariaDB does not, but I know I’ve seen such behavior in the past. And why Swedish of all things? I guess Swedish collation would work for most Latin alphabet languages. It must have made sense before UTF-8 became common.

    Anyway, I’m glad things are working well. You’re most welcome.

    Thread Starter nhowarth

    (@nhowarth)

    Unfortunately it didn’t work. The following morning I couldn’t log in to the WP backend or view the site.

    I’m going to export the site and set up a test site where I can play around to fix the problem.

    Moderator bcworkz

    (@bcworkz)

    Oh no!! That’s entirely unexpected. It may be unrelated to collation settings. I can imagine how output might be corrupted or how your password could need resetting, but completely fail? Something else must be going on. It may sound like I’m trying to deflect blame from offering bad advice, but it’s not the case. I’ll own up to offering bed advice if that’s really the case, but I really don’t see how that could be. In the past I have changed other DB collation from Swedish to UTF-8 with no ill effects.

    Not that it matters, but I was reminded yesterday by another forum regular that SQL was originally developed by a small Swedish outfit. Thus the reason it defaults to Swedish collation.

    Thread Starter nhowarth

    (@nhowarth)

    Hi bcworkz

    You haven’t been offering bad advice – you set me on the right track to crack the problem.

    I have suspicions that the problem occurred when my hosting company moved the database to a new server. (A friend offered to host my site on their cloud server. I checked earlier today and the WP database is OK).

Viewing 15 replies - 1 through 15 (of 16 total)
  • The topic ‘CHARSET non-Latin character problems on new server’ is closed to new replies.