Viewing 3 replies - 1 through 3 (of 3 total)
  • Moderator bcworkz

    (@bcworkz)

    Whether HTML content is saved as entity codes or actual Cyrillic letters doesn’t affect how the content appears in the browser. The entity codes were likely copy/pasted from another source. WP would not normally convert Cyrillic letters into entity codes since all content is UTF-8 by default.

    There are online converters that will convert entity codes to UTF-8, but I don’t think it’d be worth the bother.

    Thread Starter Ferrum-man

    (@russian-man)

    Thanks
    You want to say that Unicode is not a mistake on the part of the site and the search engine will index the content normally?
    The text is displayed as it is in Russian, it does not change to Unicode, only the source code in the browser

    Moderator bcworkz

    (@bcworkz)

    WordPress does not normally generate HTML entities (“Unicode”) on Cyrillic and other non-Latin characters because all pages can be composed from the UTF-8 charset. There would be no need to do so. In theory some theme or plugin might make such conversions, but I cannot imagine why any would actually do so.

    I’m unsure how some content became HTML encoded. I’m guessing it was copied from elsewhere where it was already encoded. Maybe copied from old web content from before UTF-8 became common? Back then HTML entities were necessary for proper display all across the world.

    However it came to be, it will not affect search indexing. Search bots know how to decode HTML entities just as well as browsers do.

Viewing 3 replies - 1 through 3 (of 3 total)
  • The topic ‘Site does not see utf-8 and makes unicode from Russian letters’ is closed to new replies.