• Resolved lipskas

    (@lipskas)


    I used to import tables from JSON and HTML files for months with PHP 8.0 and it worked fine. However, when using PHP 8.1, all “foreign” characters in HTML file are replaced with unknown symbols. If I view HTML file via browser, all characters are displayed fine, as always.

    I even tried to add <meta charset="utf-8"> to HTML file; no luck. JSON works fine with both PHP 8.0 and 8.1. Any suggestions?

Viewing 6 replies - 1 through 6 (of 6 total)
  • Plugin Author Tobias B?thge

    (@tobiasbg)

    Hi,

    thanks for your post, and sorry for the trouble.

    I’m not sure what could be causing this and haven’t seen differences myself, but my guess would be that the PHP 8.1 on your server is maybe not loading certain PHP modules, like iconv or libxml or similar.

    Can you maybe also test this with the development version of TablePress 2.0 from https://tablepress.org/8-million-downloads-tablepress-2-0/ to see if that makes a differences?

    And can you maybe provide an example HTML file with which you have this problem?

    Regards,
    Tobias

    Thread Starter lipskas

    (@lipskas)

    First of all, it’s not a trouble at all. I realize this is free product and you do your best to help users, which I’m thankful for.

    My first thought was also about missing modules in PHP. But I did strict comparison between my PHP 8.0 and 8.1 installations and they use exactly the same modules.

    Like I mentioned, JSON continues working fine. Also, if I add text to table manually (using “Edit Table” feature in plugin), all foreign characters are displayed fine. The website itself is non-English too, and WP displays all foreign characters (even on the same page where table is inserted) just fine. Only imported HTML content is messed.

    Here’s a sample file – https://ufile.io/1cypxiej

    Plugin Author Tobias B?thge

    (@tobiasbg)

    Hi,

    thanks for the file, I’ll run some tests with that!

    Whether JSON and the manual editing work has not influence here. These use different PHP functions.
    The HTML import uses e.g.
    https://www.php.net/DOMDocument
    and
    https://www.php.net/simplexml_import_dom
    so any issues must be coming from that somewhere.
    TablePress does set UTF-8 at least twice during the import (see the function at https://github.com/TablePress/TablePress/blob/main/libraries/html-parser.class.php#L32 ), so I really don’t have ideas here right now…

    Regards,
    Tobias

    Thread Starter lipskas

    (@lipskas)

    As far I know, there were some changes in PHP 8.1 regarding HTML entities ( https://php.watch/versions/8.1/html-entity-default-value-changes ) and encoding.

    Not sure if it’s the case, but https://github.com/TablePress/TablePress/blob/main/libraries/html-parser.class.php seems to use html_entity_decode() function, which was affected by 8.1.

    Plugin Author Tobias B?thge

    (@tobiasbg)

    Hi,

    that could indeed be a possibility here. Thanks for sharing this!
    I’ll run a few tests with your HTML file and look into this as soon as I have time in the next couple days (I’ll be traveling over the weekend, so I can’t promise a speedy response here, sorry).

    Regards,
    Tobias

    Plugin Author Tobias B?thge

    (@tobiasbg)

    Hi,

    ok, I tried to replicate this problem with your test file, but was not successful ??

    Neither in my local test setup (using Docker containers) where I can use different versions of PHP, nor on https://tastewp.com/ (a free service where you can spin up temporary WordPress sites, for which the PHP version can also be chosen, via the “Advanced Setup”) was I able to replicate this.

    The HTML file imported fine in all cases, both with PHP 8.0 and 8.1, and the UTF-8 characters worked perfectly fine…

    I really can only assume that this is some local problem with PHP 8.1 on your server…

    Regards,
    Tobias

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘PHP 8.1 – Wrong Encoding Importing Table from HTML’ is closed to new replies.