Viewing 8 replies - 1 through 8 (of 8 total)
  • I filed this upstream with php-openid:

    https://github.com/openid/php-openid/issues/108

    no movement lately, though. But I think just replacing that whole bit with a call to html_entity_decode() might well work these days.

    Well, seems to work for me in a very quick test. Try just turning the replaceentities function into this:

    function replaceEntities($str)
        {
            $str = html_entity_decode($str);
            return $str;
        }
    Thread Starter Chaos Engine

    (@chaos-engine)

    Yup it does work for me too.
    The html_entity_decode would be my first bet also however, that complicated spaghetti regex code was there for a reason. Dunno what though. Maybe some peculiar UTF cases?

    Chaos Engine: I think it’s there for the reason stated in the comment:

    // Replace numeric entities because html_entity_decode doesn’t
    // do it for us.

    But the thing is, it’s an extremely old bit of code, if you look at the git history, and I don’t think that’s true any more. i.e., what I think is going on is that they wrote this code because at the time html_entity_decode did not do what they needed, but now html_entity_decode *does* do what they needed and it’s safe to just replace the custom function with a call to html_entity_decode.

    https://ca3.php.net/html_entity_decode explicitly states: “More precisely, this function decodes all the entities (including all numeric entities) that a) are necessarily valid for the chosen document type — i.e., for XML, this function does not decode named entities that might be defined in some DTD — and b) whose character or characters are in the coded character set associated with the chosen encoding and are permitted in the chosen document type. All other entities are left as is.” (emphasis mine)

    Of course if I was upstream I’d want to verify this and then patch it up more cleanly by dropping the custom function entirely and replacing all calls to it with calls to html_entity_decode, rather than keeping the custom function as an unnecessary wrapper.

    One thing to note, though, is that entity replacement doesn’t seem to be needed all the time – I did throw some debug prints into my change for this and watch the output for a while, and I didn’t actually see any case where the entity replacement did anything while logging in through two or three different OpenID providers – none of the strings that were run through the replaceEntities function actually had any entities to replace, so the string was just getting passed through unmodified. I wasn’t able to find an openID provider which did put HTML entities (especially numeric ones) in the strings that got passed through replaceEntities, to verify that it was working as intended.

    I’m trying to track the history of both php-yadis and the html_entity_decode function. It gets a bit obscure, though. There’s been some numeric entity capability in PHP 5 since at least 5.0.0beta1:

    https://php.net/ChangeLog-5.php

    “Added missing multibyte (unicode) support and numeric entity support to html_entity_decode(). (Moriyoshi)”

    PHP 4.4’s html_entity_decode function hasn’t been touched since 2003. unescape_html_entities has one commit in early 2006:

    https://github.com/php/php-src/commit/63251c4c5e66e9a1297dcfe64c73149c69f72875

    can’t tell if that affects this or not. If someone held a gun to my head and made me guess, I’d guess that possibly this hack is still needed on PHP 4 but I very much doubt whether it’s needed on PHP 5 of any description.

    The good news is php-openid has a test suite which includes some testing of the code where replaceEntities is used, so if I can get the test suite running I should be able to test whether it passes with html_entity_decode instead. The patch itself is trivial, ParseHTML’s replaceEntities is only used in the getMetaTags function in the same source file (though there’s a similar replaceEntities function in the Auth/OpenID tree which could probably also stand replacing).

    https://github.com/php/php-src/commit/6ed4fd1666596b8e0a182d049a30bc63d8a554b1 is the commit that added numeric entity support to PHP 5, back in 2003.

    Thread Starter Chaos Engine

    (@chaos-engine)

    Myself, I haven’t notified any strangeness with simple html_entity_decode approach. However not much tests on my side. Let’s hope this will go upstream.

    https://github.com/openid/php-openid/pull/114

    sent my change upstream, I’m pretty sure it’s correct now. Apply that to the wordpress plugin’s bundled lib/Auth/Yadis/ParseHTML.php and it should work.

Viewing 8 replies - 1 through 8 (of 8 total)
  • The topic ‘preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead’ is closed to new replies.