• The function wp_update_post appears to strip the css id attribute. Can anyone provide leads as to what might really be going on?

    The following code (when run from the cygwin command line) prints <div>bar</div> instead <div id="id">bar</div> which suggests to me that wp_update_post is stripping the id attribute.

    The value of $page_id is the page id of an existing page (the home page), and the code fragment leading up to wp_update_post is stolen from the example section on the reference page for wp_update_post. The code tries to update the page content and then retrieve the page content, and retrieves the unexpected content.


    <?php

    set_include_path(get_include_path() . PATH_SEPARATOR . "./wordpress");
    require_once("wp-load.php");

    $page_id = 6878; /* page id for existing home page */

    $my_post = array();
    $my_post['ID'] = $page_id;
    $my_post['post_content'] = '<div id="foo">bar</div>';

    wp_update_post( $my_post );

    $page = get_page($page_id);
    echo $page->post_content . "\n";

    ?>

    I am running under Windows XP SP3 with Apache 2.2 with the Zend Engine v2.3.0 version of php. I am running WordPress 3.0.2 with plug ins that include “tiny mce advanced” and “w3 total cache”. I’ve tried googling with phrases including “wp_update_post, strip, css”. I’ve tried reading the code for wp_update_post, wp_insert_post, sanitize_post, etc. I’ve checked the value of $my_post before and after invoking wp_update_post, and I’ve confirmed that the value returned by wp_update_post is the page id and not 0.

    Thanks,
    Mark

Viewing 6 replies - 1 through 6 (of 6 total)
  • Thread Starter Mark Tuttle

    (@markrtuttle)

    FWIW, same behavior after deactivating all plugins and upgrading to WordPress 3.0.4.

    Thread Starter Mark Tuttle

    (@markrtuttle)

    The problem is in the KSES module (“kses strips evil scripts!”) in wp-includes/kses.php. This is an html filtering mechanism that strips malformed elements, attributes, and values with the intention of avoiding problems like the security vulnerability known as a cross-site scripting attack. KSES is described more completely here and here.

    The problem is that the default definition of $allowedposttags says that the ‘div’ element can have a ‘class’ attribute but not an ‘id’ attribute. So KSES is stripping the ‘id’ attribute given in the example above. Indeed, adding ‘id’ as an allowable attribute solves the problem.

    It appears possible to override this behavior by setting
    define('CUSTOM_TAGS',false); and giving your own values for $allowedposttags, $allowedtags, and $allowedentitynames. I’m not sure where you would set these, perhaps in your theme’s functions.php, although fixing this should be independent of your theme.

    But why would any html filtering mechanism allow an element like ‘div’ or ‘td’ to have one of the attributes ‘class’ or ‘id’ and not the other? Since the values of these attributes are always interpreted as strings or identifiers, how could they contribute to a security vulnerability?

    Shouldn’t it be an invariant that where ‘class’ is allowed ‘id’ is also allowed, and vice versa? Who should I be writing to to get this fixed in the next release of WordPress?

    Thread Starter Mark Tuttle

    (@markrtuttle)

    I meant override by setting define('CUSTOM_TAGS',true); and giving your own values for $allowedposttags, $allowedtags, and $allowedentitynames.

    Thanks for your posts.

    I’m having a similar problem with wp_update_post. Not only is it stripping out the id, but it is also removing an iframe statement.

    Interestingly it works fine when I call wp_update_post from the admin dashboard, but when I run it from wp-cron, it strips out the ID and iframe. I’ll start investigating KSES.

    To be continued…

    I fixed my problem, by inserting the following code just before the call to wp_update_post.

    global $allowedposttags;
    $allowedposttags['div'] = array('align' => array (), 'class' => array (), 'id' => array (), 'dir' => array (), 'lang' => array(), 'style' => array (), 'xml:lang' => array() );
    $allowedposttags['iframe'] = array('src' => array () );

    Thanks again Mark, you saved me a lot of time!

    Thread Starter Mark Tuttle

    (@markrtuttle)

    Good solution, Doug. Your experience with the admin dashboard is explained in this article where it says that $allowedposttags “is the set [of tags] that is allowed to be put into Posts by non-admin users (admins have the “unfiltered_html” capability, and can put anything they like in).” This capability is checked for in kses_init. I had a similar experience with the problem disappearing in contexts I couldn’t characterize as well as you did, and might never have noticed this sentence without your report.

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘wp_update_post strips css id attribute’ is closed to new replies.