Good catches there. Of course the relevanssi_post_title_before_tokenize
should be the last thing before tokenizing.
As I was testing this, I noticed something funny: wptexturizer()
doesn’t touch an actual ellipsis (…), but does change three dots (…) to an ellipsis entity. I was using actual ellipsis in my testing, so I was a bit puzzled here.
In any case, I thought about this a bit, and that filter is not the correct solution. The correct solution is to add the html_entity_decode()
to the punctuation remover. That way it can handle the punctuation correctly.
I will add
$a = html_entity_decode( $a, ENT_QUOTES );
as the first step in relevanssi_remove_punct()
(in /lib/common.php), before $a = preg_replace( '/<[^>]*>/', ' ', $a );
. That should handle this in a neat way. I’m going to make the change in the next version of Relevanssi, but if you want it now, just patch the common.php.
And yeah, the shortcode disabling UI is a premium feature. It can be used in the free version by directly adjusting the relevanssi_disable_shortcodes
option.