• Resolved tahtu

    (@tahtu)


    The Better Search plugin searches inside the post_title and post_content fields.

    Imho, it would be nice to search inside the post_excerpt field too to improve the hit relevance. Posts with the related words of the search inside the post_excerpt field are more important than posts without this words inside the excerpt.

    If you agree with me, you could implement this feature request. Thanks a lot for your great work!

Viewing 10 replies - 16 through 25 (of 25 total)
  • @tahtu Thank you for your detailed explanation of your choice! Actually, before I started the development of WPFTS, first I tested all the search techniques which we could achieve using MySQL+PHP combination only.

    I have immediately throw out the LIKE ‘%WORD%’ technique because it’s DRAMATICALLy slow if you are using ‘%’ at the beginning of the expression, which means we could not search for specific words in the middle of the content.

    A Google plugin does not allow to search posts by internal “invisible” data like meta values. You should first show them somewhere, let the Google index it and only then, MAYBE, Google will let you find that post.

    A MATCH/AGAINST is a great solution. I tried to use it so long time, but the problems I experienced switched me to the indexing search. These problems are: it’s hard to calculate actual relevance, it can not be justified with enough flexibility like find for words of 2 letters or digits, I can not remove some specific words from the search (like “the”, “a”, “so” etc), each language has its own set of such “stop” words.
    Yes, most the things can be justified in my.ini, but it’s only possible per-server (for all your sites) and only if you have admin access to MySQL config.

    Additionally, I wanted to have an ability to find posts by a very different set of contents (like meta fields, generated texts, extracted texts (this function is used for search within PDF/DOC/DOCX, etc files) and much more. Also, I noticed that complex queries running quite slow when I using match/against.

    This all resulted in me to develop an indexed search solution which does not require to change any server configuration, create complex queries, unserialize meta-field values on the fly, etc, etc. It even does not change any of your existing WP tables, it creates its own tables to store index information.

    But you’re right. Match/Against looks very simple for simple tasks. And of course, it’s still great for some solutions.

    Oh, yes, you got to the very point with a low download rate. It just so happened that people use mostly not a free version of the plugin, but a paid one, because the most demanded feature, as it turned out, is a search in the contents of PDF files. And the fact that this plugin (free version too) can easily search for anything, anywhere, many people do not even suggest.

    But this is my serious mistake, I am not a marketer, and still have to work on it.

    Thread Starter tahtu

    (@tahtu)

    @epsiloncool like I mentioned before, I want to search inside the excerpt field (and indeed content and title). Today, I don’t have any attachment, custom fields and it’s not important to search in any taxonomy.

    So in my case, I still think my simple MATCH ... AGAINST implementation is for the moment the best one. But I’m thinking about a translation of the whole site. In that case, I need a different search, depended on the current selected language.

    Indeed this is not needed today or tomorrow, but it’s easier for me to find a solution than, if I do not implement an other complex solution before.

    Once again: Imho, there is not right solution. Indeed speed is a great problem for every search. So if I get trouble with it, I have to think about that again. The Better Search plugin seems not be optimized for speeding, since it combined BOOLEAN and NATURAL modes together. Addition, there is and unneeded index used, which takes unneeded storage uses.

    Maybe your solution would be better for that.

    But in the moment, I’m working on two machines: One development and post writing, and one live working one. It’s easier for me to transfer the data, if I don’t have to rebuild the index each time manually. MySQL rebuilds his index automatically for me.

    And you are right: I don’t have access to the my.ini. Nevertheless I think, WordPress should implement the same solution, I implemented. This would take very few additional code for them and just a few options inside the settings. The user could decide to use the MySQL FULLTEXT index or not. If yes, he could select about date or relevance ordering. And if wanted, he could use the BOOLEAN mode instead of the NATURAL too. Imho, this would be a great advantage for WordPress at all.

    I like small solutions with uses as much standards as possible. This is more flexible for any future development and still much reliable, because there is not a lot of code, which could be buggy. That’s the main reason, why I decided to use the WordPress WP_Query solution together with the MySQL FULLTEXT index.

    Indeed there are better implementations with more feature, like Better Search, Relevanssi and WPFTS offers. But I don’t need them. And that three tools didn’t support the excerpt field completely.

    We’re living in a not-perfect world. And each of us has to choose a way for himself to handle this issue. ??

    Ok, I got your point completely.

    Just some remarks:

    1. WPFTS uses WP_Query() to make search.

    2. It also supports search in excerpts, you just need to add one simple line like

    $index[‘post_excerpt’] = $post->post_excerpt;

    3. For the multilingual search, you can find the word in all posts and then remove all posts which do not belong to the selected language. However, this is not the only solution, just a way to solve.
    Another way is to put the content of different languages of the same post to different search DOMAINS in the WPFTS index. For example

    $index[‘post_content’] = $en_content;
    $index[‘post_content_es’] = $es_content;
    $index[‘post_content_fr’] = $fr_content;

    etc. Then, when making a WP_Query(), you can set weights of these DOMAINS dynamically. For example, if you need for French posts only, you can set “post_content_fr” = 0.5 and all domains = 0. So only french content will be used to search.

    Indexed search way is a VERY VERY flexible thing.

    Thank you for this discussion. I have much more ideas now for my posts about WPFTS.

    Thread Starter tahtu

    (@tahtu)

    Indeed indexed search has a lot of advantages. For example the speed / resource usage. And an own index has additional advantages, for example the option to index attachments too. The possiblility to weight the placment of the search hit is important too, I think.

    But I still think: There is not one solution, which is the best one.

    For example, I searched the word “ego” on my site right now. I feel, it should be one of the most used and most important words on it. But the MySQL fulltext search didn’t show me any post. Now, I try to find out why. Maybe with 3 characters, this word is too short for the fulltext index?

    Indeed I would be able to configure it somewhere. But where? In the my.ini, I can’t access? Inside the CREATE TABLE definition, I don’t know a way to place this length definition of the word length…

    But since this word is very important on this site, not to find posts with it is also a bad solution. How do you handle this problem with WPFTS?

    I’m happy about this discussion too. I hope nobody feels bothered about we do this here, inside the Better Search area… ??

    Thread Starter tahtu

    (@tahtu)

    @epsiloncool do you know a way to put some field from the WP_Query result into the global $post variable? I would like to see the score of the relevance on result list. As far as I understood the output of Better Search @ajay don’t know it, but the developer of Relevanssi knows the way.

    Hi @tahtu

    By default MATCH/AGAINST indexing ignores words shorter than 3 symbols for InnoDB tables (MySQL 5.6+ only) and words shorter than 4 symbols for MyISAM tables. Which type of tables you are using? This behavior could be changed in my.ini. No way to change this in CREATE TABLE.

    The WPFTS stores complete words in its index and then uses LIKE ‘word%’ to search words in this index. Above I said that LIKE ‘%word%’ approach works quite slow and it’s still correct. MySQL ignores binary indexes for LIKE queries when you start from ‘%’. But if you start from the exact letter, LIKE works very fast, because the internal binary index still using for that.

    Summarizing above, I’d say WPFTS will find all the posts where the word ‘ego’ appears. Also, it will find all the posts with the word ‘ego’ by the query “eg”. Also it will find words with “ego” at the start, for example, “egonomic” or “egoes” etc.
    But the bottleneck is WPFTS will NOT find words like “subego” or “vertego” by default, because LIKE “ego%” not works on them.

    Fortunately, there is an option in WPFTS Settings (and this flag could be also set via WP_Query() parameters) to enable LIKE “%word%” for search. This will sufficiently slower, but if you extremely need to find words like “subego” it will help.

    I am still working on improvement for this, and I have good results.

    Thread Starter tahtu

    (@tahtu)

    @epsiloncool I’m using MyISAM, since my ISP didn’t update MySQL since some years. He still offers MySQL 5.5. But I have that account for free. So it’s Ok in the moment. But with your information about the 4 needed characters for MyISAM right now I have a reason to change the ISP – or using any plugin.

    Ok, I will test WPFTS.

    Until now I thought every fulltext index usage with MATCH … AGAINST is much faster than any LIKE ‘ego%’. Am I wrong?

    I can’t say right now how much difference in speed there is. Because you can’t find ‘ego’ in the big text using LIKE ‘ego%’. This query can be used to find an ID or the word in big word-based index table (like WPFTS is using). So it’s hard to compare.

    But it could be interesting to compare well-configured MySQL with MATCH/AGAINST and WPFTS plugin at the same MySQL instance. I could check this later. I think MATCH/AGAINST will do the job faster. But not sure.

    Thread Starter tahtu

    (@tahtu)

    I wrote a message to your at this page: https://fulltextsearch.org/contact/

    Maybe this is not the right place for a further discussion about WPFTS…

Viewing 10 replies - 16 through 25 (of 25 total)
  • The topic ‘Search in excerpt field too’ is closed to new replies.