Some algoritmhic improvements
-
Hello,
First, thank you for having developped this fine plugin.
I don’t know what exactly is your algorithm to find related posts, I haven’t read the code. I understand also that there are inherent limitations – complexity, resource usage and unability to semantically understand which words are more important.
However, it looks like there is a simple “count” of words and that there is really room for improvement without adding too much on complexity.
I have on my website some posts with 5000 to 10000 words, though most have les than 1000. I had to disable checking in the content, because only a handful of posts were suggested – which all had in common their great length.
Idea here : check more the weight of the words.
Although checking only percentage would likely have the opposite effect – there is a balance to find. There could probably be an option to control it besides a default.
Another idea on the top of my idea : compare the tags/categories. I’ve no idea of what it would take in code, but if it’s possible, it shouldn’t be expensive in CPU and could greatly improve this plugin.
I’ve posts on some topics which have widely differing titles but are related to each other, and that’s very clear once you look in the tags/categories.
There could be also options to control the relative importance of title/content/tags/categories.
That should probably be enough to vastly improve the quality of the algorithm, and really make it top-notch.
- The topic ‘Some algoritmhic improvements’ is closed to new replies.