The below is not for production but just a quick and dirty example:
<?php
$results = $wpdb->get_results( "SELECT post_content FROM wp_posts WHERE post_type IN ('post') AND post_status IN ('publish') ");
$res = implode(' ',array_column($results, 'post_content'));
/* https://stackoverflow.com/a/46874560/2476069 */
function most_frequent_words($string, $stop_words = [], $limit = 10) {
$string = strtolower($string); // Make string lowercase
$words = str_word_count($string, 1); // Returns an array containing all the words found inside the string
$words = array_diff($words, $stop_words); // Remove black-list words from the array
$words = array_count_values($words); // Count the number of occurrence
arsort($words); // Sort based on count
print_r( array_slice($words, 0, $limit)); // Limit the number of words and returns the word array
}
most_frequent_words(strip_tags($res), array('the','a','to','and','of','for') );
die();
?>
Even filtering out a handful of common words, my result is a bunch of (to me) useless words counted from my test site.
The above is more an example that in my opinion, you’ll need to filter out a lot more words, possibly limit results to larger words, although that could be an issue if you are limiting say word to 5+ letters but you talk about SPAM a lot, then it isn’t counted.
I saw some plugins but they are pretty old and don’t work with the current WordPress version.
Did you try them? Most likely they are just full of SQL queries so they should still work. Might be better not to reinvent the wheel here.
Good luck!