• Resolved mustafaxyz

    (@mustafaxyz)


    Like the title says, I need to preload 1 million of blogs posts and I need to do it only once a month or so. Is there a faster way for this?

Viewing 4 replies - 1 through 4 (of 4 total)
  • It’s probably best to do that using a command line script from your server. Have it delete the cache files (in wp-content/cache/supercache/…) for the page being refreshed, then fetch it. Rinse and repeat until finished, and adjust speed to what your server can cope with.

    Thread Starter mustafaxyz

    (@mustafaxyz)

    How do I do that exactly, do I need to send GET requests to every page or something?

    #!/bin/bash
    
    # Define the base directory for WP Super Cache files
    CACHE_DIR="~/http/wp-content/cache/supercache/"
    
    # List of your post URLs
    POST_URLS="list_of_post_urls.txt"
    
    # Loop through each URL
    while IFS= read -r url
    do
        # Extract a relative path from URL
        relative_url=$(echo $url | sed 's,http[s]*://[^/]*,,g')
    
        # Build the cache directory path
        cache_path="${CACHE_DIR}${relative_url}"
    
        # Delete the cache for this URL
        rm -rf $cache_path
    
        # Fetch the URL to rebuild cache
        curl -s $url > /dev/null
    
        # Sleep for a short period to reduce load on the server
        sleep 1
    done < "$POST_URLS"
    

    Something like this maybe?

    • This reply was modified 10 months, 3 weeks ago by mustafaxyz.
    Thread Starter mustafaxyz

    (@mustafaxyz)

    I’ve written this script but no luck

    #!/bin/bash
    
    set -x
    
    # Define the base URL to be added
    BASE_URL="https://learnguitar.deplike.com/guitar-chords-and-lessons/"
    
    # Define the base directory for WP Super Cache files
    CACHE_DIR="/var/www/html/wp-content/cache/supercache/learnguitar.deplike.com"
    
    # List of your post URLs
    POST_URLS="post_urls.txt"
    
    # Loop through each URL
    while IFS= read -r url
    do
        # Add the base URL to the beginning of each URL
        full_url="${BASE_URL}${url}"
    
        # Extract a relative path from URL
        relative_url=$(echo $full_url | sed 's,http[s]*://[^/]*,,g')
    
        # Build the cache directory path
        cache_path="${CACHE_DIR}${relative_url}"
    
        # Delete the cache for this URL
        rm -rf "$cache_path"
    
        # Fetch the URL to rebuild cache
        curl -A "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/81.0" -s "$full_url" > /dev/null
    
        # Sleep for a short period to reduce load on the server
        sleep 1
    echo "Full URL: $full_url"
    echo "Relative URL: $relative_url"
    echo "Cache Path: $cache_path"
    
    done < "$POST_URLS"
    Thread Starter mustafaxyz

    (@mustafaxyz)

    The below code works fine if anyone needs it:

    #!/bin/bash
    
    # Define the base URL to be added
    BASE_URL="https://mydomain.com/"
    
    # Define the base directory for WP Super Cache files
    CACHE_DIR="/var/www/html/wp-content/cache/supercache/..."
    
    # List of your post URLs
    POST_URLS="post_urls.txt"
    
    # Limit for the number of requests to send at a time
    LIMIT=1000
    
    # Function to process a single URL
    process_url() {
        url="$1"
        full_url="${BASE_URL}${url}"
        relative_url=$(echo $full_url | sed 's,http[s]*://[^/]*,,g')
        cache_path="${CACHE_DIR}${relative_url}"
    
        # Check if the cache file exists
        if [ -e "$cache_path" ]; then
            echo "Cache file already exists for: $url"
        else
            wget -q -O /dev/null "$full_url"
            echo "Processed: $url"
    
            # Remove the processed URL from the file
            sed -i "/$url/d" "$POST_URLS"
        fi
    }
    
    # Initialize a counter for requests sent
    counter=0
    
    # Read URLs from the file and process them in parallel
    while IFS= read -r url
    do
        # Process each URL in the background
        process_url "$url" &
    
        # Increment the counter
        ((counter++))
    
        # Check if we have reached the limit, and if so, wait for background jobs and reset the counter
        if [ $counter -eq $LIMIT ]; then
            wait
            counter=0
        fi
    done < "$POST_URLS"
    
    # Wait for any remaining background jobs to finish
    wait
    • This reply was modified 10 months, 3 weeks ago by mustafaxyz.
Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘What’s the best way to preload 1 million of blog posts?’ is closed to new replies.