• Resolved Jeff Starr

    (@specialk)


    Hello,

    Couple of questions about the EDD tables added to the WP database.

    1. Why so many tables? In my 15 years as a WP developer I’ve never seen any plugin add so many tables to the database. Usually it’s just one, every now then you get something crazy like 5 or even 10 new tables. But 26+ tables just for EDD? Just curious as to why that’s necessary.
    2. I say 26 “plus” tables because I’m *guessing* that the following tables also are added now by EDD, which would put the number of EDD tables added to the WP database at a record-breaking 30.
    actionscheduler_actions
    actionscheduler_claims
    actionscheduler_groups
    actionscheduler_logs

    The question here is, can you confirm that those tables are in fact added by EDD? And also what is their purpose? Shouldn’t they be prefixed with edd_ like the others.

    Thanks for any help with this. We are working on cleaning up our database and need this information to help make optimal decisions, etc.

Viewing 6 replies - 1 through 6 (of 6 total)
  • Plugin Author Chris Klosowski

    (@cklosows)

    Hey @specialk

    Thanks for the inquiry concerning our custom tables. I’d be happy to answer your questions.

    First, the tables you specifically listed above are for Action Scheduler, a library that EDD does not use at the moment, so those tables would have been added by a different plugin. Many plugins in the WordPress space use Action Scheduler at this time, as it is great for background queue processing.

    So if we ignore those tables since they aren’t ours, we can talk about the EDD specific tables. One of the challenges that EDD has had for years is scaling. Previously we used the Posts table and postmeta for data that didn’t really ‘fit’ in the main posts table, because it was the ‘WordPress’ way. However, as a store grows over time, we end up pushing data into tables that are not structured for the dataset, leading us to run into performance issues when querying for this data later on (like reporting).

    For instance, an ‘order’ doesn’t really have all of the same columns that a ‘post’ does, so a bulk of the data was put into meta tables. As that trend continues, your result is millions of rows in the database in meta. Meta tables are notoriously bad for performance especially when we start talking about complex queries for reporting, as every ‘filter’ just results in adding another million rows to the end query.

    Moving to custom tables in the format we did was very intent driven. Each table has the columns necessary for itself, has typing specific to the data in the column, and typically only contains a ‘meta’ table so that developers can continue to extend EDD in the way they have always expected to. So why so many different tables? This was done to achieve a few things:

    1. EDD core should aim to add no data to meta tables. It is our goal to avoid finding it necessary to add a row into a meta table for EDD core alone. Extensions may take advantage of this but, the mission was to avoid putting two rows into the database for an object, when one would do. This means that for almost all the queries in EDD core, we can avoid doing expensive meta queries to filter the results.
    2. Performance. It may seem on the surface that some of these new data objects could share tables for instance: file download logs, api request logs, and generic logging. It could have been done to make all three of these live in the same table, however, the file download log is historically going to be one of the larger tables, and one of the most used. Therefore breaking it out into it’s own table was done to ensure that we can handle a larger scaling effort, even as a store delivers millions of file downloads over time. And, by knowing how we query that data specifically by columns, leads us into the 3rd point…
    3. Proper Indexes. One of the things that we can do to help database performance at scale is to have proper indexes on the tables. The reason we don’t frequently include more than one data type into a single table is so that we can add proper indexes based on the queries we know that we’ll end up using the most. If you put data into tables that don’t share a consistent query structure when retrieving it, then we’re producing a table that will inherently succumb to performance issues as the table grows in size.

    The reality is that most WordPress plugins don’t hold critical financial and transactional data, like an eCommerce Platform. Most of the data in an eCommerce database has a specific reason for being maintained. A single purchase produces an order, a customer, a list of items ordered, possibly a discount used, the log entry for someone downloading the file, any address information supplied by the customer, the confirmation of the transaction ID from the gateway, and even more if there are things like subscriptions, or licenses involved.

    All of these data points are extremely different in data structure and formatting, however all of them are important to an eCommerce business. Because they are so different, they need their own table to properly hold this information in a way that can scale.

    The other aspect of EDD that we’d heard for years is that the reporting was limited. The simple fact was, that it was due to our database structure being entirely meta driven and non-typed. When we set out to build EDD 3.0’s tables, we wanted to do it in a way that would help us fulfill the many reporting requests that we had heard for years. A great example of this is, what it takes to run an earnings report for a given time.

    In EDD 2.0, we would have had to go get all payments in the date range, then find all of their meta entries for the purchase total, then loop through each of them in PHP to get the sum of those values. In EDD 3.0 (this is slightly over simplified for some more advanced reports) however, we can simply use MySQL SUM functions, so that our reports can get the value directly from MySQL instead of using expensive and time consuming foreach loops in PHP.

    I can understand the initial concern for the number of tables created, as it is more than most plugins will ever add. No table was added to the custom set without intent and we were selective with what tables we did add asking ourselves if the data in these custom tables would dramatically benefit from it’s own table, or if it could exist in a table with some other data. The adjustments table is a great example of this. Taxes, Fees, and Discounts are all stored in the adjustments table, because at the end of the day they all share a commonality in their structure and purpose. Initially we had split all three of these into their own tables, but after some testing and discussion we determined they could live in a single table and meet the needs of a store just fine, and still maintain the performance improvements we were aiming for.

    I’m happy to answer any more questions you may have concerning this, but I hope that this long-form response gives you some insight into our through process and intent for long-term store performance and scaling.

    Thread Starter Jeff Starr

    (@specialk)

    Wow okay thanks for the detailed explanation. I’ll have to look closer at Woocommerce and other member/shopping plugins to see how they’re handling database structure, etc. I appreciate your helpful information.

    So to follow up, the four action scheduler tables were not added by EDD and are not required by EDD for anything. And thus are safe to delete as far as EDD is concerned, correct?

    Plugin Author Chris Klosowski

    (@cklosows)

    Hey Jeff,

    Based on the work that we’re seeing WooCommerce doing, they are adding a number of tables similar to our own. Where they differ a bit is how they handle customer data, but they’ve always had a table for order items, and the additional meta. In their new custom tables implementation they are also adding a table for the orders themselves (and their meta) as well as a table for order ‘operations’. Once they’ve fully gone custom tables, I’m pretty confident they’ll also have a number of tables as well and while they may differ, it’s likely that they will have a number that is larger than most plugins, again, due to the nature of eCommerce site data.

    As far as the action scheduler tables, they are not being used by EDD currently, but that doesn’t mean they are not necessary. These tables hold background jobs that another plugin on your site might be using as opposed to using WPCron as it can be configured in a more advanced way. You can visit the following Admin page and you’ll see the jobs that are registered with Action Scheduler, which may give you some insights to know if they are necessary to keep:
    /wp-admin/tools.php?page=action-scheduler

    Thread Starter Jeff Starr

    (@specialk)

    So just to be clear, on a default WP site using the default theme with only EDD installed and no other plugins, are those four tables safe to delete?

    Plugin Author Chris Klosowski

    (@cklosows)

    @specialk

    That appears to be correct yes. I can only truly vouch for EDD that we currently do not use Action Scheduler and do not create their tables. When I tested a clean WordPress install with just EDD activated, I do not have the action scheduler tables.

    Thread Starter Jeff Starr

    (@specialk)

    Brilliant, thanks much for your help and infos.

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘Questions about EDD database tables’ is closed to new replies.