Fast Woo Order Lookup

This is a simple protytype plugin for big WooCommerce sites. It speeds up search on the Order page in the dashboard. It mitigates the performance issues in this WooCommerce issue and this Subscriptions issue.

The installable plugin zip file is here. The source code is at https://github.com/OllieJones/fast-woo-order-lookup . You can offer suggestions or report bugs at https://github.com/OllieJones/fast-woo-order-lookup/issues .

It makes the search box on the WooCommerce -> Orders page actually return results before it times out. Seriously, that Search box needs to do some really challenging MariaDB (or MySQL) queries to deliver the results store operators need. That search box looks for the string you type all over the historical orders in your system: Customer name, full text address, names of products in the owner, email addresses, even payment gateway transaction numbers. And to serve their customers and their businesses, they need to be able to find orders reliably. This search box looks in a lot of places to find matching orders, even when it is given only a few characters to search.

And that is a complex query, with multiple meta_value LIKE '%searchterm%' search predicates in it. Those predicates are the DBMS archetype of stupid slow query. And these product searches use them. Because store owners need to be able to find orders. (Notice that the PostgreSQL DBMS offers trigram indexes, so if WordPress used it this plugin wouldn’t be needed. )

This plugin deals with that performance problem by hooking into WordPress’s database query system, intercepting those queries and adding an extra predicate to each query:

WHERE order_id IN(a list of ids that have matching text strings) AND ...

And the point of this plugin is to efficiently and quickly retrieve that list of ids.

How does it work? Trigrams.

What’s a trigram? It’s a chunk of three consecutive letters from a text string. For example, the first trigram in the string “Beatles” is “Bea”. The second one is “eat”, and so forth. They are useful in database work because SQL databases handle tables with a lot of short rows efficiently. It’s much cheaper to satisfy a query saying trigram = 'Bea' than it is meta_value LIKE '%Bea%'.

This plugin makes a table of trigrams in the searchable text that describes each WooCommerce order. For example, if order # 123 is for a record called The Beatles! these entries appear in the trigram table.

order_idtrigramWhere it comes from
123TheThe Beatles!
123he_The Beatles!
123e_BThe Beatles!
123_BeThe Beatles!
123BeaThe Beatles!
123eatThe Beatles!
123atlThe Beatles!
123tleThe Beatles!
123lesThe Beatles!
123es!The Beatles!

As the example shows, every consecutive pair of three letters — every trigram — in the text string gets its own row in this table. For a lot of orders, each with a lot of metadata, that turns into many table rows. But they are short and efficiently indexable rows — a 64-bit integer and three Unicode characters. In fact the primary key of the table is the composite of the table’s only two columns (trigram, order_id). That means the entire content of the table is in the primary key. So searches such as

SELECT order_id FROM trigram
    WHERE trigam = 'Bea'

are satisfied by one of the fastest ways to access a BTREE table: an index range scan.

To search for the text string “Beatle” in the table, the actual query searches the table for trigrams in the search term with

SELECT order_id FROM trigram 
 WHERE trigram IN ( 'Bea', 'eat', 'atl', 'tle' ) 
 GROUP BY id
 HAVING COUNT(*) = 4

This search term contains four trigrams, and the query gets all the order_ids which have all four trigrams in them. That set of order_ids is guaranteed, by matching all four trigrams, to fetch all the matching ids. It may contain or may not contain some other order_ids, but it will definitely contain the ones we want. And, because of the marvellous SQL voodoo of the index range scan the query is fast, O(log n) fast, at finding that set of order_ids. That compares favorably to the O(n) query built in to WooCommerce itself using multiple meta_value LIKE '%Beatle%' search terms.

To make sure we get the correct results, we’ll continue using the meta_value LIKE '%Beatle%' — riddled query, but we’ll include the extra

WHERE order_id IN(a list of ids that have matching text strings) AND ...

search predicate into the query to rule out almost all the orders before the slow searches.

Maintaining the trigram table

This trigram table must be created and maintained. It is not part of core WordPress or WooCommerce.

On activation the plugin creates the tables from existing orders in the system. (I suggest you use WP-CLI to activate the plugin — wp plugin activate fast-woo-order-lookup — because table creating may take a few minutes. That will trigger a timeout if you activate the plugin from the web interface in the Plugins dashboard page.

The plugin’s sets up action handlers for order creation, updates, and deletions. Any order mentioned in any of those action handlers has its trigram list created, updated, or deleted as well.

The plugin deletes the trigram table when it is deactivated. Notice that the usual practice is to delete a plugin’s data when the plugin is deleted, not when it is deactivated. But, if the trigram table were preserved without being maintained, its content would no longer reflect the current orders. So this plugin deletes the table on deactivation rather than letting it go stale.

Where did this idea come from?

PostgreSQL has built in support for trigram indexes to support predicates such as meta_value LIKE '%Beatle%'. I don’t claim this plugin supports their precise algorithm. It does something similar, but from the application layer.