Query improvement suggestions
-
Hi
Some of the frequently run queries in “visitors-traffic-real-time-statistics” include:
(The date condition range is different each time)SELECT SUM(vst_visitors) AS vst_visitors, SUM(vst_visits) AS vst_visits FROM ahc_visitors WHERE 1 = 1 AND DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) >= '2021-09-16' AND DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) <= '2021-10-16';
The problem with that query is that the first lookup column doesn’t have an index.
Table Description
CREATE TABLE ahc_visitors ( vst_id int(11) unsigned NOT NULL AUTO_INCREMENT, vst_date datetime NOT NULL, vst_visitors int(11) unsigned DEFAULT '0', vst_visits int(11) unsigned DEFAULT '0', PRIMARY KEY (vst_id) ) ENGINE=InnoDB;
There is no index on the vst_date column.
The second problem is that the left side is a function processing.
If you use this function on the left, you can’t use the index in a normal b-tree index
ez)
DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) >= '2021-09-16'
Of course, MySQL 8 adds a “function base index” feature.
However, it is not available in earlier 5.x versions, so its versatility is low.So I propose two amendments.
First: add index
alter table ahc_visitors add index idx_vst_date(vst_date);
Second: change the query
SELECT SUM(vst_visitors) AS vst_visitors, SUM(vst_visits) AS vst_visits FROM ahc_visitors WHERE 1 = 1 AND vst_date > DATE_FORMAT(CONVERT_TZ('2021-09-15 23:59:59','+00:00', '-09:00'),'%Y-%m-%d %H:%i:%s') and vst_date < DATE_FORMAT(CONVERT_TZ('2021-10-17 00:00:00','+00:00', '-09:00'),'%Y-%m-%d %H:%i:%s');
For reference, My region is Seoul, south Korea and gmt+9
After creating the index, let’s check the query plan.
Original SQL
explain SELECT SUM(vst_visitors) AS vst_visitors, SUM(vst_visits) AS vst_visits FROM ahc_visitors WHERE 1 = 1 AND DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) >= '2021-09-16' AND DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) <= '2021-10-16'; +----+-------------+--------------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | ahc_visitors | NULL | ALL | NULL | NULL | NULL | NULL | 190112 | 100.00 | Using where | +----+-------------+--------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
Modify SQL
explain SELECT SUM(vst_visitors) AS vst_visitors, SUM(vst_visits) AS vst_visits FROM ahc_visitors WHERE 1 = 1 AND vst_date > DATE_FORMAT(CONVERT_TZ('2021-09-15 23:59:59','+00:00', '-09:00'),'%Y-%m-%d %H:%i:%s') and vst_date < DATE_FORMAT(CONVERT_TZ('2021-10-17 00:00:00','+00:00', '-09:00'),'%Y-%m-%d %H:%i:%s') ; +----+-------------+--------------+------------+-------+---------------+--------------+---------+------+-------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------------+------------+-------+---------------+--------------+---------+------+-------+----------+-----------------------+ | 1 | SIMPLE | ahc_visitors | NULL | range | idx_vst_date | idx_vst_date | 5 | NULL | 37256 | 100.00 | Using index condition | +----+-------------+--------------+------------+-------+---------------+--------------+---------+------+-------+----------+-----------------------+
After you create an index and modify your query, you can use index .
Of course, the data is output the same without any error.
SELECT SUM(vst_visitors) AS vst_visitors, SUM(vst_visits) AS vst_visits FROM ahc_visitors WHERE 1 = 1 AND DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) >= '2021-09-16' AND DATE(CONVERT_TZ(vst_date, '+00:00', '+09:00')) <= '2021-10-16' UNION ALL SELECT SUM(vst_visitors) AS vst_visitors, SUM(vst_visits) AS vst_visits FROM ahc_visitors WHERE 1 = 1 AND vst_date > DATE_FORMAT(CONVERT_TZ('2021-09-15 23:59:59','+00:00', '-09:00'),'%Y-%m-%d %H:%i:%s') and vst_date < DATE_FORMAT(CONVERT_TZ('2021-10-17 00:00:00','+00:00', '-09:00'),'%Y-%m-%d %H:%i:%s'); +--------------+------------+ | vst_visitors | vst_visits | +--------------+------------+ | 16528 | 19920 | | 16528 | 19920 | +--------------+------------+
Currently, the total number of cases in my table is about 200,000
As data increases, Full Scan will increase the load on the DB.So I’m asking for this improvement.
Always Thanks
- The topic ‘Query improvement suggestions’ is closed to new replies.