tag:blogger.com,1999:blog-9191231274999121063.post4826075751008915919..comments2023-08-27T04:41:55.200-07:00Comments on About MySQL development (Optimizer mainly): Support for LATERAL derived tables added to MySQL 8.0.14Guilhem Bichothttp://www.blogger.com/profile/11253623513851616256noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-9191231274999121063.post-78017630301379336452019-04-08T01:30:08.480-07:002019-04-08T01:30:08.480-07:00Thank you so much. It's black magic :) but it ...Thank you so much. It's black magic :) but it really works.<br /><br />I have already submitted bug report - http://bugs.mysql.com/94903Vasiliy Lyk'yanchikovhttps://www.blogger.com/profile/10464790268091263481noreply@blogger.comtag:blogger.com,1999:blog-9191231274999121063.post-41965415853775333222019-04-05T06:06:42.853-07:002019-04-05T06:06:42.853-07:00Vasiliy: my colleague has looked a bit and it coul...Vasiliy: my colleague has looked a bit and it could theoretically be improved (no idea when, though). Could you please file a bug report? Please note there that you're doing it on my suggestion. A workaround is: extending ORDER BY, like this:<br />select (select post_text from posts where t1.user_id = <br />posts.user_id order by **posts.user_id**, date_added limit 3) as s from posts t1;<br />then MySQL thinks it can use the index.<br />For your original query which orders DESC, it would be:<br /> order by posts.user_id desc, date_added desc<br />and then I get:<br />| 3 | DEPENDENT DERIVED | posts | NULL | ref | user_id | user_id | 4 | t1.user_id | 1 | 100.00 | Backward index scan |Guilhem Bichothttps://www.blogger.com/profile/11253623513851616256noreply@blogger.comtag:blogger.com,1999:blog-9191231274999121063.post-65978431572330287902019-04-04T08:54:49.928-07:002019-04-04T08:54:49.928-07:00Hi. Thanks. Not specific of LATERAL, also visible ...Hi. Thanks. Not specific of LATERAL, also visible with<br />explain select (select post_text from posts where t1.user_id=posts.user_id order by date_added limit 3) as s from posts t1;<br />I'll contact a colleague who knows more, and will let you know.Guilhem Bichothttps://www.blogger.com/profile/11253623513851616256noreply@blogger.comtag:blogger.com,1999:blog-9191231274999121063.post-35366844073404981302019-04-04T01:20:32.871-07:002019-04-04T01:20:32.871-07:00Thanks for the reply.
You can download optimizer ...Thanks for the reply.<br /><br />You can download optimizer trace <a href="https://sqlinfo.ru/forum/attachment.php?item=671" rel="nofollow">here</a><br /><br />important part:<br /> {<br /> "reconsidering_access_paths_for_index_ordering": {<br /> "clause": "ORDER BY",<br /> "steps": [<br /> ],<br /> "index_order_summary": {<br /> "table": "`posts`",<br /> "index_provides_order": false,<br /> "order_direction": "undefined",<br /> "index": "user_id",<br /> "plan_changed": false<br /> }<br /> }<br /> }<br /><br />I don't understand why it's so.Vasiliy Lyk'yanchikovhttps://www.blogger.com/profile/10464790268091263481noreply@blogger.comtag:blogger.com,1999:blog-9191231274999121063.post-52501620367365462782019-04-03T06:14:23.716-07:002019-04-03T06:14:23.716-07:00Hello Vasiliy. First, thanks a lot fot the transla...Hello Vasiliy. First, thanks a lot fot the translations!!<br />About your problem: that is indeed surprising; to decide if it's correct, we should look at the cost-based decision. For that, I suggest you look at the optimizer trace https://dev.mysql.com/doc/internals/en/optimizer-tracing.html ; it will tell which strategy was considered and why it was chosen or rejected.Guilhem Bichothttps://www.blogger.com/profile/11253623513851616256noreply@blogger.comtag:blogger.com,1999:blog-9191231274999121063.post-46788776217883173782019-04-02T22:10:10.616-07:002019-04-02T22:10:10.616-07:00Hello, Guilhem.
I wrote article How to select N ro...Hello, Guilhem.<br />I wrote article <a href="https://sqlinfo.ru/articles/info/45.html" rel="nofollow">How to select N rows for each group</a>. One of solutions uses LATERAL.<br />However, the performance was lower than expected - mysql do not uses unique index, instead of that it reads all rows from group and does filesort. Example<br /><br />CREATE TABLE `posts` (<br /> `post_id` int(11) NOT NULL AUTO_INCREMENT,<br /> `user_id` int(11) NOT NULL,<br /> `date_added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,<br /> `post_text` text NOT NULL,<br /> PRIMARY KEY (`post_id`),<br /> UNIQUE KEY `user_id` (`user_id`,`date_added`)<br />) ENGINE=InnoDB AUTO_INCREMENT=32754 DEFAULT CHARSET=utf8;<br /><br />/*<br />15962 rows in table<br />20 distinct user_id<br />~800 rows per group<br />*/<br /><br />mysql> explain select t2.* from (select user_id from posts group by user_id) as t1,<br /> -> lateral (select * from posts where t1.user_id=posts.user_id order by date_added desc limit 3) as t2;<br />+----+-------------------+------------+------------+-------+---------------+---------+---------+------------+------+----------+----------------------------+<br />| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |<br />+----+-------------------+------------+------------+-------+---------------+---------+---------+------------+------+----------+----------------------------+<br />| 1 | PRIMARY | | NULL | ALL | NULL | NULL | NULL | NULL | 21 | 100.00 | Rematerialize () |<br />| 1 | PRIMARY | | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | NULL |<br />| 3 | DEPENDENT DERIVED | posts | NULL | ref | user_id | user_id | 4 | t1.user_id | 801 | 100.00 | Using filesort |<br />| 2 | DERIVED | posts | NULL | range | user_id | user_id | 4 | NULL | 21 | 100.00 | Using index for group-by |<br />+----+-------------------+------------+------------+-------+---------------+---------+---------+------------+------+----------+----------------------------+<br /><br />mysql> show status like 'handler_%';<br />+----------------------------+-------+<br />| Variable_name | Value |<br />+----------------------------+-------+<br />| Handler_commit | 1 |<br />| Handler_delete | 0 |<br />| Handler_discover | 0 |<br />| Handler_external_lock | 4 |<br />| Handler_mrr_init | 0 |<br />| Handler_prepare | 0 |<br />| Handler_read_first | 1 |<br />| Handler_read_key | 102 |<br />| Handler_read_last | 1 |<br />| Handler_read_next | 15962 |<br />| Handler_read_prev | 0 |<br />| Handler_read_rnd | 60 |<br />| Handler_read_rnd_next | 101 |<br />| Handler_rollback | 0 |<br />| Handler_savepoint | 0 |<br />| Handler_savepoint_rollback | 0 |<br />| Handler_update | 0 |<br />| Handler_write | 80 |<br />+----------------------------+-------+<br /><br />mysql> show status like 'sort%';<br />+-------------------+-------+<br />| Variable_name | Value |<br />+-------------------+-------+<br />| Sort_merge_passes | 0 |<br />| Sort_range | 0 |<br />| Sort_rows | 60 |<br />| Sort_scan | 20 |<br />+-------------------+-------+<br /><br />In pseudo-code, this does:<br />select 20 distinct user_id, store into t1<br />for each row in t1:<br /> select all rows from posts where user_id=t1.user_id<br /> do filesort<br /> take 3 last rows.<br /><br />We have unique index (user_id, date_added). Optimal way like in this pseudo-code:<br />select 20 distinct user_id, store into t1<br />for each row in t1:<br /> take 3 rows using unique key<br /><br />Is it bug?<br /><br />P.S. Also I translated some of your articles (<a href="https://sqlinfo.ru/articles/info/42.html" rel="nofollow">LATERAL</a> and <a href="https://sqlinfo.ru/articles/info/44.html" rel="nofollow">emulate CHECK constraints</a>).<br />Vasiliy Lyk'yanchikovhttps://www.blogger.com/profile/10464790268091263481noreply@blogger.com