Comments on About MySQL development (Optimizer mainly): Support for LATERAL derived tables added to MySQL 8.0.14

Thank you so much. It's black magic :) but it ...

2019-04-08T01:30:08.480-07:00

Thank you so much. It's black magic :) but it really works.

I have already submitted bug report - http://bugs.mysql.com/94903

Vasiliy: my colleague has looked a bit and it coul...

2019-04-05T06:06:42.853-07:00

Vasiliy: my colleague has looked a bit and it could theoretically be improved (no idea when, though). Could you please file a bug report? Please note there that you're doing it on my suggestion. A workaround is: extending ORDER BY, like this:
select (select post_text from posts where t1.user_id =
posts.user_id order by **posts.user_id**, date_added limit 3) as s from posts t1;
then MySQL thinks it can use the index.
For your original query which orders DESC, it would be:
order by posts.user_id desc, date_added desc
and then I get:
| 3 | DEPENDENT DERIVED | posts | NULL | ref | user_id | user_id | 4 | t1.user_id | 1 | 100.00 | Backward index scan |

Hi. Thanks. Not specific of LATERAL, also visible ...

2019-04-04T08:54:49.928-07:00

Hi. Thanks. Not specific of LATERAL, also visible with
explain select (select post_text from posts where t1.user_id=posts.user_id order by date_added limit 3) as s from posts t1;
I'll contact a colleague who knows more, and will let you know.

Thanks for the reply. You can download optimizer ...

2019-04-04T01:20:32.871-07:00

Thanks for the reply.

You can download optimizer trace here

important part:
{
"reconsidering_access_paths_for_index_ordering": {
"clause": "ORDER BY",
"steps": [
],
"index_order_summary": {
"table": "`posts`",
"index_provides_order": false,
"order_direction": "undefined",
"index": "user_id",
"plan_changed": false
}
}
}

I don't understand why it's so.

Hello Vasiliy. First, thanks a lot fot the transla...

2019-04-03T06:14:23.716-07:00

Hello Vasiliy. First, thanks a lot fot the translations!!
About your problem: that is indeed surprising; to decide if it's correct, we should look at the cost-based decision. For that, I suggest you look at the optimizer trace https://dev.mysql.com/doc/internals/en/optimizer-tracing.html ; it will tell which strategy was considered and why it was chosen or rejected.

Hello, Guilhem. I wrote article How to select N ro...

2019-04-02T22:10:10.616-07:00

Hello, Guilhem.
I wrote article How to select N rows for each group. One of solutions uses LATERAL.
However, the performance was lower than expected - mysql do not uses unique index, instead of that it reads all rows from group and does filesort. Example

CREATE TABLE `posts` (
`post_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`date_added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`post_text` text NOT NULL,
PRIMARY KEY (`post_id`),
UNIQUE KEY `user_id` (`user_id`,`date_added`)
) ENGINE=InnoDB AUTO_INCREMENT=32754 DEFAULT CHARSET=utf8;

/*
15962 rows in table
20 distinct user_id
~800 rows per group
*/

mysql> explain select t2.* from (select user_id from posts group by user_id) as t1,
-> lateral (select * from posts where t1.user_id=posts.user_id order by date_added desc limit 3) as t2;
+----+-------------------+------------+------------+-------+---------------+---------+---------+------------+------+----------+----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------------+------------+------------+-------+---------------+---------+---------+------------+------+----------+----------------------------+
| 1 | PRIMARY | | NULL | ALL | NULL | NULL | NULL | NULL | 21 | 100.00 | Rematerialize () |
| 1 | PRIMARY | | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | NULL |
| 3 | DEPENDENT DERIVED | posts | NULL | ref | user_id | user_id | 4 | t1.user_id | 801 | 100.00 | Using filesort |
| 2 | DERIVED | posts | NULL | range | user_id | user_id | 4 | NULL | 21 | 100.00 | Using index for group-by |
+----+-------------------+------------+------------+-------+---------------+---------+---------+------------+------+----------+----------------------------+

mysql> show status like 'handler_%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 4 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 1 |
| Handler_read_key | 102 |
| Handler_read_last | 1 |
| Handler_read_next | 15962 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 60 |
| Handler_read_rnd_next | 101 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 80 |
+----------------------------+-------+

mysql> show status like 'sort%';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Sort_merge_passes | 0 |
| Sort_range | 0 |
| Sort_rows | 60 |
| Sort_scan | 20 |
+-------------------+-------+

In pseudo-code, this does:
select 20 distinct user_id, store into t1
for each row in t1:
select all rows from posts where user_id=t1.user_id
do filesort
take 3 last rows.

We have unique index (user_id, date_added). Optimal way like in this pseudo-code:
select 20 distinct user_id, store into t1
for each row in t1:
take 3 rows using unique key

Is it bug?

P.S. Also I translated some of your articles (LATERAL and emulate CHECK constraints).