Articles

Affichage des articles du 2012

Cost-based choice between subquery materialization and EXISTS

In a previous post , I had demonstrated how subquery materialization , introduced in MySQL 5.6.5, improves the performance of certain queries, like query Q16 of DBT3. Such improvement was easily explained: Subquery materialization has a high start up cost (it needs to create and fill the temporary table). But afterwards it has fast lookups (temporary table has a hash index, no duplicates, and is in memory). In other words, compared to EXISTS, the first evaluation of the IN predicate is slow (high start up cost) and all following evaluations are fast (just a hash lookup). In the DBT 3 setup, one outer table (named "part") has 200,000 rows, so there are 200,000 evaluations of IN, thus subquery materialization wins over EXISTS because the time it loses in the first evaluation is more than compensated by the many faster following evaluations. However, if there were only few outer rows, then subquery materialization should logically be slower than EXISTS (the compensation...

Faster subqueries with materialization

In a previous post , I analyzed how a query of the famous DBT3 benchmark was optimized by MySQL. It was this query, named "Q16" in the DBT3 jargon: select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#23' and p_type not like 'LARGE PLATED%' and p_size in (43, 1, 25, 5, 35, 12, 42, 40) and ps_suppkey not in ( select s_suppkey from supplier where s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size; Here is a brief recap of conclusions I had drawn: for this query, MySQL tranforms the IN condition to EXISTS and then evaluates it with the "unique_subquery" technique, which does an index lookup into the subquery's table. IN is evaluated 120,000 times (once per combined row of the outer tables). The total execution time of query Q16...

Optimizer tracing used by others!

In a previous post, I had explained how to use MySQL's optimizer tracing , a new feature which appeared in MySQL 5.6.3. As a developer, it feels really good to see others adopt my work and make something useful out of it! My colleague Dimitri Kravtchuk, who is one of our top Benchmarking experts, has written a blog post where he shows how the optimizer tracing has helped him to figure out why, under load, once in a while and randomly, a query performed badly. His investigation technique may be reusable by other people, so I encourage you to read more about it, here .