Cost-based choice between subquery materialization and EXISTS
In a previous post , I had demonstrated how subquery materialization , introduced in MySQL 5.6.5, improves the performance of certain queries, like query Q16 of DBT3. Such improvement was easily explained: Subquery materialization has a high start up cost (it needs to create and fill the temporary table). But afterwards it has fast lookups (temporary table has a hash index, no duplicates, and is in memory). In other words, compared to EXISTS, the first evaluation of the IN predicate is slow (high start up cost) and all following evaluations are fast (just a hash lookup). In the DBT 3 setup, one outer table (named "part") has 200,000 rows, so there are 200,000 evaluations of IN, thus subquery materialization wins over EXISTS because the time it loses in the first evaluation is more than compensated by the many faster following evaluations. However, if there were only few outer rows, then subquery materialization should logically be slower than EXISTS (the compensation