Are you using the operator IN with subqueries in MySQL and wondering about poor performance? Have you tried EXPLAINing your queries and noticed that they are reported as "DEPENDENT SUBQUERY" despite your best intent to use an uncorrelated subquery? If so, read on:
This post describes a MySQL bug (some call it a feature) which has been open for many years and according to official documentation still exists in version 5.6 (note that I only checked 5.0.x and 5.1.x myself and didn't bother to look for info about 6.0, but I have no high hopes based on the bug's age and character).
Consider the following two SQL statements:
SELECT id FROM t1 WHERE id IN (1, 2, 3)
SELECT id FROM t1 WHERE id IN (SELECT id FROM t2)
In MySQL they have very different performance characteristics: the first one will work fast as expected, the second one will actually cause index lookups to be executed repeatedly for each row from t1 rather than just once for the entire statement. Why?
Blame the "optimizer", as explained in the manual. Our fine uncorrelated query becomes forcingly "optimized" into an EXISTS query and thus correlated (rougly speaking; it's a bit more complicated, but you can view the result by using EXPLAIN EXTENDED followed by SHOW WARNINGS). MySQL of course must handle correlated queries per row, leading to disastrous algorithmic performance in what one would assume is a very common case.
I have no clever workaround - you can't "convince" the MySQL optimizer to behave differently. What you can do is use temporary tables to simulate what you really want to happen. But if you do, beware that you may only refer to a temporary table once in a SELECT statement due to another hugely annoying limitation.
For further information see:
- Subquery restrictions according to MySQL manual
- Optimization Problem with a subquery in an IN Operation
- MySQL WHERE IN subquery runs forever
- Problems with subquery transformations
- Update 20140331: Improvements in MySQL 5.6 - Øystein Grøvlen explains subquery materialization, also see his slides, starting from page 15.
P.S. If you found a MySQL version that works in a smarter way, don't hesitate to drop a comment.