‘EXISTS’ is better than ‘IN’

While working on my last project, I was facing a query which was taking 30 second to execute. The number of tables involved were 10 and total records were around 20k.  It was a search functionality for a web-app project.  There were lot of sub-queries involved in the first draft, all of them using ‘IN’. After doing some research ( within MySQL Community ) , I just replaced all ‘IN’  by ‘EXISTS’ and voila !  The query took 11 seconds to execute.

For example, you could change this query  :

SELECT *

FROM invitees i

WHERE i.event_id IN ( SELECT e.id

FROM event e

WHERE e.city LIKE = ‘ABC’ )

TO

SELECT *

FROM invitees i

WHERE EXISTS ( SELECT 1

FROM event e

WHERE e.city LIKE = ‘ABC’

AND e.id = i.event_id ).

Try it and let me know if you find the same difference in performance as I have. There were many more modifications  to the original query to bring it further down to 1 sec. But I will talk about that later.  For more information on  EXISTS and IN Sub Queries see Reference below.

References :

http://dev.mysql.com/doc/refman/5.0/en/optimizing-subqueries.html

http://dev.mysql.com/doc/refman/5.0/en/in-subquery-optimization.html

About these ads

9 Comments »

  1. Jay Pipes said

    1) You are using LIKE with no wildcard, and so it really should be = ‘ABC’

    2) You would get sub-second performance with a standard join:

    SELECT i.*
    FROM invitees i
    INNER JOIN event e
    ON e.id = i.event_id
    AND e.city = ‘ABC’;

  2. Hi there,

    in my opinion, these

    SELECT *
    FROM invitees i
    WHERE i.event_id IN (
    SELECT id
    FROM event e
    WHERE city LIKE ‘ABC’
    )

    Should all be rewritten to joins

    SELECT i.*
    FROM invitees i
    INNER JOIN event e
    ON i.event_id = e.id
    WHERE e.city LIKE ‘ABC’

    Also, the LIKE does not contain a wildcard, we can just as well write:

    WHERE e.city = ‘ABC’

    instead of

    WHERE e.city LIKE ‘ABC’

    Roland Bouman
    http://rpbouman.blogspot.com/

  3. Oh , I am sorry. Should have put LIKE ‘%ABC%’ or = ‘ABC’ for the example.

    Second, I agree with your example of using a standard join, it will be better. The 2 examples given were to show the difference between IN and EXISTS , so didn’t focus on JOIN. But yeah, there are ways to replace ‘IN’ Sub Queries with JOIN for better performance.

  4. Andrew said


    SELECT *

    FROM invitees i

    WHERE i.event_id IN ( SELECT e.id

    FROM event e

    WHERE e.city LIKE = ‘ABC’ )

    I believe this is slow because the inner query is evaluated, and then the outer query is evaluated – you do not reference the inner and outer queries directly.

    If you rewrote the query to be:

    SELECT *

    FROM invitees i

    WHERE i.event_id IN ( SELECT e.id

    FROM event e

    WHERE e.city LIKE = ‘ABC’ AND i.event_id = e.id )

    You would find the query to be much faster.

  5. Ahmed said

    Hi,

    I have this SQL query but instead of using IN i like to use the EXISTS operator, now how can I do that? My brain is hanging on this and need help. Thanks

    SELECT C.CUSTOMER_NUM, C.CUSTOMER_NAME
    FROM CUSTOMER C, ORDERS O
    WHERE
    C.CUSTOMER_NUM = O.CUSTOMER_NUM
    AND
    O.ORDER_DATE IN (’2007-10-21′);

    • Hi Ahmed,

      You can try this query :

      SELECT C.CUSTOMER_NUM, C.CUSTOMER_NAME
      FROM CUSTOMER C
      WHERE EXISTS (
      SELECT 1
      FROM ORDERS O
      WHERE O.ORDER_DATE = ’2007-10-21′
      AND C.CUSTOMER_NUM = O.CUSTOMER_NUM
      )

  6. Ahmed said

    that did work many many thanks for that..you guys deserve an oscar award for being so helpfull…cheers thanks

  7. Ajit said

    Hi….
    can some buddy help me…to use EXISTS …how i can use EXISTS in following query insted of IN

    ……………………….
    …………………….
    ………..WHERE application_opus.company_id = ’50′
    AND application_opus.application_id
    IN (
    SELECT event_log_link_opus.event_log_link_reference_id
    FROM event_log_link_opus
    LEFT JOIN event_log_link_type_opus ON event_log_link_type_opus.event_log_link_type_id = event_log_link_opus.event_log_link_type_id
    LEFT JOIN event_log_opus ON event_log_opus.event_log_id = event_log_link_opus.event_log_id
    LEFT JOIN event_log_type_opus ON event_log_type_opus.event_log_type_id = event_log_opus.event_log_type_id
    WHERE event_log_link_type_opus.name_short = ‘application_id’
    AND event_log_type_opus.event_type IN (‘dteen_fraud_rule_130′)
    )

  8. SacTiw said

    After looking at what Rishi has to offer, you this might work:

    WHERE application_opus.company_id = ’50′
    AND EXISTS (
    SELECT 1
    FROM event_log_link_opus
    LEFT JOIN event_log_link_type_opus ON event_log_link_type_opus.event_log_link_type_id = event_log_link_opus.event_log_link_type_id
    LEFT JOIN event_log_opus ON event_log_opus.event_log_id = event_log_link_opus.event_log_id
    LEFT JOIN event_log_type_opus ON event_log_type_opus.event_log_type_id = event_log_opus.event_log_type_id
    WHERE event_log_link_type_opus.name_short = ‘application_id’
    AND event_log_type_opus.event_type = ‘dteen_fraud_rule_130′
    AND application_opus.application_id = event_log_link_opus.event_log_link_reference_id
    );

    I, btw, know it’s too late to answer this but I saw it today and thought of giving it a try!!! :-)

RSS feed for comments on this post · TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: