MySql: Importance of Proper Indexing

As a programmer its easy to forget how our queries can impact the performance of an application. Especially since when doing local testing you can see that your queries execute lightning fast on your commodity hardware, how much faster will it run on a server then? It should be executing faster, but one thing we tend to forget is how fast will those queries perform when there are hundreds of users concurrently executing all those queries in your application.


That's one thing that the previous designers of this application where I work with forgot to take into consideration, basically during the few years they were online everything worked okay, but lately they've seen a tremendous growth in traffic and lately they're seeing humongous loads on their Mysql server.


Initially everyone was blaming the server, after all it was using outdated software, Mysql 4.1, the webserver using PHP4 and it was running a 32 bit OS on a 64bit platform with 16 GB of RAM, which according to some experts we might as well be running it on 4GB or RAM. Anyways since everyone thought the server was to blame I did everything I can to optimize its configuration, doing little tweaks here and there till I can't find anything else to optimize.


Well that didn't help much, one problem we were getting weekly were duplicate orders because the site was too slow and customers were resubmitting their orders which they thought hadn't gone through the system. So the application team decided to  break down the check out process and had a cron job doing part of the process they used to do in one script. For a while this helped, we still get duplicate orders, but we just let the finance team deal with it, until it became an issue again cause they got tired of handling duplicate orders.


I've brought up the issue of indexing the database before our IT manager and showed him a couple of the slow queries on the server based on the slow query log, he did index a couple of the tables but I don't think he was serious about it, most of the index were composite index and the queries never used those indexes at all. Some index were on status fields with "yes" and "no" values, very low cardinality. He actually mentioned before that he thinks adding indexes will further slow down our web application, I did mention that our select/insert ratio was 90/10. Anyways with a couple of days before we go live on a different system I decided to give indexing a try on this system.


You can optimize a query in 3 ways


* Altering the query from the application side, which I have no control of


* Reorganizing the data structure for the tables, which the developers would kill me for doing


* lastly reindexing the tables, which I did.


Using slow query logging and mysql EXPLAIN, I re-indexed several tables on 1-3 databases and after several days of monitoring the logs here's the result. I started Thursday morning on the Aug 26th and stopped re-indexing on the evening of the Aug 30th. From our zenoss load graph you can clearly see the difference of having those indexes, plus the finance team has reported no duplicate orders so far.


Mysql Server load average




Here's an example, this query takes 2 secs and examines a million rows to get 6 rows.


# Query_time: 2  Lock_time: 0  Rows_sent: 6  Rows_examined: 1735526

SELECT * FROM job_item_price_data WHERE job_item_id = '589772';

# User@Host: imsdb[imsdb] @  [10.0.0.179]

mysql> explain SELECT * FROM job_item_price_data WHERE job_item_id = '589772';

+----+-------------+---------------------+------+---------------+------+---------+------+---------+-------------+

| id | select_type | table               | type | possible_keys | key  | key_len | ref  | rows    | Extra       |

+----+-------------+---------------------+------+---------------+------+---------+------+---------+-------------+

|  1 | SIMPLE      | job_item_price_data | ALL  | NULL          | NULL |    NULL | NULL | 1735520 | Using where |

+----+-------------+---------------------+------+---------------+------+---------+------+---------+-------------+

1 row in set (0.01 sec)

mysql> show index from job_item_price_data;

+---------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+

| Table               | Non_unique | Key_name     | Seq_in_index | Column_name  | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |

+---------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+

| job_item_price_data |          1 | job_order_id |            1 | job_order_id | A         |      216940 |     NULL | NULL   |      | BTREE      |         |

+---------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+

1 row in set (0.00 sec)

mysql> alter table job_item_price_data add index (job_item_id);

Query OK, 1735520 rows affected (8.57 sec)

Records: 1735520  Duplicates: 0  Warnings: 0

mysql> optimize table job_item_price_data;

+----------------------------+----------+----------+----------+

| Table                      | Op       | Msg_type | Msg_text |

+----------------------------+----------+----------+----------+

| ims_db.job_item_price_data | optimize | status   | OK       |

+----------------------------+----------+----------+----------+

1 row in set (0.21 sec)

mysql> explain SELECT * FROM job_item_price_data WHERE job_item_id = '589772';

+----+-------------+---------------------+------+---------------+-------------+---------+-------+------+-------------+

| id | select_type | table               | type | possible_keys | key         | key_len | ref   | rows | Extra       |

+----+-------------+---------------------+------+---------------+-------------+---------+-------+------+-------------+

|  1 | SIMPLE      | job_item_price_data | ref  | job_item_id   | job_item_id |       4 | const |    9 | Using where |

+----+-------------+---------------------+------+---------------+-------------+---------+-------+------+-------------+

1 row in set (0.00 sec)

mysql> SELECT * FROM job_item_price_data WHERE job_item_id = '589772';

+--------------+-------------+------------+-------------+------------+

| job_order_id | job_item_id | field_name | field_value | date_added |

+--------------+-------------+------------+-------------+------------+

|       390170 |      589772 | attr4      | 1147        | 2010-08-26 |

|       390170 |      589772 | attr25     | 1153        | 2010-08-26 |

|       390170 |      589772 | attr26     | 1157        | 2010-08-26 |

|       390170 |      589772 | attr30     | 1165        | 2010-08-26 |

|       390170 |      589772 | attr27     | 2114        | 2010-08-26 |

|       390170 |      589772 | product_id | 49          | 2010-08-26 |

+--------------+-------------+------------+-------------+------------+

6 rows in set (0.00 sec)


References:

http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html

http://dev.mysql.com/doc/refman/5.0/en/explain.html

http://www.mysqlperformanceblog.com/2006/09/06/slow-query-log-analyzes-tools/

Comments

Popular Posts