MYSQL-TOOLS

Comparison of graphical tools for MySQL

Home Optimizing and Tuning Queries Query Execution Plan

Query Execution Plan

Print PDF
User Rating: / 4
PoorBest 

For tables that contain large number of rows, you might find that certain SQL statements take relatively longer time to complete.

In MySQL, each query is executed according to a plan generated by the database system's query optimizer, that aims to maximize performance. Selection of an execution plan depends on factors such as available indexes and the selectivity of the different filter conditions in the query.

Selectivity is a measure of the number of rows in a table which will satisfy a condition.

MySQL provides several tools to show how a query will be executed and can be optimized. Using this information, an application writer can modify the query, define a new index, or modify the schema to improve query performance.

The EXPLAIN statement provides an analysis of a specified SELECT statement. To use the EXPLAIN statement, simply include the EXPLAIN keyword, followed by the SELECT statement, as shown in the following syntax:

EXPLAIN <select statement>

Code Sample: OptimizeQueries/Demos/Explain-No-Index.sql

ALTER TABLE film_detail DROP primary key;

EXPLAIN
SELECT title, description, rating, category_name FROM film_detail WHERE film_id = 101;
Code Explanation

Run explain on a table without an Index.

The listing below doesn't show all columns.

+-------------+------+---------------+------+---------+------+
| table | type | possible_keys | key | key_len | rows |
+-------------+------+---------------+------+---------+------+
| film_detail | ALL | NULL | NULL | NULL | 907 |
+-------------+------+---------------+------+---------+------+
1 row in set (0.00 sec)

As there is no index on the film_detail table being read, MySQL must search through each row in the film table to find matching film ID. If the table contains a large number of rows, this process can be slow and very inefficient.

Once an index is created, MySQL searches the index, rather than searching the entire table row by row and, as a result, can find a certain value faster. Searches are faster in indexes because indexes are sorted and identical values are grouped together and organized in an easy-to-locate order. Also, sorting allows MySQL to stop the search as soon as it reaches the end of the matching rows.

Analyzing the output from EXPLAIN Statement

The SELECT statement after the EXPLAIN keyword is specified exactly as intended. The EXPLAIN statement then returns results that provide details about how the SELECT statement will be executed. From these details, you can determine whether indexes are being used effectively, whether you should add new indexes

Each column returned by the EXPLAIN statement provides specific information about how the optimizer plans to execute the SELECT statement. The following table describes each column returned by the EXPLAIN statement and provides information about the values displayed in the previous results set.

Column

Description

id

An identifier for the SELECT statement that is being analyzed. If the statement doesn't include subqueries or unions, the id value is 1 for each row, as is the case in the preceding example results set.

select_type

The type of SELECT statement. The SIMPLE value indicates that the statement doesn't include subqueries or unions. Other values indicate how the statement participates in a subquery or union.

table

The table being analyzed by the row. If an alias is used for the table name, the column displays the alias, rather than the actual table name.

type

The method used to match rows in different tables when the SELECT statement joins two or more tables. If ALL is specified, MySQL conducts a full table scan for each combination of rows from the current table and the joined table. Generally, there should be only one ALL if at all in the first row.

possible_keys

The indexes that MySQL can use to find rows. If NULL, no indexes can be used. In the previous example, the primary key in the Manufacturers table can potentially be used to process the SELECT statement. This index would be considered because it is one of the columns specified in the join condition.

key

The indexes that MySQL actually uses to return rows. If NULL, no indexes are used.

key_len

The length of the index used to retrieve rows. This is most useful in determining how many parts of a multicolumn index are used. For example, if an index is made up of two columns that are each configured as CHAR(4) columns and the key_len column value is 4, you know that only the first column in the index is used. The key_len value is NULL if the key value is NULL.

ref

The column used in conjunction with the index specified in the key column. This usually refers to the column referenced in a foreign key. If NULL, no columns are used.

rows

The number of rows that MySQL plans to examine in order to execute the query. This column is normally your best indicator of the efficiency of the column. The more rows that must be examined, the less efficient the query.

Extra

Additional information about the query. For example, if a query can be executed by referring only to the index, the value Using index is displayed. The Using filesort value is displayed if MySQL must make an additional pass to retrieve rows in a sorted order. The Using temporary value is displayed if MySQL will create a temporary table to execute the query. The Using where value indicates that the WHERE clause will be used to restrict which rows to retrieve.

This table described the basic information needed to understand the results returned by an EXPLAIN statement. For further information about the results returned by the statement, see the MySQL product documentation.

Explain Table

You can use the EXPLAIN statement to return details about a table. To view table details, use the following syntax:

Syntax
EXPLAIN <table name>

This variant of EXPLAIN statement way produces the same output as the DESCRIBE <table name> statement.

Join Optimization

The benefit of indexes becomes even more apparent for queries that join multiple tables. For joined two or more tables, rows in each table are matched together based on the specified join condition. If the tables are not indexed, MySQL must compare each row in a table with each row in the other table to determine which rows contain the values that meet the join criteria.

In a join, the results returned by the EXPLAIN statement include a row for each table that participates in the query.

Code Sample: OptimizeQueries/Demos/Explain-Non-Key-Join.sql

EXPLAIN SELECT name, title, rental_date, city
FROM rental_detail JOIN customer_detail USING (customer_id)
WHERE rental_date >= '2005-05-25'
AND rental_date <= '2005-05-26'
AND city LIKE 'AD%'
;
Code Explanation

Run explain on two joined tables with no indexes or references.

The explain plan of the query looks as shown (some columns have been dropped):

+-------------+-----------------+------+---------------+-------+
| select_type | table | type | possible_keys | rows |
+-------------+-----------------+------+---------------+-------+
| SIMPLE | customer_detail | ALL | NULL | 563 |
| SIMPLE | rental_detail | ALL | NULL | 16082 |
+-------------+-----------------+------+---------------+-------+
2 rows in set (0.00 sec)

Cartesian Product

In a SELECT statement joining tables app_group, app_user, and app_permission, none of which are configured with indexes. Each row in app_group is combined with each row in tables app_user and app_permission to determine whether each combination matches the join condition. In other words, the first row in app_group is combined with the first row in app_user and the first row in app_permission and so on. This continues until all possible combinations are compared, which can result in excessively large searches. This process leads to the dreaded Cartesian Product as a result. This matching gets worse as existing tables grow larger and more tables are joined together. For example, if each table contains just 100 rows, MySQL must compare one million rows (100 x 100 x 100) to locate rows matching the join conditions.

Match Elimination: Filtering Joins

Join queries often have filter conditions that restrict the set of rows that can participate in the join. In this case, indexes can be used to skip reading unwanted rows.

Code Sample: OptimizeQueries/Demos/Match-Elimination.sql

EXPLAIN
SELECT first_name, last_name, r.rental_date, f.title, f.rating
FROM rental r, customer c, address a, city, inventory i, film f
WHERE 1=1
AND c.address_id = a.address_id
AND a.city_id = city.city_id
AND r.customer_id = c.customer_id
AND r.inventory_id = i.inventory_id
AND i.film_id = f.film_id
AND r.rental_date > '2005-04-01'
AND r.rental_date < '2005-09-30'
AND city LIKE 'A%'
AND f.title LIKE 'SU%'
;

EXPLAIN
SELECT first_name, last_name, r.rental_date, f.title, f.rating
FROM rental r JOIN customer c USING(customer_id)
JOIN address a USING (address_id)
JOIN city USING (city_id)
JOIN inventory i USING (inventory_id)
JOIN film f USING (film_id)
WHERE 1=1
AND r.rental_date > '2005-04-01'
AND r.rental_date < '2005-09-30'
AND city LIKE 'A%'
AND f.title LIKE 'SU%'
;
Code Explanation

Run explain on various joined tables.

As most joined keys are indexed, those indexes are being used, eliminating extensive matching.

Using Joins in Aggregation

Aggregating queries generally have one or more joins to reference informative data from other non-aggregated tables as shown:

Code Sample: OptimizeQueries/Demos/Explain-Non-Key-Aggregate.sql

EXPLAIN SELECT city, count(rental_id)
FROM rental_detail JOIN customer_detail USING (customer_id)
WHERE rental_date >= '2005-01-01'
AND rental_date <= '2005-06-30'
AND city LIKE 'AD%'
GROUP BY city
ORDER BY 2 DESC
;
Code Explanation

Run explain on two joined tables with no indexes or references with grouping.

+--------+-----------------+------+-------+----------------------------------------------+
| select | table | type | rows | Extra |
+--------+-----------------+------+-------+----------------------------------------------+
| SIMPLE | customer_detail | ALL | 563 | Using where; Using temporary; Using filesort |
| SIMPLE | rental_detail | ALL | 16082 | Using where |
+--------+-----------------+------+-------+----------------------------------------------+
2 rows in set (0.00 sec)

In this case,the Extra column displays the value Using temporary; Using filesort for the first row and displays the value Using where in the second row.

In the two customer_details results we have seen, an ALL specified as the scan type means a full-table scan will be conducted on both tables or no indexes are being used to process the SELECT statement.

Optimizing Joins

The first step here is to determine if we need any more keys or indexes in the tables. One place to start is by adding a foreign key to the rental_detail table. The foreign key provides the important referential integrity and also creates an index on that column, which is specified in the join condition. Use ALTER TABLE statement to add the foreign key:

Code Sample: OptimizeQueries/Demos/Create-Ref-Key.sql

ALTER TABLE customer_detail
ADD PRIMARY KEY (customer_id);

ALTER TABLE rental_detail
ADD FOREIGN KEY (customer_id) REFERENCES customer_detail (customer_id);

Code Explanation

Create indexes on the previously joined tables.

After adding these indexes, execute the same two explain plans used above. The output is shown below (truncated and reformatted):

+-----------------+------+---------------+-------------+---------+
| table | type | possible_keys | key | key_len |
+-----------------+------+---------------+-------------+---------+
| customer_detail | ALL | PRIMARY | NULL | NULL |
| rental_detail | ref | customer_id | customer_id | 2 |
+-----------------+------+---------------+-------------+---------+

------------------------------------+------+-------------+
ref | rows | Extra |
------------------------------------+------+-------------+
NULL | 671 | Using where |
sakila.customer_detail.customer_id | 13 | Using where |
------------------------------------+------+-------------+
+-----------------+------+---------------+-------------+---------+
| table | type | possible_keys | key | key_len |
+-----------------+------+---------------+-------------+---------+
| customer_detail | ALL | PRIMARY | NULL | NULL |
| rental_detail | ref | customer_id | customer_id | 2 |
+-----------------+------+---------------+-------------+---------+

-----------------------------+------+--------------------------+
ref | rows | Extra |
-----------------------------+------+--------------------------+
NULL | 671 | temporary;Using filesort |
customer_detail.customer_id | 13 | |
-----------------------------+------+--------------------------+

Though the results look similar to ones displayed before, there are several differences. The new results show a lower row count in both the tables. The reduced multiple of these counts looks even more attractive than before. But still, one important thing to note is that we are matching on city which is not indexed, which is why type column above is ALL for customer_detail. As a result, MySQL will still do a near full table scan on both tables.

Essentially, indexes eliminate the need to compare all the rows in all the tables. Once the columns in our cartesian join conditions are indexed, MySQL uses an index to locate the first applicable row in app_permission. MySQL then uses an index on app_user to match the row in app_permission to the appropriate row in app_user.

MySQL uses an index on app_group to match the row in app_user to the appropriate row in app_group. At each step, MySQL is using indexes to locate values in order to match rows according to how the join condition has been defined. As a result, the number of rows that must be searched is significantly reduced, which results in a dramatic improvement in performance.

Driving Tables

The first table in this join description is called the driving table. Some database systems refer to the driving table as the outer table and the other table as the inner table. Database systems uses statistics, such as the distribution of data in a table, to determine which table should be used as the driving table.

Many database systems allow application writers to specify the driving table explicitly.

Consider the following join query that finds the names of all customers who rented films were in a given period:

Code Sample: OptimizeQueries/Demos/Join-Driv-Rental.sql

SELECT count(1)
FROM rental_detail
WHERE rental_date >= '2005-05-25'
AND rental_date <= '2005-05-26'
;

SELECT count(1)
FROM customer;

EXPLAIN SELECT name, title, rental_date
FROM rental_detail JOIN customer_detail USING (customer_id)
WHERE rental_date >= '2005-05-25'
AND rental_date <= '2005-05-26'
;

Code Explanation

Show a join between tables without keys.

In general, the driving table should be the smallest table after rows have been eliminated by the filter conditions. In example above, the rental_date filter condition reduces the number of rows of the rental_detail table making it the driving table if the reduced row count is less than the total number of rows in customer_detail.

In the query below, the customer table may become the driving table after being filtered:

Code Sample: OptimizeQueries/Demos/Join-Driv-Customer.sql

SELECT count(1)
FROM rental_detail
WHERE rating = 'PG';

SELECT count(1)
FROM customer_detail
WHERE country LIKE 'U%'
;


EXPLAIN SELECT name, title, rental_date, country
FROM rental_detail JOIN customer_detail USING (customer_id)
WHERE country LIKE 'U%'
AND rating = 'PG';

Code Explanation

Show a join between tables without keys.

When to Index

The following list provides some guidelines when to index:

  • Index columns that appear in search conditions. As a general rule, you should consider defining an index on any column that you commonly use in WHERE, GROUP BY, or HAVING clauses. Because these columns define the limitations of a query, they are good candidates for improving performance because they allow MySQL to identify quickly which rows should be included in a search and which should not.
  • Index columns that appear in join conditions. Index any columns that appear in a join condition. Because join conditions are often based on foreign key columns that reference primary key columns, MySQL creates the indexes automatically when you define the primary keys and foreign keys.
  • Do not index columns that appear only in the SELECT clause. If a column appears in the SELECT clause but not in WHERE, GROUP BY, or HAVING clauses, you usually shouldn't index these columns because indexing them provides no performance benefit but does require additional storage. Indexing columns in the SELECT clause provides no benefit because the SELECT clause is one of the last parts of a SELECT statement to be processed. MySQL conducts searches based on the other clauses.
  • Do not index columns with a small domain. If a column contains a few distinct (many duplicated) values, indexing that column provides little benefit. Because of the way in which MySQL accesses an index and uses that index to locate the rows in the tables, many duplicated values can actually cause the process to take longer than if no index is used.

    In fact, when MySQL finds that a value occurs in more than 30 percent of a table's rows, it usually doesn't use the index at all.

  • Use prefixes for indexes on columns that contain large string values. If you're adding an index to a string column, consider defining a prefix on that index so that your index includes only part of the entire values, as they're stored in the table. For example, if your table includes a CHAR(150) column, you might consider indexing only the first 10 or 15 bytes, or whatever number provides enough unique values without having to store the entire values in the index.
  • Create only the indexes that you need. Never create more indexes than you need. If a column is rarely used in a search or join condition, don't index that column. You want to index only those columns that are frequently used to identify the rows being searched.

Index Types

Once you determine whether to index a column you must decide which type of index to use. You can define one or more indexes as part of that statement. MySQL supports five types of indexes:

  • Primary key: Requires that each value or set of values be unique in the columns on which the primary key is defined. In addition, NULL values are not allowed. A table can include only one primary key.
  • Foreign key: Enforces the relationship between the referencing columns in the child table where the foreign key is defined and the referenced columns in the parent table.
  • Regular: Provides a basic index that permits duplicate values and NULL values in the columns on which the index is defined.
  • Unique: Requires that each value or set of values be unique in the columns on which the index is defined. Unlike primary key indexes, NULL values are allowed.
  • Full-text: Supports full-text searches of the values in the columns on which the index is defined. A full-text index permits duplicate values and NULL values in those columns. A full-text index can be defined only on MyISAM tables and only on CHAR, VARCHAR, and TEXT columns.

Using indexes will address many of the performance issues and yet even a well- indexed table can experience performance problems. As a result, we will consider other methods to optimize your queries.

SQL Queries: Other Optimizations

You are likely to perform a varied set of operations through applications that both retrieve and modify data. To ensure that your system is fully optimized, you must take into account methods to improve data retrieval, but also methods that improve inserting updating and deleting data.

Optimizing Data Retrieval

Of all the operations performed against a MySQL database, data retrieval operations via SELECT statements, are the most common. When you execute a SELECT statement, MySQL uses the query optimizer to analyze the statement and perform the query as effectively as possible. The optimizer is a MySQL component whose sole purpose is to ensure the best performance possible for each query by planning out the execution of that query.

The optimizer's primary goal is to try to use indexes whenever possible to process the statement. The optimizer tries to determine which indexes benefit the execution of the statement and which ones don't. In some cases, the optimizer determines that it is better to bypass the index, rather than use it, such as when there are too many duplicate values.

The optimizer also tries to determine the greatest number of rows that can be eliminated from the search. To better understand how this works, take a look at the following SELECT statements:

Code Sample: OptimizeQueries/Demos/Pick-Filter.sql

select count(1) from film WHERE rental_rate = 4.99;

select count(1) from film WHERE rating = 'PG';

select count(1) from film WHERE rental_rate = 4.99 AND rating = 'PG';
Code Explanation

See which expression results in fewer rows.

  • The statement returns 68 rows meeting both the conditions in the WHERE clause.
  • When the optimizer first looks at the statement, it tests each of these expressions in order to estimate the number of rows that must be examined to meet the conditions specified by the expressions.
  • The optimizer estimates that 336 rows need to be examined for the first expression.
  • If only 68 of the 1000 rows meet both search conditions, MySQL has to unnecessarily search through 266 rows that don't meet both search conditions.
  • The optimizer also estimates that only 194 rows have to be examined for the second expression, using which only 126 rows fail to meet both search conditions.
  • Based on the results of its initial tests of the two conditions, the query optimizer may decide that it is best to use the second expression first and then from those limited results, process the first expression.
  • This way, fewer rows have to be processed, which means that the SELECT statement requires less processing time and fewer disk I/O operations than if the first expression is processed first.

Though the optimizer includes other many capabilities such as expression rearrangement, you must sometimes analyze your SELECT statements to improve performance. The most effective method that you can use to analyze your SELECT statement is to use the EXPLAIN statement.

Cardinality of a Table

Index cardinality refers to the number of unique values in an index. For example, suppose that you have a column that permits only three values. If you create an index on the column, the cardinality for that index is 3 with no regard to number of rows in the table. Normally, MySQL ignores indexes with low cardinality as that index may not be efficient. As a result, when you add an index to a table or modify it significantly in any other way, you should ensure that the cardinality is correctly represented to the query optimizer. The easiest way to do this is to execute an OPTIMIZE TABLE statement.

You can view a table's cardinality by using the SHOW INDEX statement on that table. Viewing the cardinality does not necessarily tell you whether that setting is improperly set, so you still might want to run the OPTIMIZE TABLE statement.

Using the OPTIMIZE TABLE Statement

The OPTIMIZE TABLE statement performs a number of functions. For example, it defragments the table and sorts the table's indexes. It also updates the internal table statistics. One of these statistics is the cardinality of its indexes. If you add an index to an existing table, you might have to use the OPTIMIZE TABLE statement to ensure that the table statistics are accurate when read by the query optimizer. The following syntax shows how to create an OPTIMIZE TABLE statement:

Syntax
OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE <table name> [{, <table name>}...]

You can specify the synonymous options of LOCAL or NO_WRITE_TO_BIN_LOG to prevent the optimization process from being written to the binary log.

Now return to the examples from previous section. Because you added an index to the rental_detail table, you should run the OPTIMIZE TABLE statement on that table, as shown in the following example:

OPTIMIZE TABLE rental_detail;

When you execute this statement, you should receive results similar to the following:

+---------------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------------+----------+----------+----------+
| rental_detail | optimize | status | OK |
+---------------+----------+----------+----------+
1 row in set (0.35 sec)

Adding indexes and executing OPTIMIZE TABLE statements are not the only methods that you can use to optimize query performance. MySQL also recommends other steps that you can take to maximize performance.

The Slow Query Log

The slow query log consists of all SQL statements that took more than a certain number of seconds to execute and are therefore candidates for optimization. The time to acquire the initial table locks is not counted as execution time. mysqld writes a statement to the slow query log after it has been executed and after all locks have been released, so log order might be different from execution order.

Use the option long_query_time to specify the time limit of slow queries in seconds. The minimum and default values of long_query_time are 1 and 10, respectively.

To enable the slow query log, start mysqld with the options - log-slow-queries[=file_name] and long_query_time options.

[mysqld]
log-slow-queries[=<path/filename>]
long_query_time=<time-in-seconds>

If no file_name value is given for --log-slow-queries, the default name is <host_name>-slow.log.

Use the following to set up slow query log via command parameters.

root> ./bin/mysqld_safe --log_slow_queries=/mysql/logs/slow_query.log --long_query_time=20

To simplify examining the queries in a long slow query log, you can use mysqldumpslow command on some versions of MySQL.

Queries that do not use indexes are logged in the slow query log if the -- log-queries-not-using-indexes option is specified.

The --log-slow-admin-statements server option enables you to request logging of slow administrative statements such as OPTIMIZE TABLE, ANALYZE TABLE, and ALTER TABLE to the slow query log.

Note: Queries handled by the query cache are not added to the slow query log, nor are queries that would not benefit from the presence of an index because the table has zero rows or one row.

Optimizing and Tuning Queries Conclusion

This lesson covered the optimization of a MySQL database for performance.

There are many steps that you can take to try to improve how well your SQL statements perform. The most important step is to ensure that your tables have been properly indexed. You want to be sure that each table contains the indexes that it needs, while at the same time ensuring that the table is not over-indexed. When necessary, you should also analyze your SELECT statements to ensure that they are being executed as optimally as possible. You should also take the steps necessary to maximize the performance of your data modification statements. In addition, you should look at how your tables have been created, and you should consider enabling query caching. To help you optimize your system, this lesson covered the following topics:

  • Setting up effective indexes

  • Using the EXPLAIN statement to analyze SELECT statement performance

  • Using the OPTIMIZE TABLE statement to optimize your table after changes have been made to that table

  • Improving the performance of data retrieval and modification

  • Enabling your system's query cache to improve query performance

The subject of optimization is a broad one that can cover many aspects of running MySQL. Although this lesson attempted to touch on many of the important issues concerning optimizing your system, it could not cover each subject as extensively as possible. In fact, the subject of optimization is a whole course in itself. In addition, this lesson does not cover hardware considerations in system optimization. Consequently, you're encouraged to refer to other resources for information about optimizing your system, particularly the MySQL product documentation and Web site.

To continue to learn MySQL go to the top of this page and click on the next lesson in this MySQL Tutorial's Table of Contents.
 
mysql tools

HTTP-tunnel - Ability to manage MySQL database via the Internet. More...

'mysql.proc' doesn't exist fixed problem

_

Table 'mysql.proc' doesn't exist. Fix error 1146 Table doesn't exist here...