Performance is influenced by characteristics of the system hosting the database server, for example:
- Disk input/output (I/O) speed.
- Amount of memory available.
- Processor speed.
- Network bandwidth.
All of these factors have an impact on performance. For example, performance of a SQL statement depends upon whether or not a complete table can fit in memory, the time it takes to load the table into memory, the time it takes to process the data in memory, and the network bandwidth available for interaction between the user and the database server.
Disk I/O Speed
Database performance depends heavily on disk I/O speed. Disk accesses are very slow compared to memory operations. In addition to using faster disks, database performance can be improved by minimizing disk accesses. Also, once data is brought into memory (data buffer), keeping it there for future use as long as memory is available will reduce disk accesses there by speeding up query execution. The larger the data buffer, the more the data that can be kept in memory.
Disks are mechanical devices with moving parts which is slow electronic memory. The time to read/write data consists of three components in order of decreasing time consumption:
- Seek Time: Time taken by the disk head to move to the appropriate track.
- Rotation Time: Time taken by the platter to rotate so that the appropriate block on the track is under the disk head.
- Transfer Time: Time to read or write the data block.
Optimizing Disk Accesses
- Database systems try to optimize disk access time by storing tables, indexes, and other database items to minimize seek and rotational times. Rows inserted into a table are stored contiguously in a block with other rows or in adjacent blocks, as appropriate. The execution time of a query, such as a SELECT statement without a WHERE clause, that needs to access a complete table will be minimized if the table is stored in contiguous blocks. Data placement decisions are made by the database system and are hidden from the users.
- A disk block is the unit of disk I/O and storage. The size of the block can be explicitly specified by a database designer or a DBA, overriding the default size. A typical block size ranges from 2K to 16K bytes. The block size is specified when a database is created and typically cannot be altered once defined.
- Large disk block sizes increase the number of bytes transferred per expensive seek thus reducing number of seeks. But using large blocks for applications with updates involving several blocks will cause each update to take longer as it gets expensive to write larger blocks. The default block size should suffice for most applications. In case of databases with special needs, say for multimedia databases with large items, a larger block size may be beneficial to reduce the number of items spread over more than one disk block for storage.
- Disk seeks problem becomes more pronounced as data volumes starts to grow so large that effective caching becomes harder. For large databases with random data access, you will likely need at least one disk seek to read and a couple of disk seeks to write things. For database servers, employ disks with lower seek times.
- Increase the number of available disk spindles which reduces the seek overhead by either symlinking files to different disks or striping the disks.
Using Symbolic Links
You can move tables and databases from the database directory to other locations and replace them with symbolic links to the new locations. You might want to do this, for example, to move a database to a file system with more free space or increase the speed of your system by spreading your tables to different disk.
The recommended way to do this is simply to symlink databases to a different disk.
If an installation has several disks, Striping means putting the first block on the first disk, the second block on the second disk, and the N-th block on the (N MOD number_of_disks) disk, and so on.
If your average data size is less than or equal to the stripe size, performance for multiple reads will improve.
Striping may be suitable for some parallel disk-access systems such as RAID.
We can also tune the Operating System settings for the filesystem that the database is deployed under.
- If the needed data is in memory or data buffers, then disk accesses are not required saving time. If not, then disk blocks containing the data must be read from the disk into the data buffer. Larger data buffers increase the chance of finding the needed data in the buffer.
- If the data buffer is running out of space for new items, then some items in the buffer must be deleted. And if these memory blocks have been altered via updates, then they must be written to disk before removal.
To determine which items should be deleted, database systems use algorithms similar to those used in operating systems such as the least- recently-used (LRU) and the first-in-first-out (FIFO) algorithms and their variants. Note that the data buffer is conceptually similar to a cache in an operating system.
- The DBA has the primary responsibility of determining and specifying memory needs of a database system to ensure optimal performance. To tune performance, the DBA must determine and specify buffer sizes.
In MySQL, it is possible to configure the buffer sizes for:
- Caching queries
- Caching query results
Buffers for caching queries and their results are very useful in situations where the database does not change often and the same queries are executed repeatedly. This is typical for many web servers that serve dynamic pages using database content.
How MySQL Uses Memory
- All threads share the same base memory. When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache. In that case, the memory remains allocated.
- The key buffer is shared by all threads; its size is determined by the key_buffer_size variable
- Each thread that is used to manage client connections uses some thread-specific space. The following list indicates these and which variables control their size:
- A stack (default 192KB, variable thread_stack)
- A connection buffer (variable net_buffer_length)
- A result buffer (variable net_buffer_length)
The connection buffer and result buffer both begin with a size given by net_buffer_length but are dynamically enlarged up to max_allowed_packet bytes as needed. The result buffer shrinks to net_buffer_length after each SQL statement. While a statement is running, a copy of the current statement string is also allocated.
- Most requests that perform a sort allocate a sort buffer and also zero to two temporary files depending on the result set size.
- Each request that performs a sequential scan of a table allocates a read buffer defined by variable read_buffer_size.
- For each table having BLOB columns, if you scan a table, a buffer as large as the largest BLOB value is allocated.
- Some memory is allocated for query itself for column structures, parsing, calculating and queries.
- MySQL tests mysqld with several memory-leakage detectors to prevent memory leaks.
- All joins are executed in a single pass, and most joins can be done without even using a temporary table.
- In some cases, the server creates internal temporary tables while processing queries. A temporary table can be held in memory and processed by the MEMORY storage engine, or stored on disk and processed by the MyISAM storage engine.
Faster processors are always a good thing. Typically, disks are likely to be the cause of performance problems because of disk I/O is very slow compared to processor execution time. However, if there are multiple disks and I/O is being performed in parallel, then processors can be a bottleneck. In such cases, faster processors can help speed up response time and improve throughput.
In a client-server architecture, one factor that can affect response time is available bandwidth. In case of intranets and broadband networks, network bandwidth is usually not an issue. However, even in such cases, transferring large amounts of data between the client and the database server can be time consuming and will adversely affect response time.
Tuning Server Parameters
You can determine the default buffer sizes used by the mysqld server using this command:
Code Sample: TuneMySQL/Demos/Show-Vars.bat
mysqld --verbose --help
This command produces a list of all mysqld options and configurable system variables. The output includes the default variable values
When tuning a MySQL server, the two foremost variables to configure are key_buffer_size and table_cache. You should set these appropriately before varying other variables.
The following examples indicate some typical variable values for different runtime configurations.
If you have at least 256MB of memory and many tables and want maximum performance with a moderate number of clients, you should use something like this:
shell> mysqld_safe --key_buffer_size=64M --table_cache=256 \
--sort_buffer_size=4M --read_buffer_size=1M &
If you have only 128MB of memory and only a few tables, but you still do a lot of sorting, you can use something like this:
shell> mysqld_safe --key_buffer_size=16M --sort_buffer_size=1M
If there are very many simultaneous connections, swapping problems may occur unless mysqld has been configured to use very little memory for each connection. mysqld performs better if you have enough memory for all connections.
With little memory and lots of connections, use something like this:
shell> mysqld_safe --key_buffer_size=512K --sort_buffer_size=100K \
Or even this:
shell> mysqld_safe --key_buffer_size=512K --sort_buffer_size=16K \
--table_cache=32 --read_buffer_size=8K \
For GROUP BY or ORDER BY operations on tables larger than available memory, you should increase the value of read_rnd_buffer_size to for faster reading of rows post-sorting.
Some other parameters are discussed in other lessons , and more details can be gathered using MySQL documentation.
MySQL Query Cache
- The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again.
- The query cache is extremely useful in an environment where you have tables that do not change very often and for which the server receives many identical queries. This is a typical situation for many Web servers that generate many dynamic pages based on database content.
- The query cache does not return stale data. When tables are modified, any relevant entries in the query cache are flushed.
- The query cache does not work in an environment where you have multiple mysqld servers updating the same MyISAM tables.
- A query also is not cached under these conditions:
- Server-side prepared statements, or sub-queries or Queries executed within the body of a stored function or trigger.
- user-defined functions (UDFs) or stored functions, user or internal variables, uses explicit locking,uses TEMPORARY tables, generates warnings, or if The user has a column-level privilege for any of the involved tables.
- A query cannot be cached if it contains some of the system-info functions such as DATABASE() or LAST_INSERT_ID().
- Searches for a single row in a single-row table are 238% faster with the query cache than without it.
- To disable the query cache at server startup, set the query_cache_size system variable to 0.
- Incoming queries are compared to those in the query cache before parsing
- Queries must be exactly the same (byte for byte) to be seen as identical. In addition, query strings that are identical may be treated as different for other reasons. Queries that use different databases, different protocol versions, or different default character sets are considered different queries and are cached separately.
- the following two queries are regarded as different by the query cache:
SELECT * FROM film
Select * from film
- If a table changes, all cached queries that use the table become invalid and are removed from the cache.
- The query cache also works within transactions when using InnoDB tables.
- Two query cache-related options may be specified in SELECT statements:
- SQL_CACHE: The query result is cached if it is cacheable and the value of the query_cache_type system variable is ON or DEMAND.
- SQL_NO_CACHE: The query result is not cached.
Here are a few examples:
Code Sample: TuneMySQL/Demos/Query-Cache-In-Select.sql
SELECT SQL_CACHE customer_id, first_name, last_name FROM customer;
SELECT SQL_NO_CACHE customer_id, first_name, last_name FROM customer;
The first query is cached if cache is enabled or on DEMAND.
The second query is not cached at all.
Query Cache Parameters
By default, the query cache is not enabled. You can set up query caching on your system by using the following three system variables:
- query_cache_type: Specifies the operating mode of the query cache, values are explained below.
- query_cache_limit: Specifies the maximum size for a result set that can be cached. For example, if the limit is set to 2M, no result set larger than 2M will be cached. The default limit is 1M.
- query_cache_size: Specifies the amount of memory allocated for caching queries. By default, this variable is set to 0, which means that query caching is turned off. To implement query caching, you should specify a query_cache_size setting in the [mysqld] section of your option file.
As you can see, the only action that you need to take to implement query caching is to set the query_cache_size variable, which is set to 0 by default.
If the query cache size is greater than 0, the query_cache_type variable influences how it works. This variable can be set to the following values:
- 0 or OFF: Prevents caching or retrieval of cached results.
- 1 or ON: Allows caching except of those statements that begin with SELECT SQL_NO_CACHE.
- 2 or DEMAND: Cache only those statements that begin with SELECT SQL_CACHE.
Code Sample: TuneMySQL/Demos/set-Query-Cache-Type.sql
SET SESSION query_cache_type = OFF;
SET SESSION query_cache_type = 2;
The first SET turns off query caching.
The second SET will cache only queries demanding caching via SQL_CACHE option.
You can defragment the query cache to better utilize its memory with:
Code Sample: TuneMySQL/Demos/Flush-Query-Cache.sql
FLUSH QUERY CACHE;
The statement does not remove any queries from the cache.
To remove all query results from the query cache:
Code Sample: TuneMySQL/Demos/Reset-Query-Cache.sql
RESET QUERY CACHE;
The FLUSH TABLES statement can also do the same thing.
To monitor query cache performance:
Code Sample: TuneMySQL/Demos/Show-Qcache.sql
SHOW STATUS LIKE 'Qcache%';
Descriptions of each of these variables can be found in MySQL documentation.
The system variables related to your query cache are not the only variables that can affect performance. There are other system variables related to your table and index cache, as well as other components of MySQL. Refer to the MySQL product documentation for more.
Warning: You cannot use a SET statement to specify the cache size.
The following exercise walks you through the process of viewing the settings for each of these variables and enabling the query cache.
The MyISAM Key Cache
To minimize disk I/O, the MyISAM storage engine exploits a strategy that is used by many database management systems. It employs a cache mechanism to keep the most frequently accessed index blocks in memory. A special structure called the key cache (or key buffer) is maintained. The structure contains a number of block buffers where the most-used index blocks are placed. When data from any table index block must be accessed, the server first checks whether it is available in some block buffer of the key cache.
You can set up multiple key caches and assign table indexes to specific caches.
Multiple threads can access the cache concurrently, governed by concurrent updates. Shared access to the key cache enables the server to improve throughput significantly.
To control the size of the key cache, use the key_buffer_size system variable.
Code Sample: TuneMySQL/Demos/Set-Key-Cache.sql
SET GLOBAL keycache1.key_buffer_size=128*1024;
SHOW VARIABLES LIKE 'key_buffer_size';
The key cache is set to 128K.
To destroy a key cache, set its size to zero:
Code Sample: TuneMySQL/Demos/Destroy-Key-Cache.sql
SET GLOBAL keycache1.key_buffer_size=0;
If there are enough blocks in a key cache to hold blocks of an entire index or the blocks corresponding to its non-leaf nodes, you may preload the key cache with index blocks. Preloading reads the index blocks from disk sequentially which is more efficient than the normal random reads.
Without preloading, the blocks are placed into the key cache as needed by queries. The blocks will stay in the cache if there are enough buffers, the fetching from disk in random order.
To preload an index into a cache, use the LOAD INDEX INTO CACHE statement. For example, the following statement preloads nodes (index blocks) of indexes of the tables table1 and table2:
Code Sample: TuneMySQL/Demos/Load-Index-Into-Cache.sql
CREATE TABLE table1 ( id1 SMALLINT NOT NULL AUTO_INCREMENT PRIMARY KEY ) Engine=MyISAM;
CREATE TABLE table2 ( id2 SMALLINT NOT NULL AUTO_INCREMENT PRIMARY KEY ) Engine=MyISAM;
LOAD INDEX INTO CACHE table1, table2 IGNORE LEAVES;
Create two tables table1 and table2 in MyISAM engine with indexes.
The IGNORE LEAVES modifier causes only blocks for the non-leaf nodes of the index to be preloaded. The statement shown preloads all index blocks from table1, but only blocks for the non-leaf nodes from table2.
Examining Thread Information
It may be worthwhile to diagnose what your MySQL server is doing, to identify any bottlenecks. One can examine the process list - the set of threads currently executing within the server:
Code Sample: TuneMySQL/Demos/Show-Process-List.sql
SHOW FULL PROCESSLIST;
-- Can also use (on command line)
-- myqladmin processlist;
Show detailed information about currently running MySQL threads.
Each process list entry contains several pieces of information:
- Id is the connection identifier for the client associated with the thread.
- User and Host indicate the account associated with the thread.
- db is the default database for the thread, or NULL if none is selected.
- Command and State indicate what the thread is engaged in. Most states correspond to very quick operations. If a thread stays in a given state for unusual time, there might be a problem that needs further investigation.
- Time indicates how long the thread has been in its current state.
- Info contains the text of the statement being executed by the thread, or NULL if it is not executing one. By default, this value contains only the first 100 characters of the statement. To see the complete statements, use SHOW FULL PROCESSLIST.
See MySQL documentation for various Command and State values and their meaning.
Tuning MySQL for Performance Conclusion
This lesson covered the aspects of tuning a MySQL installation at operating system level.