You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

456 lines
42 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# 23丨MySQL数据库级监控及常用计数器解析
上一篇文章中,我们讲了有关数据库的全局分析,那么在今天的文章中,我们继续看看在数据库中,如何做定向分析。
还记得我在上篇文章中提到的工具吗mysqlreport、pt-query-digest和mysql\_exportor+Prometheus+Grafana。我们在上一篇中已经讲完了mysqlreport今天我们来看看剩下的这几个。
## 定向抓取SQLpt-query-digest
`pt-query-digest`是个挺好的工具,它可以分析`slow log`、`general log`、`binary log`还能分析tcpdump抓取的MySQL协议数据可见这个工具有多强大。`pt-query-digest`属于Percona-tool工具集这个Percona公司还出了好几个特别好使的监控MySQL的工具。
`pt-query-digest`分析slow log时产生的报告逻辑非常清晰并且数据也比较完整。执行命令后就会生成一个报告。
我来稍微解释一下这个报告。我们先看这个报告的第一个部分:
```
# 88.3s user time, 2.5s system time, 18.73M rss, 2.35G vsz
# Current date: Thu Jun 22 11:30:02 2017
# Hostname: localhost
# Files: /Users/Zee/Downloads/log/10.21.0.30/4001/TENCENT64-slow.log.last
# Overall: 210.18k total, 43 unique, 0.26 QPS, 0.14x concurrency _________
# Time range: 2017-06-12 21:20:51 to 2017-06-22 09:26:38
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 118079s 100ms 9s 562ms 2s 612ms 293ms
# Lock time 15s 0 7ms 71us 119us 38us 69us
# Rows sent 1.91M 0 48.42k 9.53 23.65 140.48 2.90
# Rows examine 13.99G 0 3.76M 69.79k 101.89k 33.28k 68.96k
# Rows affecte 3.36M 0 1.98M 16.76 0.99 4.90k 0
# Query size 102.82M 6 10.96k 512.99 719.66 265.43 719.66
```
从上表中可以看得出来在这个慢日志中总执行时间达到了118079s平均执行时间为562ms最长执行时间为9s标准方差为612ms。
可见在此示例中SQL执行还是有点慢的。
这时也许会有人问SQL执行多长时间才是慢呢之前在一个金融机构我跟一个做核心系统的团队讨论他们的SQL执行时间指标。他们判断之后说希望SQL平均执行时间指标定在500ms。我说你们要500ms那前面还有一连串的节点才能到达最终的用户如果每个环节都这样要求自己那最终的用户不就明显感觉到很慢了吗
经过一轮轮的讨论最后定在了100ms以内。
其实从我的经验上来看对于大部分实时的业务一个SQL执行的平均时间指标定在100ms都多了。但是对性能来说就是这样在所有的环节中都没有固定的标准只有经验数据和不断演化的系统性能能力。
我们再接着分析上面的数据。再来看`pt-query-digest`给出的负载报表:
```
# Profile
# Rank Query ID Response time Calls R/Call V/M Item
# ==== ================== ================ ====== ====== ===== ===========
# 1 0x6A516B681113449F 73081.7989 61.9% 76338 0.9573 0.71 UPDATE mb_trans
# 2 0x90194A5C40980DA7 38014.5008 32.2% 105778 0.3594 0.20 SELECT mb_trans mb_trans_finan
# 3 0x9B56065EE2D0A5C8 3893.9757 3.3% 9709 0.4011 0.11 UPDATE mb_finan
# MISC 0xMISC 3088.5453 2.6% 18353 0.1683 0.0 <40 ITEMS>
```
从这个表中可以看到有两个SQL的执行时间占了总执行时间的94%显然这两个SQL是要接下来要分析的重点。
我们再接着看这个工具给出的第一个SQL的性能报表
```
# Query 1: 0.30 QPS, 0.29x concurrency, ID 0x6A516B681113449F at byte 127303589
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.71
# Time range: 2017-06-16 21:12:05 to 2017-06-19 18:50:59
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 36 76338
# Exec time 61 73082s 100ms 5s 957ms 2s 823ms 672ms
# Lock time 19 3s 20us 7ms 38us 66us 29us 33us
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 36 5.06G 3.82k 108.02k 69.57k 101.89k 22.70k 68.96k
# Rows affecte 2 74.55k 1 1 1 1 0 1
# Query size 12 12.36M 161 263 169.75 192.76 11.55 158.58
# String:
# Databases db_bank
# Hosts 10.21.16.50 (38297/50%)... 1 more
# Users user1
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms ################################################################
# 1s #########################################
# 10s+
# Tables
# SHOW TABLE STATUS FROM `db_bank` LIKE 'mb_trans'\G
# SHOW CREATE TABLE `db_bank`.`mb_trans`\G
UPDATE mb_trans
SET
resCode='PCX00000',resultMes='交易成功',payTranStatus='P03',payRouteCode='CMA'
WHERE
seqNo='20170619PM010394356875'\G
# Converted for EXPLAIN
# EXPLAIN /*!50100 PARTITIONS*/
select
resCode='PCX00000',resultMes='交易成功',payTranStatus='P03',payRouteCode='CMA' from mb_trans where
seqNo='20170619PM010394356875'\G
```
从查询时间分布图上来看这个语句的执行时间在100ms~1s之间居多95%的执行时间在2s以下。那么这个SQL就是我们接下来要调优的重点了。
第二个SQL我就不赘述了因为逻辑是完全一样的。
通过对慢日志的分析我们可以很快知道哪个SQL是慢的了。当然你用`mysqldumpslow`分析,也会得到一样的结果。
## SQL剖析profiling
在分析数据库的性能时显然对SQL的分析是绕不过去的一个环节。但是我之前也说过了上来就对SQL进行全面剖析也是不合逻辑的因为SQL那么多如果对每个SQL都进行详细的执行步骤解析显然会拖慢整个系统而且对一些执行快的SQL进行分析也没有什么必要徒增资源消耗。
通过上面的分析过程我们已经定位到了具体是哪个SQL执行得慢那么下面就是要知道SQL的执行细节。无论是在Oracle还是在MySQL中我们都要去看执行计划。
比如说下面这样的:
![](https://static001.geekbang.org/resource/image/0d/e6/0dc186fdc3757e4804030a9e19b5a0e6.png)
上图中`select_type`是子句类型的意思有简单有复杂但是它不能说明什么成本的问题。在这里最重要的内容是type因为type可以告诉你访问这个表的时候是通过什么样的方式访问的。上图中的ALL是全表扫描的意思。type还有如下几个值
![](https://static001.geekbang.org/resource/image/ba/ba/ba8d198f6dc35389a615c8e64b5c6dba.png)
执行计划中的`possible_keys`会列出可能使用到的索引值。key这一列会列出执行时使用到的索引值。
以上信息就是MySQL的执行计划中比较重要的部分了。这些信息可以帮助我们做SQL的分析为优化提供证据。
除了执行计划外MySQL还提供了`profiling`这个有什么用呢它可以把SQL执行的每一个步骤详细列出来从一个SQL进入到数据库中到执行完这整个生命周期。
MySQL的`profiling`在`session`级生效所以当你用了慢日志知道哪个SQL有问题之后再用这个功能是最见成效的。如果想一开始就把所有`session`的`SQL profiling`功能打开,那成本就太高了。
下面我来详细解释一下profiling的用法和功能。
### profiling操作步骤
profiling的操作步骤比较简单如下所示
```
步骤一 set profiling=1; //这一步是为了打开profiling功能
步骤二 :执行语句 //执行你从慢日志中看到的语句
步骤三 show profiles; //这一步是为了查找步骤二中执行的语句的ID
步骤四 show profile all for query id; //这一步是为了显示出profiling的结果
```
我们实际执行一下上面的步骤:
```
// 步骤一打开profiling功能
mysql> set profiling=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
// 这一步只是为了确认一下profiles列表有没有值可以不用执行。
mysql> show profiles;
Empty set, 1 warning (0.00 sec)
// 步骤二:执行语句
mysql> select * from t_user where user_name='Zee0355916';
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
| id | user_number | user_name | org_id | email | mobile | create_time |
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
| 00000d2d-32a8-11ea-91f8-00163e124cff | 00009496 | Zee0355916 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 14:19:32 |
| 77bdb1ef-32a6-11ea-91f8-00163e124cff | 00009496 | Zee0355916 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 14:08:34 |
| d4338339-32a2-11ea-91f8-00163e124cff | 00009496 | Zee0355916 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 13:42:31 |
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
3 rows in set (14.33 sec)
// 步骤三查看profiles列表中有了我们刚才执行的语句
mysql> show profiles;
+----------+-------------+---------------------------------------------------+
| Query_ID | Duration | Query |
+----------+-------------+---------------------------------------------------+
| 1 | 14.34078475 | select * from t_user where user_name='Zee0355916' |
+----------+-------------+---------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
// 步骤四看这个语句的profile信息
mysql> show profile all for query 1;
+--------------------------------+-----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+------------------+-------------+
| Status | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function | Source_file | Source_line |
+--------------------------------+-----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+------------------+-------------+
| starting | 0.000024 | 0.000012 | 0.000005 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NULL | NULL | NULL |
| Waiting for query cache lock | 0.000004 | 0.000003 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| init | 0.000003 | 0.000002 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| checking query cache for query | 0.000052 | 0.000036 | 0.000015 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | send_result_to_client | sql_cache.cc | 1601 |
| checking permissions | 0.000007 | 0.000005 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | check_access | sql_parse.cc | 5316 |
| Opening tables | 0.000032 | 0.000023 | 0.000009 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | open_tables | sql_base.cc | 5095 |
| init | 0.000042 | 0.000029 | 0.000013 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_prepare_select | sql_select.cc | 1051 |
| System lock | 0.000016 | 0.000011 | 0.000004 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_lock_tables | lock.cc | 304 |
| Waiting for query cache lock | 0.000003 | 0.000002 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| System lock | 0.000020 | 0.000014 | 0.000006 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| optimizing | 0.000012 | 0.000009 | 0.000004 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | optimize | sql_optimizer.cc | 139 |
| statistics | 0.000019 | 0.000013 | 0.000005 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | optimize | sql_optimizer.cc | 365 |
| preparing | 0.000015 | 0.000010 | 0.000005 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | optimize | sql_optimizer.cc | 488 |
| executing | 0.000004 | 0.000003 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | exec | sql_executor.cc | 110 |
| Sending data | 14.324781 | 4.676869 | 0.762349 | 1316 | 132 | 2499624 | 288 | 0 | 0 | 8 | 30862 | 0 | exec | sql_executor.cc | 190 |
| end | 0.000015 | 0.000007 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_select | sql_select.cc | 1106 |
| query end | 0.000006 | 0.000005 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_command | sql_parse.cc | 5015 |
| closing tables | 0.000016 | 0.000013 | 0.000003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_command | sql_parse.cc | 5063 |
| freeing items | 0.000013 | 0.000010 | 0.000003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | mysql_parse | sql_parse.cc | 6490 |
| Waiting for query cache lock | 0.000003 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| freeing items | 0.000014 | 0.000012 | 0.000003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| Waiting for query cache lock | 0.000003 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| freeing items | 0.000003 | 0.000002 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| storing result in query cache | 0.000004 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | end_of_result | sql_cache.cc | 1034 |
| logging slow query | 0.015645 | 0.000084 | 0.000020 | 2 | 0 | 16 | 8 | 0 | 0 | 0 | 2 | 0 | log_slow_do | sql_parse.cc | 1935 |
| cleaning up | 0.000034 | 0.000024 | 0.000006 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | dispatch_command | sql_parse.cc | 1837 |
+--------------------------------+-----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+------------------+-------------+
26 rows in set, 1 warning (0.02 sec)
```
非常长,从这样的数据中,我们就看到了一个语句在数据库中从开始到结束的整个生命周期。
对生命周期中的每个步骤进行统计之后,我们就可以看到每个步骤所消耗的时间。不仅如此,还能看到如下这些信息:
* BLOCK IO
* Context Switches
* CPU
* IPC
* MEMORY
* Page Fault
* SOURCE
* SWAPS
有了这些信息,我们基本上就可以判断语句哪里有问题了。
从上面这个示例语句中,你可以看到`Sending data`这一步消耗了14秒的时间并且从后面的数据中也可以看到主动上下文切换有1316次被动的有132次块操作的量也非常大。
碰到这样的情况,我们就得先知道这个`Sending data`到底是什么东西。下面我们结合之前说的到的执行计划,一起看一下:
```
mysql> explain select * from t_user where user_name='Zee0355916';
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | t_user | ALL | NULL | NULL | NULL | NULL | 3868195 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
```
这就是个典型的全表扫描,所以下一步就是检查有没有创建索引。
```
mysql> show indexes from t_user;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_user | 0 | PRIMARY | 1 | id | A | 3868195 | NULL | NULL | | BTREE | | |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
mysql>
```
还是有一个主键索引的,但由于我们没用主键来查,所以用不到。
有些性能测试工程师面对这种情况可能会有这种想法:第一次没有查索引,但是把所有数据都调到缓存里了呀,所以第二次就快了嘛,于是有些人可能想尽快“完成”工作,就用重复的数据。
这里我再执行一遍,你可以看看是什么结果:
```
+----------+-------------+-----------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+-------------+-----------------------------------------------------------+
| 1 | 14.34078475 | select * from t_user where user_name='Zee0355916' |
| 2 | 0.00006675 | show profile all for 1 |
| 3 | 0.00031700 | explain select * from t_user where user_name='Zee0355916' |
| 4 | 0.00040025 | show indexes from t_user |
+----------+-------------+-----------------------------------------------------------+
6 rows in set, 1 warning (0.00 sec)
mysql> select * from t_user where user_name='Zee0355916';
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
| id | user_number | user_name | org_id | email | mobile | create_time |
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
| 00000d2d-32a8-11ea-91f8-00163e124cff | 00009496 | Zee0355916 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 14:19:32 |
| 77bdb1ef-32a6-11ea-91f8-00163e124cff | 00009496 | Zee0355916 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 14:08:34 |
| d4338339-32a2-11ea-91f8-00163e124cff | 00009496 | Zee0355916 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 13:42:31 |
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
3 rows in set (0.00 sec)
mysql> show profiles;
+----------+-------------+-----------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+-------------+-----------------------------------------------------------+
| 1 | 14.34078475 | select * from t_user where user_name='Zee0355916' |
| 2 | 0.00006675 | show profile all for 1 |
| 3 | 0.00031700 | explain select * from t_user where user_name='Zee0355916' |
| 4 | 0.00040025 | show indexes from t_user |
| 5 | 0.00027325 | select * from t_user where user_name='Zee0355916' |
+----------+-------------+-----------------------------------------------------------+
7 rows in set, 1 warning (0.00 sec)
mysql> show profile all for query 5;
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
| Status | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function | Source_file | Source_line |
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
| starting | 0.000029 | 0.000018 | 0.000004 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NULL | NULL | NULL |
| Waiting for query cache lock | 0.000006 | 0.000003 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| init | 0.000003 | 0.000003 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| checking query cache for query | 0.000008 | 0.000006 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | send_result_to_client | sql_cache.cc | 1601 |
| checking privileges on cached | 0.000003 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | send_result_to_client | sql_cache.cc | 1692 |
| checking permissions | 0.000010 | 0.000192 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | check_access | sql_parse.cc | 5316 |
| sending cached result to clien | 0.000210 | 0.000028 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | send_result_to_client | sql_cache.cc | 1803 |
| cleaning up | 0.000006 | 0.000006 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | dispatch_command | sql_parse.cc | 1837 |
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
8 rows in set, 1 warning (0.00 sec)
mys
```
看到没有,在用重复数据的时候确实会让响应时间快很多,因为数据直接从`cache`中发给`client`了。
但是这种作法请你坚决制止因为它不符合真实生产的样子。当你再换一个数据的时候就会歇菜还要再经过14秒的时间做全表扫描。
所以正确的做法是创建合适的索引,让语句在执行任何一条数据时都能快起来,那么,我们现在就创建一个索引,再看执行结果。
```
// 创建索引
mysql> ALTER TABLE t_user ADD INDEX username_idx (user_name);
Query OK, 0 rows affected (44.69 sec)
Records: 0 Duplicates: 0 Warnings: 0
// 分析表
mysql> analyze table t_user;
+-----------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| pa.t_user | analyze | status | OK |
+-----------+---------+----------+----------+
1 row in set (0.08 sec)
// 执行语句
mysql> select * from t_user where user_name='Zee0046948';
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
| id | user_number | user_name | org_id | email | mobile | create_time |
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
| 000061a2-31c2-11ea-8d89-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-08 10:53:08 |
| 047d7ae1-32a2-11ea-91f8-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 13:36:42 |
| 1abfa543-318f-11ea-8d89-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-08 04:48:48 |
| 671c4014-3222-11ea-91f8-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-08 22:23:12 |
| 9de16dd3-32a5-11ea-91f8-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 14:02:28 |
| dd4ab182-32a4-11ea-91f8-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 13:57:05 |
| f507067e-32a6-11ea-91f8-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-09 14:12:04 |
| f7b82744-3185-11ea-8d89-00163e124cff | 00009496 | Zee0046948 | NULL | test9495@dunshan.com | 17600009498 | 2020-01-08 03:43:24 |
+--------------------------------------+-------------+------------+--------+----------------------+-------------+---------------------+
8 rows in set (0.02 sec)
// 查看Query_ID
mysql> show profiles;
+----------+-------------+-----------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+-------------+-----------------------------------------------------------+
| 1 | 14.34078475 | select * from t_user where user_name='Zee0355916' |
| 2 | 0.00006675 | show profile all for 1 |
| 3 | 0.00031700 | explain select * from t_user where user_name='Zee0355916' |
| 4 | 0.00005875 | show indexes for table t_user |
| 5 | 0.00005850 | show indexes for t_user |
| 6 | 0.00040025 | show indexes from t_user |
| 7 | 0.00027325 | select * from t_user where user_name='Zee0355916' |
| 8 | 0.00032100 | explain select * from t_user where user_name='Zee0355916' |
| 9 | 12.22490550 | select * from t_user where user_name='Zee0046945' |
| 10 | 0.00112450 | select * from t_user limit 20 |
| 11 | 44.68370500 | ALTER TABLE t_user ADD INDEX username_idx (user_name) |
| 12 | 0.07385150 | analyze table t_user |
| 13 | 0.01516450 | select * from t_user where user_name='Zee0046948' |
+----------+-------------+-----------------------------------------------------------+
13 rows in set, 1 warning (0.00 sec)
// 查看profile信息
mysql> show profile all for query 13;
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+------------------+-------------+
| Status | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function | Source_file | Source_line |
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+------------------+-------------+
| starting | 0.000030 | 0.000017 | 0.000004 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NULL | NULL | NULL |
| Waiting for query cache lock | 0.000005 | 0.000004 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| init | 0.000003 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| checking query cache for query | 0.000060 | 0.000050 | 0.000011 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | send_result_to_client | sql_cache.cc | 1601 |
| checking permissions | 0.000009 | 0.000007 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | check_access | sql_parse.cc | 5316 |
| Opening tables | 0.000671 | 0.000412 | 0.000000 | 1 | 0 | 8 | 0 | 0 | 0 | 0 | 1 | 0 | open_tables | sql_base.cc | 5095 |
| init | 0.006018 | 0.000082 | 0.000899 | 1 | 0 | 5408 | 0 | 0 | 0 | 1 | 0 | 0 | mysql_prepare_select | sql_select.cc | 1051 |
| System lock | 0.000017 | 0.000011 | 0.000003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_lock_tables | lock.cc | 304 |
| Waiting for query cache lock | 0.000003 | 0.000003 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| System lock | 0.000019 | 0.000015 | 0.000004 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| optimizing | 0.000012 | 0.000010 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | optimize | sql_optimizer.cc | 139 |
| statistics | 0.001432 | 0.000167 | 0.000037 | 1 | 0 | 32 | 0 | 0 | 0 | 0 | 4 | 0 | optimize | sql_optimizer.cc | 365 |
| preparing | 0.000026 | 0.000043 | 0.000009 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | optimize | sql_optimizer.cc | 488 |
| executing | 0.000034 | 0.000005 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | exec | sql_executor.cc | 110 |
| Sending data | 0.006727 | 0.000439 | 0.001111 | 13 | 0 | 1536 | 0 | 0 | 0 | 0 | 1 | 0 | exec | sql_executor.cc | 190 |
| end | 0.000014 | 0.000007 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_select | sql_select.cc | 1106 |
| query end | 0.000009 | 0.000008 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_command | sql_parse.cc | 5015 |
| closing tables | 0.000015 | 0.000012 | 0.000003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_command | sql_parse.cc | 5063 |
| freeing items | 0.000010 | 0.000008 | 0.000002 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_parse | sql_parse.cc | 6490 |
| Waiting for query cache lock | 0.000003 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| freeing items | 0.000027 | 0.000022 | 0.000005 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| Waiting for query cache lock | 0.000003 | 0.000002 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| freeing items | 0.000003 | 0.000002 | 0.000000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | try_lock | sql_cache.cc | 468 |
| storing result in query cache | 0.000004 | 0.000004 | 0.000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | end_of_result | sql_cache.cc | 1034 |
| cleaning up | 0.000015 | 0.000012 | 0.000003 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | dispatch_command | sql_parse.cc | 1837 |
+--------------------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+------------------+-------------+
25 rows in set, 1 warning (0.01 sec)
mysql>
```
从上面最后的profile信息你可以看出来步骤一点没少但是速度快了很多这才是正确的优化思路。
在上一篇文章中我描述了在一个数据库中如何从全局监控的角度查看数据今天讲的是如何找到具体慢的SQL以及如何定位这个SQL的问题。
当然不是所有的情况下都是SQL的问题也有可能是配置的问题也有可能是硬件的问题。不管什么样的问题其分析思路都是这样的也就是我总是在强调的全局监控-定向监控。
当然在这里我也应该给出MySQL分析决策树的思路。从`mysqlreport`的划分上,给出几个具体的分析决策树的树枝。
![](https://static001.geekbang.org/resource/image/5b/27/5b9c8e2f05ee467ffc834dad86dcb927.png)
这是常见的问题,如果你有兴趣,可以自己完善这棵完整的树,因为你可能会有不一样的划分计数器的工具或思路,所以这个树是可以灵活变化的。
你一定要记得,别人给你的东西,永远变不成自己的东西,它们只能引导你。如果你自己动手去做一遍,哪怕只画出一个分枝来,都会是很大的进步。
如果你想用其他的全局监控工具也可以考虑如下的组合也就是mysql\_exportor+Prometheus+Grafana。
## mysql\_exportor+Prometheus+Grafana
我在前面也屡次提到过这类组合不同的exportors结合Prometheus+Grafana可以实现实时监控及数据的保存。
在这里我们看一下`mysql_exportor`可以给我们提供什么样的监控数据。这里截几个图,给你大概看一下这个套装工具能看什么内容,有兴趣的话,你也可以自己搭建一下。
![](https://static001.geekbang.org/resource/image/d9/ac/d96a58c12ab279375d976f27ff17acac.png)
![](https://static001.geekbang.org/resource/image/29/a1/292761d3463870e0fe3a0b05a502d6a1.png)
![](https://static001.geekbang.org/resource/image/95/73/959feaeacefe12129a9337eee5e1c073.png)
## 总结
有关数据库的知识实在是太多了,在这两篇文章中,我重点想告诉你的,就是分析数据库应该具有的思路。至于其他的知识点,我想应该是你打开文章之前就应该储备的东西。
我们再来总结一下,在数据库的分析中,最有可能在三个方面出现问题:
1. 硬件配置
2. 数据库配置
3. SQL语句
对于硬件配置来说我们只能在解决了2和3的问题之后再来评估到底多少硬件够用的。而面对数据库配置问题这个实在没什么好招只能去了解数据库架构等一系列的知识之后再学着解决。而SQL的问题呢应该说是我们在性能测试和分析中最常见的了。SQL性能问题的分析思路也比较清晰那就是判断出具体的SQL瓶颈点进而做相应的优化切记不要蒙
现在的数据库类别比之前多太多了,每种数据库都有自己的架构和使用场景,我们要在充分了解了之后,才能下手去调。
## 思考题
我在这里照例留两个问题。你能说一下数据库分析的大体思路是什么吗如何在数据库中迅速找到一个慢SQL的根本原因呢
欢迎你在评论区写下你的思考,也欢迎把这篇文章分享给你的朋友或者同事,一起交流一下。