learn and grow up

深入探讨innodb中exists、in、join查询之间的区别和使用场景

字数统计: 4k阅读时长: 19 min
2021/01/23 Share

写在前面

​ 前几天生产上出现了一条慢sql,执行时长长达150多秒,经过一天多的分析,找出来sql写法的问题及解决方法,并且在深入研究下,延伸出来很多知识点,特意记录下。

​ 本文篇幅较长,需要耐心看下来~

正文

​ 话不多说直接进入主题,慢sql如下:(做了部分脱敏处理)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
SELECT
st.state state,
di.created_time creatTime
FROM
d_table di,
iiap_device_state st
WHERE
di.device_id = st.device_id
AND di.deleted = 0
AND di.project_id = '3b7807466119a'
AND di.space_id IN (
SELECT
a1.space_node_id
FROM
space a1,
(
SELECT
max( a.line ) AS min_line,
b.space_id,
b.line AS max_line
FROM
space a,
( SELECT space_id, LEVEL, line FROM space WHERE space_node_id = 'space.746a11df2' ) b
WHERE
a.space_id = b.space_id
AND a.line < b.line AND a.LEVEL = b.LEVEL ) a2 WHERE a1.space_id = a2.space_id AND a1.line > IFNULL( a2.min_line, 0 )
AND a1.line <= a2.max_line
AND a1.is_tail = 1
)
ORDER BY
di.created_time DESC,
di.id DESC
LIMIT 30300,
300

in的速度

可以看到,在测试上数据量没那么对的情况下都需要80+s之久

那第一件事当然是explain看下了,我们结合表结构来看explain的结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
--使用海里捞算法建立的space表
CREATE TABLE `space` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '记录唯一id',
`line` bigint(20) NOT NULL COMMENT '空间节点编号',
`parent_node_id` varchar(256) CHARACTER SET utf8mb4 DEFAULT '' COMMENT '父节点ID',
`space_node_id` varchar(256) CHARACTER SET utf8mb4 NOT NULL COMMENT '空间节点id(系统生成唯一id)',
`node_name` varchar(64) CHARACTER SET utf8mb4 NOT NULL COMMENT '节点名称',
`node_desc` varchar(128) CHARACTER SET utf8mb4 DEFAULT NULL COMMENT '节点描述',
`level` int(11) DEFAULT NULL COMMENT '节点层级(根节点层级为1),标记节点END层级为0;',
`space_id` varchar(256) CHARACTER SET utf8mb4 NOT NULL COMMENT '空间id(空间根节点id)',
`project_id` varchar(64) COLLATE utf8mb4_bin NOT NULL COMMENT '项目id',
`is_tail` tinyint(4) DEFAULT '0' COMMENT '0:not tail;1:tail',
`sort_no` int(8) DEFAULT '0' COMMENT '',
`tail_no` int(11) DEFAULT '1' COMMENT ''
PRIMARY KEY (`id`) USING BTREE,
UNIQUE KEY `u_idx` (`space_node_id`),
KEY `r_idx` (`space_id`,`line`)
--.......
) ENGINE=InnoDB AUTO_INCREMENT=32143 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='项目空间子树表';

--iiap_device_state 仅仅是一个简单关联表,可忽略


--d表
CREATE TABLE `d_table` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT '设备记录id',
`device_id` varchar(32) DEFAULT '' COMMENT '',
`space_id` varchar(128) NOT NULL COMMENT '',
--........
`project_id` varchar(64) NOT NULL COMMENT '',
`deleted` tinyint(1) NOT NULL DEFAULT '0' COMMENT '0,deleted is false;1,deleted is true',
PRIMARY KEY (`id`) USING BTREE,
UNIQUE KEY `idx_device_id` (`device_id`),
KEY `idx_project_v_did` (`project_id`,`space_id`) USING BTREE COMMENT ' '
--.......
) ENGINE=InnoDB AUTO_INCREMENT=783555 DEFAULT CHARSET=utf8mb4 COMMENT='方案设备信息';

explain的结果如下:in的explain结果

重点结合以下查询出来列来观察:id、table、type、key、ref、extra

其中:id用来标示查询优化器优化后的执行顺序;table标示查询的对象表、type标示搜索类型、key标示走哪个索引、ref标示查询关联表的列、extra标示查询描述,更具体的网上可以搜出来很多,这里不细写了。

那么分析出来的搜索步骤如下:

​ 1、执行id=3的查询组,也就是海底捞算法的基础过滤列,查询后的结果集也就是上面sql的a2表。

​ 2、id=1的查询组有4个,那么由上往下执行的话,再注意到1.1的extra=‘Using temporary ,Start temporary’和1.3的‘end temporary’,可以看出1.1(a2)+1.2(a1)+1.3(di)共同组成了临时表temporary,其中关联关系如下:a1和a2用space_id走space表的r_idx索引,di单独走const即project_id索引,然后在对两者组成临时表即:(a1+a2)+di,再对临时表进行filesort。剩下的1.4(st)是单独的外表,不需要讨论了。

那么问题初步确定了:(a1+a2)+di这个关联查询并没有走索引,所以才会造成速度慢,因为两边数据量都比较大。

为了进一步证明,我使用更加底层的:explain format=json select…来进行验证,下面是innodb返回的json串和解释图

in的explain结果2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
{
/*in查询 explain json*/
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "2359.81"
},
"ordering_operation": {
/*临时表及(1.1+1.2+1.3)的json说明*/
"using_temporary_table": true,
"using_filesort": true,
"cost_info": {
"sort_cost": "69.02"
},
"nested_loop": [
/*临时表明细说明*/
{
"duplicates_removal": {
"using_temporary_table": true,
"nested_loop": [
{
/*1.1,即<derived3>、a2*/
"table": {
"table_name": "a2",
/*全表扫描*/
"access_type": "ALL",
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "10.05",
"eval_cost": "0.20",
"prefix_cost": "10.25",
"data_read_per_join": "1K"
},
"used_columns": [
"min_line",
"space_id",
"max_line"
],
"materialized_from_subquery": {
"using_temporary_table": true,
"dependent": false,
"cacheable": true,
"query_block": {
"select_id": 3,
"cost_info": {
"query_cost": "892.40"
},
/* 最里层的嵌套查,也就3.1+3.2,组合成<derived3>,即a2*/
"nested_loop": [
/*嵌套查询组成a2的步骤,不细说了,对整体影响不大*/
{
"table": {
"table_name": "space",
"access_type": "const",
"possible_keys": [
"u_idx",
"r_idx"
],
"key": "u_idx",
"used_key_parts": [
"space_node_id"
],
"key_length": "1026",
"ref": [
"const"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.00",
"eval_cost": "0.20",
"prefix_cost": "0.00",
"data_read_per_join": "5K"
},
"used_columns": [
"line",
"space_node_id",
"level",
"space_id"
]
}
},
{
"table": {
"table_name": "a",
"access_type": "ref",
"possible_keys": [
"r_idx"
],
"key": "r_idx",
"used_key_parts": [
"space_id"
],
"key_length": "1026",
"ref": [
"const"
],
"rows_examined_per_scan": 2047,
"rows_produced_per_join": 68,
"filtered": "3.33",
"index_condition": "(`aiot_iiap_luoshu`.`a`.`line` < '1000000000000')",
"cost_info": {
"read_cost": "483.00",
"eval_cost": "13.65",
"prefix_cost": "892.40",
"data_read_per_join": "379K"
},
"used_columns": [
"line",
"level",
"space_id"
],
"attached_condition": "(`aiot_iiap_luoshu`.`a`.`level` = '0')"
}
}
]
}
}
}
},
{
/*1.2,即a1*/
"table": {
"table_name": "a1",
"access_type": "ref",
"possible_keys": [
"u_idx",
"r_idx"
],
"key": "r_idx",
"used_key_parts": [
"space_id"
],
"key_length": "1026",
"ref": [
/*可以看到与a2关联*/
"a2.space_id"
],
"rows_examined_per_scan": 26,
"rows_produced_per_join": 0,
"filtered": "1.11",
/*走索引space_id和line,为了和a2匹配速度更快*/
"index_condition": "((`aiot_iiap_luoshu`.`a1`.`line` > ifnull(`a2`.`min_line`,0)) and (`aiot_iiap_luoshu`.`a1`.`line` <= `a2`.`max_line`))",
"cost_info": {
"read_cost": "26.23",
"eval_cost": "0.06",
"prefix_cost": "41.73",
"data_read_per_join": "1K"
},
"used_columns": [
"id",
"line",
"space_node_id",
"space_id",
"is_tail"
],
/*附加查询*/
"attached_condition": "(`aiot_iiap_luoshu`.`a1`.`is_tail` = 1)"
}
},
/*1.3*/
{
"table": {
"table_name": "di",
"access_type": "ref",
"possible_keys": [
"idx_device_id",
"idx_project_v_did",
"idx_project_p_did"
],
/*索引,走project_id*/
"key": "idx_project_v_did",
"used_key_parts": [
"project_id"
],
"key_length": "258",
"ref": [
"const"
],
"rows_examined_per_scan": 23689,
"rows_produced_per_join": 69,
"filtered": "1.00",
/*关联查询,可以看到:这里和a1关联、但没走索引*/
"index_condition": "(`aiot_iiap_luoshu`.`di`.`space_id` = `aiot_iiap_luoshu`.`a1`.`space_node_id`)",
"cost_info": {
"read_cost": "756.13",
"eval_cost": "13.80",
"prefix_cost": "2207.95",
"data_read_per_join": "287K"
},
"used_columns": [
"id",
"device_id",
"space_id",
"project_id",
"created_time",
"deleted"
],
/*附加查询*/
"attached_condition": "((`aiot_iiap_luoshu`.`di`.`deleted` = 0) and (`aiot_iiap_luoshu`.`di`.`device_id` is not null))"
}
}
]
}
},
{
/*1.4,st,对整体无影响,忽略*/
"table": {
"table_name": "st",
"access_type": "eq_ref",
"possible_keys": [
"idx_device_id",
"idx_device_id2"
],
"key": "idx_device_id",
"used_key_parts": [
"device_id"
],
"key_length": "130",
"ref": [
"aiot_iiap_luoshu.di.device_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 69,
"filtered": "100.00",
"cost_info": {
"read_cost": "69.02",
"eval_cost": "13.80",
"prefix_cost": "2290.78",
"data_read_per_join": "107K"
},
"used_columns": [
"id",
"device_id",
"state"
]
}
}
]
}
}
}

可以看到json显示和我上面根据列的推测一样,那么问题就出在了a1和di关联查上,导致速度很慢。

那么如何优化呢?我初次想到的是先试试从in改成exists,因为之前记得in数据量多的话,可以用exists代替,那么真的是这样吗?我改成exists试了试,sql如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
 SELECT
st.state state,
di.created_time creatTime
FROM
d_table di,
iiap_device_state st
WHERE
di.device_id = st.device_id
AND di.deleted = 0
AND di.project_id = '3b7807466119a3'
AND EXISTS (
SELECT
1
FROM
space a1,
(
SELECT
max( a.line ) AS min_line,
b.space_id,
b.line AS max_line
FROM
space a,
( SELECT space_id, LEVEL, line FROM space WHERE space_node_id = 'space.xxxx' ) b
WHERE
a.space_id = b.space_id
AND a.line < b.line AND a.LEVEL = b.LEVEL ) a2 WHERE a1.space_id = a2.space_id AND a1.line > IFNULL( a2.min_line, 0 )
AND a1.line <= a2.max_line
AND a1.is_tail = 1
AND a1.space_node_id = di.space_id
)
ORDER BY
di.created_time DESC,
di.id DESC
LIMIT 30300,
300

exists的速度

果然,速度一下快了11倍多,那么速度为什么快了这么多呢?我们也可以从explain中找到答案:

exists的explain

通过上面explain和in的对比可以看到有如下不同:

1、查询组变成了3对,把in的查询组1拆成了两对

2、没有使用a2+a1+di组成临时表,而是先生成3.1+3.2(a2),与1.1(di)过滤及排序后的结果集,再与2.1+2.2(a1)进行DEPENDENT SUBQUERY,也就是di过滤后的每一行都与a1进行关联查和筛选。

3、(a2)由in的all查询变成了更快的system,我猜测是被优化加载到了内存内

4、di和a1的关联查由之前的不走索引改为了:u_idx(见2.2),而a1和a2的关联差则没有走索引

那么结合3和4,因为两个大表的关联(a1和di)关联查走了索引,就算a1和a2的关联差则没有走索引(a2被优化为system查询,也是速度变快的一个原因),但速度也会快了11倍之多。

为了更加详细证明我的理解,同in也利用explain format=json select…来进行验证,下面是innodb返回的json串和解释

exists的explain2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247

//{
//exists查询 explain json
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "10175.48"
},
"ordering_operation": {
//查询组1
"using_filesort": true,
"nested_loop": [
{
"table": {
//1.1
"table_name": "di",
"access_type": "ref",
"possible_keys": [
"idx_device_id",
"idx_project_v_did",
"idx_project_p_did"
],
//过滤,走const,和in一样,都是project_id
"key": "idx_project_v_did",
"used_key_parts": [
"project_id"
],
"key_length": "258",
"ref": [
"const"
],
"rows_examined_per_scan": 23689,
"rows_produced_per_join": 2368,
"filtered": "10.00",
//过滤,走const,和in一样,都是project_id
"index_condition": "(`aiot_iiap_luoshu`.`di`.`project_id` <=> 'x')",
"cost_info": {
"read_cost": "2595.00",
"eval_cost": "473.78",
"prefix_cost": "7332.80",
"data_read_per_join": "9M"
},
"used_columns": [
"id",
"device_id",
"space_id",
"project_id",
"created_time",
"deleted"
],
//与in不同,附加查询。每次搜索一行都会和a1对比一次,但是因为走了space_id索引,所以反而相对于in,会很快
"attached_condition": "(((`aiot_iiap_luoshu`.`di`.`deleted` = 0) and exists(/* select#2 */ select 1 from `aiot_iiap_luoshu`.`space` `a1` where ((`aiot_iiap_luoshu`.`a1`.`is_tail` = 1) and (`aiot_iiap_luoshu`.`a1`.`space_id` = 'space.746112931454820352') and (`aiot_iiap_luoshu`.`a1`.`line` > <cache>(ifnull(NULL,0))) and (`aiot_iiap_luoshu`.`a1`.`line` <= '1000000000000') and (`aiot_iiap_luoshu`.`a1`.`space_node_id` = `aiot_iiap_luoshu`.`di`.`space_id`)))) and (`aiot_iiap_luoshu`.`di`.`device_id` is not null))",
"attached_subqueries": [
{
//内嵌查询,也就是DEPENDENT SUBQUERY和查询组3
"dependent": true,
"cacheable": false,
"query_block": {
"select_id": 2,
"cost_info": {
"query_cost": "1.20"
},
"nested_loop": [
{
"table": {
//2.1
"table_name": "a2",
"access_type": "system",
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.00",
"eval_cost": "0.20",
"prefix_cost": "0.00",
"data_read_per_join": "1K"
},
"used_columns": [
"min_line",
"space_id",
"max_line"
],
"materialized_from_subquery": {
//查询组3,这里看到,直接查询完后被封装到内存内,供查询组2使用,查询组3和in相同,不做解释
"using_temporary_table": true,
"dependent": false,
"cacheable": true,
"query_block": {
"select_id": 3,
"cost_info": {
"query_cost": "892.40"
},
"nested_loop": [
{
"table": {
"table_name": "space",
"access_type": "const",
"possible_keys": [
"u_idx",
"r_idx"
],
"key": "u_idx",
"used_key_parts": [
"space_node_id"
],
"key_length": "1026",
"ref": [
"const"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.00",
"eval_cost": "0.20",
"prefix_cost": "0.00",
"data_read_per_join": "5K"
},
"used_columns": [
"line",
"space_node_id",
"level",
"space_id"
]
}
},
{
"table": {
"table_name": "a",
"access_type": "ref",
"possible_keys": [
"r_idx"
],
"key": "r_idx",
"used_key_parts": [
"space_id"
],
"key_length": "1026",
"ref": [
"const"
],
"rows_examined_per_scan": 2047,
"rows_produced_per_join": 68,
"filtered": "3.33",
"index_condition": "(`aiot_iiap_luoshu`.`a`.`line` < '1000000000000')",
"cost_info": {
"read_cost": "483.00",
"eval_cost": "13.65",
"prefix_cost": "892.40",
"data_read_per_join": "379K"
},
"used_columns": [
"line",
"level",
"space_id"
],
"attached_condition": "(`aiot_iiap_luoshu`.`a`.`level` = '0')"
}
}
]
}
}
}
//查询组3 end
},
{
"table": {
//2.1
"table_name": "a1",
"access_type": "eq_ref",
"possible_keys": [
"u_idx",
"r_idx"
],
"key": "u_idx",
"used_key_parts": [
"space_node_id"
],
"key_length": "1026",
"ref": [
//可以看到,关联di查,且走了各自的索引
"aiot_iiap_luoshu.di.space_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 0,
"filtered": "5.00",
//走了各自的索引,space_id
"index_condition": "(`aiot_iiap_luoshu`.`a1`.`space_node_id` = `aiot_iiap_luoshu`.`di`.`space_id`)",
"cost_info": {
"read_cost": "1.00",
"eval_cost": "0.01",
"prefix_cost": "1.20",
"data_read_per_join": "284"
},
"used_columns": [
"line",
"space_node_id",
"space_id",
"is_tail"
],
//附加查询,也就是同内存内的查询组3的结果关联查,因为是内存,所以很快
"attached_condition": "((`aiot_iiap_luoshu`.`a1`.`is_tail` = 1) and (`aiot_iiap_luoshu`.`a1`.`space_id` = 'space.746112931454820352') and (`aiot_iiap_luoshu`.`a1`.`line` > <cache>(ifnull(NULL,0))) and (`aiot_iiap_luoshu`.`a1`.`line` <= '1000000000000'))"
}
}
]
}
}
]
}
},
{
"table": {
//1.2,st对整体结果无影响,就是干扰项,所以不做解释
"table_name": "st",
"access_type": "eq_ref",
"possible_keys": [
"idx_device_id",
"idx_device_id2"
],
"key": "idx_device_id",
"used_key_parts": [
"device_id"
],
"key_length": "130",
"ref": [
"aiot_iiap_luoshu.di.device_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 2368,
"filtered": "100.00",
"cost_info": {
"read_cost": "2368.90",
"eval_cost": "473.78",
"prefix_cost": "10175.48",
"data_read_per_join": "3M"
},
"used_columns": [
"id",
"device_id",
"state"
]
}
}
]
}
}
}

可以看到我上面的文字分析都是正确的~

​ 但是!可以看到查询速度虽然进步了那么多,却还需要平均6秒多,这还是不能够被接受,那么还能不能继续优化呢?答案当然是的,重点在于查询组2即:DEPENDENT SUBQUERY。它需要主表di每扫描一行就要去子表内对比一次,那么我们可以怎么优化呢?就是使用join查询,本场景中正确的写法如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
SELECT
st.state state,
di.created_time creatTime
FROM
d_table di,
iiap_device_state st,
(
SELECT
a1.space_node_id AS space_node_id
FROM
space a1,
(
SELECT
max( a.line ) AS min_line,
b.space_id,
b.line AS max_line
FROM
space a,
( SELECT space_id, LEVEL, line FROM space WHERE space_node_id = 'space.x' ) b
WHERE
a.space_id = b.space_id
AND a.line < b.line AND a.LEVEL = b.LEVEL ) a2 WHERE a1.space_id = a2.space_id AND a1.line > IFNULL( a2.min_line, 0 )
AND a1.line <= a2.max_line
AND a1.is_tail = 1
) als
WHERE
di.device_id = st.device_id
AND di.deleted = 0
AND di.project_id = 'x.'
AND di.space_id = als.space_node_id
ORDER BY
di.created_time DESC,
di.id DESC
LIMIT 30300,
300

查询速度如下:

join的速度

速度一下又快了将近10倍,只需要不到1s,那么速度为什么又快了这么多呢?我们同样也可以从explain中找到答案:

join的explain

根据explain的结果,可以看到结果和exists很大不同,但和in却又很多相似的,与in只有如下不同:

1、查询组1.2和1.3交换了顺序,先查了di

2、取消了1.1、1.2、1.3的组合临时表

3、a1和di之间的关联由r_idx改为了u_idx

可以看到这样组合查询把in和exists的缺点都全部优化了,即保障了两个大表之间的正确索引关联,也取消了错误的临时表生成,所以速度才会变得这么快!!!

同样的,我们从json角度看看是不是我上面说的这个样:

join的explain2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
{
//Join 的json
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "10317.61"
},
"ordering_operation": {
//最外层的sort
//所以我觉得是innodb的问题:1.1的Using filesort不应该放在这里,应该单独拎出来,不然给人误解
"using_filesort": true,
"cost_info": {
"sort_cost": "5389.49"
},
"nested_loop": [
{
//嵌套循环,
//3.1+3.2=1.1 即:a2
"table": {
"table_name": "a2",
"access_type": "system",
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.00",
"eval_cost": "0.20",
"prefix_cost": "0.00",
"data_read_per_join": "1K"
},
"used_columns": [
"min_line",
"space_id",
"max_line"
],
"materialized_from_subquery": {
//查询组3,和之前一样,忽略
"using_temporary_table": true,
"dependent": false,
"cacheable": true,
"query_block": {
"select_id": 3,
"cost_info": {
"query_cost": "892.40"
},
"nested_loop": [
{
"table": {
"table_name": "space",
"access_type": "const",
"possible_keys": [
"u_idx",
"r_idx"
],
"key": "u_idx",
"used_key_parts": [
"space_node_id"
],
"key_length": "1026",
"ref": [
"const"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.00",
"eval_cost": "0.20",
"prefix_cost": "0.00",
"data_read_per_join": "5K"
},
"used_columns": [
"line",
"space_node_id",
"level",
"space_id"
]
}
},
{
"table": {
"table_name": "a",
"access_type": "ref",
"possible_keys": [
"r_idx"
],
"key": "r_idx",
"used_key_parts": [
"space_id"
],
"key_length": "1026",
"ref": [
"const"
],
"rows_examined_per_scan": 2047,
"rows_produced_per_join": 68,
"filtered": "3.33",
"index_condition": "(`aiot_iiap_luoshu`.`a`.`line` < '1000000000000')",
"cost_info": {
"read_cost": "483.00",
"eval_cost": "13.65",
"prefix_cost": "892.40",
"data_read_per_join": "379K"
},
"used_columns": [
"line",
"level",
"space_id"
],
"attached_condition": "(`aiot_iiap_luoshu`.`a`.`level` = '0')"
}
}
]
}
}
//查询组3 end
}
},
{
"table": {
//1.2,di,先查
"table_name": "di",
"access_type": "ref",
"possible_keys": [
"idx_device_id",
"idx_project_v_did",
"idx_project_p_did"
],
"key": "idx_project_v_did",
"used_key_parts": [
"project_id"
],
"key_length": "258",
"ref": [
//使用project_id 静态索引,速度很快
"const"
],
"rows_examined_per_scan": 23689,
"rows_produced_per_join": 2368,
"filtered": "10.00",
"index_condition": "(`aiot_iiap_luoshu`.`di`.`project_id` <=> 'x.')",
"cost_info": {
"read_cost": "2595.00",
"eval_cost": "473.78",
"prefix_cost": "7332.80",
"data_read_per_join": "9M"
},
"used_columns": [
"id",
"device_id",
"space_id",
"project_id",
"created_time",
"deleted"
],
//附加过滤,几乎不影响速度
"attached_condition": "((`aiot_iiap_luoshu`.`di`.`deleted` = 0) and (`aiot_iiap_luoshu`.`di`.`device_id` is not null))"
}
},
{
"table": {
//1.3,a1
"table_name": "a1",
"access_type": "eq_ref",
"possible_keys": [
"u_idx",
"r_idx"
],
//u_idx,关联了di和a1,所以速度很快
"key": "u_idx",
"used_key_parts": [
//a1的space_node_id
"space_node_id"
],
"key_length": "1026",
"ref": [
//di的space_id
"aiot_iiap_luoshu.di.space_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 118,
"filtered": "5.00",
//索引查询
"index_condition": "(`aiot_iiap_luoshu`.`di`.`space_id` = `aiot_iiap_luoshu`.`a1`.`space_node_id`)",
"cost_info": {
"read_cost": "2368.90",
"eval_cost": "23.69",
"prefix_cost": "10175.48",
"data_read_per_join": "658K"
},
"used_columns": [
"id",
"line",
"space_node_id",
"space_id",
"is_tail"
],
//附加查询,和a2关联查询,因a2数据量少,所以对速度影响也小
"attached_condition": "((`aiot_iiap_luoshu`.`a1`.`is_tail` = 1) and (`aiot_iiap_luoshu`.`a1`.`space_id` = 'space.746112931454820352') and (`aiot_iiap_luoshu`.`a1`.`line` > <cache>(ifnull(NULL,0))) and (`aiot_iiap_luoshu`.`a1`.`line` <= '1000000000000'))"
}
},
{
//1.4,st,同上,忽略
"table": {
"table_name": "st",
"access_type": "eq_ref",
"possible_keys": [
"idx_device_id",
"idx_device_id2"
],
"key": "idx_device_id",
"used_key_parts": [
"device_id"
],
"key_length": "130",
"ref": [
"aiot_iiap_luoshu.di.device_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 118,
"filtered": "100.00",
"cost_info": {
"read_cost": "118.45",
"eval_cost": "23.69",
"prefix_cost": "10317.61",
"data_read_per_join": "184K"
},
"used_columns": [
"id",
"device_id",
"state"
]
}
}
]
}
}
}

经过json分析验证到我上面文字分析的不错,问题得到解决~

总结

之所以呢能从80多秒优化到1秒不到,从上面的explain和json分析,总结出来就是:1、表使用到正确的索引;2、使用到正确的select_type;3、正确的query_type。

所以在日常工作和开发中不能随便写sql语句,不能怎么顺手怎么来,要根据实际的业务场景+表结构+表数据量来决定到底使用什么样的查询方法和sql写法,结合上面按列,我总结了下关于exists、in、join查询的结论如下(在表结构有合适索引的情况下):

  1. 如果子查询数据量很少,那么三种写法都可以~
  2. 如果子查询数据量很大,但主表数据量很少,那么就可以用exists,因为exists是基于主表进行内嵌查询,主表数据越少速度越快~
  3. 如果子表和主表数据量都很大,那么就必须得用join查询,把子表抽离出来变为独立的临时表,与主表关联查询,在有索引的情况下,速度肯定不会很慢~
  4. 如果速度还是很慢,那么就要考虑表结构是否合理,是否值得优化等其他话题了~
CATALOG
  1. 1. 写在前面
  2. 2. 正文
    1. 2.1. 总结