分析优化Mysql 多表联合查询效率

mysql大数据查询优化对于许多站长来讲都不会仔细的去分析了，对于这个问题小编最近碰到一个100w数据优化问题了，下面整理了一些mysql关联查询优化的测试及相关分析希望能帮助到大家。
相关mysql视频教程推荐：《mysql教程》
一，简单的关联子查询的一种优化 .
很多时候，在mysql上实现的子查询的性能较差，这听起来实在有点难过。特别有时候，用到in()子查询语句时，对于上了某种数量级的表来说，耗时多的难以估计。本人mysql知识所涉不深，只能慢慢摸透个中玄机了。
假设有这样的一个exists查询语句：
select * from table1 where exists (select * from table2 where id>=30000 and table1.uuid=table2.uuid);
table1为十万行级的表，table2为百万行级的表，本机测试结果用时2.40s。
通过explain可以看到子查询是一个相关子查询(dependence subquery); mysql会首先对外表table1进行全表扫描，然后根据返回的uuid逐次执行子查询。如果外层表是一个很大的表，我们可以想象查询性能会表现得比此次测试更糟糕。
一种简单的优化方案为使用inner join的方法来代替子查询, 查询语句则可以改为:
select * from table1 innner join table2 using(uuid) where table2.id>=30000;
本机测试结果用时0.68s。
通过explain可以看到mysql使用了simple类型(子查询或union以外的查询方式); mysql优化器会先过滤table2，然后对table1和table2做笛卡尔积得出结果集后，再通过on条件来过滤数据。
二、多表联合查询效率分析及优化
1. 多表连接类型
1. 笛卡尔积(交叉连接) 在mysql中可以为cross join或者省略cross即join，或者使用',' 如：
01.select * from table1 cross join table2 02.select * from table1 join table2 03.select * from table1,table2 select * from table1 cross join table2 select * from table1 join table2 select * from table1,table2
由于其返回的结果为被连接的两个数据表的乘积，因此当有where, on或using条件的时候一般不建议使用，因为当数据表项目太多的时候，会非常慢。一般使用left [outer] join或者right [outer] join
2. 内连接inner join 在mysql中把inner join叫做等值连接，即需要指定等值连接条件在mysql中cross和inner join被划分在一起。 join_table: table_reference [inner | cross] join table_factor [join_condition]
3. mysql中的外连接，分为左外连接和右连接，即除了返回符合连接条件的结果之外，还要返回左表(左连接)或者右表(右连接)中不符合连接条件的结果，相对应的使用null对应。
例子：
user表:
id | name ——— 1 | libk 2 | zyfon 3 | daodao
user_action表:
user_id | action ————— 1 | jump 1 | kick 1 | jump 2 | run 4 | swim
sql:
01.select id, name, action from user as u 02.left join user_action a on u.id = a.user_id select id, name, action from user as u left join user_action a on u.id = a.user_idresult: id | name | action ——————————– 1 | libk | jump ① 1 | libk | kick ② 1 | libk | jump ③ 2 | zyfon | run ④ 3 | daodao | null ⑤
分析：
注意到user_action中还有一个user_id=4, action=swim的纪录，但是没有在结果中出现，
而user表中的id=3, name=daodao的用户在user_action中没有相应的纪录，但是却出现在了结果集中
因为现在是left join，所有的工作以left为准.
结果1，2，3，4都是既在左表又在右表的纪录，5是只在左表，不在右表的纪录
工作原理：
从左表读出一条，选出所有与on匹配的右表纪录(n条)进行连接，形成n条纪录(包括重复的行，如：结果1和结果3)，如果右边没有与on条件匹配的表，那连接的字段都是null.然后继续读下一条。
引申：
我们可以用右表没有on匹配则显示null的规律, 来找出所有在左表，不在右表的纪录，注意用来判断的那列必须声明为not null的。
如：
sql:
01.select id, name, action from user as u 02.left join user_action a on u.id = a.user_id 03.where a.user_id is null select id, name, action from user as u left join user_action a on u.id = a.user_id where a.user_id is null
(注意:
1.列值为null应该用is null 而不能用=null
2.这里a.user_id 列必须声明为 not null 的.）
上面sql的result:
id | name | action ————————– 3 | daodao | null——————————————————————————–
一般用法：
a. left [outer] join：
除了返回符合连接条件的结果之外，还需要显示左表中不符合连接条件的数据列，相对应使用null对应
01.select column_name from table1 left [outer] join table2 on table1.column=table2.column select column_name from table1 left [outer] join table2 on table1.column=table2.column b. right [outer] join：
right与left join相似不同的仅仅是除了显示符合连接条件的结果之外，还需要显示右表中不符合连接条件的数据列，相应使用null对应
01.select column_name from table1 right [outer] join table2 on table1.column=table2.column select column_name from table1 right [outer] join table2 on table1.column=table2.columntips:
1. on a.c1 = b.c1 等同于 using(c1)
2. inner join 和 , (逗号) 在语义上是等同的
3. 当 mysql 在从一个表中检索信息时，你可以提示它选择了哪一个索引。
如果 explain 显示 mysql 使用了可能的索引列表中错误的索引，这个特性将是很有用的。
通过指定 use index (key_list)，你可以告诉 mysql 使用可能的索引中最合适的一个索引在表中查找记录行。
可选的二选一句法 ignore index (key_list) 可被用于告诉 mysql 不使用特定的索引。如：
01.mysql> select * from table1 use index (key1,key2) 02.-> where key1=1 and key2=2 and key3=3; 03.mysql> select * from table1 ignore index (key3) 04.-> where key1=1 and key2=2 and key3=3; mysql> select * from table1 use index (key1,key2) -> where key1=1 and key2=2 and key3=3; mysql> select * from table1 ignore index (key3) -> where key1=1 and key2=2 and key3=3;
2. 表连接的约束条件
添加显示条件where, on, using
1.where子句mysql>
01.select * from table1,table2 where table1.id=table2.id; select * from table1,table2 where table1.id=table2.id;
2. on
mysql>
01.select * from table1 left join table2 on table1.id=table2.id; 02. 03.select * from table1 left join table2 on table1.id=table2.id 04.left join table3 on table2.id=table3.id; select * from table1 left join table2 on table1.id=table2.id;select * from table1 left join table2 on table1.id=table2.id left join table3 on table2.id=table3.id;
3. using子句，如果连接的两个表连接条件的两个列具有相同的名字的话可以使用using
例如：
select from left join using ()
连接多于两个表的情况举例：
mysql>
01.select artists.artist, cds.title, genres.genre 02. 03.from cds 04. 05.left join genres n cds.genreid = genres.genreid 06. 07.left join artists on cds.artistid = artists.artistid; select artists.artist, cds.title, genres.genrefrom cdsleft join genres n cds.genreid = genres.genreidleft join artists on cds.artistid = artists.artistid;
或者 mysql>
01.select artists.artist, cds.title, genres.genre 02. 03.from cds 04. 05.left join genres on cds.genreid = genres.genreid 06. 07. left join artists -> on cds.artistid = artists.artistid 08. 09. where (genres.genre = 'pop'); select artists.artist, cds.title, genres.genre
from cds
left join genres on cds.genreid = genres.genreid left join artists -> on cds.artistid = artists.artistid where (genres.genre = 'pop');
--------------------------------------------
另外需要注意的地方在mysql中涉及到多表查询的时候，需要根据查询的情况，想好使用哪种连接方式效率更高。
1. 交叉连接(笛卡尔积)或者内连接 [inner | cross] join
2. 左外连接left [outer] join或者右外连接right [outer] join 注意指定连接条件where, on，using.
3. mysql如何优化left join和right join
在mysql中，a left join b join_condition执行过程如下：
1)· 根据表a和a依赖的所有表设置表b。
2)· 根据left join条件中使用的所有表(除了b)设置表a。
3)· left join条件用于确定如何从表b搜索行。(换句话说，不使用where子句中的任何条件）。
4)· 可以对所有标准联接进行优化，只是只有从它所依赖的所有表读取的表例外。如果出现循环依赖关系，mysql提示出现一个错误。
5)· 进行所有标准where优化。
6)· 如果a中有一行匹配where子句，但b中没有一行匹配on条件，则生成另一个b行，其中所有列设置为null。
7)· 如果使用left join找出在某些表中不存在的行，并且进行了下面的测试：where部分的col_name is null，其中col_name是一个声明为 not null的列，mysql找到匹配left join条件的一个行后停止(为具体的关键字组合)搜索其它行。
right join的执行类似left join，只是表的角色反过来。
联接优化器计算表应联接的顺序。left join和straight_join强制的表读顺序可以帮助联接优化器更快地工作，因为检查的表交换更少。请注意这说明如果执行下面类型的查询，mysql进行全扫描b，因为left join强制它在d之前读取：
01.select * 02.from a,b left join c on (c.key=a.key) left join d on (d.key=a.key) 03.where b.key=d.key; select * from a,b left join c on (c.key=a.key) left join d on (d.key=a.key) where b.key=d.key;
在这种情况下修复时用a的相反顺序，b列于from子句中：
01.select * 02.from b,a left join c on (c.key=a.key) left join d on (d.key=a.key) 03.where b.key=d.key; select * from b,a left join c on (c.key=a.key) left join d on (d.key=a.key) where b.key=d.key;
mysql可以进行下面的left join优化：如果对于产生的null行，where条件总为假，left join变为普通联接。
例如，在下面的查询中如果t2.column1为null，where 子句将为false：
01.select * from t1 left join t2 on (column1) where t2.column2=5;
select * from t1 left join t2 on (column1) where t2.column2=5;因此，可以安全地将查询转换为普通联接：
01.select * from t1, t2 where t2.column2=5 and t1.column1=t2.column1;
select * from t1, t2 where t2.column2=5 and t1.column1=t2.column1;这样可以更快，因为如果可以使查询更佳，mysql可以在表t1之前使用表t2。为了强制使用表顺序，使用straight_join。
三、利用缓存来实现
现在社区分享类网站很火，就拿方维购物分享网站举例说明吧。也是对二次开发方维购物分享网站的一点总结，高手可以飞过。
购物分享的关键表有：分享表、图片表、文件表、评论表、标签表、分类表等。
围绕分享的表就么多，哇，那也不少啊。当我们查看一个图片的详细信息时，就要显示以上表里的信息。显示图片所属的分类、给图片打的标签、图片的评论、有文件的话还要显示文件下载信息等。难道让我们6个表去关联查询嘛，当然不能这么多关联来查询数据，我们可以只查询一个表即可，这怎么讲？这里分享表是主表，我们可以在主表里建立一个缓存字段。比如我们叫cache_data字段，赋予它text类型，这样可以存储很长的字符串，而不至于超过字段的最大存储。
这个缓存字段怎么用呢？在新增一条分享信息后，产生分享id。如果用户发布图片或文件的话，图片信息入图片表，文件信息入文件表，然后把新产生的图片或文件信息写入到缓存字段里。同样的，如果用户有选择分类、打了标签的话，也把相应的信息写入到缓存字段里。对于评论而言，没有必要把全部评论存到缓存字段里，因为你不知道他有多少条记录，可以把最新的10条存到缓存字段里用于显示，这样缓存字段就变成一个二维或三维数组，序列化后存储到分享表里。
array( 'img' = array( name => '123.jpg', url => 'http://tech.42xiu.com/123.jpg', width => 800, width => 600, ), 'file' = array( name => 'abc.zip', download_url => 'http: //tech.42xiu.com/abc.zip', size => 1.2mb, ), 'category' = array( 1 => array( id => 5, name => php乐知博客 ), 2 => array( id => 6, name => php技术博客 ), ), 'tag' => array( tag1 tag2 ...... ), 'message' => array( 1 => array(id, uid, name, content, time), 2 => array(id, uid, name, content, time), 3 => array(id, uid, name, content, time), 4 => array(id, uid, name, content, time), ),) //比如，上面的数组结构，序列化存入数据库。
update share set cache_data=mysql_real_escape_string(serialize($cache_data)) where id=1;这样查询就变得简单了，只需要查询一条就行了，取到缓存字段，把其反序列化，把数组信息提取出来，然后显示到页面。如果是以前那个结构，在几十万的数据量下，估计早崩溃了。数据缓存的方法也许不是最好的，如果你有更好的方法，可以相互学习，相互讨论。
相关推荐：
thinkphp多表联合查询的常用方法_php教程
mysql多表联合查询返回一张表的内容实现代码
mysql多表联合查询的问题
以上就是分析优化mysql 多表联合查询效率的详细内容。

分析优化Mysql 多表联合查询效率

VIP推荐