对PostgreSQL Merge Join 的理解
开始伪代码:
http://momjian.us/main/writings/pgsql/optimizer.pdf
sort(outer);
sort(inner);
i = 0;
j = 0;
save_j = 0;
while (i < length(outer))
{
if (outer == inner)
output(outer, inner);
if (outer <= inner && j < length(inner))
{
j++;
if (outer < inner)
save_j = j;
}
else
{
i++;
j = save_j;
}
}
上述描述中,可以把两列排序好的数组看成 由大到小排列。
Merge Join 先要对各表各自排序,然后从各自的排序表中抽取数据,到另一个排序表中做匹配。
通常来讲,能够使用merge join的地方,hash join 更快。
验证:
postgres=# EXPLAIN SELECT relname,nspname FROM pg_class join
pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);
QUERY PLAN
-------------------------------------------------------------------------
Hash Join(cost=1.14..16.02 rows=290 width=128)
Hash Cond: (pg_class.relnamespace = pg_namespace.oid)
->Seq Scan on pg_class(cost=0.00..10.90 rows=290 width=68)
->Hash(cost=1.06..1.06 rows=6 width=68)
->Seq Scan on pg_namespace(cost=0.00..1.06 rows=6 width=68)
(5 rows)
postgres=#
postgres=# set session enable_hashjoin=false;
SET
postgres=# EXPLAIN SELECT relname,nspname FROM pg_class join
pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);
QUERY PLAN
-------------------------------------------------------------------------
Merge Join(cost=23.90..28.28 rows=290 width=128)
Merge Cond: (pg_namespace.oid = pg_class.relnamespace)
->Sort(cost=1.14..1.15 rows=6 width=68)
Sort Key: pg_namespace.oid
->Seq Scan on pg_namespace(cost=0.00..1.06 rows=6 width=68)
->Sort(cost=22.76..23.49 rows=290 width=68)
Sort Key: pg_class.relnamespace
->Seq Scan on pg_class(cost=0.00..10.90 rows=290 width=68)
(8 rows)
postgres=#
[作者:技术者高健@博客园mail: luckyjackgao@gmail.com ]
改了 join 的顺序 对 结果也没有影响。
postgres=# EXPLAIN SELECT relname,nspname FROM pg_namespace join
pg_class ON (pg_class.relnamespace = pg_namespace.oid);
QUERY PLAN
-------------------------------------------------------------------------
Merge Join(cost=23.90..28.28 rows=290 width=128)
Merge Cond: (pg_namespace.oid = pg_class.relnamespace)
->Sort(cost=1.14..1.15 rows=6 width=68)
Sort Key: pg_namespace.oid
->Seq Scan on pg_namespace(cost=0.00..1.06 rows=6 width=68)
->Sort(cost=22.76..23.49 rows=290 width=68)
Sort Key: pg_class.relnamespace
->Seq Scan on pg_class(cost=0.00..10.90 rows=290 width=68)
(8 rows)
postgres=#
postgres=# EXPLAIN SELECT relname,nspname FROM pg_namespace,
pg_class where (pg_class.relnamespace = pg_namespace.oid);
QUERY PLAN
-------------------------------------------------------------------------
Merge Join(cost=23.90..28.28 rows=290 width=128)
Merge Cond: (pg_namespace.oid = pg_class.relnamespace)
->Sort(cost=1.14..1.15 rows=6 width=68)
Sort Key: pg_namespace.oid
->Seq Scan on pg_namespace(cost=0.00..1.06 rows=6 width=68)
->Sort(cost=22.76..23.49 rows=290 width=68)
Sort Key: pg_class.relnamespace
->Seq Scan on pg_class(cost=0.00..10.90 rows=290 width=68)
(8 rows)
postgres=#
结束
页:
[1]