DBCC FREEPROCCACHE;GODBCC DROPCLEANBUFFERS;GO SET STATISTICS IO ON;SET STATISTICS TIME ON; SELECT h.* FROM Sales.SalesOrderHeader hWHERE SalesOrderID IN ( SELECT SalesOrderID FROM Sales.SalesOrderDetail)
DBCC FREEPROCCACHE;GODBCC DROPCLEANBUFFERS;GOSET STATISTICS IO ON;SET STATISTICS TIME ON; SELECT h.* FROM Sales.SalesOrderHeader hINNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID 如下所示,两种写法的SQL的实际执行计划是几乎一致。而且对比IO开销也是一致。cpu time 与elapsed time 有所差别,这个是因为两者返回的数据有所差别的缘故(SQL 1 返回 31465行数据, SQL 2返回 121317行数据),两者在逻辑上实际上是不一致的。因为重复数据的缘故。撇开这个不谈,光从性能上来考察两种,它们几乎是一模一样。没有优劣之分。
CREATE TABLE P( PID INT , Pname VARCHAR(24)) INSERT INTO dbo.PSELECT 1, 'P1' UNION ALLSELECT 2, 'P2' UNION ALLSELECT 3, 'P3' CREATE TABLE dbo.C( CID INT , PID INT , Cname VARCHAR(24)) INSERT INTO dbo.cSELECT 1, 1, 'C1' UNION ALLSELECT 2, 1, 'C2' UNION ALLSELECT 3, 2, 'C3' UNION ALLSELECT 3, 3, 'C4'
其实下面SQL在逻辑上才是相等的,它们的实际执行计划与IO是一样的。没有优劣之分。
SELECT h.* FROM Sales.SalesOrderHeader hWHERE SalesOrderID IN ( SELECT SalesOrderID FROM Sales.SalesOrderDetail); SELECT DISTINCT h.* FROM Sales.SalesOrderHeader hINNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID;
那么我们再来看另外一个例子,测试一下两者的性能差别。如下所示
SET STATISTICS IO ON;SET STATISTICS TIME ON; SELECT C.*FROM Sales.Customer C INNER JOIN Person.Person P ON C.PersonID = P.BusinessEntityID; SELECT C.*FROM Sales.Customer CWHERE C.PersonID IN ( SELECT Person.Person.BusinessEntityID FROM Person.Person );
INNER JOIN与子查询IN的实际执行计划对比的百分比为66% VS 34% , 子查询IN的性能还比 INNER JOIN的性能要好一些. IO几乎无差别,cpu time 与elapsed time的对比情况来看,子查询IN的性能确实要好一些。
CREATE TABLE P( P_ID INT IDENTITY(1,1), OTHERCOL CHAR(500), CONSTRAINT PK_P PRIMARY KEY(P_ID))GO BEGIN TRANDECLARE @I INT = 1WHILE @I<=10000BEGIN INSERT INTO P VALUES (NEWID()) SET @I = @I+1 IF (@I%500)=0 BEGIN IF @@TRANCOUNT>0 BEGIN COMMIT BEGIN TRAN END ENDENDIF @@TRANCOUNT>0BEGIN COMMITENDGO CREATE TABLE C ( C_ID INT IDENTITY(1,1) , P_ID INT FOREIGN KEY REFERENCES P(P_ID), COLN CHAR(500), CONSTRAINT PK_C PRIMARY KEY (C_ID) ) SET NOCOUNT ON; DECLARE @I INT = 1WHILE @I<=1000000BEGIN INSERT INTO C VALUES ( CAST(RAND()*10 AS INT)+1, NEWID()) SET @I = @I+1ENDGO
构造完测试数据后,我们对比下两者的性能差异
SET STATISTICS IO ON;SET STATISTICS TIME ON; SELECT C.* FROM dbo.C CINNER JOIN dbo.P P ON C.P_ID = P.P_IDWHERE P.P_ID=8 SELECT * FROM dbo.CWHERE P_ID IN (SELECT P_ID FROM dbo.P WHERE P_ID=8)
增加对应的索引后,这个性能差距更更明显。 如下截图所示
USE [AdventureWorks2014]GOCREATE NONCLUSTERED INDEX [IX_C_N1]ON [dbo].[C] ([P_ID])INCLUDE ([C_ID],[COLN])GO
在生产环境遇到一个案例, 两个视图使用INNER JOIN 与 IN 两种写法,在性能上差距很大。 使用子查询IN的性能比使用INNER JOIN的性能要好很多。如下截图所示。因为视图里面涉及多表。这样肯定导致执行计划非常复杂,导致SQL用INNER JOIN 的写法在性能上没有用子查询IN的写法要快