Below is an example comparing two tables (TAB1, TAB2) with two columns (col1 number, col2 varchar2) using a PARTITION BY analytic function:
create table tab1 (col1 number, col2 varchar2(3)); create table tab2 (col1 number, col2 varchar2(3)); insert into tab1 (col1, col2) values (3, 'ccc'); insert into tab1 (col1, col2) values (1, 'aaa'); insert into tab2 (col1, col2) values (1, 'aaa'); insert into tab2 (col1, col2) values (2, 'aaa'); select * from -- list all results -- change to "select count(*) from" for a count of differences ( select rid, -- ROWID col1, -- column 1 of table to compare col2, -- column 2 of table to compare sum(summ) over (partition by col1, col2) summ -- must list all columns to compare in the PARTITION BY from (select 'In TAB1, NOT in TAB2' tab, rowid rid, -- need to alias ROWID -- list all/some columns of table to compare col1, -- column 1 of table to compare col2, -- column 2 of table to compare ------ 1 summ -- summ is used for sum() operation, -- a matched row from other table will sum to 0 from tab1 UNION ALL -- change to UNION for unique rows only select 'In TAB2, NOT in TAB1' tab, rowid rid, -- need to alias ROWID -- list all/some columns of table to compare col1, -- column 1 of table to compare col2, -- column 2 of table to compare ------ -1 summ -- summ used for sum() operation, -- a matched row from other table will sum to 0 from tab2) ) -- all the rows that didn't have a match have a non-zero sum() where summ != 0 / RID COL1 COL SUMM ------------------ ---------- --- ---------- AAAWK9AAGAAADNHAAB 2 aaa -1 AAAWK8AAGAAADM/AAA 3 ccc 1 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 4 | 164 | 7 (15)| 00:00:01 | |* 1 | VIEW | | 4 | 164 | 7 (15)| 00:00:01 | | 2 | WINDOW SORT | | 4 | 124 | 7 (15)| 00:00:01 | | 3 | VIEW | | 4 | 124 | 6 (0)| 00:00:01 | | 4 | UNION-ALL | | | | | | | 5 | TABLE ACCESS FULL| TAB1 | 2 | 56 | 3 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL| TAB2 | 2 | 56 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SUMM"<>0)
Two full table scans and a SORT. Compare this to what traditionally people have used:
( select 'IN TAB1, NOT TAB2', col1, col2 from tab1 MINUS select 'IN TAB1, NOT TAB2', col1, col2 from tab2 ) UNION ALL ( select 'IN TAB2, NOT TAB1', col1, col2 from tab2 MINUS select 'IN TAB2, NOT TAB1', col1, col2 from tab1 ) / 'INTAB1,NOTTAB2' COL1 COL ----------------- ---------- --- IN TAB1, NOT TAB2 3 ccc IN TAB2, NOT TAB1 2 aaa ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 128 | 12 (75)| 00:00:01 | | 1 | UNION-ALL | | | | | | | 2 | MINUS | | | | | | | 3 | SORT UNIQUE | | 2 | 32 | | | | 4 | TABLE ACCESS FULL| TAB1 | 2 | 32 | 3 (0)| 00:00:01 | | 5 | SORT UNIQUE | | 2 | 32 | | | | 6 | TABLE ACCESS FULL| TAB2 | 2 | 32 | 3 (0)| 00:00:01 | | 7 | MINUS | | | | | | | 8 | SORT UNIQUE | | 2 | 32 | | | | 9 | TABLE ACCESS FULL| TAB2 | 2 | 32 | 3 (0)| 00:00:01 | | 10 | SORT UNIQUE | | 2 | 32 | | | | 11 | TABLE ACCESS FULL| TAB1 | 2 | 32 | 3 (0)| 00:00:01 | -----------------------------------------------------------------------------
Four full table scans, four sorts. For a large table the PARTITION BY wins.
No comments:
Post a Comment