Below is an example comparing two tables (TAB1, TAB2) with two columns (col1 number, col2 varchar2) using a PARTITION BY analytic function:
create table tab1 (col1 number, col2 varchar2(3));
create table tab2 (col1 number, col2 varchar2(3));
insert into tab1 (col1, col2) values (3, 'ccc');
insert into tab1 (col1, col2) values (1, 'aaa');
insert into tab2 (col1, col2) values (1, 'aaa');
insert into tab2 (col1, col2) values (2, 'aaa');
select * from
-- list all results
-- change to "select count(*) from" for a count of differences
(
select rid, -- ROWID
col1, -- column 1 of table to compare
col2, -- column 2 of table to compare
sum(summ) over (partition by col1, col2) summ
-- must list all columns to compare in the PARTITION BY
from
(select 'In TAB1, NOT in TAB2' tab,
rowid rid, -- need to alias ROWID
-- list all/some columns of table to compare
col1, -- column 1 of table to compare
col2, -- column 2 of table to compare
------
1 summ
-- summ is used for sum() operation,
-- a matched row from other table will sum to 0
from tab1
UNION ALL
-- change to UNION for unique rows only
select 'In TAB2, NOT in TAB1' tab,
rowid rid, -- need to alias ROWID
-- list all/some columns of table to compare
col1, -- column 1 of table to compare
col2, -- column 2 of table to compare
------
-1 summ
-- summ used for sum() operation,
-- a matched row from other table will sum to 0
from tab2)
)
-- all the rows that didn't have a match have a non-zero sum()
where summ != 0
/
RID COL1 COL SUMM
------------------ ---------- --- ----------
AAAWK9AAGAAADNHAAB 2 aaa -1
AAAWK8AAGAAADM/AAA 3 ccc 1
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 164 | 7 (15)| 00:00:01 |
|* 1 | VIEW | | 4 | 164 | 7 (15)| 00:00:01 |
| 2 | WINDOW SORT | | 4 | 124 | 7 (15)| 00:00:01 |
| 3 | VIEW | | 4 | 124 | 6 (0)| 00:00:01 |
| 4 | UNION-ALL | | | | | |
| 5 | TABLE ACCESS FULL| TAB1 | 2 | 56 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL| TAB2 | 2 | 56 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("SUMM"<>0)
Two full table scans and a SORT. Compare this to what traditionally people have used:
( select 'IN TAB1, NOT TAB2', col1, col2 from tab1 MINUS select 'IN TAB1, NOT TAB2', col1, col2 from tab2 ) UNION ALL ( select 'IN TAB2, NOT TAB1', col1, col2 from tab2 MINUS select 'IN TAB2, NOT TAB1', col1, col2 from tab1 ) / 'INTAB1,NOTTAB2' COL1 COL ----------------- ---------- --- IN TAB1, NOT TAB2 3 ccc IN TAB2, NOT TAB1 2 aaa ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 128 | 12 (75)| 00:00:01 | | 1 | UNION-ALL | | | | | | | 2 | MINUS | | | | | | | 3 | SORT UNIQUE | | 2 | 32 | | | | 4 | TABLE ACCESS FULL| TAB1 | 2 | 32 | 3 (0)| 00:00:01 | | 5 | SORT UNIQUE | | 2 | 32 | | | | 6 | TABLE ACCESS FULL| TAB2 | 2 | 32 | 3 (0)| 00:00:01 | | 7 | MINUS | | | | | | | 8 | SORT UNIQUE | | 2 | 32 | | | | 9 | TABLE ACCESS FULL| TAB2 | 2 | 32 | 3 (0)| 00:00:01 | | 10 | SORT UNIQUE | | 2 | 32 | | | | 11 | TABLE ACCESS FULL| TAB1 | 2 | 32 | 3 (0)| 00:00:01 | -----------------------------------------------------------------------------
Four full table scans, four sorts. For a large table the PARTITION BY wins.
No comments:
Post a Comment