JOINS
A join combines the output from exactly two row sources, such as tables or views, and returns one row source. The returned row source is the data set. A join condition compares two row sources using an expression. The join condition defines the relationship between the tables. If the statement does not specify a join condition, then the database performs a Cartesian join, matching every row in one table with every row in the other table. Join Methods A join method is the mechanism for joining two row sources. Each join method has two children: the driving (also called outer) row source and the driven-to (also called inner) row source. There are 3 main type of JOINS .i.e Nested Loop,Hash join,Sort Merge Joins. Driving Table => The Outer table is called Driving table Inner Table => The table which iterates with Outer table called inner table. We will discuss JOIN Methods as below. => Nested Loop Joins Nested loops join an outer data set to an inner data set. For each row in the outer data set that matches the single-table predicates, the database retrieves all rows in the inner data set that satisfy the join predicate. If an index is available, then the database can use it to access the inner data set by rowid. The number of rows expected from the join is what drives the optimizer decision, not the size of the underlying tables. For example, a query might join two tables of a billion rows each, but because of the filters the optimizer expects data sets of 5 rows each. When Optimizer choose Nested Loop Nested loops joins are useful when the database joins small subsets of data, The database joins large sets of data with the optimizer mode set to FIRST_ROWS, The join condition is an efficient method of accessing the inner table. In general, nested loops joins work best on small tables with indexes on the join conditions. If a row source has only one row, as with an equality lookup on a primary key value (for example, WHERE employee_id=101), then the join is a simple lookup. The optimizer always tries to put the smallest row source first, making it the driving table. A nested loops join involves the following basic steps: The optimizer determines the driving row source and designates it as the outer loop. The optimizer designates the other row source as the inner loop. For every fetch request from the client, the basic process is as follows: Fetch a row from the outer row source Probe the inner row source to find rows that match the predicate criteria Repeat the preceding steps until all rows are obtained by the fetch request Sometimes the database sorts rowids to obtain a more efficient buffer access pattern. Nested Loops Controls You can add the USE_NL hint to instruct the optimizer to join each specified table to another row source with a nested loops join, using the specified table as the inner table. The related hint USE_NL_WITH_INDEX(table index) hint instructs the optimizer to join the specified table to another row source with a nested loops join using the specified table as the inner table. Example SQL> SELECT * FROM EMP where ENAME IN ('CLARKE','BLAKE'); EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 SQL> SELECT * FROM EMP where ENAME IN ('CLARK','BLAKE'); EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 SQL> SELECT * FROM DEPT WHERE DNAME IN ('ACCOUNTING','SALES'); DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 30 SALES CHICAGO SQL> SELECT E.ENAME,E.SAL,D.DNAME FROM DEPT D,EMP E WHERE D.DNAME IN ('ACCOUNTING','SALES') AND E.ENAME IN ('CLARK','BLAKE') AND E.DEPTNO=D.DEPTNO; ENAME SAL DNAME ---------- ---------- -------------- CLARK 2450 ACCOUNTING BLAKE 2850 SALES SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR()); PLAN_TABLE_OUTPUT --------------------------- SQL_ID dk881s6gh1986, child number 0 ------------------------------------- SELECT E.ENAME,E.SAL,D.DNAME FROM DEPT D,EMP E WHERE D.DNAME IN ('ACCOUNTING','SALES') AND E.ENAME IN ('CLARK','BLAKE') AND E.DEPTNO=D.DEPTNO Plan hash value: 1237191068 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 9 (100)| | | 1 | NESTED LOOPS | | 2 | 44 | 9 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 1500 | 44 | 9 (0)| 00:00:01 | | 3 | INLIST ITERATOR | | | | | | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED| DEPT | 2 | 16 | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | DEPT_IDX_NAME | 2 | | 1 (0)| 00:00:01 | |* 6 | INDEX RANGE SCAN | DEPT_EMP_IDX | 750 | | 1 (0)| 00:00:01 | |* 7 | TABLE ACCESS BY INDEX ROWID | EMP | 1 | 14 | 6 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access(("D"."DNAME"='ACCOUNTING' OR "D"."DNAME"='SALES')) 6 - access("E"."DEPTNO"="D"."DEPTNO") 7 - filter(("E"."ENAME"='BLAKE' OR "E"."ENAME"='CLARK')) Note ----- - this is an adaptive plan 32 rows selected. SQL> => Hash Joins The database uses a hash join to join larger data sets. The optimizer uses the smaller of two data sets to build a hash table on the join key in memory, using a deterministic hash function to specify the location in the hash table in which to store each row. The database then scans the larger data set, probing the hash table to find the rows that meet the join condition. When the Optimizer Considers Hash Joins In general, the optimizer considers a hash join when a relatively large amount of data must be joined (or a large percentage of a small table must be joined), and the join is an equijoin."=" A hash join is most cost effective when the smaller data set fits in memory. In this case, the cost is limited to a single read pass over the two data sets. Because the hash table is in the PGA, Oracle Database can access rows without latching them. This technique reduces logical I/O by avoiding the necessity of repeatedly latching and reading blocks in the database buffer cache. If the data sets do not fit in memory, then the database partitions the row sources, and the join proceeds partition by partition. This can use a lot of sort area memory, and I/O to the temporary tablespace. This method can still be the most cost effective, especially when the database uses parallel query servers. Hash Join Controls The USE_HASH hint instructs the optimizer to use a hash join when joining two tables together. How Hash Joins Work Step 1=> Find out Smaller rowset table in Query. Step 2=> Hashing Smaller rowset table’s column on join columns Step 3=> Put hash table with hasing value and table content in PGA Step 4=> Hashing of Larger table column ,using same same has function as in Step-2 , on the fly. Step 5=> If hash function return same value as from Step 2 for Step 4 it proceed and join them else move to next value on larger dataset. Example As per below DEPT table will return very small rowset with mentioned filter ,where as EMP table will return larger rowset without fiter. So Table DEPT will be good candidate for HASING As we can see DEPT table got hashed i.e In a hash join, the data set for the build table always appears first in the list of operations (Step 2). In Step 3, the database performs a full scan of the larger EMP later, probing the hash table for each row. SQL> select count(*) from DEpt where DNAME IN ('ACCOUNTING','SALES'); COUNT(*) ---------- 2 SQL> select count(*) from EMP; COUNT(*) ---------- 9003 SQL> SELECT E.ENAME,E.SAL,D.DNAME FROM DEPT D,EMP E WHERE D.DNAME IN ('ACCOUNTING','SALES') AND D.DEPTNO=E.DEPTNO; ENAME SAL DNAME ---------- ---------- -------------- BLAKE 2850 SALES CLARK 2450 ACCOUNTING ALLEN 1600 SALES WARD 1250 SALES MARTIN 1250 SALES TURNER 1500 SALES JAMES 950 SALES MILLER 1300 ACCOUNTING 8 rows selected. SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR()); PLAN_TABLE_OUTPUT ----------------------------- SQL_ID d5u7qnkz4tm09, child number 0 ------------------------------------- SELECT E.ENAME,E.SAL,D.DNAME FROM DEPT D,EMP E WHERE D.DNAME IN ('ACCOUNTING','SALES') AND D.DEPTNO=E.DEPTNO Plan hash value: 615168685 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 20 (100)| | |* 1 | HASH JOIN | | 1501 | 33022 | 20 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL| DEPT | 2 | 16 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| EMP | 9003 | 123K| 17 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("D"."DEPTNO"="E"."DEPTNO") 2 - filter(("D"."DNAME"='ACCOUNTING' OR "D"."DNAME"='SALES')) 22 rows selected. SQL> => Sort Merge Joins A sort merge join is a variation on a nested loops join. If the two data sets in the join are not already sorted, then the database sorts them. These are the SORT JOIN operations. For each row in the first data set, the database probes the second data set for matching rows and joins them, basing its start position on the match made in the previous iteration. This is the MERGE JOIN operation. When the Optimizer Considers Sort Merge Joins A sort merge join requires two sorts. The database accesses rows in the PGA rather than the SGA, reducing logical I/O by avoiding the necessity of repeatedly latching and reading blocks in the database buffer cache. The optimizer may choose a sort merge join over a hash join for joining large amounts of data when any of the following conditions is true: a. The join condition between two tables is not an equijoin, that is, uses an inequality condition such as <, <=, >, or >=. Because of sorts required by other operations, the optimizer finds it cheaper to use a sort merge. If an index exists, then the database can avoid sorting the first data set. However, the database always sorts the second data set, regardless of indexes. Sort merge joins offer the following advantages over a hash join: After the initial sort, the merge phase is optimized, resulting in faster generation of output rows. A sort merge can be more cost-effective than a hash join when the hash table does not fit completely in memory. A hash join with insufficient memory requires both the hash table and the other data set to be copied to disk. In this case, the database may have to read from disk multiple times. In a sort merge, if memory cannot hold the two data sets, then the database writes them both to disk, but reads each data set no more than once. How Sort Merge Joins Work A sort merge join reads two data sets, but sorts them when they are not already sorted. For each row in the first data set, the database finds a starting row in the second data set, and then reads the second data set until it finds a nonmatching row. Sort Merge Controls The USE_MERGE hint instructs the optimizer to use a sort merge join. Example SQL> select count(*) from DEPT1; COUNT(*) ---------- 4 SQL> select count(*) from emp1; COUNT(*) ---------- 15 SQL> SELECT E.ENAME,E.SAL,D.DNAME FROM DEPT1 D,EMP1 E WHERE D.DNAME IN ('ACCOUNTING','SALES') AND D.DEPTNO=E.DEPTNO; ENAME SAL DNAME ---------- ---------- -------------- CLARK 2450 ACCOUNTING MILLER 1300 ACCOUNTING KING 5000 ACCOUNTING WARD 1250 SALES MARTIN 1250 SALES TURNER 1500 SALES JAMES 950 SALES ALLEN 1600 SALES BLAKE 2850 SALES 9 rows selected. SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR()); PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID 5udkmzpbvz3zz, child number 0 ------------------------------------- SELECT E.ENAME,E.SAL,D.DNAME FROM DEPT1 D,EMP1 E WHERE D.DNAME IN ('ACCOUNTING','SALES') AND D.DEPTNO=E.DEPTNO Plan hash value: 2387524659 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 6 (100)| | | 1 | MERGE JOIN | | 9 | 243 | 6 (17)| 00:00:01 | |* 2 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 2 | 26 | 2 (0)| 00:00:01 | | 3 | INDEX FULL SCAN | PK_DEPT1 | 4 | | 1 (0)| 00:00:01 | |* 4 | SORT JOIN | | 14 | 196 | 4 (25)| 00:00:01 | |* 5 | TABLE ACCESS FULL | EMP1 | 14 | 196 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(("D"."DNAME"='ACCOUNTING' OR "D"."DNAME"='SALES')) 4 - access("D"."DEPTNO"="E"."DEPTNO") filter("D"."DEPTNO"="E"."DEPTNO") 5 - filter("E"."DEPTNO" IS NOT NULL) 26 rows selected. SQL> Join Types A join type is determined by the type of join condition. Below are popular Types. A=> Inner Joins An inner join (sometimes called a simple join) is a join that returns only rows that satisfy the join condition. Inner joins are either equijoins or nonequijoins. A1=>Equijoins An equijoin is an inner join whose join condition contains an equality operator. The following example is an equijoin because the join condition contains only an equality operator: Like WHERE e.department_id=d.department_id; A2=>Non-Equijoins A nonequijoin is an inner join whose join condition contains an operator that is not an equality operator. Like e.hire_date BETWEEN h.start_date AND h.end_date; B=>Outer Joins An outer join returns all rows that satisfy the join condition and also rows from one table for which no rows from the other table satisfy the condition. Thus, the result set of an outer join is the superset of an inner join. The OUTER JOIN clause specifies an outer join. In the FROM clause, the left table appears to the left of the OUTER JOIN keywords, and the right table appears to the right of these keywords. The left table is also called the outer table, and the right table is also called the inner table. For example, in the following statement the employees table is the left or outer table: FROM employees LEFT OUTER JOIN departments Outer joins require the outer-joined table to be the driving table. In the preceding example, employees is the driving table, and departments is the driven-to table. B1=>Right Outer Joins Fetch all rows from RIGHT side table ofOUTER JOIN .
As per below exampl there is NULL entry in EMP1 table and we want this in our report. SQL> select * from DEPT1; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON SQL> select * from emp1; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7839 KING PRESIDENT 17-NOV-81 5000 10 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 7566 JONES MANAGER 7839 02-APR-81 2975 20 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 7902 FORD ANALYST 7566 03-DEC-81 3000 20 7369 SMITH CLERK 7902 17-DEC-80 800 20 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 7900 JAMES CLERK 7698 03-DEC-81 950 30 7934 MILLER CLERK 7782 23-JAN-82 1300 10 7800 MILLER1 CLERK1 7782 23-JAN-82 1300 15 rows selected. SQL> select * from EMP1 where deptno is null; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7800 MILLER1 CLERK1 7782 23-JAN-82 1300 SQL> SQL> select E.ENAME,E.SAL,D.DNAME,D.DEPTNO FROM DEPT1 D RIGHT OUTER JOIN EMP1 E ON E.DEPTNO=D.DEPTNO ORDER BY D.DEPTNO; ENAME SAL DNAME DEPTNO ---------- ---------- -------------- ---------- KING 5000 ACCOUNTING 10 MILLER 1300 ACCOUNTING 10 CLARK 2450 ACCOUNTING 10 SMITH 800 RESEARCH 20 FORD 3000 RESEARCH 20 SCOTT 3000 RESEARCH 20 JONES 2975 RESEARCH 20 ADAMS 1100 RESEARCH 20 BLAKE 2850 SALES 30 ALLEN 1600 SALES 30 WARD 1250 SALES 30 MARTIN 1250 SALES 30 TURNER 1500 SALES 30 JAMES 950 SALES 30 MILLER1 1300 15 rows selected. B2=>Left Outer Joins Fetch all rows from LEFT side table of OUTER JOIN clause As we can see from Above example that there is no entry for DEPTNO 40 in EMP1 table, and we want that in our report. SQL> select E.ENAME,E.SAL,D.DNAME,D.DEPTNO FROM DEPT1 D LEFT OUTER JOIN EMP1 E ON E.DEPTNO=D.DEPTNO ORDER BY D.DEPTNO; ENAME SAL DNAME DEPTNO ---------- ---------- -------------- ---------- CLARK 2450 ACCOUNTING 10 MILLER 1300 ACCOUNTING 10 KING 5000 ACCOUNTING 10 FORD 3000 RESEARCH 20 SCOTT 3000 RESEARCH 20 JONES 2975 RESEARCH 20 SMITH 800 RESEARCH 20 ADAMS 1100 RESEARCH 20 WARD 1250 SALES 30 MARTIN 1250 SALES 30 TURNER 1500 SALES 30 JAMES 950 SALES 30 ALLEN 1600 SALES 30 BLAKE 2850 SALES 30 OPERATIONS 40 15 rows selected. C=> Semijoins A semijoin is a join between two data sets that returns a row from the first set when a matching row exists in the subquery data set. The database stops processing the second data set at the first match. The optimizer may choose a semijoin in the following circumstances: The statement uses either an IN or EXISTS clause. The statement contains a subquery in the IN or EXISTS clause. The IN or EXISTS clause is not contained inside an OR branch. SQL> SELECT DEPTNO, DNAME FROM DEPT1 WHERE EXISTS (SELECT 1 FROM EMP1 WHERE EMP1.DEPTNO = DEPT1.DEPTNO); 2 3 4 5 DEPTNO DNAME ---------- -------------- 10 ACCOUNTING 20 RESEARCH 30 SALES SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR()); PLAN_TABLE_OUTPUT ------------------------ SQL_ID 91rkb5stx3txy, child number 0 ------------------------------------- SELECT DEPTNO, DNAME FROM DEPT1 WHERE EXISTS (SELECT 1 FROM EMP1 WHERE EMP1.DEPTNO = DEPT1.DEPTNO) Plan hash value: 4203867091 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 6 (100)| | | 1 | MERGE JOIN SEMI | | 3 | 48 | 6 (17)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 4 | 52 | 2 (0)| 00:00:01 | | 3 | INDEX FULL SCAN | PK_DEPT1 | 4 | | 1 (0)| 00:00:01 | |* 4 | SORT UNIQUE | | 14 | 42 | 4 (25)| 00:00:01 | |* 5 | TABLE ACCESS FULL | EMP1 | 14 | 42 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("EMP1"."DEPTNO"="DEPT1"."DEPTNO") filter("EMP1"."DEPTNO"="DEPT1"."DEPTNO") 5 - filter("EMP1"."DEPTNO" IS NOT NULL) 25 rows selected. SQL> D=> Antijoins An antijoin is a join between two data sets that returns a row from the first set when a matching row does not exist in the subquery data set. The optimizer may choose an antijoin in the following circumstances: The statement uses either the NOT IN or NOT EXISTS clause. The statement has a subquery in the NOT IN or NOT EXISTS clause. The NOT IN or NOT EXISTS clause is not contained inside an OR branch. The statement performs an outer join and applies an IS NULL condition to a join column, as in the following example: SQL> SELECT DEPTNO, DNAME FROM DEPT1 WHERE not EXISTS (SELECT 1 FROM EMP1 WHERE EMP1.DEPTNO = DEPT1.DEPTNO); 2 3 4 5 DEPTNO DNAME ---------- -------------- 40 OPERATIONS SQL> SQL> SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR()); PLAN_TABLE_OUTPUT ---------------- SQL_ID 6ft31ctqwvqq4, child number 0 ------------------------------------- SELECT DEPTNO, DNAME FROM DEPT1 WHERE not EXISTS (SELECT 1 FROM EMP1 WHERE EMP1.DEPTNO = DEPT1.DEPTNO) Plan hash value: 2120297959 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 6 (100)| | | 1 | MERGE JOIN ANTI | | 1 | 16 | 6 (17)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 4 | 52 | 2 (0)| 00:00:01 | | 3 | INDEX FULL SCAN | PK_DEPT1 | 4 | | 1 (0)| 00:00:01 | |* 4 | SORT UNIQUE | | 14 | 42 | 4 (25)| 00:00:01 | |* 5 | TABLE ACCESS FULL | EMP1 | 14 | 42 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("EMP1"."DEPTNO"="DEPT1"."DEPTNO") filter("EMP1"."DEPTNO"="DEPT1"."DEPTNO") 5 - filter("EMP1"."DEPTNO" IS NOT NULL) 25 rows selected. SQL> E=>Cartesian Joins The database uses a Cartesian join when one or more of the tables does not have any join conditions to any other tables in the statement. The optimizer joins every row from one data source with every row from the other data source, creating the Cartesian product of the two sets. The optimizer uses a Cartesian join for two row sources only in specific circumstances. No join condition exists. If a Cartesian join appears in a query plan, it could be caused by an inadvertently omitted join condition. In general, if a query joins n tables, then n-1 join conditions are required to avoid a Cartesian join. TheORDERED
hint specifies a table before its join table is specified. Like SELECT /*+ORDERED*/ . SQL> EXPLAIN PLAN FOR SELECT * FROM DEPT1 D1,DEPT1 D2; Explained. SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY()); PLAN_TABLE_OUTPUT ------------------------ Plan hash value: 2590926802 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 16 | 640 | 10 (0)| 00:00:01 | | 1 | MERGE JOIN CARTESIAN| | 16 | 640 | 10 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | DEPT1 | 4 | 80 | 3 (0)| 00:00:01 | | 3 | BUFFER SORT | | 4 | 80 | 7 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | DEPT1 | 4 | 80 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------ 11 rows selected. SQL>
© 2021 Ace2Oracle. All Rights Reserved | Developed By IBOX444