The Selectivity - Why INDEX is Unselective.
The Selectivity plays major role in Selection on INDEX during Query Optimization. The selectivity represents fraction of rows in row-set results. The row-set could be table / views / joins from other tables. Selectivity is tightly coupled with PREDICATES as like where name=’abhi’. Selectivity ranges from 0.0 to 1.0. A selectivity of 0.0 means that no rows are selected from a row set, whereas a selectivity of 1.0 means that all rows are selected. A predicate becomes more selective as the value approaches 0.0 and less selective (or more unselective) as the value approaches 1.0. Selective and Unselective Indexes An index could become Selective if There are less Duplicate data in INDEXED columns. Data uniqueness is pretty good There are very less count of number for each distinct value in Index. An Index could become Unselective if There are large Duplicate data in INDEXES columns. Data uniqueness is very bad There are very large counts of numbers for each distinct value in INDEX. INDEX UNSELECTIVITY CAUSE FULL TABLE SCAN The Optimizer may choose FULL table scan on basis of INDEX selectivity. If Index in unselective it will choose FULL TABLE SCAN. DEMOSTRATION As below example states that there are total 2500 Rows in table EMP_NAMES and rows distribution as per screenshot. SQL> select emp_lname,count(*) from emp_names group by emp_lname; EMP_LNAME COUNT(*) ---------- ---------- REDDY 300 DIWAN 1500 TIWARI 500 SINGH 4200 There is index on column emp_lname as in screenshot. SQL> SET LINE 500 PAGES 500 SQL> COL INDEX_NAME FOR A30 SQL> COL COLUMN_NAME FOR A20 SQL> COL TABLE_NAME FOR A20 SQL> select index_name,column_name,table_name from dba_ind_columns where table_name='EMP_NAMES'; INDEX_NAME COLUMN_NAME TABLE_NAME ------------------------------ -------------------- -------------------- IDX_SELECTIVE EMP_LNAME EMP_NAMES Let we Predict SELECTIVITY with below SQL for predicates. SQL> @selectivity_check.sql EMP_LNAME COUNT_OF_EMP_LNAME TOTAL_RECORDS SELECTIVITY_RATIO SELECTIVITY ---------- ------------------ ------------- ----------------- ----------- REDDY 300 6500 .046153846 SELECTIVE DIWAN 1500 6500 .230769231 UNSELECTIVE TIWARI 500 6500 .076923077 SELECTIVE SINGH 4200 6500 .646153846 UNSELECTIVE Now we will select data for each and every values and test whether INDEX is being used or not as per SELECTIVITY. AS we select EMP_LNAME=“DIWAN” which is having SELECTIVITY of 0.2 which is NEAR to UNSELECTIVITY ,hence INDEX is NOT being used. SQL> EXPLAIN PLAN FOR SELECT * FROM EMP_NAMES WHERE EMP_LNAME='DIWAN'; Explained. SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY()); PLAN_TABLE_OUTPUT -------------- Plan hash value: 2618431616 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1497 | 14970 | 7 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| EMP_NAMES | 1497 | 14970 | 7 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("EMP_LNAME"='DIWAN') 13 rows selected. Now as we used predicate EMP_LNAME=’REDDY’ which is having selectivity ratio of .04 ,hence INDEX is being used for that selection as per below screenshots. SQL> EXPLAIN PLAN FOR SELECT * FROM EMP_NAMES WHERE EMP_LNAME='REDDY'; Explained. SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY()); PLAN_TABLE_OUTPUT ---------- Plan hash value: 4062229589 ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 284 | 2840 | 3 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| EMP_NAMES | 284 | 2840 | 3 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_SELECTIVE | 286 | | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMP_LNAME"='REDDY') 14 rows selected. After using predicate EMP_LNAME=’SINGH’ results as below screenshot. SQL> EXPLAIN PLAN FOR SELECT * FROM EMP_NAMES WHERE EMP_LNAME='SINGH'; Explained. SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY()); PLAN_TABLE_OUTPUT ---------- Plan hash value: 2618431616 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4167 | 41670 | 7 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| EMP_NAMES | 4167 | 41670 | 7 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("EMP_LNAME"='SINGH') 13 rows selected. Below is selectivity script ** this is only for testing and demo ** SQL> !cat selectivity_check.sql Select distinct EMP_LNAME,count(EMP_LNAME) COUNT_OF_EMP_LNAME,(SELECT COUNT(*) FROM EMP_NAMES) TOTAL_RECORDS , Count(EMP_LNAME)/(select count(*) from emp_names) SELECTIVITY_RATIO , CASE WHEN Count(EMP_LNAME)/(select count(*) from emp_names) > .1 THEN 'UNSELECTIVE' ELSE 'SELECTIVE' END AS SELECTIVITY FROM emp_names group by EMP_LNAME;
© 2021 Ace2Oracle. All Rights Reserved | Developed By IBOX444