Ace2oracle | Histogram Column Statistic

Category - TUNING

Histogram - Column Statistic

Histograms - Statistics of Column

 A histogram is a type of Column statistics that provides more detailed information about the data distribution in a table column. 
 A histogram sorts values into "buckets".

Below terms and definition used in Doc

NDV - Number of distinct values in a column.
Data Skew – Distribution of Distinct value in column with huge variations. Like for Gender Y value is 80 rows and for Gender X value is 20 rows out of 100 rows. So this is skewed data.
Cardinality - The number of rows that is expected to be or is returned by an operation in an execution plan.

Why Histogram Coined in Oracle

By default the optimizer assumes a uniform distribution of rows across the distinct values in a column.

For columns that contain data skew (a nonuniform distribution of data within the column), a histogram enables the optimizer to generate accurate cardinality estimates for filter and join predicates that involve these columns.

Example => A table column contains 3 NDV as M,F,N and there are 9500 rows for M , 400 rows for F and 100 rows for N .When user queries for N values optimizer will estimate this as uniform distribution ,when there is no Histograms, resulting 10000/3 = 3333 rows and results to FULL tables scan. Whereas if Histogram is there will be index used.

When Oracle Database Creates Histograms

If DBMS_STATS gathers statistics for a table, and if queries have referenced the columns in this table, then Oracle Database creates histograms automatically as needed according to the previous query workload.

The basic process is as follows:

You run DBMS_STATS for a table with the METHOD_OPT parameter set to the default SIZE AUTO.
A user queries the table.
The database notes the predicates in the preceding query and updates the data dictionary table SYS.COL_USAGE$.
You run DBMS_STATS again, causing DBMS_STATS to query SYS.COL_USAGE$ to determine which columns require histograms based on the previous query workload.

Consequences of the AUTO feature include the following:

As queries change over time, DBMS_STATS may change which statistics it gathers. For example, even if the data in a table does not change, queries and DBMS_STATS operations can cause the plans for queries that reference these tables to change.
If you gather statistics for a table and do not query the table, then the database does not create histograms for columns in this table. For the database to create the histograms automatically, you must run one or more queries to populate the column usage information in SYS.COL_USAGE$.

Types of Histograms

1. Frequency
2. Hight-Balanced
3. Hybrid Histogram
4. Top n Frequency

How Oracle Database Chooses the Histogram Type

The histogram formula uses the following variables:

NDV => This represents the number of distinct values in a column. For example, if a column only contains the values 100, 200, and 300, then the NDV for this column is 3.
n => This variable represents the number of histogram buckets. The default is 254.
p => This variable represents an internal percentage threshold that is equal to (1–(1/n)) * 100. For example, if n = 254, then p is 99.6.
An additional criterion is whether the estimate_percent parameter in the DBMS_STATS statistics gathering procedure is set to AUTO_SAMPLE_SIZE (default).

Decision Tree for Histogram Creation

Cardinality Algorithms When Using Histograms

For histograms, the algorithm for cardinality depends on factors such as the endpoint numbers and values, and whether column values are popular or nonpopular.

1 => Endpoint Numbers and Values

An endpoint number is a number that uniquely identifies a bucket. In frequency and hybrid histograms, the endpoint number is the cumulative frequency of all values included in the current and previous buckets.

For example, a bucket with endpoint number 100 means the total frequency of values in the current and all previous buckets is 100. In height-balanced histograms, the optimizer numbers buckets sequentially, starting at 0 or 1. In all cases, the endpoint number is the bucket number.

An endpoint value is the highest value in the range of values in a bucket. For example, if a bucket contains only the values 52794 and 52795, then the endpoint value is 52795.

2 => Popular and Nonpopular Values

The popularity of a value in a histogram affects the cardinality estimate algorithm.
Specifically, the cardinality estimate is affected as follows:

Popular values

A popular value occurs as an endpoint value of multiple buckets. The optimizer determines whether a value is popular by first checking whether it is the endpoint value for a bucket. If so, then for frequency histograms, the optimizer subtracts the endpoint number of the previous bucket from the endpoint number of the current bucket. Hybrid histograms already store this information for each endpoint individually. If this value is greater than 1, then the value is popular.

The optimizer calculates its cardinality estimate for popular values using the following formula

cardinality of popular value = (num of rows in table) * (num of endpoints spanned by this value / total num of endpoints)

The optimizer calculates its cardinality estimate for popular values using the following formula

Nonpopular values

Any value that is not popular is a nonpopular value. The optimizer calculates the cardinality estimates for nonpopular values using the following formula:

cardinality of nonpopular value = (num of rows in table) * density

The optimizer calculates density using an internal algorithm based on factors such as the number of buckets and the NDV. Density is expressed as a decimal number between 0 and 1. Values close to 1 indicate that the optimizer expects many rows to be returned by a query referencing this column in its predicate list. Values close to 0 indicate that the optimizer expects few rows to be returned.

3 => Bucket Compression

In some cases, to reduce the total number of buckets, the optimizer compresses multiple buckets into a single bucket.

For example, the following frequency histogram indicates that the first bucket number is 1 and the last bucket number is 23:

Several buckets are "missing." Originally, buckets 2 through 6 each contained a single instance of value 52793. The optimizer compressed all of these buckets into the bucket with the highest endpoint number (bucket 6), which now contains 5 instances of value 52793. This value is popular because the difference between the endpoint number of the current bucket (6) and the previous bucket (1) is 5. Thus, before compression the value 52793 was the endpoint for 5 buckets.

The following annotations show which buckets are compressed, and which values are popular:

ENDPOINT_NUMBER	ENDPOINT_VALUE
1	52792
6	52793
8	52794
9	52795
10	52796
12	52797
14	52798
23	52799

ENDPOINT_NUMBER	ENDPOINT_VALUE	Comments
1	52792	-> nonpopular
6	52793	-> buckets 2-6 compressed into 6; popular
8	52794	-> buckets 7-8 compressed into 8; popular
9	52795	-> nonpopular
10	52796	-> nonpopular
12	52797	-> buckets 11-12 compressed into 12; popular
14	52798	-> buckets 13-14 compressed into 14; popular
23	52799	-> buckets 15-23 compressed into 23; popular

TYPE OF HISTOGRAM

1=> Frequency Histograms

In a frequency histogram, each distinct column value corresponds to a single bucket of the histogram. Because each value has its own dedicated bucket, some buckets may have many values, whereas others have few.

An analogy to a frequency histogram is sorting coins so that each individual coin initially gets its own bucket. For example, all the Rupees 1 coins are in bucket 1, all the Rupees 2 coins are in bucket 2, all the Rupees 5 coins in bucket 3.

Criteria for Frequency Histograms

The database creates a frequency histogram when the following criteria are met:

NDV is less than or equal to n, where n is the number of histogram buckets (default 254).
The estimate_percent parameter in the DBMS_STATS statistics gathering procedure is set to either a user-specified value or to AUTO_SAMPLE_SIZE.

Starting in Oracle Database 12c, if the sampling size is the default of AUTO_SAMPLE_SIZE, then the database creates frequency histograms from a full table scan. For all other sampling percentage specifications, the database derives frequency histograms from a sample.

Test Case for Frequency Histogram

For below example GENDER column has data skew .As shown below.

SQL> SELECT GENDER,COUNT(*) FROM EMP_DETAILS GROUP BY GENDER;

GENDER   COUNT(*)
---------- ----------
M          900
F          300
N          100

SQL>

SQL> SELECT COUNT(*) FROM EMP_DETAILS;

  COUNT(*)
----------
      1300

Let we create Index on that column as below

SQL> SELECT INDEX_NAME FROM USER_INDEXES WHERE TABLE_NAME='EMP_DETAILS';

no rows selected

SQL> CREATE INDEX EMP_DETAILS_IDX ON EMP_DETAILS(GENDER);

Index created.

SQL> SELECT INDEX_NAME FROM USER_INDEXES WHERE TABLE_NAME='EMP_DETAILS';

INDEX_NAME
--------------------------------------------------------------------------------------------------------------------------------
EMP_DETAILS_IDX

SQL> SELECT TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,HISTOGRAM FROM USER_tAB_COL_STATISTICS WHERE TABLE_NAME='EMP_DETAILS';

TABLE_NAME         COLUMN_NAME      NUM_DISTINCT HISTOGRAM
-------------------- -------------------- ------------ ---------------
EMP_DETAILS         ID                     1300 NONE
EMP_DETAILS         GENDER                 3    NONE

Let we gather stats with DBMS_STATS.AUTO_SAMPLE_SIZE

SQL> exec dbms_stats.gather_table_stats(ownname=>'abhi_test',tabname=>'EMP_DETAILS',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree =>4);

PL/SQL procedure successfully completed.

Before Histogram

Let we check Cardinality estimate for GENDER='N'

SQL> EXPLAIN PLAN FOR SELECT * FROM EMP_DETAILS WHERE GENDER='N';

Explained.

SQL> 
SQL> SELECT * FROM DBMS_XPLAN.DISPLAY();

PLAN_TABLE_OUTPUT
----------------------------
Plan hash value: 1421366555

-------------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name	      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |		      |   433 |  2598 |     3	(0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMP_DETAILS     |   433 |  2598 |     3	(0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN		    | EMP_DETAILS_IDX |   433 |       |     1	(0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
--------------------------------------------------

   2 - access("GENDER"='N')

14 rows selected.

SQL> SELECT TABLE_NAME,COLUMN_NAME,HISTOGRAM FROM USER_tAB_COL_STATISTICS WHERE TABLE_NAME='EMP_DETAILS';

TABLE_NAME	     COLUMN_NAME	  HISTOGRAM
-------------------- -------------------- ---------------
EMP_DETAILS	     ID 		  NONE
EMP_DETAILS	     GENDER		  NONE

Let we again execute stats gather and see if HISTOGRAMS developed

SQL> exec dbms_stats.gather_table_stats(ownname=>'abhi_test',tabname=>'EMP_DETAILS',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree =>4); 

PL/SQL procedure successfully completed.

Post Histogram

SQL>
SQL>
SQL>
SQL> SELECT TABLE_NAME,COLUMN_NAME,HISTOGRAM FROM USER_tAB_COL_STATISTICS WHERE TABLE_NAME='EMP_DETAILS';

TABLE_NAME      COLUMN_NAME   HISTOGRAM
-------------------- -------------------- ---------------
EMP_DETAILS      ID         NONE
EMP_DETAILS      GENDER    FREQUENCY

SQL> 

Now let we again Explain above query again and check Cardinality Estimate as below,

SQL> SELECT TABLE_NAME,COLUMN_NAME,HISTOGRAM FROM USER_tAB_COL_STATISTICS WHERE TABLE_NAME='EMP_DETAILS';

TABLE_NAME	     COLUMN_NAME	  HISTOGRAM
-------------------- -------------------- ---------------
EMP_DETAILS	     ID 		  NONE
EMP_DETAILS	     GENDER		  FREQUENCY

SQL> 
SQL> 
SQL>  EXPLAIN PLAN FOR SELECT * FROM EMP_DETAILS WHERE GENDER='N';

Explained.

SQL> 
SQL> 
SQL>  SELECT * FROM DBMS_XPLAN.DISPLAY();

PLAN_TABLE_OUTPUT
---------------------------------------------
Plan hash value: 1421366555

-------------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name	      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |		      |   100 |   600 |     2	(0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| EMP_DETAILS     |   100 |   600 |     2	(0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN		    | EMP_DETAILS_IDX |   100 |       |     1	(0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
--------------------------------------------------

   2 - access("GENDER"='N')

14 rows selected.

SQL> SELECT TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,HISTOGRAM FROM USER_TAB_COL_STATISTICS WHERE TABLE_NAME='EMP_DETAILS';

TABLE_NAME	     COLUMN_NAME	  NUM_DISTINCT HISTOGRAM
-------------------- -------------------- ------------ ---------------
EMP_DETAILS	     ID 			  1300 NONE
EMP_DETAILS	     GENDER			     3 FREQUENCY

SQL> SELECT COUNT(*) FROM EMP_DETAILS;

  COUNT(*)
----------
      1300

SQL> SELECT GENDER,COUNT(*) FROM EMP_DETAILS GROUP BY GENDER;

G   COUNT(*)
- ----------
M	 900
F	 300
N	 100
SQL> 
SQL>  SELECT TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER FROM USER_HISTOGRAMS WHERE TABLE_NAME='EMP_DETAILS' AND COLUMN_NAME='GENDER';

TABLE_NAME	     COLUMN_NAME	  ENDPOINT_NUMBER
-------------------- -------------------- ---------------
EMP_DETAILS	     GENDER			      300
EMP_DETAILS	     GENDER			     1200
EMP_DETAILS	     GENDER			     1300

SQL> 


So as we can see after HISTOGRAM build up Cardinality estimation got corrected as GENDER='N' are ~ 100 rows in table.

Generating a Frequency Histogram

To generate FREQUENCY HISTOGRAM below condition has to met.

1-> Data Skew in Column.
2-> NDV < n where as n=254 (default)
3-> DBMS_STATS must be used with auto_sampling as below,
exec dbms_stats.gather_table_stats(ownname=>'abhi_test',tabname=>'EMP_DATA',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree =>4);|

So after running query multiple times HISTOGRAM got created, once above criteria met.

2=> Top-n Frequency Histograms

A top frequency histogram is a variation on a frequency histogram that ignores nonpopular values that are statistically insignificant.

For example, if a pile of 1000 coins contains only a coin of 1 rupees then you can ignore the only 1 coin when sorting the coins into buckets. A top frequency histogram can produce a better histogram for highly popular values.

Criteria for Top Frequency Histograms

If a small number of values occupies most of the rows, then creating a frequency histogram on this small set of values is useful even when the NDV is greater than the number of requested histogram buckets. To create a better quality histogram for popular values, the optimizer ignores the nonpopular values and creates a top frequency histogram.

The database creates a TOP frequency histogram when the following criteria are met:

NDV is greater than n, where n is the number of histogram buckets (default 254).
The percentage of rows occupied by the top threshold p, where p is (1-(1/n))*100.
The estimate_percent parameter in the DBMS_STATS statistics gathering procedure is set to AUTO_SAMPLE_SIZE.
Data Skew

Test Case for Top-n Frequency Histogram

Here to demonstrate we need to shorten BUCKET Size i.e “n” .For that DBMS_STATS provide below method.

method_opt => 'FOR COLUMNS '

SQL> SELECT GENDER,COUNT(*) FROM EMP_TEST_TOPN GROUP BY GENDER;

GENDER	     COUNT(*)
---------- ----------
M		  500
F		  100
N		  400
S		  100
V		  500
T		  400

6 rows selected.

SQL> 
SQL> 
SQL> SELECT TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,HISTOGRAM FROM USER_TAB_COL_STATISTICS WHERE TABLE_NAME='EMP_TEST_TOPN';

TABLE_NAME	     COLUMN_NAME	  NUM_DISTINCT HISTOGRAM
-------------------- -------------------- ------------ ---------------
EMP_TEST_TOPN	     ID 			  2000 NONE
EMP_TEST_TOPN	     GENDER			     6 NONE

SQL> 
SQL> 
SQL> EXPLAIN PLAN FOR SELECT * FROM EMP_TEST_TOPN WHERE GENDER='S';

Explained.

SQL> SELECT * FROM DBMS_XPLAN.DISPLAY();

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
Plan hash value: 2365112915

-----------------------------------------------------------------------------------
| Id  | Operation	  | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |		  |   333 |  1998 |	4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMP_TEST_TOPN |   333 |  1998 |	4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------
   1 - filter("GENDER"='S')

13 rows selected.

SQL> 
SQL> 

Here NDV =6 so we need to have n as less than 6 ,To achieve this we use below porc.

SQL> exec dbms_stats.gather_table_stats(ownname=>'abhi_test',tabname=>'EMP_TEST',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,method_opt => 'FOR COLUMNS GENDER SIZE 5',cascade=>TRUE;

PL/SQL procedure successfully completed.
SQL> 
SQL> SELECT TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,HISTOGRAM FROM USER_TAB_COL_STATISTICS WHERE TABLE_NAME='EMP_TEST_TOPN';

TABLE_NAME	     COLUMN_NAME	  NUM_DISTINCT HISTOGRAM
-------------------- -------------------- ------------ ---------------
EMP_TEST_TOPN	     ID 			  2000 NONE
EMP_TEST_TOPN	     GENDER			     6 TOP-FREQUENCY

SQL> 
SQL> EXPLAIN PLAN FOR SELECT * FROM EMP_TEST_TOPN WHERE GENDER='S';

Explained.

SQL>  SELECT * FROM DBMS_XPLAN.DISPLAY();

PLAN_TABLE_OUTPUT
---------------------------------------
Plan hash value: 2365112915

-----------------------------------------------------------------------------------
| Id  | Operation	  | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |		  |   199 |  1194 |	4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMP_TEST_TOPN |   199 |  1194 |	4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------

   1 - filter("GENDER"='S')

13 rows selected.

Let we check Number of Buckets formed for this Table Columns as below .

SQL> 
SQL> SELECT TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER FROM USER_HISTOGRAMS WHERE TABLE_NAME='EMP_TEST_TOPN'  AND COLUMN_NAME='GENDER';

TABLE_NAME	     COLUMN_NAME	  ENDPOINT_NUMBER
-------------------- -------------------- ---------------
EMP_TEST_TOPN	     GENDER				  1
EMP_TEST_TOPN	     GENDER			      501
EMP_TEST_TOPN	     GENDER			      901
EMP_TEST_TOPN	     GENDER			     1301
EMP_TEST_TOPN	     GENDER			     1801

As we can see there are 5 bucket avialble for GENDER colmn due to Method_opt value in DBMS_STATS.

SQL> 
SQL> SELECT GENDER,COUNT(*) FROM EMP_TEST_TOPN GROUP BY GENDER;

GENDER	     COUNT(*)
---------- ----------
M		  500
F		  100
N		  400
S		  100
V		  500
T		  400

6 rows selected.

Now let we verify requirement for TOP-Frequency .

1=> NDV > n  i.e  6>5

2=> Value of p

1/5=.2 => 1-.02 => .8 *100 => 80 %
P= 80%

The percentage of rows occupied by the top 5 frequent values is equal to or greater than threshold p,  90% > 80%.

3=> Height-Balanced Histograms (Legacy)

In a legacy height-balanced histogram, column values are divided into buckets so that each bucket contains approximately the same number of rows.

For example, if you have 99 coins to distribute among 4 buckets, each bucket contains about 25 coins. The histogram shows where the endpoints fall in the range of values.

Criteria for Height-Balanced Histograms

The database creates a height-balanced histogram when the following criteria are met:

NDV is greater than n, where n is the number of histogram buckets (default 254).
The estimate_percent parameter in the DBMS_STATS statistics gathering procedure is not set to AUTO_SAMPLE_SIZE.

Generating a Height-Balanced Histogram

Here to demonstrate we need to shorten BUCKET Size i.e “n” .For that DBMS_STATS provide below method. And we also need to set estimate_percent as per user value.
We are continuing with last expample as covered above.

SQL> exec dbms_stats.gather_table_stats(ownname=>'abhi_test',tabname=>'EMP_TEST_TOPN',estimate_percent=>90,method_opt => 'FOR COLUMNS GENDER SIZE 5',cascade=>TRUE,degree =>4);

PL/SQL procedure successfully completed.

SQL> SQL> 
SQL> SELECT TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,HISTOGRAM FROM USER_TAB_COL_STATISTICS WHERE TABLE_NAME='EMP_TEST_TOPN';

TABLE_NAME	     COLUMN_NAME	  NUM_DISTINCT HISTOGRAM
-------------------- -------------------- ------------ ---------------
EMP_TEST_TOPN	     ID 			  2000 NONE
EMP_TEST_TOPN	     GENDER			     6 HEIGHT BALANCED

SQL>

4=> Hybrid Histograms

A hybrid histogram combines characteristics of both height-based histograms and frequency histograms. This "best of both worlds" approach enables the optimizer to obtain better selectivity estimates in some situations.

The height-based histogram sometimes produces inaccurate estimates for values that are almost popular. For example, a value that occurs as an endpoint value of only one bucket but almost occupies two buckets is not considered popular.

To solve this problem, a hybrid histogram distributes values so that no value occupies more than one bucket, and then stores the endpoint repeat count value, which is the number of times the endpoint value is repeated, for each endpoint (bucket) in the histogram. By using the repeat count, the optimizer can obtain accurate estimates for almost popular values.

Criteria for Hybrid Histograms

The database creates a hybrid histogram when the following criteria are met:

NDV is greater than n, where n is the number of histogram buckets (default is 254).
The percentage of rows occupied by the top n frequent values is less than threshold p, where p is (1-(1/n))*100.
The estimate_percent parameter in the DBMS_STATS statistics gathering procedure is set to AUTO_SAMPLE_SIZE.

Generating a Height-Balanced Histogram

Here to demonstrate we need to shorten BUCKET Size i.e “n” .For that DBMS_STATS provide below method. And we also need to set estimate_percent as per DBMS_STATS.AUTO_SAMPLE_SIZE.
We are continuing with last expample as covered above.
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'abhi_test',tabname=>'EMP_TEST_TOPN',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,method_opt => 'FOR COLUMNS GENDER SIZE 2',cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL> SELECT TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,HISTOGRAM FROM USER_TAB_COL_STATISTICS WHERE TABLE_NAME='EMP_TEST_TOPN';

TABLE_NAME      COLUMN_NAME   NUM_DISTINCT HISTOGRAM
-------------------- -------------------- ------------ ---------------
EMP_TEST_TOPN      ID      2000 NONE
EMP_TEST_TOPN      GENDER        6 HYBRID

SQL> SELECT TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER,ENDPOINT_REPEAT_COUNT FROM USER_HISTOGRAMS WHERE TABLE_NAME='EMP_TEST_TOPN'  AND COLUMN_NAME='GENDER';

TABLE_NAME         COLUMN_NAME      ENDPOINT_NUMBER ENDPOINT_REPEAT_COUNT
-------------------- -------------------- --------------- ---------------------
EMP_TEST_TOPN         GENDER                  100            100
EMP_TEST_TOPN         GENDER                 2000            500

SQL>
SQL>

Views Used for HISTOGRAMS

USER_TAB_COL_STATISTICS
USER_HISTOGRAMS
USER_TAB_COLUMNS

Category - TUNING

Histogram - Column Statistic

Follow us on: