Skip to main content

Cluster Table


Cluster Table is the another type of table which is supported by Oracle.

A cluster is a schema object that contains one or more tables that all have one or more columns in common. Rows of one or more tables that share the same value in these common columns are physically stored together within the database. A cluster is to store data for more than 1 table in the same block.

Tables which are created upon this cluster are called as “Cluster Table”.

Cluster Index is an index which is created for this cluster.

HOW TO VERIFY:
How to verify whether the created table is “Cluster Table” or not? Fire this query,

Select distinct cluster_name from user_all_tables where table_name IN ('EMPLOYEE',’DEPARTAMENT’)
ð  This query would return the name of the cluster (eg., query returns “CLUSTER_DEPT_ID” if it is the created cluster’s name)

LITTLE-KNOWN FACTS TO BE REMEMBERED:
·         Clustered Index doesn’t occupy any memory space though it is created. In this type only, index table is part of data table but deals only with the way data is being stored. It means, functionality of the clustered index in the cluster is, it guides/or force data table to store each distinct cluster key value’s related data from both the tables in the same block
·         If Cluster tables are referred in a query, then in the explain plan, you never see the step, ‘TABLE ACCESS BY ROWID” instead, you would see only “TABLE ACCESS CLUSTER
·         you cannot load data into any clustered tables until you create the cluster index else any DML statement would throw the run time errors
·         If you drop a existing cluster index, data in the cluster (including those cluster tables) remains but becomes unavailable until you create a new cluster index
·         Clusters are of two types, indexed cluster & hast cluster. (in the below eg, I have explained the former)
·         Value of the cluster key is stored only once in the block (this has been explained in the below eg.)

ADVANTAGE:
·         Retrival of data from Cluster Tables would be much faster since the all the related records for each distinct index values are stored adjacent to each other physically.
·         Logical I/O would be much faster since only the less number of blocks have to be referred for a query.
·         Physical I/O would be much faster since only the less number of blocks to be read from the datafile.
·         Amount of memory space required for the storage is reduced since value of the cluster key is only stored once.

DISADVANTAGE:
·         DML activity would consume a lot of time since it couldn’t blindly insert the data in a new block. All the new/or updated record has to find out the cluster key first, traverse to that particular block and then do the respective activity.

EXAMPLE:

Create an EMP table(columns are empid,empname,salary,deptid) and inserts 16 records. Create a DEPT table(columns are deptid,deptname) and insertes 4 records. The data tables will look like this.



After doing this, Fire this query against these tables where the requirement is to display the all the employee names working for the deptid:3 along with the department names.
select a.empname,b.deptname from emp a, dept b
where a.deptid = b.deptid and b.deptid = 3;

To execute this query, Oracle has to fetch all the blocks since all the blocks contain an Employee belonging to deptid:3. So, it has to fetch all the blocks of EMP table. So, the traffic would be very high. In order to reduce this traffic, Cluster has to be created.

This is the syntax to create the cluster table,

Creation of “Cluster”:
CREATE CLUSTER cluster_dept_id (
department_id number) SIZE 1024;

Creation of “Clustered Index”:
CREATE INDEX indx_dept_id ON CLUSTER cluster_dept_id;

Creation of “Clustered Tables”:
create table emp
(empid number,
 empname varchar2(100),
 salary number,
 deptid number)
 CLUSTER cluster_dept_id(deptid);

create table dept
(deptid number,
 deptname varchar2(100))
 CLUSTER cluster_dept_id(deptid);

After this, the cluster, clustered index and the cluster tables will be similar to like this,


After creating this cluster, fire the same query again,

Now, block3 contains all the required EMP and DEPT records since all the related records of each index values would be stored adjacent physically. Because of this, oracle has to fetch only one block (block3). So, query would return the result set very fast.

Now, explain table will look like this,
SELECT STATEMENT

Cardinality=4
…………….


  TABLE ACCESS CLUSTER
Object name=DEPT
Cardinality=1
   INDEX UNIQUE SCAN
Object name=INDX_DEPT_ID
Cardinality=1
……………………


   TABLE ACCESS CLUSTER
Object name=EMP
Cardinality=4
    INDEX UNIQUE SCAN
Object name=INDX_DEPT_ID
Cardinality=1

If you closely look this explain plan, you can see that “TABLE ACCESS BY ROWID” is not mentioned and only “TABLE ACCESS CLUSTER” is mentioned. In both the cases, cardinality of the index (INDX_DEPT_ID) is 1. So, all the related records of both the tables are grouped together for each distinct cluster values. For DEPT table, explain table has mentioned Cardinality as 1 it means only one department record(deptid:3) is referred. For EMP table, explain table has mentioned Cardinality as 4 it means only four employee records(empid:103,107,111,115) are referred.

Comments

Popular posts from this blog

Listener refused the connection due to NetExcepti​on

I was testing some piece of code for calculation of new date on the basis of a given pattern and the specified date. I wrote a method to automate the test cases to generates those patterns and calculate the new date for each date of the specified date of the range of years. Since there were around 1 million pattern test cases are possible, so I want to insert this data in database for any future reference. After creating a pattern I was inserting data of the pattern and the calculation date along with the calculated date. It was working fine. I was prepare to hit the start button now, after testing different patterns individually. I hit the run button and it started its executions, but in the middle, I got this error. java.sql.SQLException: Listener refused the connection with the following error:ORA-12516, TNS:listener could not find available handler with matching protocol stack       at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:4...

B*Tree Reverse index

B*Tree Reverse index is one of the index types that is supported by oracle. It more or less behaves like b*tree unique & non-unique index but only difference is the way it constructs the tree. In this index, index value is stored in the reverse direction when compared to the values in the actual data table. This index is used in order to distribute the index data evenly across all the leaf index nodes if table data is skewed very high. Like b*tree index, this index doesn’t understand NULL. This B*Tree Reverse index can be created on a single column or more than one column. More than one B*Tree Reverse index can be created for the same data table. HOW TO VERIFY: How to verify whether the created index is reverse index or not? Fire this query, select index_type from user_indexes where index_name = 'DEPT_ID_INDX' ð   This query would return “ NORMAL/REV ” (it means it is B*Tree index but of “Reverse” type) LITTLE-KNOWN FACTS TO BE REMEMBERED: ·      ...

Hash Natural Join

To perform hash join, Oracle follows these steps: 1.      Oracle chooses the smallest of two tables as the hash table (otherwise called as driving table). Oracle built the hash table in RAM after applying the hash function on the joining column(s) of the driving table. 2.      Oracle chooses the other [big] table as the probe table (otherwise called as driven table or probing table). It traverse through all the records of this probe table, applies the same hash function on the joining column(s) [column(s) used to join these two tables] and will hit the corresponding entry in the hash table. 3.      Oracle returns the output if a record from driving table is already present in the same hash key, else no record will be returned. It may look like Nested loop join & Hash join have the same architecture since both these have the concept of driving & driven tables but they have entirely different design if you closely look a...