Cluster Table is the another type of table which is supported by Oracle.
A cluster is a schema object that contains one or more tables that all have one or more columns in common. Rows of one or more tables that share the same value in these common columns are physically stored together within the database. A cluster is to store data for more than 1 table in the same block.
Tables which are created upon this cluster are called as “Cluster Table”.
Cluster Index is an index which is created for this cluster.
HOW TO VERIFY:
How to verify whether the created table is “Cluster Table” or not? Fire this query,
Select distinct cluster_name from user_all_tables where table_name IN ('EMPLOYEE',’DEPARTAMENT’)
ð This query would return the name of the cluster (eg., query returns “CLUSTER_DEPT_ID” if it is the created cluster’s name)
LITTLE-KNOWN FACTS TO BE REMEMBERED:
· Clustered Index doesn’t occupy any memory space though it is created. In this type only, index table is part of data table but deals only with the way data is being stored. It means, functionality of the clustered index in the cluster is, it guides/or force data table to store each distinct cluster key value’s related data from both the tables in the same block
· If Cluster tables are referred in a query, then in the explain plan, you never see the step, ‘TABLE ACCESS BY ROWID” instead, you would see only “TABLE ACCESS CLUSTER”
· you cannot load data into any clustered tables until you create the cluster index else any DML statement would throw the run time errors
· If you drop a existing cluster index, data in the cluster (including those cluster tables) remains but becomes unavailable until you create a new cluster index
· Clusters are of two types, indexed cluster & hast cluster. (in the below eg, I have explained the former)
· Value of the cluster key is stored only once in the block (this has been explained in the below eg.)
ADVANTAGE:
· Retrival of data from Cluster Tables would be much faster since the all the related records for each distinct index values are stored adjacent to each other physically.
· Logical I/O would be much faster since only the less number of blocks have to be referred for a query.
· Physical I/O would be much faster since only the less number of blocks to be read from the datafile.
· Amount of memory space required for the storage is reduced since value of the cluster key is only stored once.
DISADVANTAGE:
· DML activity would consume a lot of time since it couldn’t blindly insert the data in a new block. All the new/or updated record has to find out the cluster key first, traverse to that particular block and then do the respective activity.
EXAMPLE:
Create an EMP table(columns are empid,empname,salary,deptid) and inserts 16 records. Create a DEPT table(columns are deptid,deptname) and insertes 4 records. The data tables will look like this.
After doing this, Fire this query against these tables where the requirement is to display the all the employee names working for the deptid:3 along with the department names.
select a.empname,b.deptname from emp a, dept b
where a.deptid = b.deptid and b.deptid = 3;
where a.deptid = b.deptid and b.deptid = 3;
To execute this query, Oracle has to fetch all the blocks since all the blocks contain an Employee belonging to deptid:3. So, it has to fetch all the blocks of EMP table. So, the traffic would be very high. In order to reduce this traffic, Cluster has to be created.
This is the syntax to create the cluster table,
Creation of “Cluster”:
CREATE CLUSTER cluster_dept_id (
department_id number) SIZE 1024;
department_id number) SIZE 1024;
Creation of “Clustered Index”:
CREATE INDEX indx_dept_id ON CLUSTER cluster_dept_id;
Creation of “Clustered Tables”:
create table emp
(empid number,
empname varchar2(100),
salary number,
deptid number)
CLUSTER cluster_dept_id(deptid);
(empid number,
empname varchar2(100),
salary number,
deptid number)
CLUSTER cluster_dept_id(deptid);
create table dept
(deptid number,
deptname varchar2(100))
CLUSTER cluster_dept_id(deptid);
(deptid number,
deptname varchar2(100))
CLUSTER cluster_dept_id(deptid);
After this, the cluster, clustered index and the cluster tables will be similar to like this,
After creating this cluster, fire the same query again,
Now, block3 contains all the required EMP and DEPT records since all the related records of each index values would be stored adjacent physically. Because of this, oracle has to fetch only one block (block3). So, query would return the result set very fast.
Now, explain table will look like this,
SELECT STATEMENT | Cardinality=4 | |
……………. | ||
TABLE ACCESS CLUSTER | Object name=DEPT | Cardinality=1 |
INDEX UNIQUE SCAN | Object name=INDX_DEPT_ID | Cardinality=1 |
…………………… | ||
TABLE ACCESS CLUSTER | Object name=EMP | Cardinality=4 |
INDEX UNIQUE SCAN | Object name=INDX_DEPT_ID | Cardinality=1 |
If you closely look this explain plan, you can see that “TABLE ACCESS BY ROWID” is not mentioned and only “TABLE ACCESS CLUSTER” is mentioned. In both the cases, cardinality of the index (INDX_DEPT_ID) is 1. So, all the related records of both the tables are grouped together for each distinct cluster values. For DEPT table, explain table has mentioned Cardinality as 1 it means only one department record(deptid:3) is referred. For EMP table, explain table has mentioned Cardinality as 4 it means only four employee records(empid:103,107,111,115) are referred.
Comments
Post a Comment