Indexing Tables : Building block of performance tuning, an overview
DEFINITION:
An INDEX is a schema object that contains an entry for each value that appears in the indexed column(s) of the table and provides direct & fast access to rows(i.e. table records). INDEX is nothing but to create a logical relationship table (i.e. index table) where the mapping of indexed value of all the table records and ROWID is captured.
Index table is totally different from normal table where latter contains the actual data but former is a logical mapping table which is mentioned above in order to quickly retrieve the table records. Index table doesn’t contain any actual data. Index table can be read or written only by Oracle. DB users can’t understand/ or view the index table since it is in oracle readable format.
Index can be created on a single column or on a combination of multiple columns. More than one index table can be created for a single data table. Index is mainly used to improve the performance of the database (especially SQL queries)
ADVANTAGE:
- Using Index, we can speedily find the address (ROWID – which is nothing but the physical address where the table record is stored in the datafile) where the data is stored
- In Index, random search is made to look for the indexed values instead of sequential search. Performance gain is achieved since the random search (which is used while searching for the data in index table) is much faster than the sequential search (which is used while searching for the data in data table)
- In some cases, if all the required columns to be displayed available in the index table itself, then Oracle don’t need to refer the actual data table at all.
DISADVANTAGE:
· It consumes considerable amount of memory since it needs the space to store the index table.
· DML operations (Insert, Update, Delete, Merge) will be affected and this operations take addition time to execute the DML statements. This additional time is consumed by Oracle in order to update the index table when user tries to do any DML activity in the actual table since index table should be in sync with the actual table.
LITTLE-KNOWN FACTS TO BE REMEMBERED:
- Whenever any index is created on a column of a table, then in the index table, all the index record would be sorted by the column’s values in the ascending order by default. Since this indexed value is in ascending order, oracle uses the random search algorithm (which is nothing but binary search in most cases) to refer the index records which reduces the data retrieval time significantly.
- External table – It is one of the specific table format which oracle alone supports. In this external table, you can store the data but Oracle doesn’t allow to create any type of indexes on this table. Reason is, if you want to create the index table, then you must know ROWID. But in this table format, Oracle stores the data outside the database, so ROWID becomes immaterial.
EXAMPLE:
Create an employee table and inserts 10 records. The table will look like this,
DATA TABLE:
ROWID | empid | column2 | column3 | jobid | deptid | column4 | colum5 |
AAAAA1 | 3 | .. | .. | MGR | DB | .. | .. |
AAAAA2 | 1 | .. | .. | TL | JPMC | .. | .. |
AAAAA3 | 7 | .. | .. | PL | CITI | .. | .. |
AAAAA4 | 9 | .. | .. | TL | DB | .. | .. |
AAAAA5 | 8 | .. | .. | TRAINEE | CITI | .. | .. |
AAAAA6 | 2 | .. | .. | PL | DB | .. | .. |
AAAAA7 | 5 | .. | .. | MGR | CITI | .. | .. |
AAAAA8 | 4 | .. | .. | PL | CITI | .. | .. |
AAAAA9 | 6 | .. | .. | MGR | SS | .. | .. |
AAAAA10 | 10 | .. | .. | PL | JPMC | .. | .. |
Fire this query against this table where the requirement is to display the all the attributes of an employee whose
Select * from emp where empid = 10;
Here, we are yet to create index. Now, to execute this query, Oracle will take 10 seconds (assume oracle takes 1 second for single data table record search).
Why it is taking 10 seconds? Reason is, Oracle follows the sequential search when it looks for the data in the data table. Here, we are looking for the empid, 10. Since oracle is searching in the data table, it can’t go directly to 10th record. Oracle starts from first record and then search record by record till it reaches the last record of the same table.
Problem here is, oracle has unnecessarily referred the first 9 table records which are not meant to be referred for this sql.
In order to overcome this issue, we have to create the index on empid column.
Create an index on this table for empid column. (create unique index emp_id_indx on emp(empid)).
Index table will logically look like this,
INDEX TABLE:
INDEX | ROWID |
1 | AAAAA2 |
2 | AAAAA6 |
3 | AAAAA1 |
4 | AAAAA8 |
5 | AAAAA7 |
6 | AAAAA9 |
7 | AAAAA3 |
8 | AAAAA5 |
9 | AAAAA4 |
10 | AAAAA10 |
First column (INDEX) : it stores all the values of empid column in the ascending order.
Second column (ROWID) : it stores the ROWID of the corresponding record.
After creating this index table, fire the same query again,
Select * from emp where empid = 10;
When oracle executes this sql, first it looks for any index which has already been created on “empid” column. It comes to know that the index, “emp_id_indx” has already been created on “empid” column. So, oracle refers the index table first before hitting the actual data table. Since we are looking for the empid:10, oracle hits index table first and get the corresponding ROWID (AAAAA10). When oracle searches for the value in this index table, it follows random search (i.e. binary search) to go to the corresponding index record. Using this binary search, Oracle is intelligent enough to go to the 10th record directly in index table. So, it won’t touch the remaining 9 index records.
After the getting the required ROWID (AAAAA10), oracle directly refers the 10th record of data table since it knows the exact location (ROWID of 10th record is AAAAA10). With this, it doesn’t refer the remaining 9 table records.
Since this index table is used, oracle will take only 2 seconds (assume 1 second to get the ROWID of 10th index record since it follows binary search + 1 second to retrieve the actual table data record since we know the exact ROWID which is retrieved in the previous step)
Comments
Post a Comment