Tuesday, October 23, 2012

Teradata - Performace Tuning

Use group by instead of distinct because distinct occupies more spool space

Use EXISTS instead of IN because EXISTS stops the process once it finds the value

Join Columns of same datatype

Use UNION ALL instead of UNION

Remove secondary indexes when the data is being loaded,updated or deleted

Avoid correlated queries,cross join, product join

use Delete ALL - normal delete have rollback, for delete all there is no rollback(no transient journal overhead)

Does SDLC changes when you use Teradata instead of Oracle ?


If Teradata is just a database then no change
IF we are using utilities then it changes
If the schema is in 3rd normal form then there won't be huge changes

Teradata Utilities

Teradata Utilities continues...

Teradata vs Other RDBMS

Teradata: Shared nothing(Ex: Each VPROC is associated with one VDisk)
Other: Shared Everything

Process Million of Instructions per second
Process Thousands of Instructions per second

Stores billions of rows
stores millions of rows

unconditional parallelism(Each AMP works with its own drive)
Conditional Parallelism

Better storage and fast retrieval(using indexes)
fast retrieval only

bulk load utilities(BTEQ,FLOAD,MLOAD,TPUMP)
limited utilities


Teradata Questions - SQL Queries


Number of nodes in a system

select count(distinct(nodeid)) from dbc.resusagescpu

To find database size

select sum(currentPerm) from DBC.diskspace where databasename='mydb'

To find table size

select sum(currentperm) from DBC.tablesize where databasename='mydb' and tablename='mytable'

To find version of terdata

select * from dbc.dbcinfo

To find CPU time consumed

select username,accountname,sum(CPUTIME) from dbc.ampusage where username='xyz' group by 1,2

To find the space used by each table in MB's in a database

select tablename,sum(currentperm)/1024/1024 MB
from DBC.AllSpace
where databasename='mydb'
group by 1

To identify skewness

SELECT
   DatabaseName,
   TableName,
   SUM(CurrentPerm) AS CurrentPerm ,
   SUM(PeakPerm) AS PeakPerm,
   (100 - (AVG(CurrentPerm)/MAX(CurrentPerm)*100)) AS SkewFactor
  FROM DBC.TableSize
 WHERE databasename =  database
   AND TableName IN ('SALES' , 'CUSTOMERS')
 GROUP BY 1,2
 ORDER BY SkewFactor DESC;

Difference between Teradata V12 and Teradata V2R5

Stored Procedure Enhancements
Multilevel Partitioning introduced
Supports SQL invocation from external stored procedures
Returning multiple resultsets from a stored procedure
Supports DSW(Destination Selection Word) of 20 bits also
BLOB and CLOB extended till to 10 GB
V2R5 - 512 nodes, TD12 - 1024 nodes


Multilevel Partitioning


create set table test1
(
empid integer,
empname varchar(30),
sdate date
)
PRIMARY INDEX(empid)
partition by
(
Range_N(empid between 1 and 10,sdate between DATE '2010-01-01' and DATE '2010-01-01')
Case_N(empid=10,empid=5,NOCASE,UNKNOWN)
)


Returning resultset from a stored procedure


CREATE PROCEDURE "SCHEMA"."GETRESULTSET" (
    IN "p1" VARCHAR(30))
DYNAMIC RESULT SETS 1
BEGIN    
DECLARE cur1 CURSOR WITH RETURN ONLY TO CLIENT FOR
 SELECT "Partition" FROM "SCHEMA".SessionInfo where username = p1;  
  OPEN cur1;  
END;


Most frequest Teradata Production or Development issues

Numeric Overflow Error Occurred


*to avoid this error you can use decimal(18,0)*
select cast(empsalarytotal as decimal(18,0)) from Databasename.Tablename


No more room in database

*because of poor Primary Index *

select TableName, sum(CurrentPerm) as ActualSpace,
count(*)*(max(CurrentPerm)-avg(CurrentPerm)) as WastedSpace


Spool Space Issue


To avoid ---

join the columns that are of same data types
filter the subqueries just to have the rows that joins
collect the statistics(diagnostic helpstats on for session)

bad character in format or data


While I am trying to run the following query .
SELECT
(case
when d.So_Qty is NULL then NULL
when ctr.So_Qty = 0 then NULL
else ctr.So_Qty end) Soq

I am getting the following error
2620 The format or data contains a bad character.

compare string to string

ctr.So_Qty='0'