TERADATA Blogger: 2012

Wednesday 21 November 2012

Types of HASH functions used in Teradata?

SELECT HASHAMP (HASHBUCKET (HASHROW ())) AS “AMP#”, COUNT (*) FROM TABLE_NAME

GROUP BY 1 ORDER BY 2 DESC;

There are HASHROW, HASHBUCKET, HASHAMP and HASHBAKAMP.

The SQL hash functions are:

Ø HASHROW (column(s))

Ø HASHBUCKET (hashrow)

Ø HASHAMP (hashbucket)

Ø HASHBAKAMP (hashbucket)

Example:

Create Table emp

(

ID BIGINT NOT NULL GENERATED BY DEFAULT AS IDENTITY

(START WITH 1

INCREMENT BY 1

MINVALUE -999999999999999999

MAXVALUE 999999999999999999

NO CYCLE),

empname varchar(20),

empdept varchar(10)

)

PRIMARY INDEX(ID);

insert into emp(empname,empdept) values('D','X');

insert into emp(empname,empdept) values('C','X');

insert into emp(empname,empdept) values('B','X');

insert into emp(empname,empdept) values('A','X');

SELECT

HASHROW (EMP.empname) AS "Hash Value"

, HASHBUCKET (HASHROW (EMP.empname)) AS "Bucket Num"

, HASHAMP (HASHBUCKET (HASHROW (EMP.empname))) AS "AMP Num"

, HASHBAKAMP (HASHBUCKET (HASHROW (EMP.empname))) AS "AMP Fallback Num"

,EMP.empname

FROM EMP;

This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution

Monday 19 November 2012

Sql Query to print Calender Month in Teradata

SELECT

MAX(CASE WHEN day_of_week=1 THEN day_of_month ELSE NULL END ) SUN

,MAX(CASE WHEN day_of_week=2 THEN day_of_month ELSE NULL END ) MON

,MAX(CASE WHEN day_of_week=3 THEN day_of_month ELSE NULL END ) TUE

,MAX(CASE WHEN day_of_week=4 THEN day_of_month ELSE NULL END ) WED

,MAX(CASE WHEN day_of_week=5 THEN day_of_month ELSE NULL END ) THU

,MAX(CASE WHEN day_of_week=6 THEN day_of_month ELSE NULL END ) FRI

,MAX(CASE WHEN day_of_week=7 THEN day_of_month ELSE NULL END ) SAT

FROM sys_calendar.calendar

WHERE year_of_calendar='2013'

AND month_of_year='03'

GROUP BY week_of_month

ORDER BY week_of_month;

Pls post u r feedback :-)

Thursday 15 November 2012

String Reversal in Teradata Sql

String Reversal in Teradata Sql:

CREATE

TABLE Dummy_Emp(EMP_ID INTEGER NOT NULL ,EMP_NAME VARCHAR(30) NOT NULL);

INSERT INTO Dummy_Emp VALUES(1, 'RAJGOPAL GURRAPUSHALA');
INSERT INTO Dummy_Emp VALUES(2, 'RAJGOPAL');
INSERT INTO Dummy_Emp VALUES(3, 'GURRAPUSHALA');
INSERT INTO Dummy_Emp VALUES(4, '');

WITH
RECURSIVE REVNAME(EMP_ID, EMP_NAME, NAMELEN, LVL, RNAME)
AS(SELECT EMP_ID, EMP_NAME, CHARACTER_LENGTH(EMP_NAME), 1(INTEGER), '' (VARCHAR(30))
FROM Dummy_Emp --WHERE EMP_ID=2
UNION
ALL SELECT
EMP_ID, EMP_NAME, NAMELEN, LVL+1, SUBSTRING(EMP_NAME FROM LVL FOR 1) || RNAMEFROM REVNAMEWHERE LVL <= NAMELEN
)SELECT EMP_ID, EMP_NAME, RNAMEFROM REVNAMEWHERE NAMELEN+1 = LVLORDER BY 1;

Sunday 11 November 2012

What are the different return codes (severity errors) in Teradata utilities?

There are 3 basic return codes (severity errors) in teradata utilities.

0 - success
4 - Warning
8 - User error
12 - System error
16 - system error

Please note that apart from this there are separate error codes for each of the error returned from sql queries.

For more details on error codes and fixing them please refer to General reference manual available from teradata documentation section in the teradata site.

How to find out list of indexes in Teradata?

IndexType	Description
P	Nonpartitioned Primary
Q	Partitioned Primary
S	Secondary
J	join index
N	hash index
K	primary key
U	unique constraint
V	value ordered secondary
H	hash ordered ALL covering secondary
O	valued ordered ALL covering secondary
I	ordering column of a composite secondary index
M	Multi column statistics
D	Derived column partition statistics
1	field1 column of a join or hash index
2	field2 column of a join or hash index

What is error table? What is the use of error table?

Answers:

The Error Table contains information concerning:

Data conversion errors Constraint violations and other error conditions:
Contains rows which failed to be manipulated due to constraint violations or Translation error
Captures rows that contain duplicate Values for UPIs.
It logs errors & exceptions that occurs during the apply phase.
It logs errors that are occurs during the acquisition phase.

How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?

Answers:

Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.

If it’s a duplicate it silently skips it without throwing any error.

What are the different types of temporary tables in Teradata?

Answers:

a. Global temporary tables

b. Volatile temporary tables

c. Derived tables

Global Temporary tables (GTT) –

1. When they are created, its definition goes into Data Dictionary.

2. When materialized data goes in temp space.

3. That's why, data is active up to the session ends, and definition will remain there up-to its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all;

4. You can collect stats on GTT.

5. Defined with the CREATE GLOBAL TEMPORARY TABLE sql

Volatile Temporary tables (VTT) -

1. Local to a session (deleted automatically when the session terminates)

2. Table Definition is stored in System cache .A permanent table definition is stored in the DBC data dictionary database (DBC.Temptables).

3. Data is stored in spool space.

4. That’s why; data and table definition both are active only up to session ends.

5. No collect stats for VTT( TD13 Onwards Collect stats Option is available) . If you are using volatile table, you cannot put the default values on column level (while creating table)

6. Created by the CREATE VOLATILE TABLE sql statement

Derived tables

1 Derived tables are local to an SQL query.

2 Not included in the DBC data dictionary database, the definition is kept in cache.

3 They are specified on a query level with an AS keyword in an sql statement

How do you find out number of AMP's in the given system?

Answer

1.running following query in queryman

Select HASHAMP () +1;

2. We can find out complete configuration details of nodes and amps in configuration screen of Performance monitor

How many error tables are there in fload and Mload and what is their significance/use?

Answers:

Fload uses 2 error tables

ET TABLE 1: where format of data is not correct.

ET TABLE 2: violations of UPI

It maintains only error field name, error code and data parcel only.

Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table

1. ET TABLE - Data error

MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task.

2. UV TABLE - UPI violations

MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task

Apart from error tables, it also has work and log tables

3. WORK TABLE - WT

Mload loads the selected records in the work table

4. LOG TABLE

A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.

Friday 9 November 2012

what are different types of journals in teradata?

There are 3 different types of journals available in Teradata. They are

1. Transient Journal - This maintains current transaction history. Once the query is successful it deletes entries from its table . If the current query transaction fails, It rolls back data from its table.

2. Permanent Journal - This is defined when a table is created. It can store BEFORE or AFTER image of tables. DUAL copies can also be stored. Permanent Journal maintains Complete history of table.

3.Down AMP recovery Journal (DARJ) - This journal activates when the AMP which was supposed to process goes down. This journal will store all entries for AMP which went down. Once that AMP is up, DARJ copies all entries to that AMP and makes that AMP is sync with current environment.

How to find duplicates in a table?

To find duplicates in the table , we can use group by function on those columns which are to be used and then listing them if their count is >1 .

Following sample query can be used to find duplicates in table having 3 columns
select col1, col2,col3, count(*) from table
group by col1, col2, col3
having count (*) > 1 ;

How do you see a DDL for an existing table in Teradata?

By using show table command as follows
show table tablename ;

This will display DDL structure of table along with following details
Fallback
Before/after journal
set/multiset table type
Index(PI,SI,PPI)
details about column - datatype ,default,identity/primary -foreign key .

How do you find the No of AMP 's in the teradata Database?

1.) You can do a SELECT HASHAMP()+1 to get the total number of Amps in the given teradata system.
2.) Details about amps can also be checked in Configuration management on teradata PMON tool.

What is RAID, What are the types of RAID?

Redundant Array of Inexpensive Disks (RAID) is a type of protection available in Teradata. RAID provides Data protection at the disk Drive level. It ensures data is available even when the disk drive had failed.

There are around 6 levels of RAID ( RAID0 to RAID5) .

Teradata supports two levels of RAID protection
RAID 1 - Mirrored copy of data in other disk
RAID 5 - Parity bit (XOR) based Data protection on each disk array.

One of the major overhead's of RAID is Space consumption

What is hash collision?

This occurs when there is same hash value generated for two different Primary Index Values. It is a rare occurrence and has been taken care in future versions of TD.

What are different types of Spaces available in Teradata ?

There are 3 types of Spaces available in teradata ,they are

1. Perm space
-This is disk space used for storing user data rows in any tables located on the database.
-Both Users & databases can be given perm space.
-This Space is not pre-allocated , it is used up when the data rows are stored on disk.

2.Spool Space
-It is a temporary workspace which is used for processing Rows for given SQL statements.
-Spool space is assigned only to users . -
-Once the SQL processing is complete the spool is freed and given to some other query.
-Unused Perm space is automatically available for Spool .

3. TEMP space
-It is allocated to any databases/users where Global temporary tables are created and data is stored in them.
-Unused perm space is available for TEMP space

how do you list all the objects available in given database?

1.)select * from dbc.tables where databasename='<DATABASENAME>';

2.) By running a normal help command on that database as follows.
help database <DATABASENAME';

Can a macro be called inside a macro?

Answer:
The main purpose of run a set of repeated sql queries. Macro supports only DML queries .
Hence We cant call any-other macro or not even a procedure in a macro.

One trick to have closest functionality of this is to copy all the sql queries from macro2 to macro1 and add parameters if it is necessary as shown below.

replace macro1( val int)
as
( sel * from employee_table where empid= :val );

replace macro2( dept_no int)
as
( sel * from employee_table where deptno= :dept_no );

so, to call a macro2 inside a macro1..it is not possible. Hence follow this approach

Replace macro1 (val int, dept_no int)
as
(
sel * from employee_table where empid= :val;
sel * from employee_table where deptno= :dept_no ;
);

What is a join index ? What are benefits of using Join index?

Answer:

It is a index that is maintained in a system .It maintains rows joined on two or more tables. Join index is useful for queries where the index structure contains all the columns referenced by one or more joins, thereby allowing the index to cover all or part of the query

Benefits if using join index is
- to eliminate base table access
- Aggregate processing is eliminated by creating aggregate join index
- It reduces redistribution of data since data is materialized by JI.
- Reduces complex join conditions by using covering queries

What are the ways by which we can use zero to replace a null value for a given column ?

Answer -

1. By using Teradata SQL supported command as follows
Select Col1, ZEROIFNULL(Col2) from Table_name;

2. By using ANSI SQL command as follows
a. Coalesce
Select Col1,COALESCE(Col2,0) from Table_name;

b.Case operator
select
Case When Col2 IS NOT NULL
Then Col2
Else 0
End
from Table_name;It is always suggested to use ANSI standard while coding in Teradata , since any changes in Teradata version due to upgrade/patches installation will lead to
- Time for regression testing
- rework of code.

What is multivalued compression in Teradata?

Multivalued compression or just MVC is a compression technique applied on columns in Teradata . MVC has a unique feature of compressing up-to 255 distinct values per column in a given table.

The advantage of compression are

Reduced Storage cost by storing more of a logical data than physical data.
Performance is greatly improves due to reduced retrieval of physical data for that column.

Tables having compression always have an advantage since optimizer considers reduced I/O as one of the factors for generating EXPLAIN plan.

What is the acceptable range for skew factor in a table?

There is no particular range for skew factor. In case of production systems, it is suggested to keep skew factor between 5-10.
There are various considerations for skew factor
- Number of AMPS
- Size of row in a table
- number of records in a table
- PI of a table
- Frequent access of table (Performance consideration)
- whether table getting loaded daily /monthly or how frequently data is being refreshed

Query to find skew factor of a particular table?

SELECT
TABLENAME
,SUM(CURRENTPERM) /1024/1024 AS CURRENTPERM,
(100 - (AVG(CURRENTPERM)/MAX(CURRENTPERM)*100)) AS SKEWFACTOR
FROM
DBC.TABLESIZE
WHERE DATABASENAME= <DATABASENAME>
AND
TABLENAME =<TABLENAME>
GROUP BY 1;

What are permanent journals in teradata?

- Journals are used to capture information about table in Teradata. In case of Permanent journals they capture details of Journal enabled tables in teradata with all the pre transaction and post transaction details .
- Journal tables are assigned PERM space and they reside in same database as of parent or they can reside on different database.
- They are mainly used for protection of data and sometimes also for disaster recovery ( fallback is better in this case )
- Permanent journal tables can be enabled or disabled by running alter database <databasename> 'no journal' /' journal = <databasename.jrnltbl>'
- Arcmain utility provides the feature of backing up Journal tables
- We can find details about all journal tables present in teradata database using DBC.JOURNALS table.

Total Pageviews

Wednesday 21 November 2012

Monday 19 November 2012

Thursday 15 November 2012

Sunday 11 November 2012

What is error table? What is the use of error table?

Answers:

The Error Table contains information concerning:

Data conversion errors Constraint violations and other error conditions:

Contains rows which failed to be manipulated due to constraint violations or Translation error

Captures rows that contain duplicate Values for UPIs.

It logs errors & exceptions that occurs during the apply phase.

It logs errors that are occurs during the acquisition phase.

How many error tables are there in fload and Mload and what is their significance/use?

Friday 9 November 2012

what are different types of journals in teradata?

How to find duplicates in a table?

How do you see a DDL for an existing table in Teradata?

How do you find the No of AMP 's in the teradata Database?

What is hash collision?

What are different types of Spaces available in Teradata ?

how do you list all the objects available in given database?

1.)select * from dbc.tables where databasename='<DATABASENAME>';2.) By running a normal help command on that database as follows. help database <DATABASENAME';

Can a macro be called inside a macro?

What is a join index ? What are benefits of using Join index?

What are the ways by which we can use zero to replace a null value for a given column ?

What is multivalued compression in Teradata?

What is the acceptable range for skew factor in a table?

Query to find skew factor of a particular table?

What are permanent journals in teradata?

1.)select * from dbc.tables where databasename='<DATABASENAME>';

2.) By running a normal help command on that database as follows.
help database <DATABASENAME';