Total Pageviews

Wednesday 21 November 2012

Types of HASH functions used in Teradata?

Types of HASH functions used in Teradata?
SELECT HASHAMP (HASHBUCKET (HASHROW ())) AS “AMP#”, COUNT (*) FROM TABLE_NAME
GROUP BY 1 ORDER BY 2 DESC;

There are HASHROW, HASHBUCKET, HASHAMP and HASHBAKAMP.
The SQL hash functions are:
Ø  HASHROW (column(s))
Ø  HASHBUCKET (hashrow)
Ø  HASHAMP (hashbucket)
Ø  HASHBAKAMP (hashbucket)
Example:
Create Table emp
(
ID BIGINT NOT NULL GENERATED BY DEFAULT AS IDENTITY
           (START WITH 1
            INCREMENT BY 1
            MINVALUE -999999999999999999
            MAXVALUE 999999999999999999
            NO CYCLE),
empname varchar(20),
empdept varchar(10)
)
PRIMARY INDEX(ID);
insert into emp(empname,empdept) values('D','X');
insert into emp(empname,empdept) values('C','X');
insert into emp(empname,empdept) values('B','X');
insert into emp(empname,empdept) values('A','X');
insert into emp(empname,empdept) values('A','X');
SELECT
            HASHROW (EMP.empname)   AS "Hash Value"
            , HASHBUCKET (HASHROW (EMP.empname)) AS "Bucket Num"
            , HASHAMP (HASHBUCKET (HASHROW (EMP.empname))) AS "AMP Num"
            , HASHBAKAMP (HASHBUCKET (HASHROW (EMP.empname)))  AS "AMP Fallback Num"
            ,EMP.empname
FROM EMP;
This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution

Monday 19 November 2012

Sql Query to print Calender Month in Teradata

Sql Query to print Calender Month in Teradata

SELECT
MAX(CASE WHEN day_of_week=1 THEN day_of_month ELSE NULL END     ) SUN
,MAX(CASE WHEN day_of_week=2 THEN day_of_month ELSE NULL END   ) MON
,MAX(CASE WHEN day_of_week=3 THEN day_of_month ELSE NULL END   ) TUE
,MAX(CASE WHEN day_of_week=4 THEN day_of_month ELSE NULL END   ) WED
,MAX(CASE WHEN day_of_week=5 THEN day_of_month ELSE NULL END   ) THU
,MAX(CASE WHEN day_of_week=6 THEN day_of_month ELSE NULL END   ) FRI
,MAX(CASE WHEN day_of_week=7 THEN day_of_month ELSE NULL END   ) SAT
FROM sys_calendar.calendar
WHERE year_of_calendar='2013'
AND      month_of_year='03'
GROUP BY week_of_month
ORDER BY week_of_month;
Pls post u r feedback :-)

Thursday 15 November 2012

String Reversal in Teradata Sql

String Reversal in Teradata Sql:

CREATE
TABLE Dummy_Emp(EMP_ID INTEGER NOT NULL ,EMP_NAME VARCHAR(30) NOT NULL);


INSERT INTO Dummy_Emp VALUES(1, 'RAJGOPAL GURRAPUSHALA');
INSERT INTO Dummy_Emp VALUES(2, 'RAJGOPAL');
INSERT INTO Dummy_Emp VALUES(3, 'GURRAPUSHALA');
INSERT INTO Dummy_Emp VALUES(4, '');

  WITH
RECURSIVE REVNAME(EMP_ID, EMP_NAME, NAMELEN, LVL, RNAME)
AS(SELECT EMP_ID, EMP_NAME, CHARACTER_LENGTH(EMP_NAME), 1(INTEGER), '' (VARCHAR(30))
FROM Dummy_Emp --WHERE EMP_ID=2
  UNION
ALL  SELECT
EMP_ID, EMP_NAME, NAMELEN, LVL+1, SUBSTRING(EMP_NAME FROM LVL FOR 1) || RNAMEFROM REVNAMEWHERE LVL <= NAMELEN
)SELECT EMP_ID, EMP_NAME, RNAMEFROM REVNAMEWHERE NAMELEN+1 = LVLORDER BY 1;

Sunday 11 November 2012

What are the different return codes (severity errors) in Teradata utilities?

What are the different return codes (severity errors) in Teradata utilities?
There are 3 basic return codes (severity errors) in teradata utilities.
  • 0 - success
  • 4 - Warning
  • 8 - User error
  • 12 - System error
  • 16 - system error  
 
Please note that apart from this there are separate error codes for each of the error  returned from sql queries.
 
For more details on error codes and fixing them please refer  to General reference manual  available from teradata documentation section in the teradata site.

How to find out list of indexes in Teradata?

How to find out list of indexes in Teradata?
IndexType
Description
P
Nonpartitioned Primary
Q
Partitioned Primary
S
Secondary
J
join index
N
hash index
K
primary key
U
unique constraint
V
value ordered secondary
H
hash ordered ALL covering secondary
O
valued ordered ALL covering secondary
I
ordering column of a composite secondary index
M
Multi column statistics
D
Derived column partition statistics
1
field1 column of a join or hash index
2
field2 column of a join or hash index

What is error table? What is the use of error table?

What is error table? What is the use of error table?

Answers:

The Error Table contains information concerning:

  1. Data conversion errors Constraint violations and other error conditions:

  2. Contains rows which failed to be manipulated due to constraint violations or Translation error

  3. Captures rows that contain duplicate Values for UPIs.

  4. It logs errors & exceptions that occurs during the apply phase.

  5. It logs errors that are occurs during the acquisition phase.

How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?

How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?
Answers:
Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.
If it’s a duplicate it silently skips it without throwing any error.

What are the different types of temporary tables in Teradata?

What are the different types of temporary tables in Teradata?
Answers:
a.       Global temporary tables
b.       Volatile temporary tables
c.       Derived tables
Global Temporary tables (GTT) –
1. When they are created, its definition goes into Data Dictionary.
2. When materialized data goes in temp space.
3. That's why, data is active up to the session ends, and definition will remain there up-to its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all;
4. You can collect stats on GTT.
5. Defined with the CREATE GLOBAL TEMPORARY TABLE sql
Volatile Temporary tables (VTT) -
1. Local to a session (deleted automatically when the session terminates)
2. Table Definition is stored in System cache .A permanent table definition is stored in the DBC data dictionary database (DBC.Temptables).
3. Data is stored in spool space.
4. That’s why; data and table definition both are active only up to session ends.
5. No collect stats for VTT( TD13 Onwards Collect stats Option is available) . If you are using volatile table, you cannot put the default values on column level (while creating table)
6. Created by the CREATE VOLATILE TABLE sql statement
Derived tables
1 Derived tables are local to an SQL query.
2 Not included in the DBC data dictionary database, the definition is kept in cache.
3 They are specified on a query level with an AS keyword in an sql statement

How do you find out number of AMP's in the given system?

How do you find out number of AMP's in the given system?
Answer
1.running following query in queryman
Select HASHAMP () +1;
2. We can find out complete configuration details of nodes and amps in configuration screen of Performance monitor

How many error tables are there in fload and Mload and what is their significance/use?

How many error tables are there in fload and Mload and what is their significance/use?

Answers:
Fload uses 2 error tables
ET TABLE 1: where format of data is not correct.
ET TABLE 2: violations of UPI
It maintains only error field name, error code and data parcel only.
Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table
1. ET TABLE - Data error
MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task.
2. UV TABLE - UPI violations
MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task
Apart from error tables, it also has work and log tables
3. WORK TABLE - WT
Mload loads the selected records in the work table
4. LOG TABLE
A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.

Friday 9 November 2012

what are different types of journals in teradata?

what are different types of journals in teradata?

There are  3 different types of journals available in Teradata. They are

1. Transient Journal  - This maintains current transaction history. Once the query is successful it deletes entries from its table .  If the current query transaction fails, It rolls back data from its table.

2. Permanent Journal  - This is defined when a table is created.  It can store BEFORE or AFTER image of tables. DUAL copies can also be stored. Permanent Journal maintains Complete history of table.

3.Down AMP recovery Journal (DARJ)  - This journal activates when the AMP which was supposed to process goes down.  This journal will store all entries for AMP which went down.  Once that AMP is up, DARJ copies all entries to that AMP and makes that AMP is sync with current environment.

How to find duplicates in a table?

How to find duplicates in a table?

To find duplicates in the table , we can use group by function on those columns which are to be used and then listing them if their count is >1 .

Following sample query can be used to find duplicates in table having  3 columns
select col1, col2,col3, count(*) from table
group by col1, col2, col3 
having count  (*) > 1 ;

How do you see a DDL for an existing table in Teradata?

How do you see a DDL for an existing table in Teradata?


By using show table command  as follows
show table tablename ;

This will display DDL structure of table along with following details
Fallback
Before/after journal
set/multiset table type
Index(PI,SI,PPI)
details about column - datatype ,default,identity/primary -foreign key .

How do you find the No of AMP 's in the teradata Database?

How do you find the No of AMP 's in the teradata Database?


1.) You can do a SELECT HASHAMP()+1 to get the total number of Amps in the  given teradata system.
2.) Details about amps can also be checked in Configuration management on  teradata PMON tool.

What is RAID, What are the types of RAID?


Redundant Array of Inexpensive Disks (RAID) is a type of protection available in Teradata. RAID  provides Data protection at the disk Drive level. It ensures data is available even when the disk drive had failed.

There are around 6 levels of RAID ( RAID0 to RAID5) . 
Teradata supports two levels of RAID protection
RAID 1 - Mirrored copy of data in other disk
RAID 5 - Parity bit (XOR) based Data protection on each disk array.

One of the major overhead's of RAID is Space consumption

What is hash collision?

What is hash collision?


This occurs when there is same hash value generated for two different Primary Index Values. It is a rare occurrence and has been taken care in future versions of TD.

What are different types of Spaces available in Teradata ?

What are different types of Spaces available in Teradata ?

There are 3 types of Spaces available in teradata ,they are

1. Perm space
-This is disk space used for storing user data rows in any tables located on the database.
-Both Users & databases can be given perm space.
-This Space is not pre-allocated , it is used up when the  data rows are stored on disk.

2.Spool Space
-It is a  temporary workspace which is used for processing Rows for given SQL statements.
-Spool space is assigned only to users . -
-Once the SQL processing is complete the spool is freed and given to some other query.
-Unused Perm space is automatically available for Spool . 

3. TEMP space
-It is allocated to any databases/users where Global temporary tables are created and data is stored in them.
-Unused perm space is available for TEMP space

 

how do you list all the objects available in given database?

how do you list all the objects available in given database?

1.)select * from dbc.tables where databasename='<DATABASENAME>';

2.) By running a normal help command on that database as follows. 
help database <DATABASENAME';

Can a macro be called inside a macro?

Can a macro be called inside a macro?

Answer:
The main purpose of run a set of repeated sql queries.   Macro supports only DML queries .
Hence We cant call any-other macro or not even a procedure in a macro.

One trick to have closest functionality of this is to copy all the sql queries from macro2 to macro1 and add parameters if it is necessary as shown below.


replace macro1( val int)
as
(  sel * from employee_table where empid= :val );

replace macro2( dept_no int)
as
(  sel * from employee_table where deptno= :dept_no );

so, to call a macro2 inside a macro1..it is not possible. Hence follow this approach

Replace macro1 (val int, dept_no int)
as
(
sel * from employee_table where empid= :val;
sel * from employee_table where deptno= :dept_no ;
);

What is a join index ? What are benefits of using Join index?

What is a join index ? What are benefits of using Join index?

Answer:

It is a index that is maintained in a system .It maintains rows  joined on two or more tables. Join index is useful for queries where the index structure contains all the columns referenced by one or more joins, thereby allowing the index to cover all or part of the query

Benefits if using join index is
- to eliminate base table access
- Aggregate processing is eliminated by creating aggregate join index
- It reduces redistribution of data since data is materialized by JI.
- Reduces complex join conditions by using covering queries

What are the ways by which we can use zero to replace a null value for a given column ?

What are the ways by which we can use zero to replace a null value for a given column ?

Answer -

1. By using Teradata SQL supported command as follows
Select Col1, ZEROIFNULL(Col2)  from Table_name;

2. By using ANSI SQL command as follows
a. Coalesce
Select Col1,COALESCE(Col2,0)    from Table_name;

b.Case operator
select
Case When Col2 IS NOT NULL
Then Col2
Else 0
End
from Table_name;
It is always suggested to use ANSI standard while coding in Teradata , since any changes in Teradata version due to upgrade/patches installation will lead to
- Time for regression testing
- rework of code.

What is multivalued compression in Teradata?

What is multivalued compression in Teradata?

Multivalued compression or just MVC is a compression technique applied on columns in Teradata .  MVC has a unique feature of compressing up-to 255 distinct values per column in a given table. 

The advantage of compression are
  • Reduced Storage cost by storing more of a logical data than physical data.
  • Performance is greatly improves due to  reduced retrieval of physical data for that column.   
Tables having compression always have an advantage since optimizer considers reduced I/O as one of the factors for generating EXPLAIN plan.

What is the acceptable range for skew factor in a table?

What is the acceptable range for skew factor in a table?


There is no particular range for skew factor.  In case of production systems, it is suggested to keep skew factor between 5-10. 
There are various considerations for skew factor
- Number of AMPS
- Size of row in a table
- number of records in a table
- PI of a table
- Frequent access of table (Performance consideration)
- whether table getting loaded daily /monthly or how frequently data is being refreshed

Query to find skew factor of a particular table?

Query to find skew factor of a particular table?

SELECT
TABLENAME
,SUM(CURRENTPERM) /1024/1024 AS CURRENTPERM,
(100 - (AVG(CURRENTPERM)/MAX(CURRENTPERM)*100)) AS SKEWFACTOR
FROM
DBC.TABLESIZE
WHERE DATABASENAME= <DATABASENAME>
AND
TABLENAME =<TABLENAME> 
GROUP BY 1;

What are permanent journals in teradata?

What are permanent journals in teradata?

- Journals are used to capture information about table in Teradata.  In case of Permanent journals they capture details of Journal enabled tables in teradata   with all the pre transaction and post transaction details . 
- Journal tables are assigned PERM space and they reside in same database as of parent or they can reside on different database.
- They are mainly used for protection of data and sometimes  also for disaster recovery ( fallback is better in this case )
- Permanent journal tables can be enabled or disabled by running alter database <databasename> 'no journal' /' journal = <databasename.jrnltbl>'
- Arcmain  utility provides the feature of backing  up Journal tables
- We can find details about all journal tables present in  teradata  database using DBC.JOURNALS table.