TERADATA Blogger: February 2013

Monday 25 February 2013

Which is faster? Select * from table or select 'all Columns' from table??

Which is faster? Select * from table or select 'all Columns' from table??

Many of us would have come across a scenario where listing column names in select SQL was found to be faster than using select * from table command. This indeed is interesting.

The reason being,

In case of using "select * from table", an extra stage is added where * is replaced by column names by teradata and then it would fetch the data.

But using "select <all Columns > from table” eliminates this extra stage of verifying and fetching on columns from the table.

Hence it is always recommended to use "select <all Columns > from table”

Teradata HELP commands

Teradata HELP commands

Teradata provides one of the user friendly featured like HELP commands.

Let us see the various help commands and their uses.

HELP SESSION;
This command is used to display the user name, account name, logon date and time, current database name, collation code set and character set being used and also , transaction semantics, time zone and character set data.

HELP DATABASE <database-name> ;
This command is used to display the names of all the tables (T), views (V), macros (M), and triggers (G) stored in a database and table comments

HELP USER <user-name> ;
This command is used to display the names of all the tables (T), views (V), macros (M), and triggers (G) stored in a user area and table comments

HELP TABLE <table-name> ;
This command is used to display the column names, type identifier, and any user written comments on the columns within a table.

HELP VOLATILE TABLE ;
This command is used to display the names of all Volatile temporary tables active for the current user session.

HELP VIEW <view-name> ;
This command is used to display the column names, type identifier, and comments on the columns within a view.

HELP MACRO <macro-name> ;
This command is used to display the characteristics of parameters passed to it at execution time.

HELP TRIGGER <trigger-name> ;
This command is used to display details created for a trigger, like action time and sequence.

HELP PROCEDURE <procedure-name> ;
This command is used to display the characteristics of parameters passed to it at execution time.

HELP COLUMN <table-name>.* OR HELP COLUMN <view-name>.* OR HELP COLUMN <table-name>.<column-name>, .…;
This command is used to display detail data describing the column level characteristics.

HELP INDEX <table-name> ;
This command is used to display the indexes and their characteristics like unique or non-unique and the column or columns involved in the index. This information is used by the Optimizer to create a plan for SQL.

HELP STATISTICS <table-name> ;
This command is used to display values associated with the data demographics collected on the table. This information is used by the Optimizer to create a plan for SQL.

HELP CONSTRAINT <table-name>.<constraint-name> ;
This command is used to display the checks to be made on the data when it is inserted or updated and the columns are involved.

HELP 'SQL';
This command is used to display a list of all available SQL commands and functions.

HELP 'SQL <command>';
This command is used to display the basic syntax and options for the SQL command used in place of the <command>.

HELP 'SPL';
This command is used to display a list of available SPL commands.

HELP 'SPL <command>';
This command is used to display the basic syntax and options for the SPL command used in place of the <command>.

Distinct Vs group by in Teradata

There is always a debate going on when finding out unique values in a given table. The problem comes into picture when we use Group by or distinct to find it.
Both return same number of rows, but with some execute time difference between them. Execution time is always a very important factor considering performance as one of the major factors is teradata warehouse.

So which is more efficient - DISTINCT or GROUP BY?

Since DISTINCT redistributes the rows immediately, more data may move between the AMPs, where as GROUP BY that only sends unique values between the AMPs.
So, we can say that GROUP BY sounds more efficient.
    But when you assume that data is nearly unique in a table, GROUP BY will spend more time attempting to eliminate duplicates that do not exist at all.Therefore, it is wasting its time to check for duplicates the first time. Then, it must redistribute the same amount of data .

Let us see how these steps are used in each case for elimination of Duplicates
(can be found out using explain plan)

DISTINCT
1. It reads each row on AMP
2. Hashes the column value identified in the distinct clause of select statement.
3. Then redistributes the rows according to row value into appropriate AMP
4. Once redistribution is completed , it
    a. Sorts data to group duplicates on each AMP
    b. Will remove all the duplicates on each amp and sends the original/unique value

P.s: There are cases when "Error : 2646 No more Spool Space " . In such cases try using GROUP BY.

GROUP BY
1. It reads all the rows part of GROUP BY
2. It will remove all duplicates in each AMP for given set of values using "BUCKETS" concept
3. Hashes the unique values on each AMP
4. Then it will re-distribute them to particular /appropriate AMP's
5. Once redistribution is completed , it
    a. Sorts data to group duplicates on each AMP
    b. Will remove all the duplicates on each amp and sends the original/unique value

hence it is better to go for

GROUP BY - when Many duplicates

DISTINCT - when few or no duplicates

GROUP BY - SPOOL space is exceeded

Difference between count(*),count(column) and count(1) ?

Difference between count(*),count(column) and count(1) ?

Count is one of very important functions used in any database. But what many don’t know that is the result count we get from COUNT function might be different based on how it is used. Let us consider the example of count (*) and Count (1) and understand how there is some difference the usage.

CREATE MULTISET TABLE TEMP ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      X CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      Y CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC)
NO PRIMARY INDEX ;

INSERT INTO TEMP VALUES(,);

INSERT INTO TEMP VALUES(,'3');

INSERT INTO TEMP VALUES('2',);

INSERT INTO TEMP VALUES('1','1');

SELECT * FROM TEMP;

X	Y
2	?
?	3
?	?
1	1

SELECT COUNT(*), COUNT(x), COUNT(1) FROM TEMP;

Result Set:

Count(*)	Count(X)	Count(1)
4	2	4

Tuesday 5 February 2013

Why MultiLoad Utility supports only Non Unique Secondary Index(NUSI) in the Target Table ?

Whenever we define a Secondary index, an Secondary index subtable will be created.In case of UPI, when they go for has distribution subtable is in one AMP and actual data row pointed by USI subtable is in another AMP. So the AMPs need to communicate, which is not supported by Multiload. But in case of NUSI, the subtable and the references of the actual data will store in the same AMP hence AMPs no need to communicate here.

So in case of NUSI, AMPs work in parallal and hence Mload supports that.

What is the difference between Global temporary tables and Volatile temporary tables?

Global Temporary tables (GTT):-

1. Whenever we create GTT, its definition is stored into Data Dictionary.

2. Whenever we insert data into GTT,data is stored into temp space. So definition of the table will be active until we can delete using the drop table statement and data remains active up to end of the session only.

3. We can Collect statistics on Global temporary tables.

Volatile Temporary tables (VTT) :-

1. Whenever we create VTT, its Definition is stored into System cache.

2. Whenever we insert data into VTT, data is stored into spool space.So table definition and data both are remains active only up to session end only.
3. We cannot able to collect statistics on volatile tables.

Note: From TD13 Onwards we can collect temporary Stats on VTT Tables.

Teradata SQL basic commands

Teradata basic SQL commands commonly are divided into three categories:

1.Data Definition Language (DDL): - We can define and create database objects such as tables, macros, views, databases and users etc by using DDL commands.

SQL statement	Function
CREATE	To define a table, macro,view, index, trigger or stored procedure.
DROP	To remove a table, view, macro, index, trigger or stored procedure.
ALTER	To change table structure or protection definition.

2.Data Manipulation Language (DML): - Work with the data including tasks such as inserting data rows into a table and updating an existing row or performing queries on the data by using the DML commands.

SQL statement	Function
SELECT	To select data from one or more tables.
INSERT	To place a new row into a table.
UPDATE	To change data values in one or more existing rows.
DELETE	To remove one or more rows from a table.

3.Data Control Language (DCL): - Here we can perform the administrative tasks such as granting and revoking privileges to database objects or controlling ownership of those objects by using DCL commands.

SQL statement	Function
GRANT	To give user privileges.
REVOKE	To remove user privileges.
GIVE	To transfer database ownership.

Teradata HELP TABLE Command

The Teradata HELP TABLE command displays the information regarding name, data type, and comment (if applicable), of all columns in particular table:

HELP TABLE Customer_Service.emp;

ColumnName	Type	Comment
emp_number	I	System assigned identification
mgr_emp_number	I
dept_number	I	Department employee works in
job_code	I	Job classification designation
l_name	CF	Employee surname
f_name	CV	Employee given name
hire_date	DA	Date employee was hired
birth_date	DA
sal_amt	D	Annual compensation amount

Data type Representations are as follows:

Type Description

BF BYTE
BV VARBYTE

CF CHARACTER FIXED

CV CHARACTER VARIABLE

DA DATE

I1 BYTEINT

I2 SMALLINT

AT TIME

TS TIMESTAMP

D DECIMAL

I INTEGER

Monday 4 February 2013

What is Join? Explian about Inner and Outer Joins?

We can use Join condition whenever we need to retrieve data from two (or) more tables.

Join can be used in either Where or From clause. But Join specification in From clause is recommended. We can also use Having and Where clause for filtering the data.

Joins can be classified into following types:

ü 1.Inner Join (Equi Join & Natural Join)

ü 2.Outer Join (Left Outer Join, Right Outer Join & Full Join)

ü 3.Self Join

ü 4.Cross Join

Inner Join: To retrieve matched records we can use comparison operators (i.e. =, < >).Whenever we need to get the matched records from two tables,this type of Join is used.

Inner Join Also call as Natural Join or Equi Join.

Syntax:

SELECT emp.emp_id

, emp. sal

, ph.phone_no

FROM employee emp

INNER JOIN phone ph

ON emp.emp_id = ph. emp_id

WHERE emp.sal > 2000;

Outer Join: To retrieve matched and unmatched records from two tables we can use Outer Join. Unmatched rows will be displayed as NULL from both tables.

Outer Join can be classified as: Left Outer Join , Right Outer Join and Full Outer Join.

Left Outer Join: All matched and unmatched records from left table will be displayed and unmatched records will be displayed as NULL from left table.But from right table only matched records will be displayed as it is and unmatched records will not be displayed.

Syntax:

SELECT emp.emp_id

, emp. sal

, ph.phone_no

FROM employee emp

LEFT JOIN phone ph

ON emp.emp_id = ph.emp_id

WHERE emp.sal > 2000 ;

Right Outer Join: All Records from right table will be displayed either matched or unmatched and unmatched records will be displayed as NULL from right table.But from left table only matched records will be displayed as it is and unmatched records will not be displayed.

Syntax:

SELECT emp.emp_id

, emp. sal

, ph.phone_no

FROM employee emp

RIGHT JOIN phone ph

ON emp.emp_id = ph.emp_id

WHERE emp.sal > 2000 ;

Full Outer Join: All matched and unmatched records from both left table and right table will be displayed and unmatched records will be displayed as NULL from both the tables.

Cross Join: This Join returns all rows from the left table, each and every row from the left table is in combination with all rows from the right table. Cross join also called as Cartesian Product Join.

Syntax:

SELECT au.au_fname

, au.au_lname

, p.pub_name

FROM authors AS au

INNER JOIN publishers AS p

ON au.city = p.city

AND au.state = p.state

ORDER BY au.au_lname ASC

, au.au_fname ASC ;

Self Join: A table can be joined to itself is a self-join.

For example, we can use a self-join to find out the authors in New Zealand who live in the same ZIP Code area.

SELECT au.au_fname

, au.au_lname

, a.au_fname

, au2.au_lname

FROM authors au

INNER JOIN Jauthors a

ON au.zip = a.zip

WHERE au.city = ' New Zealand'

ORDER BY au.au_fname ASC

, au.au_lname ASC ;

TERADATA Blogger

Total Pageviews

Monday 25 February 2013

Which is faster? Select * from table or select 'all Columns' from table??

Teradata HELP commands

Distinct Vs group by in Teradata

Difference between count(*),count(column) and count(1) ?

Tuesday 5 February 2013

Why MultiLoad Utility supports only Non Unique Secondary Index(NUSI) in the Target Table ?

Why MultiLoad Utility supports only Non Unique Secondary Index(NUSI) in the Target Table ?

What is the difference between Global temporary tables and Volatile temporary tables?

What is the difference between Global temporary tables and Volatile temporary tables?

Teradata SQL basic commands

Teradata SQL basic commands

Teradata HELP TABLE Command

Teradata HELP TABLE Command

Monday 4 February 2013

What is Join? Explian about Inner and Outer Joins?

What is Join? Explian about Inner and Outer Joins?

About Me

Blog Archive