Total Pageviews

Monday, 4 February 2013

Advantages of OLAP Functions over Subqueries in Teradata

Advantages of OLAP Functions over Sub queries in Teradata
Significance of OLAP Features:
1. Execution Time of the Query is minimized to a significant level after using OLAP features.
2. Less complex query can be formulated by OLAP functions.
Description of the table [PRDT_RECVD_DT_TBL] used:

Table contains the 5 columns namely PRODUCT, BRAND, CATEGORY, SALES_CENTER (i.e Store), PRDT_RCVD_DATE. Consider 'Product, Brand, Category' as a MERCHANDISE.

High level Requirement Specification:

Ultimate Goal is to compute the Earliest Product Received Date. Earliest Product Received Date is the most recent date when majority of the sales center received the goods at the Merchandise level.

Code level Specification:
We need to build a table containing Merchandise with its Earliest Product Received Date from the source table which has only product received date [PRDT_RCVD_DATE].

Earliest Product Received Date is the date when particular Merchandise is received by maximum number of SALES CENTERS (read as STORES). For particular Merchandise, if the maximum number of SALES CENTERS is same for two different PRDT_RCVD_DATEs then we should take the earliest date.
For Example - If the product is received at 1000 different stores on 2 different dates say on 11/01/2010 & 15/01/2010, then the query should return 11/01/2010[earliest date amongst the two dates] as Earliest Product Received Date.

OLAP Features Used:

We have used the following OLAP features in our module to enhance the performance of the query.

The QUALIFY clause allows restriction of rows to be output in the final result. In the below query, the QUALIFY clause restricts the output to be based on product received by majority of the STORES and latest PRODUCT RECEIVED DATE and return the firstSTORE satisfying this criteria.

The PARTITION BY clause may be used in conjunction with a RANK function to change the scope of the ranking. PARTITION BY clause controls scope, i.e., rank sales within store. In the below query, the scope limits to MERCHANDISE.

The RANK () OVER ORDER BY clause is used to assign the ranking sequence.

  
Implementation of the above requirement with OLAP Functions:

SELECT DISTINCT A.PRODUCT, A.CATEGORY, A.BRAND, A.PRDT_RCVD_DATE
AS ERLST_RCVD_DATE
FROM 
(
            SELECT A.PRODUCT, A.CATEGORY, A.BRAND, PRDT_RCVD_DATE,
            COUNT(*) OVER(PARTITION BY A.PRODUCT, A.CATEGORY,
                                              A.BRAND, A.PRDT_RCVD_DATE) AS
                                              SLSCNTR_COUNT
            FROM PRDT_RECVD_DT_TBL A
) A
QUALIFY RANK() OVER (PARTITION BY A.PRODUCT, A.CATEGORY, A.BRAND
ORDER BY SLSCNTR_COUNT DESC, PRDT_RCVD_DATE ASC)=1;

Execution time: 9 min 14 sec

Implementation of the above requirement without OLAP Functions:

SELECT
T1.PRODUCT AS PRODUCT
,T1.CATEGORY AS CATEGORY
,T1.BRAND AS BRAND
,MIN (T1.PRDT_RCVD_DATE) AS ERLST_RCVD_DATE
FROM
(
                SELECT PRODUCT, CATEGORY, BRAND, PRDT_RCVD_DATE, 
                COUNT (SALES_CENTER) AS SLSCNTR_COUNT
                FROM PRDT_RECVD_DT_TBL
                GROUP BY PRODUCT, CATEGORY, BRAND, PRDT_RCVD_DATE
) T1
WHERE (T1.PRODUCT, T1.CATEGORY, T1.BRAND,T1.SLSCNTR_COUNT) IN
                (
                                SELECT T2.PRODUCT
                                                , T2.CATEGORY
                                                , T2.BRAND,
                                                MAX (T2.SLSCNTR_COUNT)
                                FROM
                                                (
                                                                SELECT PRODUCT
                                                                                ,CATEGORY
                                                                                ,BRAND
                                                                                ,PRDT_RCVD_DATE
                                                                                , COUNT (SALES_CENTER) AS SLSCNTR_COUNT
                                                                FROM PRDT_RECVD_DT_TBL
                                                                GROUP BY 1,2,3,4
                                                ) T2
                GROUP BY T2.PRODUCT, T2.CATEGORY, T2.BRAND
)
GROUP BY T1.PRODUCT, T1.CATEGORY, T1.BRAND

Execution Time:  Executed till 11 minutes and then aborted with spool space error as the query is space intensive.

Test the above query:
Create the sample table with the below CREATE statement. Load the table with the DATALOAD statements that follow & then one can test.

Table Creation:

CREATE SET TABLE ETL_TEMP_PROD_ANLYTC.PRDT_RECVD_DT_TBL, NO FALLBACK,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT
     (
      PRODUCT BYTEINT NOT NULL,
      SALES_CENTER SMALLINT NOT NULL,
      CATEGORY INTEGER NOT NULL,
      BRAND SMALLINT NOT NULL,
      PRDT_RCVD_DATE DATE FORMAT 'MM/DD/YYYY'
     )
PRIMARY INDEX (PRODUCT, SALES_CENTER, CATEGORY, BRAND);

Insert Query to load the tables:

Table: PRDT_RECVD_DT_TBL

INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
VALUES( '1' , '9101' , '777409' , '158' , '10/11/2000' );

INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
   VALUES( '1' , '9101' , '778878' , '158' , '05/16/2000' );
  
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '1478213' , '158' , '07/25/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2768695' , '398' , '10/18/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2771103' , '380' , '10/10/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2772580' , '398' , '09/20/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2772614' , '323' , '09/20/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2785590' , '2220' , '10/11/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2790228' , '3145' , '10/05/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2790228' , '2402' , '10/05/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '1' , '9101' , '2790368' , '2048' , '10/05/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2026201' , '2048' , '12/04/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2690576' , '1123' , '10/16/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2691160' , '4515' , '10/16/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2691160' , '4473' , '10/16/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2691202' , '4507' , '09/08/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2692200' , '1040' , '12/09/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2768398' , '398' , '08/01/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
  VALUES( '4' , '9104' , '2781714' , '596' , '10/10/2000' );
 
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
   VALUES( '4' , '9104' , '2781722' , '5967' , '10/10/2000' );
  
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
   VALUES( '4' , '9104' , '2781748' , '554' , '10/10/2000' );
  
INSERT INTO PRDT_RECVD_DT_TBL
(PRODUCT,SALES_CENTER,CATEGORY,BRAND,PRDT_RCVD_DATE)
   VALUES( '4' , '9104' , '2781748' , '554' , '11/10/2000' );


 Result:

PRODUCT
CATEGORY
BRAND
ERLST_RCVD_DATE
1
777409
158
10/11/2000
1
778878
158
5/16/2000
1
1478213
158
7/25/2000
1
2768695
398
10/18/2000
1
2771103
380
10/10/2000
1
2772580
398
9/20/2000
1
2772614
323
9/20/2000
1
2785590
2220
10/11/2000
1
2790228
2402
10/5/2000
1
2790228
3145
10/5/2000
1
2790368
2048
10/5/2000
4
2026201
2048
12/4/2000
4
2690576
1123
10/16/2000
4
2691160
4473
10/16/2000
4
2691160
4515
10/16/2000
4
2691202
4507
9/8/2000
4
2692200
1040
12/9/2000
4
2768398
398
8/1/2000
4
2781714
596
10/10/2000
4
2781722
5967
10/10/2000




No comments:

Post a Comment