Simplify analytics on Amazon Redshift utilizing PIVOT and UNPIVOT

0
1

8df9

8df9
8df9

8df9 Amazon Redshift 8df9 is a quick, totally 8df9 managed cloud knowledge warehouse that 8df9 makes it easy and cost-effective 8df9 to investigate all of your 8df9 knowledge utilizing commonplace SQL and 8df9 your present enterprise intelligence (BI) 8df9 instruments.

8df9
8df9

8df9 Many purchasers look to construct 8df9 their knowledge warehouse on Amazon 8df9 Redshift, and so they have 8df9 many necessities the place they 8df9 need to convert knowledge from 8df9 row stage to column stage 8df9 and vice versa. Amazon Redshift 8df9 now natively helps PIVOT and 8df9 UNPIVOT SQL operators with built-in 8df9 optimizations that you need to 8df9 use for knowledge modeling, knowledge 8df9 evaluation, and knowledge presentation. You’ll 8df9 be able to apply PIVOT 8df9 and UNPIVOT to tables, sub-queries, 8df9 and customary desk expressions (CTEs). 8df9 PIVOT helps the COUNT, SUM, 8df9 MIN, MAX, and AVG combination 8df9 capabilities.

8df9
8df9

8df9 You should use PIVOT tables 8df9 as a statistics software that 8df9 summarizes and reorganizes chosen columns 8df9 and rows of information from 8df9 a dataset. The next are 8df9 just a few situations the 8df9 place this may be helpful:

8df9
8df9

    8df9
    8df9

  • 8df9 Group the values by no 8df9 less than one column
  • 8df9
    8df9

  • 8df9 Convert the distinctive values of 8df9 a specific column into new 8df9 column names
  • 8df9
    8df9

  • 8df9 Use together with combination capabilities 8df9 to derive complicated experiences
  • 8df9
    8df9

  • 8df9 Filter particular values in rows 8df9 and convert them into columns 8df9 or vice versa
  • 8df9
    8df9

  • 8df9 Use these operators to generate 8df9 a multidimensional reporting
  • 8df9
    8df9

8df9
8df9

8df9 On this submit, we focus 8df9 on the advantages of PIVOT 8df9 and UNPIVOT, and the way 8df9 you need to use them 8df9 to simplify your analytics in 8df9 Amazon Redshift.

8df9
8df9

8df9 PIVOT overview

8df9
8df9

8df9 The next code illustrates the 8df9 PIVOT syntax:

8df9
8df9

8df9
8df9

 8df9 SELECT … 
FROM  
  8df9    ( 8df9 <get_source_data 8df9 >)   
   8df9   AS  8df9 <alias_source_query> 8df9   
PIVOT  
(  8df9  
     8df9  8df9 <agg_func> 8df9 ( 8df9 <agg_col> 8df9 )  
FOR    8df9 
[ 8df9 <pivot_col> 8df9 ]   
   8df9   IN ( [pivot_value_first],  8df9 [pivot_value_second],  
    8df9  ... [pivot_value_last])  
)  8df9 AS <alias_pivot>
 8df9 <non-obligatory ORDER BY clause> 8df9 ;

8df9
8df9

8df9
8df9

8df9 The syntax accommodates the next 8df9 parameters:

8df9
8df9

    8df9
    8df9

  • 8df9 <get_source_data> 8df9 – The SELECT question 8df9 that will get the information 8df9 from the supply desk
  • 8df9
    8df9

  • 8df9 <alias_source_query> 8df9 – The alias for 8df9 the supply question that will 8df9 get the information
  • 8df9
    8df9

  • 8df9 <agg_func> 8df9 – The combination perform 8df9 to use
  • 8df9
    8df9

  • 8df9 <agg_col> 8df9 – The column to 8df9 combination
  • 8df9
    8df9

  • 8df9 <pivot_col> 8df9 – The column whose 8df9 worth is pivoted
  • 8df9
    8df9

  • 8df9 <pivot_value_n> 8df9 – An inventory of 8df9 pivot column values separated by 8df9 commas
  • 8df9
    8df9

  • 8df9 <alias_pivot> 8df9 – The alias for 8df9 the pivot desk
  • 8df9
    8df9

  • 8df9 <non-obligatory ORDER BY clause> 8df9 – An non-obligatory parameter 8df9 to use an ORDER BY 8df9 clause on the consequence set
  • 8df9
    8df9

8df9
8df9

8df9 The next diagram illustrates how 8df9 PIVOT works.

8df9
8df9

8df9
8df9

8df9 PIVOT as an alternative of 8df9 CASE statements

8df9
8df9

8df9 Let’s take a look at 8df9 an instance of analyzing knowledge 8df9 from a distinct perspective than 8df9 the way it’s saved within 8df9 the desk. Within the following 8df9 instance, e-book gross sales knowledge 8df9 is saved by 12 months 8df9 for every e-book. We need 8df9 to take a look at 8df9 the 8df9 book_sales 8df9 dataset by 12 months 8df9 and analyze if there have 8df9 been any books bought or 8df9 not, and if bought, what 8df9 number of books had been 8df9 bought for every title. The 8df9 next screenshot exhibits our question.

8df9
8df9

8df9
8df9

8df9 The next screenshot exhibits our 8df9 output.

8df9
8df9

8df9
8df9

8df9 Beforehand, you needed to derive 8df9 your required outcomes set utilizing 8df9 a CASE assertion. This requires 8df9 you so as to add 8df9 a person CASE assertion with 8df9 the column title for every 8df9 title, as proven within the 8df9 following code:

8df9
8df9

8df9
8df9

 8df9 SELECT 12 months,
MAX (CASE WHEN  8df9 bookname="LOTR" THEN gross sales ELSE  8df9 NULL END) LOTR,
MAX (CASE WHEN  8df9 bookname="GOT" THEN gross sales ELSE  8df9 NULL END) GOT,
MAX (CASE WHEN  8df9 bookname="Harry Potter" THEN gross sales  8df9 else NULL
END) "Harry Potter",
MAX (CASE  8df9 WHEN bookname="Sherlock" THEN gross sales  8df9 ELSE NULL END)
sherlock
FROM book_sales GROUP  8df9 BY 12 months order by  8df9 12 months;

8df9
8df9

8df9
8df9

8df9
8df9

8df9 With the out-of-the-box PIVOT operator, 8df9 you need to use a 8df9 less complicated SQL assertion to 8df9 realize the identical outcomes:

8df9
8df9

8df9
8df9

 8df9 SELECT *
FROM
(
  SELECT bookname,  8df9 12 months, gross sales
   8df9 FROM book_sales
) AS d
PIVOT
(
   8df9 MAX(gross sales)
  FOR bookname  8df9 IN ('LOTR', 'GOT', 'Harry Potter',  8df9 'Sherlock')
) AS piv
order by 12  8df9 months;

8df9
8df9

8df9
8df9

8df9
8df9

8df9 UNPIVOT overview

8df9
8df9

8df9 The next code illustrates the 8df9 UNPIVOT syntax:

8df9
8df9

8df9
8df9

 8df9 SELECT ...
FROM  
   8df9   ( 8df9 <get_source_data> 8df9 )   
   8df9   AS  8df9 <alias_source_query> 8df9  
UNPIVOT  8df9 <non-obligatory INCLUDE NULLS> 8df9 
(  
    8df9   8df9 <value_col> 8df9 
FOR   
<name_col>   8df9 
    IN  8df9 (column_name_1, column_name_2 ..... column_name_n)   8df9 
) AS  8df9 <alias_unpivot> 8df9 
 8df9 <non-obligatory ORDER BY clause> 8df9 ; 

8df9
8df9

8df9
8df9

8df9 The code makes use of 8df9 the next parameters:

8df9
8df9

    8df9
    8df9

  • 8df9 <get_source_data> 8df9 – The SELECT question 8df9 that will get the information 8df9 from the supply desk.
  • 8df9
    8df9

  • 8df9 <alias_source_query> 8df9 – The alias for 8df9 the supply question that will 8df9 get the information.
  • 8df9
    8df9

  • 8df9 <non-obligatory INCLUDE NULLS> 8df9 – An non-obligatory parameter to 8df9 incorporate NULL values within the 8df9 consequence set. By default, NULLs 8df9 in enter columns aren’t inserted 8df9 as consequence rows.
  • 8df9
    8df9

  • 8df9 <value_col> 8df9 – The title assigned 8df9 to the generated column that 8df9 accommodates the row values from 8df9 the column checklist.
  • 8df9
    8df9

  • 8df9 <name_col> 8df9 – The title assigned 8df9 to the generated column that 8df9 accommodates the column names from 8df9 the column checklist.
  • 8df9
    8df9

  • 8df9 <column_name_n> 8df9 – The column names 8df9 from the supply desk or 8df9 subquery to populate 8df9 value_col 8df9 and 8df9 name_col 8df9 .
  • 8df9
    8df9

  • 8df9 <alias_unpivot> 8df9 – The alias for 8df9 the unpivot desk.
  • 8df9
    8df9

  • 8df9 <non-obligatory ORDER BY clause> 8df9 – An non-obligatory parameter 8df9 to use an ORDER BY 8df9 clause on the consequence set.
  • 8df9
    8df9

8df9
8df9

8df9 The next diagram illustrates how 8df9 UNPIVOT works.

8df9
8df9

8df9
8df9

8df9 UNPIVOT as an alternative of 8df9 UNION ALL queries

8df9
8df9

8df9 Let’s take a look at 8df9 the next instance question with 8df9 8df9 book_sales_pivot 8df9 .

8df9
8df9

8df9
8df9

8df9 We get the next output.

8df9
8df9

8df9
8df9

8df9 Beforehand, you needed to derive 8df9 this consequence set utilizing UNION 8df9 ALL, which resulted in an 8df9 extended and sophisticated question type, 8df9 as proven within the following 8df9 code:

8df9
8df9

8df9
8df9

 8df9 choose * from
(SELECT 12 months,  8df9 'lotr' AS e-book, LOTR AS  8df9 gross sales FROM (SELECT *  8df9 FROM book_sales_pivot)
UNION ALL
SELECT 12 months,  8df9 'acquired' AS e-book, GOT AS  8df9 gross sales FROM (SELECT *  8df9 FROM book_sales_pivot)
UNION ALL
SELECT 12 months,  8df9 'harry potter' AS e-book, "Harry  8df9 Potter" AS gross sales FROM  8df9 (SELECT * FROM book_sales_pivot)
UNION ALL
SELECT  8df9 12 months, 'sherlock' AS e-book,  8df9 "Sherlock" AS gross sales FROM  8df9 (SELECT * FROM book_sales_pivot)
)
order by  8df9 12 months;

8df9
8df9

8df9
8df9


8df9 With UNPIVOT, you need to 8df9 use the next simplified question:

8df9
8df9

8df9
8df9

 8df9 choose * from book_sales_pivot UNPIVOT  8df9 INCLUDE NULLS
(gross sales for e-book  8df9 in ("LOTR", "GOT", "Harry Potter",  8df9 "Sherlock"))
order by 12 months;

8df9
8df9

8df9
8df9

8df9
8df9

8df9 UNPIVOT is simple in comparison 8df9 with UNION ALL. You’ll be 8df9 able to additional clear this 8df9 output by excluding NULL values 8df9 from the consequence set. For 8df9 instance, you possibly can exclude 8df9 e-book titles from the consequence 8df9 set if there have been 8df9 no gross sales in a 8df9 12 months:

8df9
8df9

8df9
8df9

 8df9 choose * from book_PIVOT UNPIVOT
(gross  8df9 sales for e-book in ("LOTR",  8df9 "GOT", "Harry Potter", "Sherlock"))
order by  8df9 12 months;

8df9
8df9

8df9
8df9

8df9
8df9

8df9 By default, NULL values within 8df9 the enter column are skipped 8df9 and don’t yield a consequence 8df9 row.

8df9
8df9

8df9 Now that we perceive the 8df9 fundamental interface and value, let’s 8df9 dive into just a few 8df9 complicated use circumstances.

8df9
8df9

8df9 Dynamic PIVOT tables utilizing saved 8df9 procedures

8df9
8df9

8df9 The question of PIVOT is 8df9 static, that means that you 8df9 must enter a listing of 8df9 PIVOT column names manually. In 8df9 some situations, chances are you’ll 8df9 not need to manually use 8df9 your PIVOT values as a 8df9 result of your knowledge retains 8df9 altering, and it will get 8df9 tough to keep up the 8df9 checklist of values and replace 8df9 the PIVOT question manually.

8df9
8df9

8df9 To deal with these situations, 8df9 you possibly can make the 8df9 most of the dynamic PIVOT 8df9 saved process:

8df9
8df9

8df9
8df9

 8df9 /*
      8df9    non_pivot_cols :  8df9 Textual content checklist of columns  8df9 to be added to the  8df9 SELECT clause
     8df9     table_name  8df9     :  8df9 Schema certified title of desk  8df9 to be queried
    8df9       8df9 agg_func      8df9   : Title of  8df9 the combination perform to use
  8df9       8df9   agg_col    8df9       8df9 : Title of the column  8df9 to be aggregated
    8df9       8df9 pivot_col      8df9  : Title of the  8df9 column whose worth will probably  8df9 be pivoted
     8df9     result_set  8df9     :  8df9 Title of cursor used for  8df9 output      8df9  
 */    8df9    

CREATE OR  8df9 REPLACE PROCEDURE public.sp_dynamicPIVOT
(
non_pivot_cols IN VARCHAR(MAX),
table_name  8df9 IN VARCHAR(MAX),
agg_func IN VARCHAR(32),
agg_col IN  8df9 VARCHAR(MAX),
pivot_col IN VARCHAR(100),
result_set INOUT REFCURSOR  8df9 )
AS $$
DECLARE
sql     8df9     VARCHAR(MAX)  8df9 := '';
result_t   VARCHAR(MAX)  8df9 := '';
PIVOT_sql  VARCHAR(MAX);
cnt INTEGER  8df9 := 1;
no_of_parts INTEGER := 0;
item_for_col  8df9 character various := '';
item_pivot_cols character  8df9 various := '';
BEGIN

sql := 'SELECT  8df9  listagg (distinct ' ||  8df9 pivot_col || ', '','') inside  8df9 group (order by ' ||  8df9 pivot_col || ')  from  8df9 ' || table_name || ';';

EXECUTE  8df9 sql ||' ;' INTO result_t;


no_of_parts  8df9 := (choose REGEXP_COUNT ( result_t  8df9 , ','  ));


<<simple_loop_exit_continue>>
   8df9 LOOP
    item_for_col  8df9 := item_for_col + '''' +  8df9 (choose split_part("result_t",',',cnt)) +''''; 
   8df9   item_pivot_cols := item_pivot_cols  8df9 + '"' + (choose split_part("result_t",',',cnt))  8df9 +'"'; 
     8df9 cnt = cnt + 1;
  8df9    IF (cnt  8df9 < no_of_parts + 2) THEN
  8df9       8df9   item_for_col := item_for_col  8df9 + ',';
     8df9     item_pivot_cols  8df9 := item_pivot_cols + ',';
   8df9   END IF;
   8df9   EXIT simple_loop_exit_continue WHEN  8df9 (cnt >= no_of_parts + 2);
  8df9  END LOOP;


PIVOT_sql := 'SELECT  8df9 ' || non_PIVOT_cols || ','  8df9 || item_pivot_cols || ' from  8df9 ( choose * from '  8df9 || table_name || ' )  8df9 as src_data PIVOT ( '  8df9 || agg_func || '(' ||  8df9 agg_col || ') FOR '  8df9 || pivot_col || ' IN  8df9 (' || item_for_col || '  8df9 )) as PIV order by  8df9 ' || non_PIVOT_cols || ';';


--  8df9 Open the cursor and execute  8df9 the SQL
OPEN result_set FOR EXECUTE  8df9 PIVOT_sql;

END;
$$ LANGUAGE plpgsql;


Instance:
BEGIN;
CALL public.sp_dynamicPIVOT ('12  8df9 months','public.book_sales','MAX','gross sales','bookname', 'PIVOT_result');
FETCH ALL FROM  8df9 PIVOT_result; CLOSE PIVOT_result;
END;

8df9
8df9

8df9
8df9

8df9 PIVOT instance utilizing CTEs

8df9
8df9

8df9 You should use PIVOT as 8df9 a part of a CTE 8df9 (Widespread Desk Expression). See the 8df9 next instance code:

8df9
8df9

8df9
8df9

 8df9 with dataset1 as
(Choose bookname,gross sales  8df9 from public.book_sales)
choose * from dataset1  8df9 PIVOT (
 sum(gross sales)
 FOR  8df9 bookname IN ('LOTR', 'GOT', 'Harry  8df9 Potter', 'Sherlock')
);

8df9
8df9

8df9
8df9

8df9
8df9

8df9 A number of aggregations for 8df9 PIVOT

8df9
8df9

8df9 The next code illustrates a 8df9 number of aggregations for PIVOT:

8df9
8df9

8df9
8df9

 8df9 WITH dataset1 AS
(
 SELECT 1  8df9 AS "rownum",
 bookname,
 gross sales
  8df9 FROM PUBLIC.book_sales)
SELECT *
FROM (
 SELECT  8df9 rownum,"LOTR" as avg_sales_lotr,"GOT" as avg_sales_got,"Harry  8df9 Potter" as avg_sales_harrypotter,"Sherlock" as avg_sales_sherlock
  8df9 FROM dataset1 PIVOT (avg(gross sales)  8df9 FOR bookname IN ('LOTR','GOT','Harry Potter','Sherlock'))  8df9 AS avg_sales) a
JOIN
 (
 SELECT  8df9 rownum, "LOTR" as sum_sales_lotr,"GOT" as  8df9 sum_sales_got,"Harry Potter" as sum_sales_harrypotter,"Sherlock" as  8df9 sum_sales_sherlock
 FROM dataset1 PIVOT (sum(gross  8df9 sales) FOR bookname IN ('LOTR',
  8df9 'GOT', 'Harry Potter', 'Sherlock')) AS  8df9 sum_sales) b
utilizing (rownum);

8df9
8df9

8df9
8df9

8df9
8df9

8df9 Abstract

8df9
8df9

8df9 Though PIVOT and UNPIVOT aren’t 8df9 completely new paradigms of SQL 8df9 language, the brand new native 8df9 help for these operators in 8df9 Amazon Redshift will help you 8df9 obtain many sturdy use circumstances 8df9 with out the trouble of 8df9 utilizing alternate operators. On this 8df9 submit, we explored just a 8df9 few methods during which the 8df9 brand new operators might come 8df9 in useful.

8df9
8df9

8df9 Adapt 8df9 PIVOT and UNPIVOT 8df9 into your workstreams now 8df9 and work with us as 8df9 we evolve the characteristic, incorporating 8df9 extra complicated possibility units. Please 8df9 be at liberty to 8df9 attain out 8df9 to us if you 8df9 happen to want additional assist 8df9 to realize your customized use 8df9 circumstances.

8df9
8df9


8df9
8df9

8df9 In regards to the authors

8df9
8df9

8df9 Ashish Agrawal 8df9 is presently Sr. Technical Product 8df9 Supervisor with Amazon Redshift constructing 8df9 cloud-based knowledge warehouse and analytics 8df9 cloud service. Ashish has over 8df9 24 years of expertise in 8df9 IT. Ashish has experience in 8df9 knowledge warehouse, knowledge lake, Platform 8df9 as a Service. Ashish is 8df9 speaker at worldwide technical conferences.

8df9
8df9

8df9 Sai Teja Boddapati 8df9 is a Database Engineer based 8df9 mostly out of Seattle. He 8df9 works on fixing complicated database 8df9 issues to contribute to constructing 8df9 probably the most consumer pleasant 8df9 knowledge warehouse accessible. In his 8df9 spare time, he loves travelling, 8df9 enjoying video games and watching 8df9 motion pictures & documentaries.

8df9
8df9

8df9 Maneesh Sharma 8df9  is a Senior Database Engineer 8df9 at AWS with greater than 8df9 a decade of expertise designing 8df9 and implementing large-scale knowledge warehouse 8df9 and analytics options. He collaborates 8df9 with varied Amazon Redshift Companions 8df9 and prospects to drive higher 8df9 integration.

8df9
8df9

8df9 Eesha Kumar 8df9 is an Analytics Options Architect 8df9 with AWS. He works with 8df9 prospects to understand enterprise worth 8df9 of information by serving to 8df9 them constructing options leveraging AWS 8df9 platform and instruments.

8df9
8df9 8df9
8df9

8df9

LEAVE A REPLY

Please enter your comment!
Please enter your name here