Quantcast
Channel: SCN : All Content - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 6745

How 'count' aggregation sometimes behaves differently in SAP HANA graphical views vs traditional SQL

$
0
0

This blog is to explain a common problem which occurs when we try to add an alphanumeric column as a measure (with 'COUNT' aggregation) in graphical views.

 

Problem Description


For example, if we have two views - VIEW_1 and VIEW_2, each having 2 alphanumeric columns "A" & "B".

And if we are trying to get a count of distinct values ofcolumn "B", grouped by the unique values of column "A", we would normally build a graphical view, similar to the one shown below.

intiial_flow.PNG

 

View_1.PNGview_2.PNG

In this case lets assume that we have only two unique values ('X' & 'Y') for COLUMN "A", across VIEW-1 & VIEW-2.


From an SQL perspective, we would expect HANA to perform the following query


SELECT A, COUNT(B) FROM

(

     SELECT A, B FROM VIEW_1

     UNION

     SELECT A, B FROM VIEW_2

)

GROUP by A;

 

But the graphical view data preview will always give the following result, whatever values are present in the views.

 

DATAPREVIEW.PNG

 

This result is definitely wrong.

 

Reason

 

The HANA optimizer, whenever possible, tries to always push down the aggregation/filters to a lower level nodeto reduce the number of rows transferred across levels/nodes.

 

So, in our case, HANA optimizer tries to push the 'COUNT' aggregation down.

 

flow.png

And the following property setting -"ALWAYS AGGREGATE RESULT"if set to'TRUE', always enforces a final aggregation in the semantics node.

This wouldn't be a problem for aggregation types like SUM, MAX and MIN.

 

Ex:  SUM ( SUM(A,B,F), SUM(C,D) ) is same as SUM(A,B,C,D,F)  or

        MAX ( MAX(A,B,F), MAX(C,D) ) is same as MAX(A,B,C,D,F)  or

        MIN ( MIN(A,B,F), MIN(C,D) ) is same as MIN(A,B,C,D,F)

 

BUTCOUNT ( COUNT(A,B,F), COUNT(C,D) )  => COUNT ( 3, 2 ) => 2, which is definitely wrong vsCOUNT(A,B,C,D,F) => 5

 

pre_aggre_true.PNG

 

So what actually gets executed, in the graphical view is the following query -

 

SELECT A, COUNT(B) AS B FROM

(

     SELECT A, COUNT(B) AS B FROM

     {

          SELECT A, B FROM VIEW_1

          UNION

          SELECT A, B FROM VIEW_2

     }

)

GROUP by A;

 

Resolution:

 

I guess the resolution is evident by now

 

ForCOUNT aggregation on alphanumeric values, switch "ALWAYS AGGREGATE RESULT" to 'FALSE'. to get accurate results,

 

always_aggre.PNG

 

or even simpler, use the COUNTER feature provided by HANA 

 

Regards

Ajay


Viewing all articles
Browse latest Browse all 6745

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>