Friday, July 3, 2020

Building a filter-proof calculation with Oracle Analytics - Part I

In some Analytics dashboards, we sometime need to represent values of filtered data, as a ratio to total data in the dataset. Meaning, filtering the numerator wihtout filtering the denominator of the fraction...
For example, I could filter my Global Revenue dashboard for only France, for 2019, and yet still want to see the % France represents out of the total Global revenue in 2019. That's a common functional need. 

This need has long been supported with OAC (and OBIEE earlier on) by leveraging the Semantic Layer (RPD Modelling) using 'level-based' aggregations and more advanced modelling. But what if the need is to compute this calculation without a metadata layer, with no access to an Oracle Analytics RPD ?

There is a reasonably simple solution to achieve this in Oracle Analytics Self Service (DV), without hard-coding any numbers in a calculation, this brief video walks through the whole process, and the blog below describes the steps :


In short, the trick lies into duplicating the dataset that is being used, and configuring one dataset to not react to filters from the other one. Let's say we have a sales dataset from which we query Revenue. Below is a screenshot example of our revenue breakdown by customer segment and by product category. The second table in green shows the ratio of each revenue cell compared to the total revenue. So far, this is classic need.
 
The green Pivot Table can be built directly using revenue, and tweaking pivot table properties to show values as percent of total, or it can also be built using a simple calculated metric, like what we have done here. It's similar behavior in both cases :
 
 
 
Now, let's filter this canvas for a specific value of, say Ship Mode : Express Air. The whole canvas filters properly :
 
The percentage recalculates on the basis of a new filtered total, which is normal behavior. But precisely in our case, we would like the total to remain un-filtered, so the percentage keeps showing relative value, compared to total revenue, not filtered revenue. 

How can we achieve this ?

Simply by duplicating the dataset we are using, and joining it back to the original dataset, without passing any 'filter' information in the join. We can achieve this by creating a 'dummy join' of type '1=1' for example. Once we duplicate the dataset, we just need to create a dummy-column with constant value '1' in both datasets. This is done via data prep : 

After that column is added to both datasets, we can simply create a join using it on both sides :

That's it, with this setup, we can now use objects from both versions of the same dataset in any visualization, and filters from one dataset will not apply to the second dataset. Check the screenshot below, it shows two tiles, Total Sales A (initial dataset) and Total Sales B (duplicated dataset). The filter at the top on Ship Mode = 'Regular Air' only applies to data first tile, the second tile still shows the total unfiltered sales. 
 
So in the bottom pivot table, we have replaced the percent calculation by simple division : 'Sales of dataset A' / 'Sales of dataset B'. As a result, we can see that the table computes percent of sales based on the total, unfiltered value of the sales. That's the expected behavior in this case.

An important hidden step :

The tile visualization behaved properly right away, no need for any tweaks. The pivot table did require one more tweak to show proper results : configuring the type of join. With our current process, we need to set the 'blending' characteristics of the join (ie its type : inner, left outer, right outer) at the dataviz level. Since the pivot table is aggregating data from both datasets at once, it does require us to override the defaults in it's property tab and set the join to full outer.
 
To do this, simply click on the dataviz, check the properties tab at bottom left of the screen and clikc the 'blending' tab there (two overlapping circles). There, simply set the blending to 'All Rows' for both dataset : 

That's all, the viz will behave properly from this point onwards.
 
In conclusion, we have just built a mini model here with a dummy join of type 1=1, that allows us to combine multiple versions of the same dataset with different levels of aggregation / filtering in the same DV query. This is somewhat similar to the technique used when building a robust semantic layer in OAC Admintool, but obviously in a slightly more rudimentary way. However it only takes a few minutes to get configured, and will operate dynamically. 
 
Now, notice one more detail : the question we started with was to show revenue for France in 2019 as a ratio of Global revenue... in 2019.
So in this case, both datasets should still be filtered at least for year, but not for geographies. The modeling to resolve this need is actually very simple once we have achieved the first part we just did. We address this in a second part of this blog entry here : 
http://oracledataviz.blogspot.com/2020/07/building-filter-selective-calculation.html
 
Thanks for your reading time, Happy modeling !

No comments:

Post a Comment