Thursday, July 4, 2019

Unleash your Database power within OAC using Evaluate function


Database systems like Oracle have tremendous analytical and data manipulation capabilities. These help in performing powerful transformations and gaining valuable insights out of massive amounts of data. Databases typically provide a wide range of inbuilt functions that can perform different kinds of deep analyses in a very scalable manner. Often, users are also able to define their own custom functions to meet specific query needs.

As users of Oracle Analytics (OA Cloud or Server), it is extremely valuable to be able to leverage these DBs functions directly from within the OA framework. 

The Oracle Analytics (OA) Evaluate function helps in doing just that. It enables users to create powerful analyses by directly invoking database functions. Evaluate provides a syntax to pass the specified database function with optional OA referenced columns as parameters. It is particularly useful when users want to call a specialized database function to process a heavy calculation directly in the underlying database. Evaluate can be used to call functions from any underlying database, Oracle, Essbase cube, DB2, MSSQL, Apache Hive and so on. It is a Scalar function, that is, which takes one or more values but returns a single value for every row.

The following video guides you through the process of leveraging Evaluate with Oracle Analytics, the rest of the blog below also gives the details needed.


Syntax

The syntax of Evaluate wraps around the syntax of the underlying DB function : EVALUATE('db_function(%1...%N)' AS data_typecolumn1columnN) where 
  • db_function() syntax is a valid SQL function available in the underlying database.
  • Column1 to columnN are the optional input parameters to the DB function. %1 is substituted with the value of column1, %2 with column2 and so on, for as many arguments as needed in the db function.
  • As data_type is an optional parameter that specifies the datatype of the return value (an OA datatype). Specifying data_type is not meant for type casting. Use it only when the return type cannot be easily predicted by the input columns. If there is a need to cast the return value, then add an explicit cast function.

Prerequisites

In order to be able to use Evaluate function in DV, following two conditions need to be met
  1. First, set Evaluate Support Level parameter to 2 in OAC system configuration. By default, Evaluate support level is set to 0, which prevents OAC from executing evaluate functions. To enable it, go to System Settings on the  OAC environment and set Evaluate Support Level to 1 or 2. Setting it to 1 means Evaluate is supported to users with ManageRepositories permission. Setting it to 2 means Evaluate is supported for all users.
  1. Make sure Data Access is set to Live For the datasets that will be referenced by the Evaluate function. By setting it to Live, we are ensuring that OAC Server always sends the query to the underlying database and does not try to simply retrieve results from cache.


How to use Evaluate




Let’s consider the Customers table from the sample data SH schema in Oracle Database. 
This table contains information about Customer attributes like name, gender, address, phone number etc.  

Querying the customers table in OA DataViz, here's how customer phone numbers look, with mixed format : some of the customer records have the phone number in the format xxx-xxx-xxxx while some are in the format (xxx) xxx-xxxx. Let’s say we want to change all the phone numbers to a common format (xxx) xxx-xxxx. 

Let's take the example of running some re-gex based replacement rule to achieve this. The database we are using (Oracle DB) has a REGEXP function that allows to do this, let us invoke it just for illustration of Evaluate syntax here. 




To begin with, let's first go to our OA System Settings and make sure Evaluate Support Level is set to 1 or 2. This is done only once for the OA environment.

Next, let's create an OA DV data-set sourcing from the Customers table in SH schema. Let's make sure we set the Data Access to Live for this dataset in order to be able to use Evaluate function.  


In the Oracle Database, the SQL syntax of the REGEXP_REPLACE function we want to use to change the Phone Number formats would be the following :

REGEXP_REPLACE(cust_main_phone_number, '([[:digit:]]{3})\-([[:digit:]]{3})\-([[:digit:]]{4})', '(\1) \2-\3')

To use this in DV, let's create a custom calculation directly from a project connected to the datasource and use the Evaluate function with the following syntax.

In the Evaluate Expression, we have the syntax for REGEXP_REPLACE as the db function that needs to be called, %1 is the column value being passed to the function which will be substituted by the first column, CUST_MAIN_PHONE_NUMBER (in blue). There is only one parameter in our example.
Notice that quotes is the 'escape character' : so for every single quote used within the original SQL function, we use the escape character which means we precede it by another single quote. That's why you see several places with twice single quotes in the Evaluate formula.

Bringing the new calculation into a visualization looks like this. Phone number format has changed by invoking the database REGEXP_REPLACE function. 


If we look at the underlying log with physical SQL sent by OA to the database (use the Session Monitor screen for accessing this), we see that Evaluate has been removed and only REGEXP_REPLACE function is being sent to the underlying database. 



Evaluate function can also be used within an OA dataflow. The same REGEXP_REPLACE can be called using an Evaluate function in the Add Columns step from any OA dataflow. Evaluate will properly functionship if all parameters of the dataflow are present in source tables from the same underlying database.


EVALUATE vs EVALUATE_AGGR
Evaluate formula will return the result of the database function as a column of type Attribute in OA. This result cannot be used as a metric or be broken down by other dimensions/attribute in the Datasets.
Evaluate_Aggr is a variation of the Evaluate syntax for OA that allows to invoke aggregate functions in the underlying database and return them as aggregated Metrics in OA. This function is intended for aggregate functions with a GROUP BY clause. 

Oracle database, for instance, supports a wide range of aggregate functions. APPROX_COUNT_DISTINCT is one such function that returns the approximate number of distinct values of an expression. This function is particularly useful when used on large amounts of data wherein APPROX_COUNT_DISTINCT is significantly faster than an exact COUNT(DISTINCT) with negligible deviation from the exact result. 
Let's see how to invoke this function from within OA using Evaluate_Aggr. Let us use the SH schema that has Sales, Customers and Products table. Let's approximate the number of distinct customers who purchased the various product categories in our fact table. We could run a COUNT(DISTINCT cust_id) on the Sales table, grouped by Product Categories. But this could be process time consuming if we had large amounts of data. 

Instead of this, we can invoke APPROX_COUNT_DISTINCT function using Evaluate_Aggr in OAC. The OA DV calculation would look like the following.

The result of this calculation is a metric in OA because we are using Evaluate_Aggr syntax. So in our DV or Answers visualization, we can simply add some attributes to break down the result further. For example we can bring PROD_CATEGORY column and show a bar chart. The bar chart below compares the result of Evaluate_Aggr(APPROX_COUNT_DISTINCT) with the results of actual count distinct calculation. We can see that the approximation of number of customers is very close to the actual values.


So both Evaluate and Evaluate_Aggr OA functions offer a powerful way to leverage any database capabilities directly within the OA framework, both for Cloud (OAC), Server (OAS), or even DV Desktop.

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!
 

5 comments:

lost_in_woods said...

nice blog , very helpful and visit us for VISUALIZATION SERVICES in India

Veera Blogspot said...

Hi,
Nice article,keep Updating,,,,

Thank you,,,

Power BI Training

UnoGeeks said...

I truly appreciate the time and work you put into sharing your knowledge. I found this topic to be quite effective and beneficial to me. Thank you very much for sharing. Continue to blog.

Mulesoft is the Most Widely Used Integration Platform. If you want to become Mulesoft Certified Developer, attend this Best Mulesoft Training Course offered by the Unogeeks (Top Mulesoft Training Institute)

bhanu said...

Excellent blog I visit this blog. It's really awesome.
Mulesoft Online Training
Mulesoft Training in Hyderabad

bhanu said...

Thanks for sharing such quality information.
Best Mulesoft Training
Mulesoft Online Training in Hyderabad

Post a Comment