Oracle Underground BI & Dataviz

Thursday, May 2, 2019

To Bat first or Bowl first? Strategy for the upcoming ICC Cricket World Cup 2019

2019 is finally here and along with it, the biggest extravaganza in the world of cricket - the ICC Cricket World Cup 2019!! The tournament is one of the world's most viewed sporting events and is considered the "flagship event of the international cricket calendar" by the International Cricket Council. This is the 12th edition of the Cricket World Cup scheduled to be hosted by England and Wales.

Have you ever thought about the innumerable factors influencing a cricket match outcome?

The weather on the day of the match dictates whether it would end up in a high scoring or a low scoring affair.
The condition of the pitch can also play an important role in the outcome.
The toss... who doesn't dread losing the toss?

While there are several factors influencing the outcome, one of the most difficult decisions that captains have to make is whether to bat first or bowl first. This is a crucial decision and a strongly debated topic. One key aspect to this decision making would be to look at history. See how teams fared batting first vs chasing, how this pattern varies across venues, how some teams have done better batting first vs chasing etc ...

Can any Analytics help in making this decision ?

Following are very simple OAC (Oracle Analytics Cloud) visualizations that were built on all the world cup matches played from 1975 to 2015. Using OAC, it only took a few minutes to gain interesting insights on strategy to adopt for teams competing at the 2019 World Cup :

Eleven World Cups have been played so far and England hosted four (4) of them. Australia won the World Cup 5 times, India and West Indies come a distant second by winning 2 each, Sri Lanka and Pakistan won it only once.

A Marker in Finalists Strategy

Teams batting first won 63% of the time while teams chasing first only won 37% of the time. So if a team is skilled enough to enter the finals, batting first doubles its chances of holding that cup!!

Team wise Strategies that worked

Digging a bit deeper, this pattern varies by teams: for Australia and India batting first or second didn't seem to matter when they won the World Cup. For West Indies however, they only won by batting first. So if West Indies makes it to final at Lords this time around, their bet should definitely be on batting first.

Team Strategy when England is hosting

Since England is the host nation in the upcoming event, let's shift our focus to only those World Cups which were played in England. Interestingly, when hosted in England, finals where won most of the time (75%) by teams batting first.
When it comes to semi finals, it's the opposite: teams batting second have won most semi finals !!
These numbers contrasts vigorously with semi-finals and finals hosted outside of England:
In all the other countries, teams batting first have won most finals and semi finals.

Venue-wise Strategy

Could venue even play a part in this pattern ? Let's consider the premier venues of England where the majority of the games in 2019 world cup are about to be played: Birmingham, Leeds and Lords. Teams playing there have donw equally well batting first or second.
But if you are playing in Manchester (where the 2019 semifinals would be played), history suggests its better to chase, while at The Oval (London), teams batting first have fared way better.

Let's look at the venues where teams have won with the highest average runs margin. Birmingham, Taunton and Chester Le Street are the preferred grounds to bat first where the winning margin is pretty high

Lets now look at those venues where teams bowling first have won with the highest wickets margin. Canterbury, Lords and Leeds are the venues where teams chasing first have fared well.

Looking at overall batting win rate across premier venues in England, Nottingham and The Oval are the most preferred venues to bat first.

Team strategy across World Cups games

Some individual teams historically fare better when batting first vs chasing at World Cup matches, irrespective of venues and location. Australia, the top performer has mostly won by batting first (60%). In contrast, arch rivals New Zealand are way better chasers (60%) !!

In the early days of World Cups or even other cricket matches for that matter, teams always preferred to bat first. Bat first, score big and put the opposition under pressure. But noticing the trend from 2007 World Cup onwards, stats show us that there is not much of a difference in batting first or chasing. Does this indicate that teams are getting better playing under pressure while chasing? Perhaps they are !

Conclusion

Most of the cricket enthusiasts would argue that batting first has been the best winning strategy so far, but data does not always support that position : captains winning the toss would benefit from looking at historical stats to make an informed decision.
For the upcoming World Cup in England, insights from historical data suggests a few distinct approaches :

Semi Final 1 (in Manchester) : data shows that Manchester has been a ground where teams have dominated while chasing. It also shows that, historically in England, semi finals were won by chasing. So the team winning the toss is more likely to win the match if they let the opposition bat first.

Semi Final 2 (in Birmingham) : if you win the toss in Birmingham, though the venue doesn't favor teams batting first or bowing, data also suggests that there has been big wins with runs margin (ie by batting first). The team winning the toss is more likely to win the match if they bat first.

Finals (Lord's) : Historically teams have won in finals in England by batting first. The team winning the toss is more likely to have their chance of lifting the World Cup if they bat first.

May the best team win the World Cup!!

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!

Wednesday, April 24, 2019

How to perform incremental data loads in Oracle Analytics Cloud

Oracle Analytics Cloud offers the ability to create data flows and perform incremental loads on the target table. Data flows have the ability to operate only on the incremental data which becomes available in the source in between the current run and the previous run.

In this blog, let's see how to perform incremental load on a database table. This video gives a sense of the whole experience. The blog shares a few more details as well

Prerequisites for performing incremental loads

1) Incremental loads can be performed only on those data sets where the source and target tables are database based.
2) If there are multiple data sets in a data flow, then only one of the sources can be set for an incremental update.

Defining a source and new data identifier for the source

The first step is to have a source table and identify a column using which new data can be identified in the table. In this example, I have a revenue fact table with month key as a new data identifier

Next step is to create the Oracle Analytics datasource pointing to this table. In the process of creating the datasource, make sure you mark the new identifier column by clicking on the 3rd node in the process flow. This is an important step as this column defines how the system will be able to identify new rows in the dataset.

Define a data flow

Now that our datasource is created, let's define a data flow by importing the revenue tact table from the source connection. The key here is to check the "Add new data only" box to ensure that the source table is marked for incremental load.

To make my dataflow a bit more functionally representative, I will add some business example : converting currencies values. Let's bring in a spreadsheet which has exchange rates for every month to convert and let's join it based on the month key column. Let's add a new calculation to convert revenue.

Finally lets select step 'Save Data' and specify a name for the resulting data set. Make sure you choose the target connection as a database and specify the table name where the result set needs to be saved. There are 2 options available to select in "When Run" drop down.

"Replace existing data" : that will make the data flow truncate the target table and reload all the records.
"Add new data to existing data" : keep the existing records intact and load only the new records in the source table. New records are identified by the column we defined in the source dataset above.

Lets set the When Run option to "Add new data to existing data" and save the data flow.

Now, let's run the data flow for the first time. As it completes, we can see in our underlying data base that the target table has been created. Since this was the first run of the data flow, all the records in the source table are inserted into the target table.

Now, for the example of this blog, let's go and delete a month of data (201812) from our newly created target table. After doing this, our source table still has its 12 months of data (Jan to Dec) but our target table now only has 11 months, it is missing December. Notice that we did not change data in our source table, so there are no new records there since our last run of the data flow.

So, as we run the data flow for the second time now, the target table does not get incremented at all. The data flow was set to only bring across new data, but there is no new data in the source so nothing is changed in the target table. We can check that the target table is still not loaded with the deleted month's data. If the data flow had been set to full load, all the data would be in the target.

Now, to complete the test, let's manually load our source table with 3 more months of data. This will represent some incremental data. Then let's re-run the data flow once again. We can find that the target table has been incremented with the 3 new months of data coming from the source table. But notice that data in target table is still missing for the month where the records were deleted :

Just to confirm for the purpose of this blog, if we go back to the data flow definition and set Run option to "Replace Existing Data" in the target table, then when we run the data flow all the data gets loaded, including the the deleted month's data.

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!

Tuesday, April 9, 2019

DV Aggregation : Metric Level vs Visualization Level

Oracle DV supports different aggregation rules for any metric like Sum, Average, Minimum, Maximum etc. The rule defines how the metric will aggregate when queries return data at an aggregated level (exp : sum of sales by customer when source data is at the sales order level)...

A metric's aggregation rule can be edited in several places while building analysis in OAC. It can be changed as a Metric property level, via different dialog boxes, or it can also be changed in a specific Visualization, only for that Viz. If you ever wondered about difference in behavior between these two main scenarios, let's dig deeper and find out.

1) Setting aggregation at a Metric level

This can be done in two ways: In the Prepare tab or in the Visualize tab when building a DV project. The resulting behavior is the same in both cases: it sets the default rule for aggregation of the metric in this specific dataset. So this will establish the default aggregation rule for any project using this dataset column, not just for this project we are working on.

Setting it directly from the Prepare Tab:

To set a metric aggregation in the Prepare Tab, click the Metric and in the properties pane, go to General tab and change the aggregation to a desired option.

Once you apply this change, the new aggregation method applies to all the visualizations that use this metric across different DV projects. For example, if we set the aggregation of Sales metric to Sum, every time this metric is used in a viz, the value is computed using Sum aggregation. Let's look at Sales by Product Categories just below :

So now that the property is set for this metric the dataset level every project or visualization using this metric will compute the new aggregation rule. To check that, when we inspect the dataset from the dataset pane and look at its elements, we see the metric aggregation is set to Sum.

Setting the aggregation rule from the DV Canvas (Visualize tab) :

Another option to set the same level of aggregation is to change it directly from a Metric column in the DV Canvas (Visualize tab as opposed to Prepare tab). To set a Metric aggregation from there, simply click on the metric among the data elements and change the aggregation rule in the in the properties pane (bottom left pane).

The scope here is the same as setting it in the Prepare tab : it will set it as the default aggregation for that metric and will be used every time the metric value is computed, in any project.

2) Setting aggregation at a Viz Level

Metric aggregation could also be set for a single visualization in a given DV project only. In this case, the setting is specific to the visualization and overrides the default aggregation rule set at the dataset level. This can be done by clicking on any given Viz that exists on your canvas, going to the Values sub-tab within the Properties pane (the # sign) and setting the Aggregation method here. If a Viz has multiple metrics, each metric can have a different aggregation method. Changing the rule there will only apply to the Viz that you edited and will not impact the dataset. So any other viz using the same metric, in this project or any other project, will keep the original aggregation rule.

At first sight, changing the aggregation method at the Viz level may seem of trivial use : if the Viz has a Total invoked in it, the aggregation to arrive at the Total value will be the one specified in the Viz property. For example, looking at the table of Sales by Products Category with default aggregation Sum for this column, and let's add a Total to the Viz. By default, the Total value is calculated as a Sum of Sales. If we change the aggregation method only for the Viz to Average, we will now see the Total of the report showing an Average of Product Categories Sales.

Note that this only changed the Total aggregation line, the value for lines did not change. Why ? The calculation done by OAC here includes two passes : first retrieve Sales by each dimension requested in the Viz, in this case Product Category. This retrieving of data is still done using the default dataset aggregation rule (Sum in our example). Then, the second pass, at the Viz level, aggregates the retrieved data using the specific Viz rule for each row in the Viz. In this case, simply show the Avg of Sales for each Product Category and also for the Total of the report. For each single Product Category, Avg of Total Sales is the same as Total Sales, but for the sum of all Product Categories, we get the average value of Product Category Sales.

But if you look closer in the option, there is a very powerful subtlety that can be leverage when using Viz level aggregation rules : the by clause option.
Let's keep using our table of Sales by Product Categories, and let's now say we want to see average monthly Sales for each Product Category. For each Product Category, we want average of sales by Month... That's what the By clause will let us achieve.
As we set the Viz level aggregation method to Average we can click on the By field and set it to any set of attributes (one or many). Let's just pick Order Month in our example. The view now shows Monthly average Sales for each Product Category and this aggregation is specific to the viz.

In this case, OAC actually retrieved information of Sales, summed by Product Category and Order Month in the first pass, and then, for each Product Category, calculated the average of all the monthly sales value. This applies to any other aggregation, with the 'by' columns of your choice, and for any Viz. This is pretty powerful analytics calculation done by a single user friendly click.

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!

Wednesday, February 27, 2019

Synopsis Mobile Application sourcing from Oracle Analytics Cloud

Have you ever felt the need for a rapid, automatic overview of any of the datasets that are available to you on your Oracle Analytics Cloud (OAC) instance ? From your mobile, right away, over few seconds ?

Oracle Synopsis Application now directly connects to any dataset lying on your OAC instance and immediately creates meaningful analytics out of it. It lets you interact with spreadsheets and business data in a visual and intuitive way—while you're on the go, on your mobile, within seconds.No technical training required, no specific Skills required, the app installs and connects in seconds.

This 6 minutes video gives a sense of the whole experience. The blog below shares a few more details as well.

Synopsis is available on both Android and iOS devices. At the time we are writing this blog (Feb 2019), the one for Android is a little ahead as it lets you connect to Oracle analytics cloud. The same feature in iOS would be available in few weeks with an upcoming update.

How to connect to Oracle Analytics Cloud using Synopsis

From your Synopsis App, tap the + icon in green to get an option to connect to OAC. Provide the server details and credentials to connect to OAC. Once connected, you will see a list of all the files which are available inside of OAC. Click on a spreadsheet from all the choices you have and let it run some background analysis before rendering the first visuals.

The first visuals

After a bit of background analysis, the first visuals are rendered. The top row on the screen shows 3 attribute columns that you can toggle, the bottom of the screen has 3 metrics suggested from the datasource you analyzed. Clicking on each would display a metric by attribute pair. In the screenshot you see "Profit by Order Priority"

Performance tiles of all metrics

If you wish to see the the metrics available in the datasource, simply click on "OAC". Against each tile, there are options to choose the right aggregation you want.

Choose/Edit the right columns for analysis

Click on the settings icon in the top pane to get to a screen where you can delete, move columns from metrics to attributes and rename columns. Hold on a column until a strike through appears to make sure the column is unselected. Hold and drag columns from Numbers area to Text area. Dragging columns from Text to Numbers makes them as metrics. Turn on the "Edit Column Labels" using which you can edit the column names

Further Analysis

By now I am sure you are wishing to do more with Synopsis. Lets say if you wish to analyze each metric by all the attributes available, click on a performance tile say "Sales". In this screen you will see various analysis of Sales by all the attributes.

If you wish to edit the visualizations, click on one of them where it takes you to an edit screen. In the edit mode, you would be able to change the chart type, add/remove metrics and attributes, filter the chart based on attributes.

Statistical Analysis

Once you are in the edit mode of a chart, you can choose to get some statistical information like mean min, max etc, by clicking on the yellow colored "i" icon just above the edit pencil icon

Natural Language Generation - Project Insights

Clicking on the button show in the image here would generate some interesting insights in the data. Things like "If a metric value goes down, then another metric goes up" would be generated by the built in analytics engine.

By editing out a metric or an attribute, the insights automatically run to provide fresh project insights with the new metrics and attributes.

Sharing the report on Social media

Reports and visuals generated on Synopsis can be shared on various media options like Whats app, email, Twitter etc. Click on the share icon on top of each report to share the particular visual.

By now you would have realized that so many functionalities can be accomplished using Synopsis. To conclude Synopsis is an app that provides actionable insights for smart decision making at your finger tips. Soon, you would be able to connect to Oracle cloud sources and leverage the same capabilities on other sources with Synopsis

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!

Thursday, February 14, 2019

Jan 2019: What's new in Oracle Analytics Cloud 105.1.0

Oracle Analytics Cloud - Release 105.1.0 – January 2019 offers several new Administration features and enhancements to improve your overall product Lifecycle experience.

New Features

Admin Related :

Snapshot Enhancements : users have a fine control over the type of content to include in a Snapshot, and the scope of this content has widely increased since last release. While creating a Snapshot, user either chooses 'Everything' which takes a backup of the entire environment or uses the 'Custom' option to selectively pick specific content to back up or migrate. File-based datasets, custom visualizations plug-ins and extensions are available as options to select. Similar options are available while restoring Snapshots : users can choose to restore all the contents from the Snapshot or selectively restore only a few types of objects during restore. These enhancements improve the overall backup, restore, and migration experience from environments to environments.

Data file migration utility: This utility complements the Snapshot process for cases when connection between the source and target environments may not have access to the same back-end Cloud infrastructure. Such situations can interrupt migration of some file-based data sets included in the snapshots. This utility provides an alternative way to directly migrate data files from one environment to another in these cases.

Configure System Settings and Restart : OAC Service Administration now offers an option within Console screen to edit the configuration of the environment server. Clicking on the 'Configure System Settings' tile in the console screen launches the OAC Environment Manager which allows to override several system settings like

‘Allow HTML content’,
‘Currency preferences’,
‘Prompt autocomplete’ options,
‘Timezone settings’,
‘Default scrolling behavior’,
Evaluate support level’,
etc….

Once settings are edited, services can be restarted by a click from the same interface so they are taken into effect. Respective services that need to be restarted (OBI Server, OBI Presentation Server...) are automatically identified and selected in the Restart pop up. The user only needs to click OK to restart those services.

Catalog Manager : Catalog Manager utility is now available as part of OAC client install and connects to online OAC services. Using the catalog manager, Administrators can connect to an OAC instance and directly manage Web Catalog. Edit permissions on catalog objects, move objects from folder to another, create a report on catalog objects and save it locally etc…

DSS/Data Prep Public Rest APIs : Several Data Set Service REST end points can be called from UNIX curl command, POSTMAN, swagger UI console etc. Using these APIs, developers can perform operations like list all connections on OAC environment, create/update/delete connections, replace/delete datasets, create/update/delete dataflows etc... These APIs open several powerful capabilities to the developer community in leveraging OAC capabilities.

Data Source & Data Visualization Enhancements:

Oracle Autonomous Transaction Processing: A data source connector to Oracle Autonomous Transaction Processing is now available. Connection to both Oracle Autonomous Transaction Processing as well Oracle Autonomous Data Warehouse Cloud are made easier now as the Create Connection screen directly accepts the zipped wallet file for credentials. Simply drag and drop the zip file and all the back end details will be automatically identified for the connection : host server name and port number are automatically identified and list of service names are available for selection as a drop down thereby eliminating the previous manual entry process.

New Table Viz properties: new properties have been introduced in the Table visualization.

Suppress Repeating Values: This can be toggled on/off (default off) and controls if values are repeated or not in a Table visualization.
Show Duplicate Rows: This can toggled on/off (default off). When set to Off, the metric values displayed in the Table are aggregated using the aggregation rule of the metric. When set to On, Table displays the granular level of the dataset without applying any aggregation on the metric.

Pixel Perfect Reporting (BI Publisher) Enhancements:

POV (Point Of View) parameter in an MDX query: POV as parameter was introduced in the previous release and is improved in 5.1 release. User can now search a dimension in a cube and include a POV parameter value in an MDX query. Similar to SQL query, when including a parameter in MDX query, it will automatically create List of Values and parameter prompt in the data model. Users will get prompted to complete the POV anytime they consume or design a report using this datasource.

Snapshot includes pixel perfect reporting: Snapshot files now include BIP related objects: credentials, configurations, and scheduled jobs of pixel-perfect reporting. That allows to easily migrate content from one environment to another using snapshots.

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!

Friday, January 11, 2019

Modify System Settings in Oracle Analytics Cloud

Have you ever felt the need to modify the analytics system settings in an easier way other than getting inside the OS’s file system and editing the files? The latest update of Oracle analytics cloud offers a comprehensive capability to configure system settings easily through a user interface. where users can alter configuration and restart services through the click of a button.

To access this new capability, from the DV home page, click on the burger icon and navigate to Console. Under Console, navigate to Service Administration. Clicking on Configure System Settings will open a new page where parameters are available for modification. Below is a screen shot of the new screen

To edit the parameters, one must double click on the value and a text box appears where user types the appropriate value. As soon as the value is changed, a call out appears next to the fields indicating that the value has been updated and a restart is required. There is a Restart button on the top right corner which can be leveraged to bounce the services for the parameter to take effect.

Below is a digest of some of the config parameters that can be modified via this interface in the Jan 2019 update of OAC.

Default Scrolling Enabled

This parameter specifies the data view for tabular, pivot, trellis views in BI dashboard. By setting this to true, it sets reports to show the output with Data View as 'Fixed headers with scrolling content'

If the parameter is set to false, it sets reports to show the Data View as 'Content Paging'.

Dashboard Prompt Parameters

Show Null Value when column is Nullable

Have you ever been troubled by the ever-present NULL in your dashboard prompts? Have you ever wondered how to get rid of that NULL value from the prompt? Well, here is the answer. Set the parameter to these values and expect the changes in behavior

always — Always shows the term "NULL" above the column separator in the drop-down list

never — Never shows the term "NULL" in the drop-down list.

asDataValue — Displays as a data value in the drop-down list, not as the term "NULL" above the separator in the drop-down list.

Dashboard Prompt Auto Complete.

Have you ever thought if you can make your dashboard work like google search engine which provides auto complete suggestions? If yes, then it is possible by turning on the parameter “Support Auto Complete” to true.

If you set Support Auto Complete to true, then the “Prompt Auto Complete” option appears under My Account. The same would appear under Dashboard settings as well

In addition to auto complete, here are 2 more additional parameters you can play with for more options.

Case Insensitive Auto Complete: (Default: True)

Specifies whether the auto-complete functionality is case-insensitive. If set to true, case is not considered when a user enters a prompt value such as "Oracle" or "oracle." If set to false, case is considered when a user enters a prompt value, so the user must enter "Oracle" and not "oracle" to find the Oracle record. The system recommends the value with the proper case.

Matching Level: (Default: MatchAll)

Specifies whether the auto-complete functionality uses matching to find the prompt value that the user enters in the prompt field. These settings do not apply when the user accesses the Search dialog to locate and specify a prompt value.

StartsWith — Searches for a match that begins with the text that the user types

WordStartsWith — Searches for a match at the beginning of a word or group of words

MatchAll — Searches for any match within the word or words.

In the below screenshot you can see that I have a Product Name prompt and I have typed “St” and it is being searched for every occurrence in the prompt field

Time zone parameters.

Have you ever wanted to have the date time columns in your report in your preferred time zone? If yes, then there are parameters mentioned below controls the display of time zones.

Default Data Offset Time zone (Default: None):

The time zone offset of the original data. To enable the time zone to be converted so that users see the appropriate zone, you must set the value of this element or variable. If you do not set this option, then no time zone conversion occurs because the value is "unknown". An offset that indicates the number of hours away from GMT time.

For example: "GMT-05:00" or "-300", which means minus 5 hours.

Default User Preferred Time zone:

Specifies the users' default preferred time zone before they select their own in the My Account dialog. Setting the value here will affect the default time zone under my account in BI answers and dashboards.
For example: (GMT -02:00) Cairo

Setting Currency preferences

With the new user interface, currency settings can be managed in Oracle analytics cloud by uploading the currencies.xml and userPrefcurrencies.xml, which earlier had to be saved in the file system. Double click under the property value of the respective parameter to enter edit mode.

Upload the appropriate xml files to currencies.xml and user currency preferences.xml to obtain the desired currency functionality.

In OAC, currency settings are similar to the way it is done in on-premise OBIEE. Please refer to the documentation link for more details.

https://docs.oracle.com/middleware/12213/biee/BIESG/GUID-669E8057-A76C-4F75-B29F-EF4A6F92B9E6.htm#BIESG1708

Are you an Oracle Analytics customer or user?

We want to hear your story!

Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!