Wednesday, April 22, 2020

How to retrieve valuable information related to your ML models in Oracle Database - Part I


Oracle Database offers powerful Machine Learning algorithms that enable data analysts to discover hidden patterns and insights. With Oracle Machine Learning (OML), you can build and apply predictive models inside the database that help predict customer behavior, identify cross-selling opportunities, build customer profiles and detect anomalies. These algorithms are implemented as SQL functions that can be directly invoked inside the database without moving the data and thereby leveraging the inherent strengths of the database.

OML offers a broad range of in-database algorithms to solve different kinds of business problems. For classification type of problem, there are algorithms like Naïve Bayes, Decision Tree, Support Vector Machine etc. For clustering , there is Enhanced K Means, O Clusters and so on.
For every ML model that is created in the database, different kinds of information related to the model are stored in several specific DB Tables and Views within the database.

In this blog, let’s create a simple OML model and explore the various mining model views that are created with the model in the database and how these can be leveraged to get valuable information about the model itself.

Sample Problem 

Let’s assume we are given a demographic data set about a set of customers and we would like to predict the customer response to an affinity card program using classification function based on Decision Tree algorithm. This dataset contains 3000 records with customer attributes like cust_gender, cust_marital_status, education, occupation, household_size and so on. Grain of this dataset is cust_id so every record has a unique cust_id.


Create an ML Model

Let’s create a classification model using Decision Tree (DT) to address this problem. The database offers several options to enhance the overall accuracy of a model like how to handle missing value for predictors, outlier treatment for predictors, method for binning high cardinality data and so on. For the purpose of this blog,  let us use the basic minimum steps to build a DT model.
Default mining algorithm used to build a classification model is Naïve Bayes. In order to build a Decision Tree classification model, we need to first create and populate a settings table and use this settings table as input to the model building procedure.
- Sample Syntax to create/populate model settings table
CREATE TABLE dt_sh_sample_settings
(setting_name  VARCHAR2(30),
setting_value VARCHAR2(4000));

                  INSERT INTO dt_sh_sample_settings VALUES
    (dbms_data_mining.algo_name, dbms_data_mining.algo_decision_tree);

- Sample Syntax to create a Classification model using Decision Tree algorithm.
                BEGIN
                DBMS_DATA_MINING.CREATE_MODEL(
model_name          => 'DT_TEST,
mining_function     => dbms_data_mining.classification,
data_table_name     => 'mining_data_build_v',
case_id_column_name => 'cust_id',
target_column_name  => 'affinity_card',
settings_table_name => 'dt_sh_sample_settings');
END;

Query model related information

Every time an ML model is built, there are several details about the model that are stored in a few data mining related views. Following section describes the important mining views in the database that can be leveraged to understand the model details. The information from these views can be easily visualized using Oracle Analytics Cloud.

a)       ALL_MINING_MODELS – This view in the database describes all the mining models accessible to the current user. It has details about the model owner, model_name, mining_function used in the model (Classification, Regression, Clustering and so on) , algorithm used to build the model (Naïve Bayes, Decision Tree and so on) , model creation date , model size and so on.  This view has one record for every OML model created.
There are two related views: 
             i) DBA_MINING_MODELS that describes all the mining models in the database.
             ii) USER_MINING_MODELS  that describes the models owned by the current user.

Information in this view can be visualized within Oracle Analytics Cloud by creating a dataset sourcing from this view.
Important Note: When you navigate to the schema in the data set creation screen, these data mining views are not listed as objects. However, you can enter a custom SQL similar to ‘Select * from ALL_MINING_MODELS’ to retrieve the model details.


Here's an example of the output of the ALL_MINING_MODELS view related to the model called DT_TEST.


b)       ALL_MINING_MODEL_ATTRIBUTES – This is an important view and contains information about the columns in the training data that were used to build the model. For each attribute, details about the attribute name, datatype, whether attribute is Target etc is available in this view. When this model is applied on a data set for scoring, columns with the same names and data types should be available in the scoring data set. In this case as well, there are two related views DBA_MINING_MODEL_ATTRIBUTES and USER_MINING_MODEL_ATTRIBUTES. 

Here's a sample output of this view for a model called DT_TEST as visualized within OAC. 




c)     ALL_MINING_MODEL_SETTINGS – This view describes the settings of the mining model. Here's a example of the model settings for a model called DT_TEST visualized using OAC.




d)       ALL_MINING_MODEL_VIEWS - This is another important view that provides slightly different kind of information. Every time an ML model is created in the database, there are several underlying model views that get created relevant to this model. The number and kind of these model views created differ from algorithm to another. An ML model built using Naive Bayes algorithm will have a set of related model views. An ML model built using Decision Tree will have another set of related model views created and so on. For example, consider a Decision Tree model called DT_TEST. Following are the model views created :              


Note that the view names have a fixed format. It begins with DM$V followed by an alphabet that is specific to the model view followed by the model name. DM$V<Alphabet><Model Name>. 
We will refer to the details on these specific model views for various mining algorithms in an upcoming Part II of blog entry on this topic. 

Conclusion

For every ML model that is created in the Oracle Database, there are a lot of metadata about the model available as part of the dictionary views. If you are interested in knowing more about this topic, kindly refer to Oracle Documentation, or stay tuned for the second part of this blog entry coming soon.

Wednesday, April 15, 2020

How to Access OAC and OAS Repositories Online


In this blog we will we will go through accessing Oracle Analytics OAC and OAS Repositories (RPD) in a live mode directly on the active server (Cloud or On Prem). Accessing RPD online gives Admin users the ability to directly make changes to live Oracle Analytics instance deployments with zero downtime.

Note :
Accessing RPD online is a task that cab modify live metadata in the OA semantic layer, it must be done with extreme caution and exact understanding of what's going on. Keeping an offline back-up version of the RPD before modifying online is highly recommended, to allow safe returning to initial state.

Accessing your OAC instance live...

 

First, download and install your the OAC Admintool from this location
https://www.oracle.com/solutions/business-analytics/analytics-server/analytics-server.html
https://www.oracle.com/middleware/technologies/oac-tools-downloads.html


Once installed locally, open OAC Admintool and lets begin accessing your online RPD from an OAC instance

             a. click on the 'open in cloud' icon shown below
             b. enter your federated user information in the username/password used to access OAC
               example pete.monteiro@oracle.com/mypassword)
             c. in the 'cloud' field enter bootstrap
             d. for hostname enter your FQDN- this is the URL used to access OAC
             for example: oac123456-oacppacm12345.uscom-central-1.oraclecloud.com
             e. for port number leave the default 443
             f. click open


You now have access to your online OAC RPD to update and make changes

Accessing your OAS instance live...

 

For Oracle Analytics Server download version 5.5 from the following site
https://www.oracle.com/middleware/technologies/oac-tools-downloads.html


Unlike with OAC where we use the 'open in cloud' option, for OAS we will use the 'open online'   option.  To start we need to identify the bi server port of your OAS deployment by logging into the WLS EM, for example http://YOURIP:9500/em

     a. the BI server port in OAS is 9514

   

   



Next lets create a 64bit ODBC connection to our OAS deployment

  


 a. click on System DSN and then select your datasource.  Note this data source is created when you   install the OAC/OAS admin tool

 b. click finish

  

c. Enter in the details below as shown below.
  •   provide a name and description
  •   for server name enter the IP address of your OAS instance
  •   click next....

 

 d. Enter in the OAS admin username/password as well as the port number in step 2 then click next then click finish.  Close the ODBC connection UI




Now we can launch your OAC/OAS admin tool and logon to your rdp
    a. click 'open online' as shown with the red arrow below
    b. select the connection you created and click open


 c. note, larger rpds will take longer to fully open.

 


Are you an Oracle Analytics customer or user?

We want to hear your story!
Please voice your experience and provide feedback with a quick product review for Oracle Analytics Cloud!