In this blog post we will discuss how to Add Related datasets for custom Machine Learning models in OAC.
After a Machine Learning model is created it is important to evaluate how well the model performs, before we go ahead and apply that model. To evaluate how well a model performs there are various accuracy metrics like Mean Absolute Error (MAE), Root Mean Squared Error(RMSE), Relative Absolute error(RAE), Residuals etc for numeric prediction/Regression algorithms and False Positive Rate(FPR), False Negative Error, Confusion Matrix etc for classification algorithms. Machine Learning feature in Oracle Analytics cloud has inbuilt methods to compute most of these accuracy metrics and store them in Related datasets. Related datasets are the tables/datasets which contain information about the model like accuracy metrics and prediction rules. In our previous blog spot named Understanding the Performance of a Oracle DV Machine Learning models using related datasets feature we have covered in depth about Related datasets.
In this blog post we will talk about how to add such related datasets in Custom Train model code. There are inbuilt methods in Oracle Analytics Cloud to add related datasets. User has to define the structure of these datasets i.e., columns these tables/datasets should contain, data type for all these columns and aggregation rules for these columns(if they are numeric). Once the required related datasets are added they can be found under Related Tab in Model Inspect pane:
Let us discuss in detail how to add Related datasets for a model:
ModelDataset() class implemented in Model module is a generic class that represents related datasets. Pass the name of the dataset that you are trying to create as an argument to ModelDataset() along with column names and mappings. This will return a related dataset. Generic Model class has an inbuilt method called add_output_dataset() which adds the passed dataset/dataframe as related dataset for that model. Following lines of code shows how to add a sample Related dataset called "Predicted Results" using df1 dataframe.
df1=pd.DataFrame({target:y_test, \
"PredVal":y_pred1, \
"PredProb":[y_pred_prob1[i][list(clf1.classes_).index(y_pred1[i])] for i in range(len(y_pred1))]})
df1_mappings = pd.DataFrame({
'name':[target,'PredVal','PredProb','Target','Model Name'],
'datatype':["varchar(100)","varchar(100)", double","varchar(100)","varchar(2000)"],
'aggr_rule':["none","none", "avg","none","none"]})
model.add_output_dataset(ModelDataset("Predicted Results", df1, df1_mappings))
Mappings dataframe contains column names mapped to corresponding datatypes and aggregation rules. Some of these related datasets are used by the framework to populate Quality Tab in the Model Inspect page. More details on how to populate Quality tab can be found in this blog: How to populate Quality tab.
Related Blogs: How to build Train/Apply custom model scripts in OAC, How to Populate Quality Tab, How to use inbuilt methods in OAC to Prepare data for Training/Applying ML Model
After a Machine Learning model is created it is important to evaluate how well the model performs, before we go ahead and apply that model. To evaluate how well a model performs there are various accuracy metrics like Mean Absolute Error (MAE), Root Mean Squared Error(RMSE), Relative Absolute error(RAE), Residuals etc for numeric prediction/Regression algorithms and False Positive Rate(FPR), False Negative Error, Confusion Matrix etc for classification algorithms. Machine Learning feature in Oracle Analytics cloud has inbuilt methods to compute most of these accuracy metrics and store them in Related datasets. Related datasets are the tables/datasets which contain information about the model like accuracy metrics and prediction rules. In our previous blog spot named Understanding the Performance of a Oracle DV Machine Learning models using related datasets feature we have covered in depth about Related datasets.
Let us discuss in detail how to add Related datasets for a model:
ModelDataset() class implemented in Model module is a generic class that represents related datasets. Pass the name of the dataset that you are trying to create as an argument to ModelDataset() along with column names and mappings. This will return a related dataset. Generic Model class has an inbuilt method called add_output_dataset() which adds the passed dataset/dataframe as related dataset for that model. Following lines of code shows how to add a sample Related dataset called "Predicted Results" using df1 dataframe.
df1=pd.DataFrame({target:y_test, \
"PredVal":y_pred1, \
"PredProb":[y_pred_prob1[i][list(clf1.classes_).index(y_pred1[i])] for i in range(len(y_pred1))]})
df1_mappings = pd.DataFrame({
'name':[target,'PredVal','PredProb','Target','Model Name'],
'datatype':["varchar(100)","varchar(100)", double","varchar(100)","varchar(2000)"],
'aggr_rule':["none","none", "avg","none","none"]})
model.add_output_dataset(ModelDataset("Predicted Results", df1, df1_mappings))
Mappings dataframe contains column names mapped to corresponding datatypes and aggregation rules. Some of these related datasets are used by the framework to populate Quality Tab in the Model Inspect page. More details on how to populate Quality tab can be found in this blog: How to populate Quality tab.
Related Blogs: How to build Train/Apply custom model scripts in OAC, How to Populate Quality Tab, How to use inbuilt methods in OAC to Prepare data for Training/Applying ML Model
Are you an Oracle Analytics customer
or user?
We want to hear your story!
Please voice your experience and provide feedback
with a quick product review for Oracle Analytics Cloud!
5 comments:
Your website have a valuable information,thanks for sharing information.
<a href="https://www.calfre.com/India/Hyderabad/Ameerpet/Oracle-Financials-Training/listing
>Oracle Financials Training in Ameerpet, Hyderabad
</a>
nice blog , very helpful and visit us for VISUALIZATION SERVICES in India
It's a Best post! Thank's for sharing your knowledge to others, it was very informative and in depth one.
Machine Learaning Using R Training in Electronic City
Appreciate you sharing, great article.Much thanks again. Really Cool.
Machine Learning Online Training
Machine Learning Online Course
it’s very helpful useful thanks for your valuable information follow us
Data Science Online Training in Hyderabad
Best Data Science Online Training in Hyderabad
Post a Comment