BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Oracle Cloud Infrastructure Finally Gets Data Science And ML Services

Following
This article is more than 4 years old.

Oracle recently announced the availability of Cloud Data Science Platform, a comprehensive collection of services including a data catalog, machine learning, Cloudera Hadoop and Apache Spark distributions on Oracle Cloud Infrastructure (OCI). 

According to Oracle, the key differentiating factor for its data science platform is team collaboration features and tight integration with a variety of data sources available in OCI. 

Oracle Cloud Data Science Platform supports Jupyter notebooks for building and deploying machine learning models. Developers and Data Scientists can create a notebook session - a development environment launched on top of virtual machines and block storage. A session created in a Virtual Cloud Network (VCN) with appropriate permissions can access data from Autonomous Data Warehouse (ADW). Data Scientists can use familiar libraries such as Pandas to fetch data from ADW for performing exploratory data analysis and visualization. This enables existing customers to bring predictive analytics closer to the data warehouse which is the repository for historical data. 

Machine learning algorithms are tightly integrated with Oracle Autonomous Database with new support for Python and AutoML. When Oracle Cloud Infrastructure Data Science is fully integrated with the Autonomous Database, it will enable data scientists to develop models using both open source and scalable in-database algorithms. 

Oracle has also added AutoML capabilities to the data science platform. This service automates algorithm selection and tuning which automates the process of running tests against multiple algorithms and hyperparameter configurations. The automated predictive feature selection simplifies feature engineering by automatically identifying key predictive features from larger datasets. Customers can choose the best model based on a comprehensive model evaluation process that uses multiple evaluation metrics. 

Oracle is emphasizing on the explainability of the models. The Oracle Cloud Infrastructure Data Science provides an automated explanation of the relative weighting and importance of the factors that go into generating a prediction. Oracle claims that its platform offers the industry’s first commercial implementation of model-agnostic explanation. For example, a data scientist dealing with a fraud detection model can explain which factors are the biggest drivers of fraud so the business can modify processes or implement safeguards.

The Oracle Cloud Data Science Platform also includes OCI VMs for Data Science backed by GPUs, Oracle Big Data Service based on full Cloudera Hadoop implementation, a data catalog that allows users to discover, find, organize, enrich and trace data assets on Oracle Cloud, Cloud SQL that enables SQL queries on data in HDFS, Hive, Kafka, NoSQL and Object Storage.

Some of Oracle Cloud Infrastructure’s key competitors - AWS, Azure, GCP, and IBM Cloud - had the ML PaaS related services since 2016.

Since databases and data-driven workloads are key for Oracle, an integrated ML service helps customers in building intelligent applications. 

Even though Oracle is late to the ML PaaS party, it has unique capabilities in the form of AutoML, tight integration with the data warehouse, model explainability, and comprehensive model management.

Follow me on Twitter or LinkedInCheck out my website