Modules & Capabilities of Azure Machine Learning – Azure ML Part 03

Through the journey of getting familiar with Azure Machine Learning, cloud based machine learning platform of Microsoft, we discussed about the very first steps of getting started.
When you open up the online studio through your favorite web browser, you’ll directed to create a blank experiment. Let’s start with it.

start screen
Blank Experiment in Azure ML Studio

In your left hand side of the studio, you can see the pre-built modules that you can use to develop your experiments. If they are not enough for your case, you can use R or Python scripts in your experiment.
With Azure ML Studio, you get the ability to deploy models for almost all the machine learning problem types. The algorithms you can use for classification, regression and clustering are in the AML cheat sheet that you can download from here.(http://download.microsoft.com/download/A/6/1/A613E11E-8F9C-424A-B99D-65344785C288/microsoft-machine-learning-algorithm-cheat-sheet-v6.pdf)    machine-learning-algorithm-cheat-sheet-small_v_0_6-01

Will take a look into the sections that modules are categorize. If you want to find a specific module, what you have to do is search the experiment item from the search box.

Saved datasets – You can find out a set of sample datasets that you can use for experiments. Most of the popular machine learning related datasets like “iris dataset” are available here. If you want your own dataset in the studio, you can upload it to here.

Trained models – These are the models that you get as the output after training the data using an appropriate algorithm and methodology. They can be used for building another experiment or a web service later.

Data Format Conversions – The data comes in and going out from the experiment can be converted into a desired format using the modules in this section. If you wish to convert the output of your experiment to ARFF format (which supported in Weka) or to a CSV file you can use the modules here.

Data input & output – Azure ML has the ability to get data from various sources directly.  You can use an Azure SQL database, Azure BLOB storage or a hive query to get the data. Fetching data from a local SQL server is on preview yet (August 2016).

Data transformation – Data transformation tasks like normalization, clipping etc. can be done using the modules listed in this section. You can use SQL queries to do the data transformations if want.

Feature Selection – Appropriate feature selection increases the accuracy of your machine learning model drastically. There are three different methods as “Filter bases feature selection, Fisher linear discrimination and Permutation feature importance” that you can use according to your requirement.

Machine Learning – Within this section you can find out the modules built for training machine learning models, evaluate accuracy etc. Most of the popular machine learning algorithms used for classification, clustering and regression problems are listed down here as modules. The parameters of each module can be changed or use can you Tune Model Hyperparameters module to tune-up the experiment to get the optimal output.

OpenCV library Modules – ML is widely using in image recognition. In Azure ML there’s Predefined Cascade Image Classification that is trained to identify the images with front facing human faces.

Python language models – Python is one of the widely using languages in data mining and machine learning applications. With Azure ML studio you have the ability to execute your own python script using this module. 200+ common python libraries are supported with Azure ML right now.

R language models – Same as Python, R is one of the most favorite statistical languages among data scientists. You can use your favorite R scripts and train models with R using these modules. Most of the R packages are supported in Azure ML. If the package is not there you can import the packages for the experiment. (Unfortunately there are some limitations in this. Some R packages like RJava, openNLP are not supported yet with Azure ML – Aug.2016)

Statistical Functions – If you want to do some mathematical functions for the data or perform statistical operations, here you can find out the modules for that. A basic descriptive statistical analysis on the dataset also can be performed using the modules.

Text Analytics – Machine learning models can be used for text analytics. There are some modules included in Azure ML studio for text preprocessing (omit the stop words, punctuation marks, white spaces etc.), Named entity recognition (Pre trained module) and many more. Vawpal Wabbit learning system library is also included in the modules for the use.

Web service – One of the most notable advantages in Azure ML is the ability to deploy as a web service. Here’s the web service input and output modules that can be used for the built experiments.

Deprecated – Assigning data for clusters, binning, quantizing data, cleansing missing data can be done using these modules.

Building Azure ML experiments and deploying web applications using them are not that hard.

This is one of the best step by step guide for that task from MSDN.

In the coming posts will discuss on interesting applications in Azure ML hacks to build your predictive models.
Play with the tool and leave your experience as comments below.  🙂

  

Behind the Scene – Azure ML Part 02

OverviewOfAzureML_960With the power of cloud, we going to play with data now! 🙂

Machine Learning is a niche part of predictive analysis. Predictive analysis gets its power from the tools and techniques like mathematics, statistics, data mining, machine learning etc.… Predictive analysis doesn’t refer only predicting future events; real-time fraud credit card transaction detection also falls under a usage of predictive analysis.

Am not going to discuss the usages of machine learning and what you can do with machine learning methods. Let’s see what are the benefits that you getting by using Azure ML Studio for your analysis.

Fully managed scalable cloud service –You have to deal with thousands, mostly with millions of data records when you doing your analysis. The computation power of the local machine may not be sufficient for those kind of mammoth tasks. Get the use of Azure scalable & efficient cloud. It’ll make your predictions super-fast.

Ability to develop & deploy –Want to deploy an application that get intelligence with a ML backend? AzureML Studio is the best solution then. It provides you the ability to easily deploy a web service from your built ML model and use that in your application. REST will do the rest. J

Friendly user interface for data science workflow –I’m pretty sure dragging and dropping is your ‘thing’ right? So AML Studio suits for you! D from data loading to deployment of the web service, you get a friendly UI where mostly you can just drag and srop the modules into the workspace without bothering about their underlying complex algorithms.

Wide range of ML algorithms inbuilt –No need to start from the scratch. There are plenty of ML algorithms pre built as models in AML Studio. You can use them right away for building models.

R & Python integration –For data scientists, R and Python are like life blood. IF you wish to do intergrade your own scripts in the model, with AML Studio you have the chance here. You can choose either R/python or the both. AML Studio takes care of it.

Support for R libraries –R language has its vibrant user community and the rich set of libraries. With AML studio you get the access for most of the R libraries and you can add more libraries if want too.

1602.image_3FBAEFDE

Azure Machine Learning Process

Let’s go with the process. All starts with defining the objective. Before jumping into the problem, you should have a clear idea on what you going to do. Whether it’s a classification, linear regression, recommendation… you should be able to figure out it by skimming through the data sources and the problem definition.

 

Then the Data! Data maybe a set of sales data in your enterprise cloud or in your local storage. Identify the relevant data fields and components that you want for building up the model. If dataset exceeds 10GB, it’s better to store the data in Azure SQL database first and get the data through the ‘Import Data’ module. You can use HDInsight stored data using Hive queries too.

Pay attention on the data quality. Normally real world data is noisy, full of outliers, error values, missing values etc. So data preprocessing should be done first.  Make sure the data fields are in the appropriate type (Numerical, categorical, etc.) In Azure ML there are plenty of modules that you can perform data preprocessing tasks.

Model Development! Here’s the fun part. You can use ML algorithms comes with studio or you can go with your own scripts in R or python here. If you familiar with ML model development platforms like Weka, RapidMiner, Orange you will find out this is, it is not so different. You have to put the right module at the right place. Have to use right algorithm to take the right decision.

After developing the model, normally we should train the models. For that you can use the past data that you have. You must always keep a portion from your dataset for testing the model too.

Is it over after training the model? No. Many more in the process. You should score and evaluate the model you built. It is useless if the predictions you making with the model you built is having a high error rate. You may haven’t use the appropriate algorithm or you may haven’t use the correct and optimal parameters. So using the ‘score model’ and ‘evaluate model’ you can compare different algorithms for the particular task and pick the best one out from them.

It’s obvious that ML algorithms are not 100% accurate always. But the model you building should have an accuracy more than a wild guessing.

After building your predicting magic box, you can publish it as a web service. This allows you to consume it either by a custom application, Microsoft Excel or similar tool.

For more accuracy, normally this process goes in an iterative manner.

Finishing up the theories and let’s get our hands dirty with our experiments!

Simply there are 3 steps to start working with Azure ML

  1. Navigate to AzureML and choose your subscription plan
  2. Create a Machine Learning workspace in Azure Portal
  3. Sign in to ML Studio

Step 01 –Go to http://www.azure.com and products -> Analytics -> Machine Learning

cap1You can use AzureML absolutely for free. But if you want to deploy a web service and play with serious tasks have to go for an appropriate subscription. If you have a MSDN subscription, you can use it here 🙂

cap2

Azure ML subscriptions

Step 02 –You need an Azure account here. If you don’t have one go for the 3-month free trial.

cap3In the portal go for new -> data + analytics -> Machine Learning

From there you can create your workspace to do the machine learning tasks.

Step 03 –Sign in to the Azure ML Studio from https://studio.azureml.net

cap4Now you are there! Click on the new -> Blank experiment!

We are ready to start the now.

The GUI of the AML Studio is pretty clear and easy to understand. Try to find out the way to upload the datasets and the modules that contains the ML algorithms from the pane in the left hand side.

Will explore some cool capabilities of Azure ML in the coming posts. Here’s a video for your motivation.

Part 01

Let’s Jump In! – Azure ML Part 01

ImageArtScience5

“In the world of intelligent applications, data will be the king!”. Despite of way they making the revenue, data has become the main asset of each company. Sales and distribution data, customer data repos, employee records, all sort of structured and unstructured data have become the life blood of the company’s business process because it is vital to get the accurate and relevant data to get the correct business decisions and do relevant business related predations.

Digital data and cloud storage follow Moore’s law: the world’s data doubles every two years, while the cost of storing that data declines at roughly the same rate.

pic67f7d1878ee27c87a401e8948934f751

This abundance of large amounts of data enables more features and tasks, and better machine learning models and methodologies should to be created for predictive analytics.

When the data is widely available in the cloud, and when it needs large computation power and infrastructure to process and analyze data repositories, the best move is the cloud!

Machine learning (ML) is starting to move to the cloud, where a scalable web service is an API call away. Data scientists will no longer need to manage infrastructure or implement custom code. The systems will scale for them, generating new models on the fly, and delivering faster, more accurate results.

What is Machine Learning?

Simply, machine learning is teaching the silicon chips to think! 😀 If we use the general definition: “Machine learning is the systematic study of algorithms and systems that improve their knowledge or performance with experience”

When you going through the theories behind machine learning you may find it is closely related to computational statistics, where you use computers in prediction making.  Machine learning comes out with range of computing tasks to solve problems where designing and programming explicit algorithms is unfeasible.

All of these things mean it’s possible to quickly and automatically produce models that can analyze bigger, more complex data and deliver faster, more accurate results – even on a very large scale. The result? High-value predictions that can guide better decisions and smart actions in real time without human intervention.

Where the hell ML is used?

Did you notice that eBay is pushing you to buy a protective glass after you buying a fancy phone case for your iPhone? Netflix is suggesting movies for you? Siri or Cortana speech recognition? All these tiny miracles have been possible with the power of machine learning. Spam filtering you emails, speech recognition, recommender systems in electronic commerce are some famous applications of machine learning.

So… How we going to do?

If you google or do a Bing search on machine learning, you’ll find out hundreds of ways of applying machine learning techniques in practical applications and tools that we can use to create machine learning models.

Screen-Shot-2016-06-08-at-3.35.53-PM-1024x730

Here’s a glimpse of Intelligent App Stack

With my post series, mainly am going to take you a journey with Azure Machine Learning Studio, which comes under the Cortana Intelligence Suite.

Why AzureML?

cortana-intel-suite-640x343

With advanced capabilities, free access, strong support for R, cloud hosting benefits, drag-and-drop development and many more features, Azure ML is ready to take the consumerization of ML to the next level.

It’s easy as ABC and powerful enough to handle petabytes of data with the power of Azure.

Theories??

Basics on computing and statistics will be useful to go forward. It’s fantastic if you have a rough idea about the machine learning algorithms, data pre preparation methods kind of stuff. Don’t worry. Here’s a book to read!  🙂

So will take the first step to Azure ML in the coming post.

Part 02