Unlocking the Power of Language with GPT-3

Posted on January 31, 2023 by Haritha Thilakarathne

Yes. This is all about the hype of ChatGPT. It’s obvious that most of us are too obsessed with it and spending a lot of time with that amazing tool even as the regular search engine! (Is that a bye-bye google? 😀 )

I thought of discussing the usage of underlying mechanics of ChatGPT: Large Language Models (LLMs) and the applicability of these giants in intelligent application development.

What actually ChatGPT is?

ChatGPT is a conversational AI model developed by OpenAI. It uses the GPT-3 architecture, which is based on Transformer neural networks. GPT-3 is large language model with about 175 billion parameters. The model has been trained on a huge corpus of text data (about 45TB) to generate human-like responses to text inputs. Most of the data used in training is harvested from public internet. ChatGPT can perform a variety of language tasks such as answering questions, generating text, translating languages, and more.

ChatGPT is only a single use case of a massive research. The underlying power is the ANN architecture GPT-3. Let’s dig down step by step while discussing following pinpoints.

What are Large Language Models (LLMs)?

LLMs are deep learning algorithms that can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive datasets. As the name suggests, these language models are trained with massive amounts of textual data using unsupervised learning. (Yes, there’s no data labelling involved with this). BLOOM from Hugging Face, ESMFold from Meta AI, Gato by DeepMind, BERT from Google, MT-NLG from Nvidia & Microsoft, GPT-3 from OpenAI are some of the LLMs in the AI space.

Large language models are among the most successful applications of transformer models. They aren’t just for teaching machines human languages, but for understanding proteins, writing software code and much more.

What are Transformers?

Encoder decoder architecture — The encoder-decoder structure of the Transformer architecture
Taken from “Attention Is All You Need“

Transformers? Are we going to talk about bumblebee here? Actually not!

Transformers are a type of neural network architecture (similar as Convolutional Neural Networks, Recurrent Neural Networks etc.) designed for processing sequential data such as text, speech, or time-series data. They were introduced in the 2017 research paper “Attention is All You Need“. Transformers use self-attention mechanisms to process the input sequence and compute a weighted sum of the features at each position, allowing the model to efficiently process sequences of varying length and capture long-range dependencies. They have been successful in many natural language processing tasks such as machine translation and have become a popular choice in recent years.

For a deep learning enthusiast this may sound familiar with the RNN architecture which are mostly used for learning sequential tasks. Unless the RNNs, transformers are capable of capturing long term dependencies which make them so capable of complex natural language processing tasks.

GPT stands for “Generative Pre-trained Transformer”. As the name implies it’s built with the blessing of transformers.

Alright… now GPT-3 is the hero here! So, what’s cool about GPT-3?

GPT-3 is one successful innovation in the LLMs (It’s not the only LLM in the world)
GPT-3 model itself has no knowledge; its strength lies in its ability to predict the subsequent word(s) in a sequence. It is not intended to store or recall factual information.
As such the model itself has no knowledge, it is just good at predicting the next word(s) in the sequence. It is not designed to store or retrieve facts.
It’s a pretrained machine learning model. You cannot download or retrain the model since it’s massive! (fine-tuning with our own data is possible).
GPT-3 is having a closed-API access which you need an API key to access.
GPT-3 is good mostly for English language tasks.
A bit of downside: the outputs can be biased and abusive – since it’s learning from the data fetched from public internet.

If you are really interested in learning the science behind GPT-3 I would recommend to take a look on the paper : Language Models are Few-Shot Learners

What’s OpenAI?

The 2015 founded research organisation OpenAI is the creators of GPT-3 architecture. GPT-3 is not the only interesting innovation from OpenAI. If you have seen AI generated arts which are created from a natural language phrases as the input, it’s most probably from DALL-E 2 neural network which is also from OpenAI.

OpenAI is having there set of APIs, which can be easily adapted for developers in their intelligent application development tasks.

Check the OpenAI APIs here: https://beta.openai.com/overview

What can be the use cases of GPT-3?

We all know ChatGPT is ground-breaking. Our focus should be exploring the approaches which we can use its underlying architecture (GPT-3) in application development.

Since the beginning of the deep neural networks, there have been lot of research and innovation in the computer vision space. The networks like ResNet were ground-breaking and even surpass the human accuracy level in tasks like image classification with ImageNet dataset. We were getting the advantage of having pre-trained state-of-the-art networks for computer vision tasks without bothering on large training datasets.

The LLMs like GPT-3 is addressing the gap of the lack of such networks in natural language analysis tasks. Simply it’s a massive pre-trained knowledge base that can understand language.

There are many interesting use cases of GPT-3 as a language model in use cases including but not limited to:

Dynamic chatbots for customer service use cases which provide more human-like interaction with users.
Intelligent document management by generating smart tagging/ paraphrasing, summarizing textual documents.
Content generation for websites, new articles, educational materials etc.
Advance textual classification tasks
Sentiment analysis
Semantic search capabilities which provide natural language query capability.
Text translation, keyword identification etc.
Programming code generation and code optimisation

Since the GPT-3 can be fine-tuned with a given set of training data, the possibilities are limitless with the natural language understanding capability it is having. You can be creative and come up with the next big idea which improves the productivity of your business.

What is Azure OpenAI?

Azure OpenAI is a collaboration between Microsoft’s Azure cloud platform and OpenAI, aimed at providing cloud-based access to OpenAI’s cutting-edge AI models and tools. The partnership provides a seamless platform for developers and organizations to build, deploy, and scale AI applications and services, leveraging the computing resources and technology of the Azure cloud.

Users can access the service through REST APIs, Python SDK or through the Azure OpenAI Service Studio, which is the web-based interface dedicated for OpenAI services.

In enterprise application development scenarios, using OpenAI services through Azure makes it much easier for integration.

Azure OpenAI opened for general availability very recently and I’m pretty sure there’ll be vast improvements in the coming days with the product.

Let’s keep our eyes open and start innovating on ways which we can use this super-power wisely.

What’s Best for Me? – 5 Data Analytics Service Selection Scenarios Explained

Posted on December 30, 2022 by Haritha Thilakarathne

With the extensive usage of cloud-based technologies to perform machine learning and data science related experiments, choosing the right toolset/ platform to perform the operations is a key part for the project success.

Since selecting the perfect toolset for our ML workloads maybe bit tricky, I thought of sharing my thoughts on that by getting a couple of generic use cases. Please keep in mind that the use cases I have chosen and the decisions I’m suggesting are totally my own view on the scenarios and this may differ based on different factors (amount of data, time frame, allocated budget, ability of the developer etc.) you have with your project. Plus, the suggestions I’m pointing out here are from the services comes with Microsoft Azure cloud. This maybe the easily adjusted for other cloud providers too.

Scenario 1:

We are a medium scale micro financing company having our data stored on Microsoft Azure. We have a plan to build a datalake and use that for analytical and reporting tasks. We have a diverse data team with abilities in python, Scala and SQL (most of the data engineers are only familiar with SQL). We need to build a couple of machine learning models for predictions. What would be the best platform to go forward with? Azure Databricks or Azure ML Studio?

Suggestion: Azure Databricks

Reasons:

Databricks is more flexible in ETL and datalake related data operations comparing to AzureML Studio.
You can perform data curation and machine learning within a single product with Azure Databricks.
Databricks can connect with Azure Data Factory pipelines to handle data flow and data curation tasks within the datalake.
Since the data engineers are more familiar with SQL, they’ll easily adapt with SparkSQL on Databricks.
Data team can develop their machine learning experiments with any language of their choice with Databricks notebooks.
Databricks notebooks can be used for analytical and reporting tasks even with a combination of PowerBI.
Given that, the company is planning for building a datalake, Databricks is far more flexible in ETL tasks. You can use Azure Data Factory pipelines with Databricks to control the data flow of the datalake.

Scenario 2:

I’m a computer science undergrad. I’m doing a software project to predict several types of wildflowers by capturing images from a mobile phone. I’m planning to build my computer vision model using TensorFlow and Keras and expose the service as a REST API. Since I’m not having the infrastructure to train the ML models, I’m planning to use Azure for that. Which tool on Azure should I choose?

Suggestion: Azure ML Studio

Reasons:

AzureML provides a complete toolset to train, test and deploy a deep learning model using any open-source framework of your choice.
You can use the GPU training clusters on AzureML to train your models.
It’s easy to log your model training and experiments using AzureML python SDK.
AzureML gives you the ability for model management and exposing the trained model as a REST API.
Small learning curve and adaptability.

Scenario 3:

I’m the CEO of a retail company. I’m not having a vast experience with computing or programming but having a background in maths and statistics. I have a plan to use machine learning to perform predictive analysis with the data currently having in my company. Most of the data are still in excel! Someone suggested me to use Azure. What product on Azure should I choose?

Suggestion: Azure ML Studio

Reasons:

For a beginner in machine learning and data science, Azure ML Studio is a good start.
AzureML Studio provides no-code environments (Azure ML designer and AutoML) to develop ML models.
Since, you are mostly in the experimental stage and not going for using bigger datasets, using Databricks would be an overkill.
You can easily import your prevailing data and start experimenting and playing around with them without any local environmental setup.

Scenario 4:

I’m the IT manager of a large enterprise who are heavily relying on data assets with our decision-making process. We have to run iterative jobs daily to retrieve data from different external sources and internal systems. Currently we have an on-prem SQL database acting as the data warehouse. Company has decided to go for cloud. Can Azure serve our needs?

Suggestion: Yes. Azure can serve your need with different tools in the data & AI domain.

Reasons:

You can use Azure Synapse Analytics or Azure Data Factory to build data pipelines and perform ETL operations.
The local data warehouse can be easily migrated to Azure cloud.
You can use Azure Databricks in-order to perform analytics tasks.
Since the enterprise in large and scaling, using Databricks would be better with its Spark based computation abilities.

Scenario 5:

We are an agricultural company growing forward with adopting modern Agri-tech into the business. We collect numerous data values from our plantations and store them in our cloud databases. We have a set of data scientists working on data modelling and building predictive models related to crop fertilizing and harvesting. They are currently using their own laptops to perform analysis and it’s troublesome with the data amount, platform configurations and security. Will Azure ML comes handy in our case?

Suggestion: Yes. Azure ML Studio would be a good choice.

Reasons:

AzureML can be easily adaptable as an analytical platform.
The cloud databases can be connected to AzureML, and data scientists can straight-up start working on the data assets.
AzureML is relatively cheap comparing to Databricks (Given the data amount is manageable in a single computer.)
It’s easy to perform prototyping of models using AutoML/ AzureML Designer and then implement the models within a short time frame.

Generally, these are the factors I would keep in mind when selecting the services for ML/ data related implementations on Azure.

Azure ML studio is good when you are training with a limited data, though Azure ML provides training clusters, the data distribution among the nodes is to be handled in the code.
AzureML Studio comes handy in prototyping with AzureML designer and Automated ML.
Azure Databricks with its RDDs is designed to handle data distributed on multiple nodes which is advantageous when your you have big datasets.
When your data size is small and can fit in a scaled up single machine/ you are using a pandas dataframe, then use of Azure Databricks is an overkill.
Services like Azure Data Factory and Datalake storage can be easily interconnected for building

Let me know your thoughts on these scenarios as well. Add your queries in the comments too. I’ll try my best to provide my suggestions for those use cases.

Different Approaches to Perform Machine Learning Experiments on Azure

Posted on May 30, 2022 by Haritha Thilakarathne

We have discussed a lot about Azure Machine Learning Studio; the one-stop portal for all ML related workloads on Azure cloud. AzureML Studio provides different approaches to work on the machine learning experiments based on needs, resources and constraints you have. Selecting the plan to attack is completely your choice.

We all have our own way of performing machine learning experiments. While some prefer working on Jupyter notebooks, some are more into less code environments. Being able to onboard data scientists with their familiar development environment without a big learning overhead is one of the main advantages of AzureML.

In this article, let’s have a discussion on different methods available in AzureML studio and their usage in practical scenarios. We may discuss pros and cons of each approach as well.

Please keep in mind that, these are my personal thoughts based on the experiences I had with ML experiments and this may change in different scenarios.

Automated ML

As the name implies, this is all automated. Automated ML is the easiest way of producing a predictive model just in few minutes. You don’t need to have any coding experience to use Automated ML. Just need to have an idea on machine learning basics and an understanding on the problem you going to solve with ML.

The process is pretty straight forward. You can start with selecting the dataset you want to use for ML model training and specify the ML task you want to perform. (Right now, it supports classification, regression, time series forecasting. Computer vision and NLP tasks are in preview). Then you can specify the algorithms you want to test it with and other optional parameters. Azure does all the hard work for you and provides a deployment ready model which can be exposed as a REST API.

Pros:

Zero code process.
Easy to use and well suited for fast prototyping.
Eliminate the environment setup step in ML model development
Limited knowledge on machine learning is needed to get a production viable result.

Cons:

Limited machine learning capabilities.
Right now, only works with supervised learning problem scenarios.
Works well with relational data, computer vision and NLP are still in preview.
There’s no way of using custom machine learning algorithms in the process.

Azure ML Designer

Azure ML Designer is an upgraded version of the pretty old Azure ML Studio drag and drop version. Azure ML Designer is having a similar drag and drop interface for building machine leaning experiment pipelines. You have a set of prebuilt components which you can connect together in a flowchart like manner to build machine learning experiments. You have the ability to use SQL queries or python/ R scripts if you want in the process too. After training a viable ML model, you can deploy it as a web service with just few clicks.

I personally prefer this for prototyping. Plus, I see a lot f potential on Azure ML designer in educational purposes. It’s really easy to visualize the ML process through the designer pipelines and it increases the interpretability of the operation.

Pros:

Zero code/ Less code environment
Easy to use graphical interface
No need to worry on development/ training environment configurations
Having the ability to expand the capabilities with python/ R scripts
Easy model deployment

Cons:

Less flexibility for complex ML model development.
Less support for deep learning workloads.
Code versioning should handle separately.

Azure ML notebooks

This maybe the most favourite feature on Azure ML for data scientists. I know Jupyter notebooks are like bread and butter for data scientists. Azure ML offers a fully managed notebook experience for them without giving them the hassle of setting up the dev environment on local computers. You just have to connect the notebook with a compute instance and it allows you to do your model development and training on cloud in the same way you did on a local compute or elsewhere on notebooks.

I would recommend this as the to-go option for most of the machine learning experiments since it’s really easy to spin up a notebook instance and get the job done. Most of the ML related libraries are pre-installed on the compute instance and you even have the flexibility to install 3^rd party packaged you need through conda or pip.

Pros:

Familiar notebook experience on cloud.
Option to use different python kernels.
No need to worry about dev environment setup on local compute.
Can use the powerful compute resources on Azure for model training.
Flexibility to install required libraries through package managers.

Cons:

Comes with a price for computation.
No direct support for spark workloads.
Code version control should manage separately.

Developing on local and connect to AzureML service through AML Python SDK.

This is the option I would suggest for more advanced users. Think of a scenario where you have a deep learning based computer vision experiment to run on Azure with a complex code base. If this is the case, I would definitely use AzureML python SDK and connect my prevailing code base with the AzureML service.

In this approach, your code base sits on your local computer and you are using Azure for model training, deployment and monitoring purposes. You have the total flexibility of using the power of cloud for computations as well as the flexibility of using local machine for development.

Pros:

Total flexibility in performing machine learning experiments with our comfortable dev tools.
AzureML python SDK is an open-source library.
Code version controlling can be handled easily.
Whole ML process can be managed using scripts. (Easy for automation)

Cons:

Setting up the local development environment may take some effort.
Some features are still in experimental stage.

Choosing the most convenient approach for your ML experiment is totally based on the need and resources you have. Before getting into the big picture, start small. Start with a prototype, then a workable MVP, gradually you can move forward with expanding it with complex machine learning approaches.

What’s your most preferred way of model development from these options? Please mention in the comments.

Cheers!

MLOps : Let’s start plumbing ML experiments!

Posted on April 22, 2022 by Haritha Thilakarathne

What’s all this hype on MLOps? What’s the difference between machine learning and MLOps? Is MLOps essential? Why we need MLOps? Through this article series we going to start a discussion on MLOps to get a good start with the upcoming trend. The first post is not going to go deep with technicalities, but going to cover up essential concepts behind MLOps.

What is MLOps?

As the name implies, it is obviously having some connection with DevOps. So, will see what DevOps is first.

“A compound of development (Dev) and operations (Ops), DevOps is the union of people, process, and technology to continually provide value to customers.”
Microsoft Docs

This is the formal definition of DevOps. In the simpler terms, DevOps is the approach of streamlining application development life cycle of software development process. It ensures the quality engineering and security of the product while making sure the team collaboration and coordination is managed effectively.

Imagine you are a junior level developer in a software development company who develops a mission critical system for a surveillance application. DevOps process make sure each and every code line you write is tracked, managed and integrated to the final product reliably. It doesn’t stop just by managing the code base. It involves managing all development life cycle steps including the final deployment and monitoring of the final product iteratively too.

That’s DevOps. Machine Learning Operations (MLOps) is influenced by DevOps principles and practices to increase the efficiency of machine learning workflows. Simply, it’s the way of managing ML workflows in a streamlines way to ensure quality, reliability, and interpretability of machine learning experiments.

Is MLOps essential?

We have been playing around with machine learning experiments with different tools, frameworks and techniques for a while. To be honest, most of our experiments didn’t end up in production environments :D. But, that’s the ultimate goal of predictive modeling.

Machine Learning experiment is an iterative process
Source : https://azure.microsoft.com/en-au/resources/gigaom-delivering-on-the-vision-of-mlops/

Building a machine learning model and deploying it is not a single step process. It starts with data collection and goes in an iterative life cycle till monitoring the deployed model in the production environment. MLOps approaches and concepts streamline these steps and interconnect them together.

Answer is Yes! We definitely need MLOps!

Why we need MLOps?

As I said earlier, MLOps interconnect the steps in ML life cycle and streamline the process.

I grabbed these points from Microsoft docs. As it implies, these are the goals of MLOps.

Faster experimentation and development of models

Good MLOps practices leads for more code and component reusability which leads for faster experiments and model development. For an example, without having separate training loops or data loading components for each experiment, we can reuse an abstract set of methods for those tasks and connect them with a machine learning pipeline for running different experiment configurations. That’s make the life easy of the developer a lot!

I do lot of experiments with computer vision. In my case, I usually have a set of abstract python methods that can be used for model training and model evaluation. When performing different experiments, I pass the required parameters to the pipeline and reuse the methods which makes the life easy with less coding hassle.

Faster deployment of models into production

Machine learning model deployment is always a tricky part. Managing the endpoints and making sure the deployment environment is having all the required platform dependencies maybe hard to keep track with manual processes. A streamlines MLOps pipeline helps to manage deployments by enabling us to choose which trained model should go for production etc. by keeping track of a model registry and deployment slots.

Quality assurance and end-to-end lineage tracking

Maintaining good coding practices, version controlling, dataset versioning etc. ensures the quality of your experiments. Good MLOps practices helps you to find out the points where errors are occurring easily rather than breaking down the whole process. Will say your trained model is not performing well with the testing data after sometime from model deployment. That might be caused by data drift happened with time. Correctly configured MLOps pipeline can track such changes in the inference data periodically and make sure to notify such incidents.

Trustworthiness and ethical AI

This is one of the most important use cases of MLOps. It’s crucial to have transparency in machine learning experiments. The developer/ data scientist should be able to interpret each and every decision they took while performing the experiment. Since handling data is the key component of ML model, there should be ways to maintain correct security measures in experiments too. MLOps pipelines ensure these ethical AI principles are met by streamlining the process with a defined set of procedures.

How we gonna do this?

Now we all know MLOps is crucial. Not just having set of python scripts sitting in a notebook it’s all about interconnecting all the steps in a machine learning experiments together in an iterative process pipeline. There are many methods and approaches to go forward with. Some sits on-prem while most of the solutions are having hybrid approach with the cloud. I usually use lot of Azure services in my experiments and Azure machine learning Studio provides a one-stop workbench to handle all these MLOps workloads which comes pretty handy. Let’s start with a real-world scenario and see how we can use Azure Machine Learning Studio in MLOps process to streamline our machine learning experiments.

FAQs on Machine Learning Development – #AskNaadi Part 1

Posted on February 14, 2022 by Haritha Thilakarathne

Happy 2022!

It’s almost 7 years since I started playing with machine learning and related domains. These are some FAQs that comes for me from peers. Just added my thoughts on those. Feel free to any questions or concerns you have on the domain. I’ll try my best to add my thoughts on that. Note that all these answers are my personal opinions and experiences.

01. How to learn the theories behind machine learning?

The first thing I’d suggest would be ‘self-learning’. There are plenty of online resources out there where you can start studying by your own. Most of them are free. Some may need a payment for the certification (That’s totally up to you to pay and get it). I’ve listed down some of the famous places to get a kickstart for learning AI. Just take a look here.

Next would be keep practising. Never stop coding and training models in various domains. Kaggle is a good place to sharpen your skill set. Keep learning and keep practising at the same time.

02. Do we really need mathematics for ML?

Yes. To some extend you should know the theories behind probability and and some from basic mathematics. No need to worry a lot on that. As I said previously, there are plenty of places to catch up your maths too.

03. Is there a difference between data analysis and machine learning?

Yes. There is. Data analysis is about find pattern in the prevailing data and obtain inferences due to those patterns. It may have the data visualization components too. When is comes to machine learning, you train a system to learn those patterns and try to predict the upcoming pattern.

04. Does the trend in AI/ML going to fade out in the near future?

Mmm.. I don’t think so. Can’t exactly say AI is going to be ‘the’ future. Since all these technical advancements going to generate hell a lot of data, there should be a way to understand the patterns of those data and get a value out of that. So, data science and machine learning is going to be the approach to go for.

Right… those are some general questions I frequently get from people. Let’s move into some technicalities.

05. What’s the OS you use on your work rig?

Ubuntu! Yes it’s FOSS and super easy to setup all the dependencies which I need on it. (I did a complete walk through on my setup previously. Here’s it). Sometimes I use Windows too. But if it’s with docker and all, yes.. Ubuntu is the choice I’m going with.

06. What’s your preferred programming language to perform machine learning experiments?

I’m a Python guy! (Have used R a little)

07. Any frameworks/ libraries you use most in your experiments?

Since am more into deep learning and computer vision, I use PyTorch deep learning framework a lot. NumPy, Sci-kit learn, Pandas and all other ML related Python toolkits are in my toolbox always.

08. Machine learning is all about neural networks right?

No it’s not! This is one of the biggest myths! Artificial neural networks (ANNs) are only one family of algorithms which we can perform machine learning. There are plenty of other algorithms which are widely used in performing ML. Decision trees, Support Vector Machines, Naive Bayes are some popular ML algorithms which are not ANNs.

09. Why we need GPUs for training?

You need GPUs when you need to do parallel processing. The normal CPUs we have on our machines are typically having 4-5 cores and limited number of threads can be handled simultaneously. When it comes to a GPU, it’s having thousands of small cores which can handle thousands of computational threads in parallel. (For an example Nvidia 2080Ti is having 4352 CUDA cores in it). In Deep learning, we have to perform millions or calculations to train models. Running these workloads in GPUs is much faster and efficient.

10. When to use/ not to use Deep learning?

This is a tricky questions. Deep learning is always good in understanding the non-linear data. That’s why it’s performing really well in computer vision and natural language processing domains. If you have a such task, or your feature space is really large and having a massive amount of data, I’d suggest you to go with deep learning. If not sticking with traditional machine learning algorithms might be the best case.

11. Do I need to know all complex theories behind AI to develop intelligent applications?

Yes and No. In some cases, you may have to understand the theories behind AI/ML in order to develop a machine learning based applications. Mostly I would say model training and validation phases need this knowledge. Will say you are a software developer who’s very good with .Net/ Java and you are developing an application which is having a component where you have to read some text from a scanned document. You have to do it using computer vision. Fortunately you don’t have to build the component from the scratch. There are plenty of services which can be used as REST endpoints to complete the task. No need to worry on the underlying algorithms at all. Just use the JSON!

12. Should I build all my models from scratch?

This is a Yes/No answer too. This question comes mostly with deep learning model development. In some complex scenarios you may have to develop your models from the scratch. But most of the cases the problem you having can be defined as a object detection/ image classification/ Key phrase extraction from text etc. kinda problem. The best approach to go forward would be something like this.

Use a simple ANN and see your data loading and the related things are working fine.

Use a pre-trained model and see the performance (A widely used SOTA model would be the best choice).

If it’s not working out, do transfer learning and see the accuracy of the trained model. (You should get good results most of the times by this step)

Do some tweaks to the network and see if it’s working.

If none of these are working, then think of building a novel model.

13. Is cloud based machine learning is a good option?

In most of the industrial use cases yes! Since most of the data in prevailing systems are already sitting in the cloud and industries are heavily relying on cloud services these days, cloud based ML is a good approach. Obviously it comes with a price. When it comes to research phases, the price of purchiasing computation power maybe a problem. In those cases, my approach would be doing the research phase on-prem and moving the deployment to the cloud.

14. I’ve huge computer vision datasets to be trained? Shall I move all my stuff to the cloud?

Ehh… As I said previously, if you planning on a research project, which goes for a long time and need a lot of computational hours, I’d suggest to go with a local setup first, finalize the model and then move to the cloud. (If dollars aren’t your problem, no worries at all! Go for the cloud! Obviously it’s easy and more reliable)

15. Which cloud provider to choose?

There’s a lot of cloud providers out there having various services related to ML. Some provides out of the box services where you can just call and API to do the ML tasks (Microsoft Cognitive services etc. ). There are services where you can use your own data to train prevailing models (Custom Vision service by Azure etc.)

If you want end-to-end ML life cycle management, personally I find Azure ML service is a good solution since you can use any of your ML related frameworks and tools and just use cloud to train, manage and deploy the models. I find MLOps features that comes with Azure Machine Learning is pretty useful.

16. I’ve trained and deployed a pretty good machine learning model. I don’t need to touch that again right?

No way! You have to continuously check their performance and the accuracy they are providing for the newest data that comes to the service. The data that comes into the service may skewed. It’s always a good idea to train the model with more data. So better to have a re-training MLOps pipelines to iteratively check your models.

17. My DL models takes a lot of time to train. If I have more computation power the things will speed up?

mm.. Not all the time. I have seen cases where data loading is getting more time than model training. Make sure you are using the correct coding approaches and sufficient memory and process management. Make sure you are not using old libraries which may be the cause for slow processing times. If your code is clean and clear then try adjusting the computation power.

This is just few questions I noted down. If you have any other questions or concerns in the domain of machine learning/ deep learning and data science, just drop a comment below. Will try to add my thoughts there.

“Stay Hungry Stay Foolish” – Let’s Learn Machine Learning without Code!

Posted on May 20, 2021 by Haritha Thilakarathne

“Stay Hungry Stay Foolish”
Steve Jobs

This famous quote of Steve Jobs is one of the most precious quotes I always keep in my mind. Stepping into the IT industry 11 years back as a teenager, I was always eagerly waiting to push myself beyond the barriers and keep trying new things. That hunger led me to explore Artificial Intelligence, undoubtably the most used buzz word in today’s industry. I always make sure to keep myself foolish and open to learn new things.

When I started exploring data science and related technologies 6 years ago, almost all the new things I experimented in the work life were self-taught from online resources. Even today I really enjoy going through documentations on different technologies and making myself familiar with those.

In the AI space, Microsoft Azure is a dominant player with their vast variety of tools and services. Being working with different AI related tools for years, I’m super thrilled to see the advancements on the Azure data & AI space. When the MVP cloud skills challenge was launched, I had no hesitation to go forward with Data & AI path, since I needed to sharpen up my skills and update myself with the new capabilities Azure AI is providing.

Azure AI is equipped with tools and services for anyone who’s interested in AI no matter in which expertise level the person is in. You can easily use Azure Cognitive services to add AI capabilities for your application by just calling for a REST API. If you want develop an advance machine learning/ deep learning experiment, Azure AI allows you to use your favorite open-source tools and frameworks with and adapt the power of cloud for your development.

What I actually learnt?

The challenge consists learning modules which covers most of the prominent parts in Azure AI domain including Azure machine learning, Azure cognitive services, Azure cognitive search and Bot framework. It has been a long time since I built bots. So, working with the new capabilities and functions of bot framework was a pretty good experience. In addition to that, Azure cognitive search is one of the services I’ve least used in my developments and I always wanted to give it a try. With the simple but well managed learning modules gave me a perfect star to sketch my first cognitive search application.

Here comes the most interesting part!

One of the most common questions I get when doing sessions in the community is “do we actually need to know coding to perform machine learning experiments?”

With no hesitation I say yes because Azure machine learning is offering two powerful tools for zero-code machine learning experiments. Automated machine learning supports training supervised machine learning models for classification, regression and time series forecasting. You can create and publish a machine learning experiment as a REST API with just few clicks!

Azure Machine Learning Designer provides you a simple drag and drop interface where you can create machine learning pipelines and publish those as REST endpoints. If you want advance functionalities, it allows you to add python or R code snippets inside the pipelines too.

I know you are super eager to learn on this zero-code machine learning development tools. Check out the Microsoft Learn collection I created specifically focusing on these two tools. Don’t forget to share your learning experience with me.

Click here for the Machine Learning with Zero-code learning collection!

Happy Learning!

Docker + Machine Learning : A Perfect Combo

Posted on April 30, 2021 by Haritha Thilakarathne

Docker has become the new norm of the software industry. Everyone is so obsessed with it since docker solves most of the issues software engineers and system administrators had with platform dependencies in application development and deployments.

“Docker is a tool that helps users to exploit operating-system-level virtualization to develop and deliver software in packages called containers.”
~ Wikipedia

Though the technical explanation sounds bit complicated, simply docker can be identified as a ‘VM like’ environment where you can build and deploy your software applications.

Why docker for machine learning/ deep learning?

We have endless discussions on how hard it is to configure the development and deployment environments in machine learning. Since python is the most used language for ML and DL experiments, dealing with python packages and making them all work seamlessly on your hardware can be a nightmare. Using cloud-based machine learning platforms or virtual machines are some of the options we can utilize to deal with this problem.

Being more flexible than virtual machines and easy migration capabilities, docker is one of the best ways for managing machine learning environments. Since docker has become the key component of MLOps it’s time for the data scientists for adapting docker in their developments.

Where and how we can use docker?

For me docker helps me out in 4 main stages in the machine learning experiment pipelines.

As a development environment.

I use to do lot of experiments in the domain of computer vision and deep learning. You may have experienced the pain of dealing libraries like opencv with python. So, I always use custom docker images with all the dependencies installed for running my experiments. This makes easy for me to collaborate with my peers easily without giving the hassle of replicating my development environment in their machines.

What about the huge amounts of data? Including those also inside the docker container? Nah. Always keeping the data in mounted volumes as well as the output files created from the experiments.

If you need GPU supported docker images, NVIDIA provides docker image variations that matches with your need on docker hub.

2. As a training environment.

You all know ML/ DL models normally take quite a big time for training. In my case, I use remote shared servers with GPUs for training my experiments. For that, the easiest way is containerizing the experiment and pushing to the server.

3. As a deployment environment.

Another popular use case of docker is in the deployment phase. Normally the deployment environment should fulfil required dependencies in order to inference the ML/DL model seamlessly. Since a docker container can be shipped across platforms easily without worrying about hardware level dependencies, it’s really easy to use docker for deploying ML models.

4. Docker for cloud-based machine learning

Most of the data scientists are using cloud-based machine learning platforms like Azure machine learning today with their flexibility and resources. Containerized experiments are the main component these services use in order to run them on cloud. When it comes to Azure ML you can use their default docker image for experiments or you can specify your custom base image for model development and training.

Take a look on this documentation for deploy Azure ML models using a custom docker base image.

So, docker has become a life saver for me since it reduces a lot of headache occurring with machine learning model life-cycle. Will come up with a sample experiment on using docker for training a machine learning model in the next post.

Connecting Azure SQL server with Azure Machine Learning

Posted on January 29, 2021 by Haritha Thilakarathne

Accessing data in different data sources is one of the main tasks in machine learning model development life cycle. Let’s discuss one of the most common data accessing scenarios.

Scenario :

We have to set of relational data points stored in a Azure SQL server to develop a machine learning model using Azure Machine Learning. Let’s see how to leverage data stored in an Azure SQL database in an Azure Machine Learning experiment.

The process contains three main steps.

Set access permissions of Azure SQL database
Connect Azure SQL database to an Azure ML datastore
Register the data in datastore as an Azure ML dataset.

1. Set access permissions of Azure SQL database

Allow Azure services and resources to access this server

By default Azure SQL databases are protected with a firewall which limits outside access for data. Since we going to provide access for the traffic from Ips belongs to Azure resources and services, make sure you allow Azure services to access your SQL server.

2. Connect Azure SQL database to an Azure ML datastore

Azure ML datastores can be defined as the abstraction of data sources for the ML workspace or as the interconnection between the data resource and AzureML workspace.

Go to your Azure Machine Learning Studio (ml.azure.com) and click ‘New datastore’. Provide a datastore name and select ‘Azure SQL database’ as the datastore type. Make sure to authenticate the access with Azure SQL server’s user ID and the password.

3. Register the data in datastore as an Azure ML dataset.

AzureML supports two types of datasets (Take a look here to get an overview on the difference between those). Since we are dealing with a set of relational data, Tabular dataset is the option we have to use for creating the dataset.

Select ‘Create dataset’ from ‘Datasets’ tab on AML Studio and prmopt to ‘From datastore’ option.

Select the datastore we created in the previous step which establish the connection between AML workspace and the data source.

Provide the required SQL query to select the required data from SQL server. Make sure to validate the data before configuring the schema.

All done! Now you have the access to the data in your Azure SQL database from AzureML workspace. You can easily refer this in your experiments.

In the cases where your database is getting updated time to time, what you have to do is refreshing the dataset to fetch the newest data points specified by the SQL query.

Different Computation Options on Azure Machine Learning

Posted on November 30, 2020 by Haritha Thilakarathne

In a later article we discussed on different data storage methods we can use with Azure Machine learning. In this article am gonna briefly discuss different computation options we have with Azure ML.

Since computation power is one of the key advantage we get from cloud based machine learning, choosing correct computation resource for our machine learning experiments is important.

AzureML offers 4 main compute types.

01. Compute instances –

If you don’t wanna spend the time in setting up your local computer for doing the ML experiments or you wanna leverage GPUs or powerful CPUs for doing your experiments, Azure Compute instances offer fully managed virtual machines loaded with most of the essential frameworks /libraries for performing machine learning and data science experiments. When you using AzureML notebooks (jupyter notebook instance attached for AzureML), compute instance is the place where the jupyter notebook is running.

Different methods can be used to access compute instances

You can access the compute instances using different methods. Accessing through Jupyter notebooks and JupyterLab is the all time favourite of most of the data scientists. If you are a R folk, you can use Rstudio with the compute instances. Accessing the compute instance through SSH is really useful (you may have to enable SSH access when creating the compute instance) in occasions where you have to install custom packages and such for the compute instance. (The machine is ubuntu based and you can use all bash scripts there!)

Basically compute instance can be defined as a virtual machine fully loaded with data science and machine learning essentials which you can use right out of the box.

02. Compute clusters –

Compute clusters are different from compute instances with their ability of having one or more compute nodes. These compute nodes can be created with our desired hardware configurations.

Why having more than one node? That comes with the ability of using parallel processing for computations. If you are going do to hyperparameter tuning/ GPU based complex computations/ several machine learning runs at once you may have to create a compute cluster.

If you are running Automated Machine Learning expriment with AzureML, you must have a compute cluster to perform computations.

When selecting the node configurations, you can either go with CPU based nodes or GPU based nodes. GPU based nodes (NC type etc.) is bit pricy. If you are not using GPU based computing, don’t waste your dollars by just creating a compute cluster with some fancy configs.

One other key setting is ‘Virtual machine priority’. If you are ok with pushing your experiment to the cloud and get the result without a hurry, you can go with low priority nodes, which will save you a lot of dollars rather than using dedicated VMs. No harm is gonna happen for the experimentation accuracy and such.

03. Inference clusters –

There are two options to deploy Azure machine learning web services as REST endpoints. 1) Use ACI (Azure Container Instances) 2) Use AKS (Azure Kubernetes Service)

Deploying the REST web service on ACI is good for testing and development uses and AKS would be the to-go for production level large deployments. You can configure the AKS cluster according to your need through AzureML as well as from the Azure portal. These AKS clusters are pretty much similar for AKS clusters you worked in any other Azure based deployments.

04. Attached compute –

Azure machine learning is not limited for doing computations on compute clusters. You can attach Azure Databricks, Data Lake Analytics, HDInsight or a prevailing VM as a compute for your workspace. Keep in your mind that Azure machine learning only supports virtual machines running Ubuntu. These compute targets will not be managed by Azure Machine Learning itself. So you may have to perform some additional steps to make sure they are compatible with your experiments.

Choosing the correct compute resource is a key component in the success of developing machine learning experiments. On the other hand, bad computation choices may leave you with huge Azure bills! 😀

There’s no hard bound rules on selecting different compute options for your machine learning life cycle. Just make sure you use the right tool at the right time.

AzureML Python SDK – Installation & Configuration

Posted on August 30, 2020 by Haritha Thilakarathne

In the last blog post, we discussed on developing a machine learning classifier with Azure machine learning service. As mentioned there, we going to utilize the familiar development tools and frameworks for model development.

Key areas of the SDK include:

Explore, prepare and manage the lifecycle of your datasets used in machine learning experiments.
Manage cloud resources for monitoring, logging, and organizing your machine learning experiments.
Train models either locally or by using cloud resources, including GPU-accelerated model training.
Use automated machine learning, which accepts configuration parameters and training data. It automatically iterates through algorithms and hyperparameter settings to find the best model for running predictions.
Deploy web services to convert your trained models into RESTful services that can be consumed in any application.

~ Ref : https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py

AzureML python SDK acts as the connector between all the resources on the cloud and the dev environment.

Installing Python SDK –

AzureML SDK can be easily installed for your local computer through pip. Refer this guide for the installation process. I’d suggest to go with default installation since it’s enough for the most of the operations we used in the experiment. It’s a good idea to upgrade the SDK before running an experiment since the package is rapidly updating.

Download config file –

For connecting the AzureML workspace, we may need the Azure subscription ID, resource group which the workspace has been created and the workspace name. The easiest way to grab these details is downloading the config.json file from the Azure portal. Place this file inside the experiment directory.

Downloading config.json from Azure portal

Connect AzureML Workspace –

Connecting the AzureML workspace and and listing the resources can be done by using easy python syntaxes of AzureML SDK (A sample code is provided below). Refer Python SDK documentation to do modifications for the resources of the AML service.

#!pip install --upgrade azureml-sdk[notebooks]
import azureml.core
from azureml.core import Workspace
from azureml.core import ComputeTarget, Datastore, Dataset

print("Ready to use Azure ML", azureml.core.VERSION)
ws = Workspace.from_config()
print(ws.name, "loaded")

#View resources in the workspace 
print("Compute Targets:")
for compute_name in ws.compute_targets:
    compute = ws.compute_targets[compute_name]
    print("\t", compute.name, ":", compute.type)
    
print("Datastores:")
for datastore in ws.datastores:
    print("\t", datastore)
    
print("Datasets:")
for dataset in ws.datasets:
    print("\t", dataset)

print("Web Services:")
for webservice_name in ws.webservices:
    webservice = ws.webservices[webservice_name]
    print("\t", webservice.name)

In next blog article, will discuss the data loading and experiment creation.