What’s next after ChatGPT?

A Large Language Model as a Van Gogh Artwork (Dall-E2)

The hype on Generative AI is still there. Everyone is looking for applications of GenAI and investing to get a competitive advantage for businesses with AI-based interventions. These are a couple of questions I came across, as well as my view on LLMs and their future.

Can we achieve everything with LLMs now?

The straightforward answer is NO! LLMs can’t handle all tasks and are best suited as tools for most natural language processing tasks, particularly in information retrieval and conversational applications. Despite their strengths, there are still plenty of simple approaches that find practical use in real-world scenarios. In simpler terms, LLMs have their unique use cases, but they can’t do everything like a wizard!

Are LLMs taking over the tech world?

Is this the end of traditional ML? Not at all. As mentioned earlier, LLMs don’t cover all machine learning tasks. Most data analytical and machine learning use cases involve numerical data often organized in a relational structure, where traditional machine learning algorithms excel. Traditional machine learning techniques are expected to remain relevant for the foreseeable future.

Artificial General Intelligence (AGI)?

Have we reached it? General Artificial Intelligence (General AI) envisions AI systems with human-like abilities across various tasks. However, we are not there yet. While there’s a possibility of achieving some level of AGI, current LLMs, including applications like ChatGPT, should not be confused with AGIs. LLMs are proficient in predicting text frequencies using transformers but struggle with complex analytical tasks where human expertise is crucial.

Are enterprises ready for the AI hype?

Having worked with numerous enterprises, I’ve observed a willingness to invest in AI projects for streamlining business processes. However, many struggle to identify suitable use cases with a considerable Return on Investment (RoI). Some organizations, even if prepared for advanced analytics, face extensive groundwork in their IT and data infrastructure. Despite these challenges, the AI hype has prompted businesses to recognize the potential of leveraging organizational data resources effectively. In the coming months, we anticipate a significant boost not only in LLM-based applications but also in traditional machine learning and deep learning applications across industries.

Ethical AI? What’s happening there?

With the public’s increasing adoption of ChatGPT and large language models, conversations about responsible AI use have gained traction. The European Union has passed pioneering AI legislation, and Australia is actively working on regulating AI systems and establishing ethical AI guidelines. Countries like Australia have introduced AI ethics frameworks and established National AI Centres to promote responsible AI practices and innovation.

Leading companies like Microsoft are contributing to responsible AI by introducing guidelines and toolboxes for transparent machine learning application development. Governments and corporations are moving towards regulating and controlling AI applications, a positive development in ensuring responsible AI use.

There’s no turning back now. We must all adapt to the next wave of AI and prepare to harness its full potential.

Data Selection for Machine Learning Models

Data is the key component of machine learning. Thus, high quality training dataset is always the main success factor of a machine learning training process. A good enough dataset leads to more accurate model training, faster convergence as well as it’s the main deciding factor on model’s fairness and unbiases too.
Let’s discuss the dos and don’ts when selecting/ preparing a dataset for training a machine learning model and the factors we should consider when composing the training data. These are valid for structured numerical data as well as for unstructured data types such as images and videos.

What does the dataset’s distribution look like?

This is important mostly with numerical datasets. Calculating and plotting the frequency distribution (how often each value occurs in the dataset) of a dataset leads us to the insights on the problem formation as well as on class distribution. ML engineers tend to have datasets with normal distribution to make sure they are having sufficient data points to train the models.


Though normal distribution is more common in nature and psychology, there’s no need of having a normal distribution on every dataset you use for model training. (Obviously some real-world data collections don’t fit for the noble bell curve).

Does the dataset represent the real world?

We are training machine learning models to solve real world problems. So, the data should be real too. It’s ok to use synthetic data if you are having no other option to collect more data or need to balance the classes, but always make sure to use the real -world data since it makes the model more robust on testing/ production. Please don’t put some random numbers into a machine learning model and expect that to solve your business problem with 90% accuracy 😉

Does the dataset match the context?

Always we must make sure the characteristics of the dataset used for training the model matches with the conditions we have when the model goes live in production. For example, will say we need to train a computer vision model for a mobile application which identify certain types of tree leaves from images captured from the mobile camera. There’s no use of training the model with images only captured in a lab environment. You should have images which are captured in wild (which is closely similar for the real-world use case of the application) in your training set.

Is the data redundant?

Data redundancy or data duplication is important point to pay attention when training ML models. If the dataset contains the same set of data points repeatedly, model overfits for that data points and will not perform well in testing. (mostly underfitting)

Is the dataset biased?

A bias dataset never produces an unbiased trained model. Always the dataset we choose should be balanced and not bias to certain cases.
Let’s get an example of having a supervised computer vision model which identifies the gender of people based on their face. Will assume the model is trained only with images from people from USA and we going to use it in an application which is used world-wide. The model will produce unrealistic predictions since the training context is bias to a certain ethnicity. To get a better outcome, the training set should have images from people from different ethnicities as well as from different age groups.

Which data is there too little/too much of?

“How much data we need to train a model with good accuracy?” – This is a question which comes out quite often when we planning ML projects. The simple answer is – “we don’t know!” 😀
There are no exact numbers on how much of data needed for training a ML model. We know that deep learning models are data hungry. Yes, we need to have large datasets for training deep neural networks since we are using those for representing non-linear complex relationships. Even with traditional machine learning algorithms, we should make sure to have enough data from all the classes even from the edge/ corner cases.
What will happen if we have too much data? – that doesn’t help at all. It only makes the training process lengthy and costly without getting the model into a good accuracy. This may end up producing an overfitted trained model too.

These are only very few points to consider when selecting a dataset for training a machine learning model. Please add your thoughts on dataset selection in comments.

Different Approaches to Perform Machine Learning Experiments on Azure

We have discussed a lot about Azure Machine Learning Studio; the one-stop portal for all ML related workloads on Azure cloud. AzureML Studio provides different approaches to work on the machine learning experiments based on needs, resources and constraints you have.  Selecting the plan to attack is completely your choice.  

We all have our own way of performing machine learning experiments. While some prefer working on Jupyter notebooks, some are more into less code environments. Being able to onboard data scientists with their familiar development environment without a big learning overhead is one of the main advantages of AzureML.

In this article, let’s have a discussion on different methods available in AzureML studio and their usage in practical scenarios. We may discuss pros and cons of each approach as well.

Please keep in mind that, these are my personal thoughts based on the experiences I had with ML experiments and this may change in different scenarios.

Automated ML

Summary of an Automated ML experiment

As the name implies, this is all automated. Automated ML is the easiest way of producing a predictive model just in few minutes. You don’t need to have any coding experience to use Automated ML. Just need to have an idea on machine learning basics and an understanding on the problem you going to solve with ML.

The process is pretty straight forward. You can start with selecting the dataset you want to use for ML model training and specify the ML task you want to perform. (Right now, it supports classification, regression, time series forecasting. Computer vision and NLP tasks are in preview). Then you can specify the algorithms you want to test it with and other optional parameters. Azure does all the hard work for you and provides a deployment ready model which can be exposed as a REST API.

Pros:

  • Zero code process.
  • Easy to use and well suited for fast prototyping.
  • Eliminate the environment setup step in ML model development
  • Limited knowledge on machine learning is needed to get a production viable result.  

Cons:

  • Limited machine learning capabilities.
  • Right now, only works with supervised learning problem scenarios.
  • Works well with relational data, computer vision and NLP are still in preview.
  • There’s no way of using custom machine learning algorithms in the process.

Azure ML Designer

Azure ML Designer

Azure ML Designer is an upgraded version of the pretty old Azure ML Studio drag and drop version. Azure ML Designer is having a similar drag and drop interface for building machine leaning experiment pipelines. You have a set of prebuilt components which you can connect together in a flowchart like manner to build machine learning experiments. You have the ability to use SQL queries or python/ R scripts if you want in the process too. After training a viable ML model, you can deploy it as a web service with just few clicks.

I personally prefer this for prototyping. Plus, I see a lot f potential on Azure ML designer in educational purposes. It’s really easy to visualize the ML process through the designer pipelines and it increases the interpretability of the operation.

Pros:

  • Zero code/ Less code environment
  • Easy to use graphical interface
  • No need to worry on development/ training environment configurations
  • Having the ability to expand the capabilities with python/ R scripts
  • Easy model deployment

Cons:

  • Less flexibility for complex ML model development.
  • Less support for deep learning workloads.
  • Code versioning should handle separately.

Azure ML notebooks

Performing data visualization on AzureML notebooks

This maybe the most favourite feature on Azure ML for data scientists. I know Jupyter notebooks are like bread and butter for data scientists. Azure ML offers a fully managed notebook experience for them without giving them the hassle of setting up the dev environment on local computers. You just have to connect the notebook with a compute instance and it allows you to do your model development and training on cloud in the same way you did on a local compute or elsewhere on notebooks.

I would recommend this as the to-go option for most of the machine learning experiments since it’s really easy to spin up a notebook instance and get the job done. Most of the ML related libraries are pre-installed on the compute instance and you even have the flexibility to install 3rd party packaged you need through conda or pip.

Pros:

  • Familiar notebook experience on cloud.
  • Option to use different python kernels.
  • No need to worry about dev environment setup on local compute.
  • Can use the powerful compute resources on Azure for model training.
  • Flexibility to install required libraries through package managers.

Cons:

  • Comes with a price for computation.
  • No direct support for spark workloads.
  • Code version control should manage separately.

Developing on local and connect to AzureML service through AML Python SDK.

This is the option I would suggest for more advanced users. Think of a scenario where you have a deep learning based computer vision experiment to run on Azure with a complex code base. If this is the case, I would definitely use AzureML python SDK and connect my prevailing code base with the AzureML service.

In this approach, your code base sits on your local computer and you are using Azure for model training, deployment and monitoring purposes. You have the total flexibility of using the power of cloud for computations as well as the flexibility of using local machine for development.

Pros:

  • Total flexibility in performing machine learning experiments with our comfortable dev tools.
  • AzureML python SDK is an open-source library.
  • Code version controlling can be handled easily.
  • Whole ML process can be managed using scripts. (Easy for automation)

Cons:

  • Setting up the local development environment may take some effort.
  • Some features are still in experimental stage.

Choosing the most convenient approach for your ML experiment is totally based on the need and resources you have. Before getting into the big picture, start small. Start with a prototype, then a workable MVP, gradually you can move forward with expanding it with complex machine learning approaches.

What’s your most preferred way of model development from these options? Please mention in the comments.

Cheers!

FAQs on Machine Learning Development – #AskNaadi Part 1

Happy 2022!

It’s almost 7 years since I started playing with machine learning and related domains. These are some FAQs that comes for me from peers. Just added my thoughts on those. Feel free to any questions or concerns you have on the domain. I’ll try my best to add my thoughts on that. Note that all these answers are my personal opinions and experiences.

01. How to learn the theories behind machine learning?

The first thing I’d suggest would be ‘self-learning’. There are plenty of online resources out there where you can start studying by your own. Most of them are free. Some may need a payment for the certification (That’s totally up to you to pay and get it). I’ve listed down some of the famous places to get a kickstart for learning AI. Just take a look here.

Next would be keep practising. Never stop coding and training models in various domains. Kaggle is a good place to sharpen your skill set. Keep learning and keep practising at the same time.

02. Do we really need mathematics for ML?

Yes. To some extend you should know the theories behind probability and and some from basic mathematics. No need to worry a lot on that. As I said previously, there are plenty of places to catch up your maths too.

03. Is there a difference between data analysis and machine learning?

Yes. There is. Data analysis is about find pattern in the prevailing data and obtain inferences due to those patterns. It may have the data visualization components too. When is comes to machine learning, you train a system to learn those patterns and try to predict the upcoming pattern.

04. Does the trend in AI/ML going to fade out in the near future?

Mmm.. I don’t think so. Can’t exactly say AI is going to be ‘the’ future. Since all these technical advancements going to generate hell a lot of data, there should be a way to understand the patterns of those data and get a value out of that. So, data science and machine learning is going to be the approach to go for.

Right… those are some general questions I frequently get from people. Let’s move into some technicalities.

05. What’s the OS you use on your work rig?

Ubuntu! Yes it’s FOSS and super easy to setup all the dependencies which I need on it. (I did a complete walk through on my setup previously. Here’s it). Sometimes I use Windows too. But if it’s with docker and all, yes.. Ubuntu is the choice I’m going with.

06. What’s your preferred programming language to perform machine learning experiments?

I’m a Python guy! (Have used R a little)

07. Any frameworks/ libraries you use most in your experiments?

Since am more into deep learning and computer vision, I use PyTorch deep learning framework a lot. NumPy, Sci-kit learn, Pandas and all other ML related Python toolkits are in my toolbox always.

08. Machine learning is all about neural networks right?

No it’s not! This is one of the biggest myths! Artificial neural networks (ANNs) are only one family of algorithms which we can perform machine learning. There are plenty of other algorithms which are widely used in performing ML. Decision trees, Support Vector Machines, Naive Bayes are some popular ML algorithms which are not ANNs.

09. Why we need GPUs for training?

You need GPUs when you need to do parallel processing. The normal CPUs we have on our machines are typically having 4-5 cores and limited number of threads can be handled simultaneously. When it comes to a GPU, it’s having thousands of small cores which can handle thousands of computational threads in parallel. (For an example Nvidia 2080Ti is having 4352 CUDA cores in it). In Deep learning, we have to perform millions or calculations to train models. Running these workloads in GPUs is much faster and efficient.

10. When to use/ not to use Deep learning?

This is a tricky questions. Deep learning is always good in understanding the non-linear data. That’s why it’s performing really well in computer vision and natural language processing domains. If you have a such task, or your feature space is really large and having a massive amount of data, I’d suggest you to go with deep learning. If not sticking with traditional machine learning algorithms might be the best case.

11. Do I need to know all complex theories behind AI to develop intelligent applications?

Yes and No. In some cases, you may have to understand the theories behind AI/ML in order to develop a machine learning based applications. Mostly I would say model training and validation phases need this knowledge. Will say you are a software developer who’s very good with .Net/ Java and you are developing an application which is having a component where you have to read some text from a scanned document. You have to do it using computer vision. Fortunately you don’t have to build the component from the scratch. There are plenty of services which can be used as REST endpoints to complete the task. No need to worry on the underlying algorithms at all. Just use the JSON!

12. Should I build all my models from scratch?

This is a Yes/No answer too. This question comes mostly with deep learning model development. In some complex scenarios you may have to develop your models from the scratch. But most of the cases the problem you having can be defined as a object detection/ image classification/ Key phrase extraction from text etc. kinda problem. The best approach to go forward would be something like this.

  • Use a simple ANN and see your data loading and the related things are working fine.
  • Use a pre-trained model and see the performance (A widely used SOTA model would be the best choice).
  • If it’s not working out, do transfer learning and see the accuracy of the trained model. (You should get good results most of the times by this step)
  • Do some tweaks to the network and see if it’s working.
  • If none of these are working, then think of building a novel model.

13. Is cloud based machine learning is a good option?

In most of the industrial use cases yes! Since most of the data in prevailing systems are already sitting in the cloud and industries are heavily relying on cloud services these days, cloud based ML is a good approach. Obviously it comes with a price. When it comes to research phases, the price of purchiasing computation power maybe a problem. In those cases, my approach would be doing the research phase on-prem and moving the deployment to the cloud.

14. I’ve huge computer vision datasets to be trained? Shall I move all my stuff to the cloud?

Ehh… As I said previously, if you planning on a research project, which goes for a long time and need a lot of computational hours, I’d suggest to go with a local setup first, finalize the model and then move to the cloud. (If dollars aren’t your problem, no worries at all! Go for the cloud! Obviously it’s easy and more reliable)

15. Which cloud provider to choose?

There’s a lot of cloud providers out there having various services related to ML. Some provides out of the box services where you can just call and API to do the ML tasks (Microsoft Cognitive services etc. ). There are services where you can use your own data to train prevailing models (Custom Vision service by Azure etc.)

If you want end-to-end ML life cycle management, personally I find Azure ML service is a good solution since you can use any of your ML related frameworks and tools and just use cloud to train, manage and deploy the models. I find MLOps features that comes with Azure Machine Learning is pretty useful.

16. I’ve trained and deployed a pretty good machine learning model. I don’t need to touch that again right?

No way! You have to continuously check their performance and the accuracy they are providing for the newest data that comes to the service. The data that comes into the service may skewed. It’s always a good idea to train the model with more data. So better to have a re-training MLOps pipelines to iteratively check your models.

17. My DL models takes a lot of time to train. If I have more computation power the things will speed up?

mm.. Not all the time. I have seen cases where data loading is getting more time than model training. Make sure you are using the correct coding approaches and sufficient memory and process management. Make sure you are not using old libraries which may be the cause for slow processing times. If your code is clean and clear then try adjusting the computation power.

This is just few questions I noted down. If you have any other questions or concerns in the domain of machine learning/ deep learning and data science, just drop a comment below. Will try to add my thoughts there.

Analyzing Performance of Neural Networks with PyTorch Profiler – Part 2

PyTorch Profiler output for model training

In the previous post, we explored the basic concepts of PyTorch profiler and the newest capabilities comes with its recent updates. One of the coolest things I tried is the TensorBoard plugin comes with PyTorch Profiler. Yes.. you heard to right.. The well-known deep learning visualisation platform TensorBoard is having a Profiler plugin which makes network analysis much more easy.

I just tried the PyTorch Profiler official tutorials and seems the visualisations are pretty descriptive with analysis. I’ll do a complete deep dive with the tool in the next article.

One of the cool things I’ve noticed is the performance recommendations. Most of the recommendations make by the tool makes sense and am pretty sure they going to increase the model training performance.

In the meantime you can play around with the tool and see how convenient it is to use in your deep learning experiments. Here’s the script I used for starting the initial steps with the tool.

import torch
import torch.nn
import torch.optim
import torch.profiler
import torch.utils.data
import torchvision.datasets
import torchvision.models
import torchvision.transforms as T

#load data
transform = T.Compose(
    [T.Resize(224),
     T.ToTensor(),
     T.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
train_set = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=32, shuffle=True)

#create model 
device = torch.device("cuda:0")
model = torchvision.models.resnet18(pretrained=True).cuda(device)
criterion = torch.nn.CrossEntropyLoss().cuda(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
model.train()


#train function
def train(data):
    inputs, labels = data[0].to(device=device), data[1].to(device=device)
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

#use profiler to record execution events
with torch.profiler.profile(
        schedule=torch.profiler.schedule(wait=1, warmup=1, active=3, repeat=2),
        on_trace_ready=torch.profiler.tensorboard_trace_handler('./log/resnet18'),
        record_shapes=True,
        profile_memory=True,
        with_stack=True
) as prof:
    for step, batch_data in enumerate(train_loader):
        if step >= (1 + 1 + 3) * 2:
            break
        train(batch_data)
        prof.step()

Analyzing Performance of Neural Networks with PyTorch Profiler

Deep neural networks are complex! Literally it takes quite an amount of effort and time to make them work near to perfect. Despite the effort you put on fitting the model well with your data and getting an admirable accuracy you have to keep your eye on model efficiency and performance. Sometimes it’s a trade-off between the model accuracy and the efficiency in inference. In order to do this, analysing the memory and computation usage of the networks is essential. This is where profiling neural networks comes in to the scene.

Since PyTorch is my preferred deep learning framework, I’ve been using PyTorch profiler tool it had for a while on torch.autograd.profiler . It was pretty sleek and had some basic functionalities for profiling DNNs. Getting a major update PyTorch 1.8.1 announced PyTorch Profiler, the imporved performance debugging profiler for PyTorch DNNs.

One of the major improvements it has got is the performance visualisations attached with tensorboard. As mentioned in the release article, there are 5 major features included on PyTorch Profiler.

  1. Distributed training view
  2. Memory view
  3. GPU utilization
  4. Cloud storage support
  5. Jump to course code

You don’t need to have extensive set of codes for analyzing the performance of the network. Just a set of simple Profiler API calls. To get the things started, let’s see how you can use PyTorch Profiler for analyzing execution time and memory consumption of the popular resnet18 architecture. You may need to have PyTorch 1.8.1 or higher to perform these actions.

import torch
import torchvision.models as models
from torch.profiler import profile, record_function, ProfilerActivity

use_cuda = torch.cuda.is_available()
device = torch.device("cuda:0" if use_cuda else "cpu")

#init simple resnet model
model = models.resnet18().to(device)

#create a dummy input
inputs = torch.randn(5,3,224,224).to(device)

# Analyze execution time
with profile(activities=[
        ProfilerActivity.CPU, ProfilerActivity.CUDA], record_shapes=True) as prof:
    with record_function("model_inference"):
        model(inputs)

#print the output sorted with CPU execution time
print(prof.key_averages().table(sort_by="cpu_time_total", row_limit=10))


#Analyzing memory consumption
with profile(activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
        profile_memory=True, record_shapes=True) as prof:
    model(inputs)

#print the output sorted with CPU memory consumption
print(prof.key_averages().table(sort_by="self_cpu_memory_usage", row_limit=10))
Output from the execution time analysis
Output from the memory consumption analysis

Will do discuss on using Profiler visualizations for analyzing model behaviours in the next post.

Handling Imbalanced Classes with Weighted Loss in PyTorch

When it comes to real world data collections, we don’t have the prestige of having perfectly balanced labelled datasets for training models. Most of the machine learning algorithms are not immune for imbalanced classes and cause less accurate and biased models. There are many approaches that we can follow to tackle imbalanced data problem. Either we have to choose a ML algorithm which is reluctant for imbalanced data or we may have to generate synthetic data in order to make the classes balanced.

Neural networks are trained using backpropagation which treats each class same when calculating the loss. If the data is not balanced, that makes the model bias for one class than another.

A, B, C, D classes are imbalanced

I had to face this issue when experimenting with a computer vision based multi-class classification problem. The data I had was so much skewed and some classes had a very less amount of data compared to the majority class. Model was not performing well at all and need to take some actions to tackle the class imbalance problem.

These were the solutions I thought of try out.

  1. Creating synthetic data –
    Creating new synthetic data points is one of the main methods which is used mostly for numerical data and in some cases in imagery data too with the help of GAN and image augmentations. As in the starting point, I took the decision not to go with synthetic data generation since it may introduce abnormal characteristics to my dataset. So I keep that for a later part.
  2. Sampling the dataset with balanced classes –
    In this approach, what we normally do is, sample the dataset similar number of samples for each data label. For an example, will say we have a dataset which is having 3 classes named A, B & C with 100, 50, 20 data points for each class accordingly. When sampling what we do is randomly selecting 20 samples from each A, B & C classes and get a dataset with 60 data points.

In some cases this approach comes as a better option if we have very large amounts of data for each class (Even for the minority classes) In my case, I was not able to take the cost of loosing a huge portion of my data just by sampling it based on the data points having in the minority class.

Since both methods were not going well for me, I used a weighted loss function for training my neural network. Since this is a multi-class classification problem, I used Cross Entropy Loss in PyTorch as my loss function. (You can follow the similar approach if you using BCELoss for binary classification too)

import torch.nn as nn

#class weights for 6 class multi-class classification
class_weights = [0.5281, 0.8411, 0.9619, 0.8634, 0.8477, 0.9577]

#loss function with class weights
criterion = nn.CrossEntropyLoss(weight = class_weights) 

How I calculated the weight for each class? –

This is so simple. What I did was calculating a manual re-scaling weight for each class and pass it to “weight” parameter in the loss function. Make sure that you have a Tensor with the size of number of classes as the class weights. (In simpler words each class should have a weight).

Hint : If you using GPU for model training, make sure to put your class weights tensor to the GPU too.

Did it worked? Hell yeah! I was able to train my model accurately with less bias and without overfitting for a single class by using this simple trick. Let me know any other trick you use for training neural network models with imbalanced data.

Happy coding 🙂

Docker + Machine Learning : A Perfect Combo

Docker has become the new norm of the software industry. Everyone is so obsessed with it since docker solves most of the issues software engineers and system administrators had with platform dependencies in application development and deployments.

“Docker is a tool that helps users to exploit operating-system-level virtualization to develop and deliver software in packages called containers.”  

~ Wikipedia

Though the technical explanation sounds bit complicated, simply docker can be identified as a ‘VM like’ environment where you can build and deploy your software applications.

Why docker for machine learning/ deep learning?

We have endless discussions on how hard it is to configure the development and deployment environments in machine learning. Since python is the most used language for ML and DL experiments, dealing with python packages and making them all work seamlessly on your hardware can be a nightmare. Using cloud-based machine learning platforms or virtual machines are some of the options we can utilize to deal with this problem.

Being more flexible than virtual machines and easy migration capabilities, docker is one of the best ways for managing machine learning environments. Since docker has become the key component of MLOps it’s time for the data scientists for adapting docker in their developments.  

Where and how we can use docker?

For me docker helps me out in 4 main stages in the machine learning experiment pipelines.

  1. As a development environment.

I use to do lot of experiments in the domain of computer vision and deep learning. You may have experienced the pain of dealing libraries like opencv with python. So, I always use custom docker images with all the dependencies installed for running my experiments. This makes easy for me to collaborate with my peers easily without giving the hassle of replicating my development environment in their machines.

What about the huge amounts of data? Including those also inside the docker container? Nah. Always keeping the data in mounted volumes as well as the output files created from the experiments.  

If you need GPU supported docker images, NVIDIA provides docker image variations that matches with your need on docker hub.

2. As a training environment.

You all know ML/ DL models normally take quite a big time for training. In my case, I use remote shared servers with GPUs for training my experiments. For that, the easiest way is containerizing the experiment and pushing to the server.

3. As a deployment environment.

Another popular use case of docker is in the deployment phase. Normally the deployment environment should fulfil required dependencies in order to inference the ML/DL model seamlessly. Since a docker container can be shipped across platforms easily without worrying about hardware level dependencies, it’s really easy to use docker for deploying ML models.   

4. Docker for cloud-based machine learning

Most of the data scientists are using cloud-based machine learning platforms like Azure machine learning today with their flexibility and resources. Containerized experiments are the main component these services use in order to run them on cloud. When it comes to Azure ML you can use their default docker image for experiments or you can specify your custom base image for model development and training.

Take a look on this documentation for deploy Azure ML models using a custom docker base image.

So, docker has become a life saver for me since it reduces a lot of headache occurring with machine learning model life-cycle. Will come up with a sample experiment on using docker for training a machine learning model in the next post.

10 Tips for Designing & Developing Computer Vision Projects

Computer vision based applications have become one of the most popular research areas as well as have gained lot of interest in different industrial domains. Popularity and the advancements of deep learning have given a boost for the hype of computer vision.

Being a researcher focused on computer vision based applications for nearly 3 years, Here are some tips I’d give for a developer who’s stepping into a computer vision related experiment/ deployment.

Before going further into the discussion, you may need to get an idea on the difference between traditional computer vision approaches and deep learning based approaches. Here’s a quick overview on that.

01. Do we really have to use deep learning based computer vision approaches to solve this?

This is the very first thing to concern! When you see a problem from the scratch, you may think applying deep learning for this is the survivor. It’s not true in some cases. You may be able to solve the problem using traditional line detection filters etc. easily without wasting the time and energy in training a deep learning model to solve the task. Observe the problem thoroughly and get the decision to move forward or not.

02. Analyze the input data and the desired output

To be obvious, deep learning based computer vision models get images or videos as its input modalities. Before starting the project implementations, we should consider following factors of the input data we have.

Size of the data –

Since DL models need a huge amount of data (in most of the cases) for training without getting the models overfitted we need to make sure we have a good amount of data in hand for training. In this case we can’t specify exact numbers. I’d say more the better!

Quality of the data –

Some image inputs or the video streams we get are blurred and not covering the most important features we need to build the models. Getting images/ videos in higher resolution is always better. When considering the quality of the data it’s better to take a look on the factors like class imbalance if it’s a classification problem.

Similarity of training data and data inputs in the inference time –

I’ve seen cases where data model is getting in the inference time is very different than the data used in the training (For an example the model is trained using cat images from cartoons and it’s getting real life cat images in the inference time.) If it’s not a model which is specifically designed for domain adaptation, you should NEVER do this mistake.

03. Building from the scratch? Is it necessary?

As I said previously, computer vision is one of the most widely researched areas in deep learning. So that, you are having the privilege of using pre-built models as well as online services to perform your computer vision workloads.

Services such as Azure cognitive services, Google vision APIs etc. provides pre-built web APIs which you can directly use for many vision related tasks. Starting from an OCR task of reading a text in a scanned document, there are APIs which can identify human faces and their emotions even. No need to build from the scratch. You can just use the service as a web service in your application.

Even going a step forward from the pre-built services Microsoft Azure cognitive services offer a custom vision service where you can train your own image classification models with your own data. This may come handy in most of the practical applications where you don’t need to spend time on building the model or configuring the training environment.

04. Building from scratch? Is it REALLY necessary?

Yp! Again, a decision to take. If your problem cannot be addressed from the pre-built computer vision services available online, the option you have to go forward is building a deep learning model and training it using your own data. When it comes to model development one of the very big mistakes we do is neglecting the prevailing models built by researchers for various purposes.

I’m pretty sure most of the computer vision tasks that you have is falling under famous computer vision areas such as image classification, action recognition in videos, human pose detection, human/ object tracking etc. There are many pre-built methods which has been achieved state-of-the-art accuracy in solving these problems and benchmarked with most of the publicly available big datasets. For an example, ResNet models are specifically designed for image classification and shown the best accuracy on ImageNet dataset. You can easily use these models (Most of these models are available in model zoos of popular deep learning frameworks) and adapt their last layers for your needs and get higher accuracies rather than building your own model from the scratch.

Papers with code is a great place to search for prevailing models on various computer vision tasks.

I recently came across this openMMLab repositories which comes pretty handy in such tasks. (Mostly for video analysis stuff)

05. Use the correct method

When building the models, make sure you follow the correct path which matches with your data input. For an example if you only have few training images to train your classification model, you may need to look on areas like few-shot learning to train your model. Tricks such as adding batch normalization, using correct loss functions, adding more input modalities, using learning rate schedulers, transfer learning will surely increase your model accuracy.

06. Data augmentation is a suvivor!

More data the better! Always take a look on sensible data augmentation methods to make sure your model is not overfitted for training data. Always visualize your data inputs before using that for model training to make sure your data augmentations are making sense.  

07. Model training should not be a nightmare

This is the most time-consuming part in developing computer vision models. We all know training deep learning models needs a lot of computation power. Make sure you have enough computation power to train your models. It’ll be a nightmare to train an image classifier which is having 100,000 images just using your CPU! Make sure you have a good enough GPUs for performing the computations and configured them correctly for training models.  

08. Model inference time should not be years!

Model inferencing the least concerned portion in model development. Though it is the most vital part since this is where the outcome is shown for the outsider. Sometimes, your trained model may take a lot of time for inferencing which may make the model useless in a real-world application. Think of a human detection system you implemented taking 1-2 minutes to identify a human who’s accessing a secured location…. There’s no use of a such system since that doesn’t meet the need of real-time surveillance. Always make sure to develop the simplest model that gives the best accuracy. Sometimes you may have to compromise few digits from the accuracy numbers to increase the model efficiency. That’s totally fine in a real-world application. Before pushing the model into production, take a look on converting the models to ONNX or model pruning. It’ll help you to deploy efficient models.

09. Take a look on your deployment target

This directly connects with the facts we discussed in the model inference time. We don’t have the luxury of having high end machines powered with GPUs in all deployment locations. Or having high powered cloud services. Sometimes out deployment target may be a IoT device. So that make sure you design a light weight model which even provides a good performance by consuming less resources.     

10. Privacy concerns

Last but not least, we may have to look on privacy concerns. Since we are dealing with image and video data which may contains lot of personal informaiton of the people, we need to make sure we are followiong the privacy guidelines and making sure the data we use for model training is having enough security clearance to do such tasks.

Bit lengthy… but hope you got some clues before getting into your next computer vision project. Happy coding 😊

Open Neural Network Exchange (ONNX)

In the current AI landscape, there are plenty of programming languages, frameworks, runtime environments and hardware devices used by practitioners for developing and deploying their machine learning and deep learning models. This technology stack get widen when it comes for integrating these machine learning models into software development processes.

With the experience with software development, we know handling platform dependencies and getting all components work smoothly is one of the biggest headache developers face. There’s no big difference in the machine learning space.

Addressing the problem of communicating between different machine learning development frameworks, industry is now adapting to “Open Neural Network Exchange” (ONNX).

What is ONNX?

ONNX acts as the open standard for representing ML/DL models

ONNX is an open format to represent both deep learning and tradition machine learning models. It increases the interoperability of the models without depending on the runtime environment or the development tools.

In simple words, you can port your neural network in a deep learning framework like Pytorch and then inference it on a Tensorflow environment by converting it into a ONNX model!

ONNX is widely supported by most of the frameworks, tools and hardware (Since it’s evolving rapidly, am pretty sure many frameworks will come under ONNX in the near future.)

Since ONNX is backed by the big players in AI space such as Facebook, Microsoft, AWS and Google you are use your familiar frameworks easily with ONNX.

Why ONNX?

Let’s get a scenario where you have built a deep learning based classification model for classifying grocery items using PyTorch as your deep learning framework. In a later stage of the developments you need to use the built model on a iOS mobile application where machine learning based operations are based on CoreML. You can export the PyTorch model into a ONNX model and then use on CoreML runtime for inference.

ONNX has proven it’s success in the scenarios where we have to deploy deep learning based models on IoT devices with less computation power and has stated a noticeable performance increase in inference times.

With ONNX, you don’t need to package the various platform dependencies in the deploying target. You just need the ONNX runtime.

You can find out the ONNX supported list of tools and frameworks through this link.

In the coming posts, am going to discuss my experiences with setting up ONNX runtime and using it with my favourite deep learning framework, PyTorch!

Happy coding 🙂