Tips | NaadiSpeaks

Computer vision based applications have become one of the most popular research areas as well as have gained lot of interest in different industrial domains. Popularity and the advancements of deep learning have given a boost for the hype of computer vision.

Being a researcher focused on computer vision based applications for nearly 3 years, Here are some tips I’d give for a developer who’s stepping into a computer vision related experiment/ deployment.

Before going further into the discussion, you may need to get an idea on the difference between traditional computer vision approaches and deep learning based approaches. Here’s a quick overview on that.

01. Do we really have to use deep learning based computer vision approaches to solve this?

This is the very first thing to concern! When you see a problem from the scratch, you may think applying deep learning for this is the survivor. It’s not true in some cases. You may be able to solve the problem using traditional line detection filters etc. easily without wasting the time and energy in training a deep learning model to solve the task. Observe the problem thoroughly and get the decision to move forward or not.

02. Analyze the input data and the desired output

To be obvious, deep learning based computer vision models get images or videos as its input modalities. Before starting the project implementations, we should consider following factors of the input data we have.

Size of the data –

Since DL models need a huge amount of data (in most of the cases) for training without getting the models overfitted we need to make sure we have a good amount of data in hand for training. In this case we can’t specify exact numbers. I’d say more the better!

Quality of the data –

Some image inputs or the video streams we get are blurred and not covering the most important features we need to build the models. Getting images/ videos in higher resolution is always better. When considering the quality of the data it’s better to take a look on the factors like class imbalance if it’s a classification problem.

Similarity of training data and data inputs in the inference time –

I’ve seen cases where data model is getting in the inference time is very different than the data used in the training (For an example the model is trained using cat images from cartoons and it’s getting real life cat images in the inference time.) If it’s not a model which is specifically designed for domain adaptation, you should NEVER do this mistake.

03. Building from the scratch? Is it necessary?

As I said previously, computer vision is one of the most widely researched areas in deep learning. So that, you are having the privilege of using pre-built models as well as online services to perform your computer vision workloads.

Services such as Azure cognitive services, Google vision APIs etc. provides pre-built web APIs which you can directly use for many vision related tasks. Starting from an OCR task of reading a text in a scanned document, there are APIs which can identify human faces and their emotions even. No need to build from the scratch. You can just use the service as a web service in your application.

Even going a step forward from the pre-built services Microsoft Azure cognitive services offer a custom vision service where you can train your own image classification models with your own data. This may come handy in most of the practical applications where you don’t need to spend time on building the model or configuring the training environment.

04. Building from scratch? Is it REALLY necessary?

Yp! Again, a decision to take. If your problem cannot be addressed from the pre-built computer vision services available online, the option you have to go forward is building a deep learning model and training it using your own data. When it comes to model development one of the very big mistakes we do is neglecting the prevailing models built by researchers for various purposes.

I’m pretty sure most of the computer vision tasks that you have is falling under famous computer vision areas such as image classification, action recognition in videos, human pose detection, human/ object tracking etc. There are many pre-built methods which has been achieved state-of-the-art accuracy in solving these problems and benchmarked with most of the publicly available big datasets. For an example, ResNet models are specifically designed for image classification and shown the best accuracy on ImageNet dataset. You can easily use these models (Most of these models are available in model zoos of popular deep learning frameworks) and adapt their last layers for your needs and get higher accuracies rather than building your own model from the scratch.

Papers with code is a great place to search for prevailing models on various computer vision tasks.

I recently came across this openMMLab repositories which comes pretty handy in such tasks. (Mostly for video analysis stuff)

05. Use the correct method

When building the models, make sure you follow the correct path which matches with your data input. For an example if you only have few training images to train your classification model, you may need to look on areas like few-shot learning to train your model. Tricks such as adding batch normalization, using correct loss functions, adding more input modalities, using learning rate schedulers, transfer learning will surely increase your model accuracy.

06. Data augmentation is a suvivor!

More data the better! Always take a look on sensible data augmentation methods to make sure your model is not overfitted for training data. Always visualize your data inputs before using that for model training to make sure your data augmentations are making sense.

07. Model training should not be a nightmare

This is the most time-consuming part in developing computer vision models. We all know training deep learning models needs a lot of computation power. Make sure you have enough computation power to train your models. It’ll be a nightmare to train an image classifier which is having 100,000 images just using your CPU! Make sure you have a good enough GPUs for performing the computations and configured them correctly for training models.

08. Model inference time should not be years!

Model inferencing the least concerned portion in model development. Though it is the most vital part since this is where the outcome is shown for the outsider. Sometimes, your trained model may take a lot of time for inferencing which may make the model useless in a real-world application. Think of a human detection system you implemented taking 1-2 minutes to identify a human who’s accessing a secured location…. There’s no use of a such system since that doesn’t meet the need of real-time surveillance. Always make sure to develop the simplest model that gives the best accuracy. Sometimes you may have to compromise few digits from the accuracy numbers to increase the model efficiency. That’s totally fine in a real-world application. Before pushing the model into production, take a look on converting the models to ONNX or model pruning. It’ll help you to deploy efficient models.

09. Take a look on your deployment target

This directly connects with the facts we discussed in the model inference time. We don’t have the luxury of having high end machines powered with GPUs in all deployment locations. Or having high powered cloud services. Sometimes out deployment target may be a IoT device. So that make sure you design a light weight model which even provides a good performance by consuming less resources.

10. Privacy concerns

Last but not least, we may have to look on privacy concerns. Since we are dealing with image and video data which may contains lot of personal informaiton of the people, we need to make sure we are followiong the privacy guidelines and making sure the data we use for model training is having enough security clearance to do such tasks.

Bit lengthy… but hope you got some clues before getting into your next computer vision project. Happy coding 😊

We all wait for Fridays! Chill out in the eve… hangout with besties … roam… shopping… sleep for long hours… relax… many more…

Despite how your office is exciting and even if you have funniest office buddies, obviously at some point you should have a feeling like… “Why weekdays are such boring!”

This happens because we all are stick with the same routine all days. Get on to the bus/train on the same time… walking on the same route… eating from the same food spot… come out from the office at the same time… then you get bored!

Here are some tips to try out with low cost or no cost on a weekday eve to change your normal boring routine.

Change your way of traveling

routes_3 If you travel by bus to home or boarding place all days after work, why not trying the train? At least for a short distance. You’ll meet new travel buddies and stories. Your Cinderella might be there too 😉

Walk… as long as you could…

OLYMPUS DIGITAL CAMERA

One thing I hate is the traffic after the office time. Fully packed snail busses make me so bored!! Walk… that’s the solution! Get off from the office then walk towards your home. You’ll able to save 10-20 rupees from your bus fare. Use a short cut where busses are too large to squeeze. Google maps may help you here. You’ll be able to skip your gym session for a day and have a good breeze without AC!

TIP- ladies put your heels into the bag and put on your slippers. Your boss is not on the road!

Use alternate routes

short cut Going on the same route make us robots! So use alternate routes. Sometimes you may have to spend few more minutes on the road. Maybe on crowded bus halts… tuk tuks… but trust me. It’s something worth to try! A small adventure is better than nothing!

Click click click

Smartphone camera street photography review You don’t have to go for a beach or safari to get great clicks! Imagine the streets as your studio. Get your smartphone out & try different angles of the buildings. Sky… road signs… men… women… litter… anything! Create some art with it. Post on insta! Ta da!!

If you have a DSLR, get that giant out! Beware of the security guards of the high secure zones and shopping malls!

Be a food adventurer!

If you tasting the same food from the same place always, you don’t have a taste tongue or it is dead! So try different food from different places. There may be some delicious dishes in the nearest small “Bath Kade”! If you afraid of putting new things into your digesting system use “Siddalepa asamodagam” after the meal!

Explore your office and surrounding.

Have you ever checked what’s outside to your cubical and where’s the service elevator of your tower? For me looking for the spots that few uses is fun! Try a tide in your service elevator! It may sound weird! But try it. Peep from the backdoor. There might be a friendly car waiting to be your companion.

See the sea

Three ladies watch the sea in Teignmouth.

If you working in a coastal city like Colombo, going for the sea, maximum by a 30mins ride is easy. So try it. Yeah alone… sea will heal you… She may get all your sufferings…

Greet strangers

234171_5_ You are meeting hell a lot of people regularly. In the elevator… road… break that awkward elevator silence. Do say them good morning! Treat with a good smile. Cost you nothing. But he/she may grab you for lunch next day!