Citation: All the videos/images used in this post are taken from https://www.pexels.com/search/videos/german%20shepherd/

In the part1 blog, we understood how to set up the data pipeline (i.e) converting the downloaded videos to frames and annotating the images(bb boxes) using CVAT. Now we are all set for the implementation of the object detection model.

step1: Gathered 13 videos from pexels.com. One point to be taken care of here is, when we download the videos there should be some similarity between the train, validation and test set videos. …


In the past, we’ve seen different types of neural network architectures, starting with simple fully connected neural networks to CNNs for image processing(computer vision). Another category in that list is Recurrent Neural Networks(RNN) and the main target of the RNNs are sequential data. Sequence refers to the current data having a dependency on the past. Some of the examples include time series(sales prediction), speech, text and video sequences where the current information is a result of the accumulation of the previous details.

For sequential data, the normal neural networks might not be very effective. As these networks can only process…


For Today’s post, it’s all about the words to vectors(i.e) converting words into numbers that the computers can efficiently understand. Let’s start with some examples,

Text1: The cat chases the rat.

Text2: The dog chases the cat.

we can start assigning sequence numbers to the words,

The = 1, cat=2, chases=3, rat=4, dog=5

Can we directly replace the words with the respective numbers? Let’s see what happens when we do that,

Text1: 1 2 3 1 4

Text2: 1 5 3 1 2

The above approach of effortlessly using the numbers might alter the meaning (i.e) some of the words…


This post is the continuation of the Preprocessing steps followed while dealing with NLP use cases.

  • Usage of wordnet library
  • Lemmatization
Photo by Antonio Gabola on Unsplash

Usage of wordnet library: Wordnet library can be considered as a dictionary of synonyms. We can leverage this python package to validate whether the word is a proper English word. If it is a junk token, then it can be ignored thus reducing the critical word count.

Let’s consider a sample text,

Text = 'skype problem Requires approval erp details'

we can say the words ‘skype’ and ‘erp’ are non-English words and more like technical terms. Now, we can…


In the previous article, we got a grip on regular expression and lower case conversion. The next interesting steps will be to further refine the input text before the actual model training. A couple of subsequent posts will have a series of preprocessing steps for the input data.

Table of Contents:

  • Pycontractions
  • Stop words removal
  • Word display using Word cloud

Pycontractions: We usually follow an informal tone while communicating on social media such as Twitter or leaving a review comment on Amazon or other online stores. We tend to use a lot of contractions to keep the text simple &…


Photo by Amador Loureiro on Unsplash

We’ve all heard about the popular Data science saying “Garbage In, Garbage Out”. Data preprocessing plays a vital role in any model building requirement. A simple model could produce accurate results with well-curated data. Conversely, a complex model can fail to meet the target outcomes because of poor selection in the input data fed into the model.

Just like any other deep learning models, the NLP models also does require properly cleaned and processed input data for creating powerful models. The very first step in the NLP data preprocessing includes the handling of regular expressions. Let’s gain a better intuition…


Photo by Brett Jordan on Unsplash

So far we’ve extensively discussed Image processing and the corresponding subfield Computer Vision. It’s time to switch the gears to yet another interesting topic in deep learning (i.e) NLP stands for Natural Language Processing. The core theme of NLP is to make computers understand human languages.

The wide range of NLP applications include chatbots, automatic voice response systems, language translation, speech to text conversion and many more. The most popular NLP Robots which we see in our daily lives are Alexa, google translators to name a few.

Why the need for computers to understand the language? Starting from customer reviews…


Photo by Zac Harris on Unsplash

Nowadays, in many highly secured places(banks, government offices etc)we could notice face verification system. The critical job of the system is to confirm whether the person is an authenticated employee. Ever wonder how the system works in the background. Siamese Networks assists such kind of verification process. The application is not just confined to face verification but also expanded to signature confirmation by comparing whether two signatures are from the same customer.

Let’s unravel the working principle behind the Siamese Neural Networks from the original research paper.

What is one-shot learning? If we take the case of usual image classification…


This article will be a quick overview of autoencoders.

Fig 1 — shows the schematic structure of an autoencoder from wiki

Autoencoder is a type of neural network, having the same input and output. It works based on unsupervised learning without the need for a target label. The entire network is sub-divided into two major components (1) Encoder and (2) Decoder.

Encoder: The prime job of the encoder unit is to extract the critical features from the input vector(images, text etc), thereby reducing the original feature space.

Decoder: The decoder portion of the network tries to reconstruct the original input from the extracted features.

Learning Process: The entire learning happens by comparing…


In this post, we’ll discuss Google’s EfficientNetV2 released in April 2021. The article will be a quick review of the original research paper. A brand new convolutional networks, more efficient in terms of both training speed and accuracy. The core idea is to have a mix of training-aware neural architecture search and scaling. The outcomes of the experiments show that the new model trains faster than the current SOTA architectures while downsized to 6.8x smaller.

Fig1 — shows the comparison of accuracy — source
  • The whole training process is further enhanced by progressively increasing the image size while adaptively adjusting the regularization techniques such as drop out and data…

Nandhini N

AI Enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store