Citation: All the videos/images used in this post are taken from https://www.pexels.com/search/videos/german%20shepherd/

In the part1 blog, we understood how to set up the data pipeline (i.e) converting the downloaded videos to frames and annotating the images(bb boxes) using CVAT. Now we are all set for the implementation of the object detection model.

step1: Gathered 13 videos from pexels.com. One point to be taken care of here is, when we download the videos there should be some similarity between the train, validation and test set videos. …


For this post, let’s focus specifically on how to leverage remote sensing images in Agriculture(to monitor plant health). Before jumping into the aerial images, let’s take a moment to explore some of the sources from where these images could be retrieved. As we all know, multiple earth observation satellites take pictures of the earth as they orbit around. These images carry a wide range of information that could be applied to many areas and one such field is Agriculture(precision farming).

Let’s consider some of the popular aerial Imagery Data Sources including images from Landsat-8 and Sentinel-2. Landsat-8 is the 8th…


Earlier we’ve gained some interesting ideas on how object detection models are applied to aerial images. Detection models are effective when the objects are clearly distinguishable (i.e) the user can effortlessly place a bounding box on the desired objects. But it is not always the case, consider the scenario of a vegetation land where we want to infer about the quality, growth etc. In these situations, localisation using bb boxes might not be a good fit. However, Semantic Segmentation would bring in beneficial information.

Photo by USGS on Unsplash

To get the maximum details, Semantic segmentation is processed in the spectral, spatial and temporal domain…


Till now we have seen how to locate objects in custom datasets and the images considered so far have been taken from cameras/video recorders. Let’s extend this knowledge to object detection for satellite imagery datasets. The application of remote sensing images are boundless and some of them include urban planning, precision agriculture, locating water resources and post-disaster recovery. Effective processing of these images can assist us in a lot many beneficial ways.

One of the prime differentiating factors between the two types of images mentioned above is the count of the objects in an image. In the case of satellite…


Semantic Segmentation is a technique where each pixel in the image is categorized into a specific class. It is applied in various fields such as medicine to locate tumours, track medical instruments in operations and scene parsing for autonomous vehicles.

Fig 1 — Show a sample Image from Cityscapes

Since its introduction, it has evolved and undergone various stages. In this article, we’ll discuss different methods and procedures that have been leveraged for semantic segmentation.

Fully Convolutional Networks for Semantic Segmentation: Pre-trained Convolutional NNs play a crucial role in feature extraction. By using existing CNNs such as AlexNet, VGG net & GoogLeNet and having the last layers as another…


In the previous article, we had a quick glimpse of different types of data and also the volume. In general, the humongous volume of data is referred to as Big Data with the below characteristics,

  • Volume of the data goes beyond petabytes(>1000⁵)
  • Veracity due to the inconsistency of the data(unreliability)
  • Variety in the data (from multiple resources and different formats)
  • Velocity of the data (i.e) the rate at which the data flows into the system

These properties pose a challenge while processing and storing Big data. To handle such data elegantly, a couple of platforms have been introduced(Hadoop, Spark). Some…


Photo by Patricia Prudente on Unsplash

Nowadays we have super-rich data available from various resources(online, mobile devices and the Internet of Things). Initially, we had only data in kilobytes, followed by a huge data expansion(explosion) resulted in a volume of petabytes, exabytes, zettabytes and yottabytes. In general, data can be broadly classified into three categories (1) Structured — takes tabular format with rows and columns (2) Semi-structured — has data in XML formats with tags (3) Unstructured — images, texts and speech.

Almost 90% of the available data is Unstructured and any transaction related details would result in structured data. These data are either generated by…


Massive advancements have been made in the area of object detection. But the computational cost of SOTA models is high due to the complex architectures. The gigantic model size and the huge computation pose challenges for real-world applications such as Self-driving cars(where the latency and size of the model are bounded). Because of this constraint, model efficiency became the focal point of the researchers.

Now comes the question of how to build a scalable detection architecture with higher accuracy and better efficiency. By making careful design choices while building the detection models, the aforementioned circumstance has been neatly handled. …


Photo by Christopher Burns on Unsplash

For this post, let’s focus on how Transfer Learning can be best utilized in the field of medicine for categorizing clinical images. Transfer Learning(using pre-trained weights) in MIC has shown promising results. The research paper (Rethink Transfer Learning in MIC) explores various strategies of TL on both shallow and deep networks for classification on two chest x-ray datasets. The experiments reveal that finetuned truncated deep models outperform when there is inadequate training data.

Employing Transfer learning for medical image classification as well as segmentation has become standard practice. Usually, the convolutional neural networks are pre-trained on large-scale datasets such as…


This article will explore the research paper “Rethinking Image Net Pre-training”. For computer vision tasks, employing pre-trained weights(trained on ImageNet) as a starting point is the norm. This paper takes a different route of training the model from scratch using randomly initialized weights. The accuracy results on the COCO dataset(object detection and instance segmentation) are encouraging and on par with the architectures that use pre-trained weights of ImageNet datasets.

During the new training process(from scratch), two prime points to be taken into account (1) Normalization technique needs to be used for optimisation (2) The model needs to be trained for…

Nandhini N

AI Enthusiast | Blogger✍

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store