“The flexibility and cost efficiency of traffic monitoring using Unmanned Aerial Vehicles (UAVs) has made such a proposition an attractive topic of research.”


Unmanned Aerial Vehicles (drones) are emerging as a promising technology for both environmental and infrastructure monitoring, with broad use in a plethora of applications. Many such applications require the use of computer vision algorithms in order to analyse the information captured from an on-board camera. In particular, Road Traffic Monitoring (RTM) systems constitute a domain where the use of UAVs is receiving significant interest.

In traffic monitoring applications, UAVs can perform vehicle monitoring, without the need for embedded sensors within cars and can be deployed in an area of interest at no additional cost. In this post motivated by the work in [1], we are interested in counting the vehicles that pass at each lane. We are going to use computer vision for this purpose as UAVs usually come equipped with cameras 📷 and make it an affordable and attractive sensor option for our application. …

Keras is a useful API for deep learning that also includes various pretrained models that you can used for transfer learning.

UPDATE! Now works with tf.keras!

The Keras API

Keras is a high level API (Application Programming Interface) for deep learning. That is it does not itself implement deep learning functionality but is built on-top of existing deep learning frameworks such as Tensorflow and provides improved functionality, faster implementation cycles, and added features. One of its most useful features is that it provided access to a large pool of existing deep learning models which are pre-trained on ImageNet (a rather time consuming and computationally demanding process). …

Deep learning approaches have demonstrated state-of-the-art performance in various computer vision tasks such as object detection and recognition. In this post I provide details on how to develop and train a Convolutional Neural Network (CNN) to detect top-view vehicles from UAV footage.


Unmanned Aerial Vehicles (drones) are emerging as a promising technology for both environmental and infrastructure monitoring, with broad use in a plethora of applications. In particular, Road Traffic Monitoring (RTM) constitutes a domain where the use of UAVs is receiving significant interest. Under the above deployments, UAVs are responsible for searching, collecting and sending, in real time, vehicle information for traffic regulation purposes from on-board camera sensors. For this purpose a deep convolutional neural network (CNN) is developed that can detect vehicles in images and an appropriate detection algorithm is formed. …

The objective of this project is to label pixels corresponding to road in images using a Fully Convolutional Network (FCN).


This specific module was a collaboration between UDACITY and NVIDIAs Deep Learning Institute. This module covers semantic segmentation, and inference optimization.

Semantic segmentation identifies free space on the road at pixel-level granularity, which improves decision-making ability. Inference optimizations accelerate the speed at which neural networks can run, which is crucial for computational-intense models like the semantic segmentation networks

The objective is to build and train fully convolutional networks that output an entire image, instead of just a classification output. For this puprose we implement three special techniques that FCNs use: 1x1 convolutions, upsampling, and skip layers, to train our own FCN models. …

The goal of this project was to design a path planner that is able to create smooth, safe paths for the car to follow along a 3-lane highway with traffic. The path planner should be able to keep inside its lane, avoid hitting other cars, and pass slower moving traffic all by using localization, sensor fusion, and map data.


The goal of this project is to safely navigate a vehicle around a virtual highway. The requirement for this project is to design a path planner that creates smooth trajectories for the car to follow that are acceptably tolerable by a passenger, hence, the path should minimize the jerk. Through the simulations the car was able to travel more than 4.32 miles without an incident. …

LBPs are local patterns that describe the relationship between a pixel and its neighborhood.

Local Binary Patterns (LBPs) have been used for a wide range of applications ranging from face detection [1], [2], face recognition [3], facial expression recognition [4], pedestrian detection [5], to remote sensing and texture classification [6] amongst others in order to build powerful visual object detection systems [7]. Many variants of LBPs have been proposed in literature [8]. The most common approach however, dictates that each 3×3 window in the image is processed to extract an LBP code. The processing involves thresholding the center pixel of that window with its surrounding pixels using the window mean, window median or the actual center pixel, as thresholds. …

Filtering and feature extraction are both very important tasks for efficient object recognition in embedded vision systems.

Perhaps one of the simplest, but also effective, forms of filtering is using color information which can be a very important factor in recognizing and detecting specific objects. For example, if you are interested in developing an app to detecting a red car then a wise first step during pre-processing is to first try and identify regions within the image that are mostly red. Of course, things are not always this simple but such approaches can be effective in many applications.

Below is an example of a code written in python using the OpenCV computer vision library that interfaces with a camera and recognizes the red color in the video stream. Using the same code you can recognize different colors by changing the lower and upper color bounds. …

“Parallel computing” stands for the ability of computer systems to perform multiple operations simultaneously. The main driver behind parallel computing is the fact that large problems can be divided to smaller ones which can be then solved in parallel — i.e. executed concurrently on the available computing resources.


The quest for squeezing more performance per silicon has been going on since the dawn of computing; from the deep execution pipelines in processors in the early 2000s, from the multicore SoCs found in mobile phones of today, to the General Purpose Graphics Processing Units that will power AI applications in the future. …

Object detection deals with determining whether an object of interest is present in an image/video frame or not. It is a necessary task for embedded vision systems as it enables them to interact more intelligently with their host environment, and increases their responsiveness and awareness with regards to their surroundings.

The detection/discovery of visual objects is a perceptual and cognitive task fundamental to vision and intelligence. It can be useful for a wide range of embedded applications ranging from robotics, surveillance and census systems, human-computer-interaction, intelligent transport systems, and military.

Object Detection Process

The process of visual object detection deals with determining whether an object of interest is present in an image/video frame or not: regardless of its size, orientation, and the environmental conditions which is found in. The high degree of variability makes it difficult to describe an object analytically by following an algorithmic step by step approach. Hence, object detection is typically viewed as a machine learning pattern recognition problem where the goal is to given an image to classify it as an object or non-object. There are different methods used to perform object detection the most notable of which…

In this project I have implemented a Model Predictive Control (MPC) algorithm to drive the car around the track in C++. The algorithm uses a Kinematic model to predict the car future trajectory and optimizes the actuation inputs to minimize a cost function. The algorithm also handles a 100 millisecond latency between actuation commands.

Kinematic Models

Kinematic models are simplifications of dynamic vehicle motion models that ignore tire forces, gravity, and mass. At low and moderate speeds kinematic models often approximate actual vehicle dynamics and hence, can be used to test algorithms for self driving cars.

Image for post
Image for post
Basic Equations of the Kinematic model

MPC Controller

The implemented MPC model drives the car around the track in the simulator. The model takes as input the state of the vehicle <x,y,v> and the errors <cet,epsi>. The controller controls the steering angle and throttle of the vehicle. The algorithm constantly re-evaluates given the input state to find the optimal actuations and minimize a cost function. It predicts the next points and tries to find the optimal actuations using the above kinematic model. The solver also takes constraints as well as the cost function and finds the optimal actuation. …


Christos Kyrkou

PhD in Computer Engineering, Self-Driving Car Engineering Nanodegree, Computer Vision, Visual Perception and Computing

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store