`

Deep Learning for Computer Vision

200.00 EGP


Welcome to our course, Deep Learning for Computer Vision: From Pixels to Semantics. In this course, we will cover three main parts. The first part covers the essentials of traditional computer vision pipeline, and how to deal with images in OpenCV and Pillow libraries, including the image pre-processing pipeline like:  thresholding, denoising, blurring, filtering, edge detection,  contours...etc. We will build simple apps like Car License Plate Detection (LPD) and activity recogntion. This will lead us to the revolution that deep learning brought to the game of computer vision, turning traditional filters into learnable parameters using Convolution Neural Networks. We will cover all the basics of ConvNets, including the details of the Vanilla architecture for image classification, hyper parameters like kernels, strides, maxpool and feature maps sizes calculations. Beyond the Vanilla architecture, we also cover the state-of-the art ConvNet meta-architectures and design patters, like skip-connnections, Inception, DenseNet...etc. In the second part, we will learn how to use ConvNets to solve practical problems in different situations, with small amount of data, how to use transfer learning and the different scenarios for that, and finally how to debug and visualize the leant kernels in ConvNets. In the last part, we will learn about different CV apps using ConvNets. We will learn about the Encoder-Decoder design pattern. We start by the task of semantic segmentation, where we will build a U-Net architecture from scratch for the Cambridge Video (CAMVID) dataset. Then we will learn about Object Detection, covering both 2-stage and one-shot architectures like SSD and YOLO. Next, we will learn how to deal with the video data using the Spatio-Temporal ConvNet architectures. Finally we will introduce 3D Deep Learning to extend ConvNets usage to deal with 3D data, like LiDAR data.

    Pre-requisities

  • Python

  • Probability

  • Linear Algebra

  • Machine Learning

    Topics Covered

  • From traditional Computer Vision to Deep Learning

  • The basics of ConvNets in Computer Vision

  • The practical aspects of DL in CV, like data augmentation and transfer learning

  • ConvNets Architectures and Pre-trained ConvNets

  • Debugging ConvNets by visualization of ConvNets filters and features

  • Image Classification

  • Semantic Segmentation

  • Object Detection

  • Video Analysis: Spatio-Temporal Models

  • 3D Deep Learning in Computer Vision

    What you will learn

  • Build solid understanding of Computer vision foundations, using traditional and Deep Learning methods

  • Deep understanding of Conolutional Neural Networks and their usage in computer vision

  • Build practical projects with ConvNets, like image classification, multi-object detection and semantic segmentations

  • Understand and practice the concepts of Transfer Learning in practical problems

  • Learn how to visualize and debug ConvNets and understand their underlying dynamics in a practical way

  • Learn how to use and apply data augmentation and how to deal with large and small datasets using ConvNets

  • Understand the basics of dealing with time and video data using Spatio-temporal models

  • Understand the basics of 3D Deep Learning and how to deal with 3D data sets