TSBB19 Machine Learning for Computer Vision

This course gives a theoretical and practical introduction to machine learning tasks in computer vision. Theory is introduced in a series of lectures that present and illustrate current methods for object recognition, detection and tracking. The course also has two small projects that are solved in groups. Projects are presented both in written form and orally, at two seminars. The course ends with a written exam on the theoretical content.

Common Computer Vision Tasks

General Information

  • Course syllabus

    Course syllabus can be found in the Study guide.

  • Registration

    The 2025 course starts on September 1. To receive emails about the course and to have your results entered into Ladok you must be registered on the course. If you intend to take the course but have not yet registered, make sure to do it promptly through Ladok, or via the Student Portal Lisam.

  • Schedule

    The course schedule can be found in TimeEdit.

  • Examination

    See the examination page.

People

Per-Erik Forssén
Lectures, Examiner
Urs Waldmann
Lectures
Michael Felsberg
Lectures
Ziliang Xiong
Supervisor
James Waguespack
Supervisor
Cuong Le
Supervisor
Ioannis Athanasiadis
Supervisor
Urs Waldmann
Supervisor
Anmar Karmush
Supervisor
Gulnaz Zhambulova
Supervisor

Course material

The main course material consists of lecture slides, and selected articles. We also recommend the two books listed below for a more in-depth treatment of the content.

  • Course GIT repositories

    The course has three GIT repositories. These are hosted in LiUs GitLab (accessible to anyone with a LiU login). One repository contains the course material (e.g. lecture slides, and articles for background reading), and the other two contain support code for the projects.

  • Book 1:

    Goodfellow et al., Deep Learning, MIT Press (2016).
    Relevant contents: Theory on machine learning, convolutional neural networks, loss functions, and regularization.
    See book webpage for chapter PDFs Book webpage.

  • Book 2:

    Richard Szeliski, Computer Vision: Algorithms and Applications, 2nd ed., Springer Verlag (2022).
    Relevant contents: classical computer vision, feature extraction, optimization.
    The book is available as an on campus e-book via the LiU library. See also Book webpage.

Lecture Schedule HT2025

Before the lectures, the lecture slides from last year can be found in the course material repository. Updated slides will be pushed after the lecture has taken place. The repository also contains additional relevant literature for each lecture.

Date,Time,Room Activity Teacher
September 1: 08.15-10
A38
Lecture 1
Introduction
Per-Erik Forssén
September 2: 10.15-12
A31
Lecture 2
Feature Descriptors
Per-Erik Forssén
September 3: 13.15-15
U2
Lecture 3
Convolutional Neural Networks: Introduction and Theory
Urs Waldmann
September 5: 8.15-10.00
A31
Lecture 4
Image Classification with Convolutional Neural Networks
Urs Waldmann
September 9: 10.15-12
C3
Lecture 5
Project 1: Image Classification
Per-Erik Forssén
September 10: 13.15-15
FB113
Lecture 6
Compound Descriptors, Metric Learning, and Evaluation
Per-Erik Forssén
September 12: 8.15-10
A32
Lecture 7
Visual Object Detection
Per-Erik Forssén
September 24: 13.15-15
A33, A34
Seminar 1
Presentation of project 1
Per-Erik Forssén
Urs Waldmann
September 24: 15.15-17
A31
Lecture 8
Semantic and Panoptic Segmentation
Urs Waldmann
September 26: 8.15-10
A36
Lecture 9
Project 2: Semantic Segmentation
Per-Erik Forssén
October 6: 8.15-10
S26
Lecture 10
Visual Object Tracking with Deep Features
Michael Felsberg
October 15: 13.15-15
KY31, KY34
Seminar 2
Presentation of project 2
Urs Waldmann
Per-Erik Forssén
October 15: 15.15-17
KY34
Seminar 2
Discussion of project 2 and exam
Per-Erik Forssén
Urs Waldmann

A more extensive schedule can be found in TimeEdit. It also contains scheduled project time (labelled "Projekttid"). These are times when you have exclusive access to the computer rooms Egypten, Asgård and Olympen. TimeEdit also lists backup lectures. Note that teachers will only be present at activities listed in the table above.

Projects

The projects are conducted in groups of 5 or 4 students (in order of preference).
Groups and supervisor assignments are finalized at the introductory lecture of project 1, and will be published in the course repository.

Project 1: Visual Object Recognition

Project 2: Semantic Segmentation

Compute resources

Run your jobs on the computers in Egypen, Asgård and Olympen.

General resources

We recommend using the following software:

  • PyTorch A deep learning framework for Python (support code for the projects is written for PyTorch).
  • PyCharm A Python IDE (installed in the Linux labs).

Other related software of interest:

  • OpenCV 4.8 (Open Source Computer Vision) library that is installed at ISY labs. Python bidnings exist.
  • LIBSVM A Library for Support Vector Machines (Matlab, Python).
  • VLFeat has a a useful code library, both for Matlab and C/C++.

Project Repositories

Project code should be developed under versioning control, with changes tracked according to LiU-ID of the participating group members. Project groups should get their repositories from GITLab at LiU.
Note: this is not GitHub, and GitHub should not be used.