Learning to Analyze what is Beyond the Visible Spectrum
Author | : Amanda Berg |
Publisher | : Linköping University Electronic Press |
Total Pages | : 111 |
Release | : 2019-11-13 |
ISBN-10 | : 9789179299811 |
ISBN-13 | : 9179299814 |
Rating | : 4/5 (814 Downloads) |
Download or read book Learning to Analyze what is Beyond the Visible Spectrum written by Amanda Berg and published by Linköping University Electronic Press. This book was released on 2019-11-13 with total page 111 pages. Available in PDF, EPUB and Kindle. Book excerpt: Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing camera price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as there exists a measurable temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy. This thesis addresses the problem of automatic image analysis in thermal infrared images with a focus on machine learning methods. The main purpose of this thesis is to study the variations of processing required due to the thermal infrared data modality. In particular, three different problems are addressed: visual object tracking, anomaly detection, and modality transfer. All these are research areas that have been and currently are subject to extensive research. Furthermore, they are all highly relevant for a number of different real-world applications. The first addressed problem is visual object tracking, a problem for which no prior information other than the initial location of the object is given. The main contribution concerns benchmarking of short-term single-object (STSO) visual object tracking methods in thermal infrared images. The proposed dataset, LTIR (Linköping Thermal Infrared), was integrated in the VOT-TIR2015 challenge, introducing the first ever organized challenge on STSO tracking in thermal infrared video. Another contribution also related to benchmarking is a novel, recursive, method for semi-automatic annotation of multi-modal video sequences. Based on only a few initial annotations, a video object segmentation (VOS) method proposes segmentations for all remaining frames and difficult parts in need for additional manual annotation are automatically detected. The third contribution to the problem of visual object tracking is a template tracking method based on a non-parametric probability density model of the object's thermal radiation using channel representations. The second addressed problem is anomaly detection, i.e., detection of rare objects or events. The main contribution is a method for truly unsupervised anomaly detection based on Generative Adversarial Networks (GANs). The method employs joint training of the generator and an observation to latent space encoder, enabling stratification of the latent space and, thus, also separation of normal and anomalous samples. The second contribution is the previously unaddressed problem of obstacle detection in front of moving trains using a train-mounted thermal camera. Adaptive correlation filters are updated continuously and missed detections of background are treated as detections of anomalies, or obstacles. The third contribution to the problem of anomaly detection is a method for characterization and classification of automatically detected district heat leakages for the purpose of false alarm reduction. Finally, the thesis addresses the problem of modality transfer between thermal infrared and visual spectrum images, a previously unaddressed problem. The contribution is a method based on Convolutional Neural Networks (CNNs), enabling perceptually realistic transformations of thermal infrared to visual images. By careful design of the loss function the method becomes robust to image pair misalignments. The method exploits the lower acuity for color differences than for luminance possessed by the human visual system, separating the loss into a luminance and a chrominance part.