Principled Optimization of Dynamic Neural Networks

Principled Optimization of Dynamic Neural Networks
Author :
Publisher :
Total Pages : 103
Release :
ISBN-10 : OCLC:1254094647
ISBN-13 :
Rating : 4/5 ( Downloads)

Book Synopsis Principled Optimization of Dynamic Neural Networks by : Jared Graham Roesch

Download or read book Principled Optimization of Dynamic Neural Networks written by Jared Graham Roesch and published by . This book was released on 2020 with total page 103 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the past decade deep learning has revolutionized many areas of computer science. Deep learning computations mostly consist of expensive linear algebra kernels defined over a mixture of large sparse and dense tensors. From the early days of deep learning framework development, researchers realized the potential for applying compiler optimizations to accelerate neural networks. As deep learning continues to grow in popularity the diversity of models also continues to grow. Due to the early success of deep learning in computer vision, early deep learning systems were were focused on static, feed-forward networks processing fixed sized images. First-generation deep learning compilers have also been similarly overfit for static model compilation, with strong assumptions of static control-flow, static tensor dimensions and no complex data structures. A focus on static models has created challenges for deep learning practitioners, as dynamic models introduce input-dependent graph topology, violating key invariants of existing systems and invalidating optimizations designed for purely static data flow graphs. This lack of support has manifested as series of ad-hoc extensions to both frameworks, deep learning runtimes and compilers. Choosing to ignore dynamic behaviors has allowed deep learning compilers to make significant strides in optimizing common deep learning workloads, but existing techniques miss increasing generality without sacrificing performance. This dissertation in particular focuses on an under served, yet important problem: the representation, optimization, differentiation and execution of dynamic neural networks.In this thesis I propose generalizing overspecialized compilation techniques applied to static dataflow graphs, the predominant programming model of deep learning, to fully dynamic neural networks. These generalizations are powered by a simple insight: dynamic neural networks are just programs which manipulate tensors. The challenge is constructing a representation that captures this generality in a principled manner, while not sacrificing state-of-the-art performance or the programming model. In particular, the contributions include: an intermediate representation which can represent dynamic behaviors, a new automatic differentiation technique for dynamic neural networks, a set of general optimizations which work on all programs, as well as specialized dynamic optimizations, an efficient runtime for dynamic neural networks. The efforts of my thesis now exists in Apache TVM, a deep learning compiler framework. Apache TVM is deployed in production at multiple leading companies including Amazon, Facebook, and Microsoft and is a critical piece of the technology stack at OctoML a company I co-founded around the TVM project. One notable impact is its use in Amazon Alexa, Amazon's AI assistant which executes on a variety of devices such as "smart speakers" which include digital assistants. Amazon engineers used Relay to optimize Alexa's wake word model, executed each time a user interacts with Alexa.


Principled Optimization of Dynamic Neural Networks Related Books