TVM, A Deep Learning Compiler Stack

TVM is an open source deep learning compiler stack for CPUs, GPUs, and specialized accelerators. It aims to close the gap between the productivity-focused deep learning frameworks, and efficiency-oriented hardware backends.

TVM accepts models in various frameworks like TensorFlow, Keras, MXNet, PyTorch and others and enables us to deploy them in various backends such as LLVM, CUDA, OpenCL and METAL.

TVM brings in various advantages such as:

  • You can replace heavy runtime environments like TensorFlow runtime environment with a lightweight runtime environment such as LLVM. This will release the load from the host machine
  • Better performance than top Machine Learning frameworks such as TensorFlow due to AutoTVM module which auto-tunes optimizations according to the host machine backend

At the same time, TVM solves some of the critical challenges such as:

  • Difficulty to switch from one framework to another
  • Framework developers need to maintain multiple backends to guarantee performance on various hardware
  • Chip vendors need to support multiple frameworks for every new chip they build

This is a companion discussion topic for the original entry at