Unlocking Peak Performance on Qualcomm NPU with LiteRT

To bring this NPU power to LiteRT, Google’s high-performance on-device ML framework, we are thrilled to announce a significant leap forward: the LiteRT Qualcomm AI Engine Direct (QNN) Accelerator, developed in close collaboration with Qualcomm, replacing the previous TFLite QNN delegate.

This update introduces two major advantages for developers:

  1. A unified and simplified mobile deployment workflow that frees Android app developers from the biggest complexities of NPU acceleration.
  2. State-of-the-Art on-device performance. The accelerator supports an extensive range of LiteRT ops, enabling maximum NPU usage and full model delegation, a critical factor for securing the best performance.