litert

star 0

Google's on-device AI framework for deploying ML and GenAI models on edge devices (successor to TensorFlow Lite). Use when working with on-device inference, .tflite models, mobile ML deployment, GPU/NPU acceleration, LiteRT-LM for LLMs, model conversion from PyTorch/TensorFlow/JAX, or migrating from TensorFlow Lite. Triggers on Android/iOS/Web ML inference, CompiledModel API, hardware acceleration, edge AI deployment, or running models like Gemma on device.

farmhutsoftwareteam By farmhutsoftwareteam schedule Updated 1/29/2026

name: LiteRT description: Google's on-device AI framework for deploying ML and GenAI models on edge devices (successor to TensorFlow Lite). Use when working with on-device inference, .tflite models, mobile ML deployment, GPU/NPU acceleration, LiteRT-LM for LLMs, model conversion from PyTorch/TensorFlow/JAX, or migrating from TensorFlow Lite. Triggers on Android/iOS/Web ML inference, CompiledModel API, hardware acceleration, edge AI deployment, or running models like Gemma on device.

LiteRT: On-Device AI Framework

Overview

LiteRT (Lite Runtime) is Google's framework for deploying ML and generative AI on edge devices. It's the successor to TensorFlow Lite with advanced GPU/NPU acceleration delivering up to 100x faster inference than CPU.

Platform Support

Platform CPU GPU NPU
Android Yes OpenCL, OpenGL Qualcomm, MediaTek
iOS Yes Metal ANE (coming)
macOS Yes Metal, WebGPU ANE (coming)
Windows Yes WebGPU Intel (coming)
Linux Yes WebGPU -
Web Yes WebGPU Coming

Quick Start

Android (Kotlin)

// Add dependency: implementation 'com.google.ai.edge.litert:litert:2.1.0'

val model = CompiledModel.create(
    context.assets,
    "model.tflite",
    CompiledModel.Options(Accelerator.GPU)  // or NPU, CPU
)

val inputBuffers = model.createInputBuffers()
val outputBuffers = model.createOutputBuffers()

inputBuffers[0].writeFloat(inputData)
model.run(inputBuffers, outputBuffers)
val result = outputBuffers[0].readFloat()

C++

#include "litert/cc/litert_compiled_model.h"
#include "litert/cc/litert_environment.h"

LITERT_ASSIGN_OR_RETURN(auto env, Environment::Create({}));
LITERT_ASSIGN_OR_RETURN(auto compiled_model,
    CompiledModel::Create(env, "model.tflite", kLiteRtHwAcceleratorGpu));

LITERT_ASSIGN_OR_RETURN(auto inputs, compiled_model.CreateInputBuffers());
LITERT_ASSIGN_OR_RETURN(auto outputs, compiled_model.CreateOutputBuffers());
compiled_model.Run(inputs, outputs);

Python

from ai_edge_litert.interpreter import Interpreter

interpreter = Interpreter(model_path='model.tflite')
interpreter.allocate_tensors()
interpreter.set_tensor(input_index, input_data)
interpreter.invoke()
output = interpreter.get_tensor(output_index)

APIs

CompiledModel API (Recommended)

  • Modern API for hardware acceleration
  • Supports GPU, NPU, CPU
  • Zero-copy buffer interop
  • Async execution

Interpreter API (Legacy)

  • TensorFlow Lite compatible
  • CPU-only in v2.x
  • Use for backward compatibility

Task Decision Tree

Running inference on device?

Deploying LLMs (Gemma, Phi, Qwen)?

Converting models to .tflite?

  • PyTorch: Use litert-torch package
  • TensorFlow: Use tf.lite.TFLiteConverter
  • JAX: Use jax2tf bridge
  • See model-conversion.md

Migrating from TensorFlow Lite?

Performance Tips

  1. Choose the right accelerator: NPU > GPU > CPU for most models
  2. Use zero-copy buffers: Pass camera/GPU buffers directly
  3. Enable async execution: Overlap CPU/GPU work
  4. Cache NPU compilation: Use CompilerCacheDir environment option
  5. Quantize models: INT8 reduces size 4x, improves speed

Dependencies

Android (Gradle)

implementation 'com.google.ai.edge.litert:litert:2.1.0'

Python

pip install ai-edge-litert           # Runtime
pip install litert-torch             # PyTorch conversion
pip install ai-edge-quantizer        # Quantization

Resources

Reference Files

Install via CLI
npx skills add https://github.com/farmhutsoftwareteam/litert-skill --skill litert
Repository Details
star Stars 0
call_split Forks 1
navigation Branch main
article Path SKILL.md
More from Creator
farmhutsoftwareteam
farmhutsoftwareteam Explore all skills →