Raspberry Pi · Hardware

Hailo-8L NPU Setup on RPi 5

Nine-phase automated installation with HailoRT 5.3.0, PCIe Gen 3 optimization, and Ollama integration.

⏱ 45 min 📊 Advanced 📅 Updated April 2026

What You'll Need

Raspberry Pi 5 (8GB recommended), Hailo-8L AI Kit (M.2 or M.2 Key-M), microSD card (64GB+), active cooler, Ubuntu Server 24 LTS or Raspberry Pi OS 64-bit.

Overview

This guide walks through a complete Hailo-8L NPU installation on Raspberry Pi 5, optimized for local AI inference. The setup includes PCIe Gen 3 optimization, HailoRT 5.3.0, and integration with Ollama for running models locally.

The entire process is automated via a nine-phase bash script. Each phase is documented below so you understand what's happening at every step.

Prerequisites

Before starting, ensure your system is ready:

  • Fresh Ubuntu Server 24 LTS or Raspberry Pi OS 64-bit installation
  • Active cooler installed (Hailo runs hot under load)
  • Network connection (wired preferred for initial setup)
  • At least 32GB free storage (models take space)

The Nine-Phase Installation

Phase 1: System Preparation

Updates packages and installs build dependencies required for HailoRT compilation.

Phase 1 — System Prep
sudo apt-get update && sudo apt-get upgrade -y
Reading package lists... Done
Building dependency tree... Done
sudo apt-get install -y build-essential cmake git python3-dev
Dependencies installed successfully

Phase 2: PCIe Configuration

Enables PCIe on the Raspberry Pi 5 and configures the M.2 slot for the Hailo card.

Phase 2 — PCIe Config
# Enable PCIe in config.txt
echo 'dtparam=pciex1' | sudo tee -a /boot/firmware/config.txt
dtparam=pciex1
# Reboot required after this phase

Phase 3: PCIe Gen 3 Optimization

Enables PCIe Gen 3 speeds for maximum throughput. This doubles the bandwidth available to the Hailo card.

Phase 3 — Gen 3 Optimization
echo 'dtparam=pciex1_gen=3' | sudo tee -a /boot/firmware/config.txt
dtparam=pciex1_gen=3
⚠ Note: Gen 3 may cause instability on some cables. Revert to Gen 2 if issues occur.

Phase 4: Hailo Repository Setup

Adds the Hailo APT repository for package installation.

Phase 4 — Repository
curl -fsSL https://hailo.ai/deb/pubkey.gpg | sudo gpg --dearmor -o /usr/share/keyrings/hailo-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hailo-archive-keyring.gpg] https://hailo.ai/deb stable main" | sudo tee /etc/apt/sources.list.d/hailo.list
sudo apt-get update

Phase 5: HailoRT 5.3.0 Installation

Installs the Hailo runtime library and Python bindings.

Phase 5 — HailoRT Install
pip install hailort==5.3.0 --break-system-packages
Collecting hailort==5.3.0
Installing build dependencies... done
Successfully installed hailort-5.3.0

Phase 6: Driver Installation

Installs the kernel driver for Hailo device communication.

Phase 6 — Driver
sudo apt-get install -y hailort-pcie-driver
Loading kernel module...
hailo_pci driver loaded

Phase 7: Model Downloads

Downloads pre-compiled Hailo models for common AI tasks.

Phase 7 — Models
mkdir -p ~/hailo-models && cd ~/hailo-models
wget https://hailo-model-zoo.s3.eu-west-2.amazonaws.com/ModelZoo/hefs/resnet_v1_50.hef
resnet_v1_50.hef - 12.4 MB
# Additional models: yolov5, yolov8, efficientnet, etc.

Phase 8: Ollama Integration

Installs Ollama with Hailo backend support for running LLMs locally.

Phase 8 — Ollama
curl -fsSL https://ollama.com/install.sh | sh
Downloading ollama...
ollama pull qwen2.5-coder:7b
pulling qwen2.5-coder:7b
Model ready for inference

Phase 9: Validation

Verifies the installation and confirms the Hailo device is working correctly.

Phase 9 — Validation
hailortcli fw-control identify
Hailo-8L found. Firmware: 4.19.0 ✓
hailortcli parse-hef ~/hailo-models/resnet_v1_50.hef
Network: resnet_v1_50
Input: 224x224x3 (float32)
Output: 1000 (float32)
✓ Installation complete

Performance Benchmarks

After installation, you can expect the following performance on Hailo-8L:

  • ResNet-50: ~450 FPS (224x224 input)
  • YOLOv5s: ~200 FPS (640x640 input)
  • YOLOv8s: ~180 FPS (640x640 input)
  • EfficientNet-B0: ~500 FPS (224x224 input)

Note: LLM inference (Qwen, Llama) runs on the RPi's CPU, with optional NPU acceleration for specific operations.

Automated Script

The complete installation script is available on GitHub. It handles all nine phases automatically with error checking and rollback support. View on GitHub →

Troubleshooting

Common issues and solutions:

  • Device not found: Ensure PCIe is enabled in /boot/firmware/config.txt and the M.2 card is properly seated.
  • Gen 3 instability: Some ribbon cables can't handle Gen 3 speeds. Remove pciex1_gen=3 and use Gen 2.
  • Permission denied: Add your user to the 'hailo' group: sudo usermod -aG hailo $USER
  • Out of memory: The 4GB RPi may struggle with larger models. Use the 8GB model or reduce batch sizes.

Next Steps

After completing this setup: