WIOWIZ Technologies • February 2026 • 8 min read

We asked: Can we run 8 ADAS models with deterministic latency on a single chip? Not by optimizing algorithms — by eliminating the reload problem architecturally.

The Result

8 distilled ADAS inference graphs — lane, object, sign, light, pedestrian, depth, radar, decision — executed in 71 cycles total. 1-cycle bank switch. Verified in RTL simulation.

What VAIDAS Is

VAIDAS = VAI + ADAS — Virtual AI Inference for Advanced Driver Assistance Systems. It applies WioWiz's VAI principle (weights as ROM, 1-cycle model switch) to the ADAS domain.

Traditional ADAS chips process models sequentially: load lane detection weights, infer, unload, load object detection weights, infer, unload... The weight transfer overhead dominates latency.

VAIDAS eliminates this. 8 models stay resident in 8 weight banks. Switching is a mux select, not a memory reload.

VAIDAS targets control-critical inference, not perception pretraining or floating-point accuracy benchmarks.

CAMERA INPUT (30 FPS) CNN FEATURE EXTRACTOR 64×64 Systolic Array | 4,096 MACs Bank 0 Lane Bank 1 Object Bank 2 Sign Bank 3 Light ⚡ Bank 4 PEDESTRIAN Bank 5 Depth Bank 6 Vehicle ⚡ Bank 7 WIDAR Radar DECISION FUSION GO / SLOW / STOP VEHICLE CONTROL CRITICAL 1-CYCLE SWITCH

VAIDAS 8-Bank Architecture: Parallel processing with priority-based fusion

Why not GPUs / FP4?

GPUs optimize throughput. VAIDAS optimizes worst-case latency. In safety systems, p99 latency matters more than TOPS/W.

8
Weight Banks
8
ADAS Models
1
Cycle Switch
71
Total Cycles

The 8-Bank Architecture

Each bank is optimized for a specific ADAS function. Bank 7 doesn't average results — it applies weighted priority fusion where pedestrian detection can override everything.

Bank Function Model Source Priority
0 Lane Detection TuSimple-derived Medium
1 Object Detection YOLO-derived Medium
2 Traffic Sign ResNet-derived Low
3 Traffic Light ResNet-derived High
4 Pedestrian Detection YOLO-derived CRITICAL
5 Depth Estimation MiDaS-derived Medium
6 Vehicle Tracking Custom MLP High
7 Decision Fusion Fusion MLP Final

Why 8 Banks? Not 4, not 16 — 8 is the sweet spot. Covers all critical ADAS functions. Fits in single chip with reasonable area. Matches weight buffer banking (8 SRAM banks for parallel loading). Allows 1-cycle bank switching during inference.

The Math: Why 71 Cycles, Not 800

Traditional NPU (Reload Every Switch)

Load Model A ....... 100 cycles
Compute .............. 10 cycles
Load Model B ....... 100 cycles
Compute .............. 10 cycles
... × 8 models ...
─────────────────────────────
Total: 800+ cycles

VAIDAS (VAI Architecture)

Load 8 Banks (boot) . ONE TIME
Compute Bank 0 ....... 8 cycles
Switch ............... 1 cycle
Compute Bank 1 ....... 8 cycles
... × 8 models ...
─────────────────────────────
Total: 71 cycles

At 100 MHz clock:
Traditional: 8+ μs for 8 models | VAIDAS: 710 ns for 8 models
11× architectural speedup — not from better algorithms, from eliminating the reload.

The SoC Architecture

VAIDAS SoC Ibex RISC-V CPU RV32IMC AHB WZ-NPU 8-Bank WROM (VAI) B0 | B1 | B2 | B3 | B4 | B5 | B6 | B7 1-CYCLE MUX SELECT wz_systolic_8x8 64 MACs → Result Peripherals CAN 2.0B UART GPIO

VAIDAS SoC: Ibex RISC-V + WZ-NPU with 8-Bank VAI Architecture

The RTL Core: 1-Cycle Bank Switch

// vai_wrom_manager.v — The VIA Core
module vai_wrom_manager #(
    parameter NUM_BANKS  = 8,
    parameter BANK_DEPTH = 4096,
    parameter DATA_WIDTH = 8
)(
    input  wire        clk,
    input  wire [2:0]  bank_sel,      // 3 bits → 8 banks
    input  wire [11:0] read_addr,
    output wire [7:0]  read_data      // ACTIVE IN 1 CYCLE
);
    reg [DATA_WIDTH-1:0] wrom [0:NUM_BANKS-1][0:BANK_DEPTH-1];
    
    // No state machine. No handshake. No reload.
    // Just a mux.
    assign read_data = wrom[bank_sel][read_addr];
endmodule

This is the invention. Not a cache. Not prefetch. Direct addressing across banks with combinational selection.

Verification: What We Proved

Test Method Result
wz_systolic_8x8 MAC accuracy Golden model comparison 64/64 match ✓
8-bank 1-cycle switch Verilator waveform 8 switches in 8 cycles ✓
Ibex → AHB → NPU bus Handshake timing rvalid/rready verified ✓
Full SoC integration Behavioral simulation PASS ✓
CARLA demo integration Video pipeline GO/SLOW/STOP decisions ✓
# Simulation command
$ cd ~/widar/verticals/auto/vaidas
$ make sim_npu

# Output: 8 PASS, 0 FAIL

Demo: CARLA + VAIDAS Integration

VAIDAS was integrated with CARLA simulator for end-to-end demonstration. Real feature extraction (YOLOv8, MiDaS) → VAIDAS NPU decisions → Vehicle control.

Tesla Model 3 • YOLOv8 Detection • GO/SLOW/STOP Overlay

Pipeline:
CARLA 3D World → Camera Frame → Feature Extractors
     → 8×8 Feature Vectors → VAIDAS NPU
     → Decision (GO/SLOW/STOP) → Vehicle Control

Result:
Vehicle responds to VAIDAS decisions
Speed: 0 → 27 → 0 km/h based on detected obstacles

The Philosophy

Others optimize algorithms. We optimize silicon behavior.

Algorithms change. Latency must not.

VAIDAS is not an AI accelerator. It is deterministic math silicon that executes inference graphs. The weights are data. The compute is fixed. The latency is architectural.

If a model cannot be fixed-point, bounded-latency, and deterministic — it doesn't belong in safety silicon.

VAIDAS is to ADAS what hard real-time schedulers are to avionics.

What's Next

Indian Dashcam Dataset: Training VAIDAS on real Indian road conditions — chaotic traffic, pedestrians everywhere, unmarked lanes, two-wheelers cutting across. This is the ultimate stress test for ADAS.

  • Scale to 64×64 systolic array — 4,096 MACs for production-grade inference
  • Indian road dataset collection — Real dashcam videos from Bangalore, Delhi, Mumbai traffic
  • Lane detection for unmarked roads — Indian roads don't have clear lane markings
  • Two-wheeler priority detection — Most common road users in India
  • Multi-chip scaling — VAIDAS clusters for Level 4+ autonomy

⚠️ Technical Notes

  • Open models (YOLO, MiDaS, TuSimple, ResNet) used for research and benchmarking
  • Weights retrained, adapted, and quantized by WioWiz for INT8 fixed-point execution
  • VAIDAS executes distilled, fixed-point inference graphs — not full floating-point models
  • Architecture is model-agnostic — any conforming inference graph can be loaded
  • VAIDAS is pre-silicon; verification conducted via Verilator simulation
#semiconductor #ADAS #NPU #autonomous #verification #RTL #VAI #India