neuroring-multifpga-snn - SKILL.md Agent Skill

name: neuroring-multifpga-snn description: NeuroRing methodology for modular and scalable SNN accelerator based on multi-FPGA bidirectional ring topology and stream-dataflow architecture version: 1.0.0 author: Hermes Agent created: 2026-05-28 arxiv_id: 2604.28059 tags: [snn, fpga, neuromorphic, accelerator, scalability, ring-topology, stream-dataflow, hardware] activation_keywords: [neuroring, snn accelerator, fpga snn, multi-fpga, ring topology, stream-dataflow, snn hardware, spiking neural network hardware]

NeuroRing: Multi-FPGA SNN Accelerator

Overview

NeuroRing is a modular and scalable Spiking Neural Network (SNN) accelerator based on stream-dataflow architecture and bidirectional ring topology, implemented using High-Level Synthesis (HLS) on FPGAs. It addresses the critical challenge of large-scale SNN execution where sparse spike communication and synchronization dominate runtime.

Key Innovation

Bidirectional Ring Topology + Stream-Dataflow Architecture

Modular single- and multi-FPGA deployment
Compatible with existing SNN workflows (NEST simulator integration)
Addresses sparse spike communication bottlenecks in large-scale SNNs
Preserves key activity statistics of reference models

Technical Framework

Architecture Components

Ring Topology Design
- Bidirectional ring communication pattern
- Efficient spike routing between modules
- Reduced communication latency and synchronization overhead
Stream-Dataflow Processing
- Event-driven computation pipeline
- Sparse spike handling optimization
- Low-latency spike propagation
HLS Implementation
- High-Level Synthesis on FPGAs
- Hardware-efficient spike processing
- Programmable and flexible architecture
NEST Simulator Integration
- Compatible with existing neuroscience workflows
- Validates against reference simulations
- Preserves biological fidelity

Performance Metrics

Cortical Microcircuit Benchmark:

Real-Time Factor (RTF): 0.83 (faster-than-real-time execution)
Preserves key activity statistics
Strong and weak scaling demonstrated

Energy Efficiency:

Competitive efficiency on programmable FPGAs
Suitable for both neuroscience simulation and event-driven applications

Applications

Neuroscience Simulation

Large-scale brain network modeling
Real-time neural dynamics simulation
Cortical microcircuit experiments

Neuromorphic Computing

Event-driven computation platforms
Energy-efficient SNN deployment
Hardware-software co-design

Broader Event-Driven Applications

Constraint satisfaction problems
Sparse computation paradigms
Real-time decision systems

Implementation Considerations

Deployment Flexibility

Single FPGA: Basic deployment for small-scale SNNs
Multi-FPGA: Scalable deployment for large networks
Modular architecture: Easy scaling by adding ring nodes

Scalability Patterns

Strong Scaling: Fixed problem size, increasing hardware
Weak Scaling: Proportional problem size and hardware
Meaningful scaling efficiency demonstrated

Hardware Requirements

FPGA boards with HLS support
Bidirectional ring interconnection
Stream-dataflow compatible interfaces

Methodology Steps

Network Design
- Define SNN topology and connectivity
- Specify spike propagation requirements
- Identify critical communication paths
Ring Topology Mapping
- Map neurons to ring nodes
- Configure bidirectional communication channels
- Optimize spike routing patterns
Stream-Dataflow Configuration
- Set up event-driven processing pipeline
- Configure spike buffering and synchronization
- Tune latency vs throughput tradeoffs
NEST Integration
- Import network from NEST simulator
- Validate activity statistics preservation
- Benchmark against reference execution
Multi-FPGA Deployment
- Configure inter-FPGA communication
- Optimize ring topology across FPGAs
- Monitor scaling efficiency

Pitfalls and Considerations

Spike Synchronization: Ensure consistent timing across ring nodes
Communication Latency: Balance spike propagation speed vs bandwidth
Activity Statistics: Verify biological fidelity against NEST reference
Hardware Constraints: Consider FPGA resource limits and ring node capacity
Scaling Tradeoffs: Evaluate strong vs weak scaling for specific applications

Code Examples

HLS Stream-Dataflow Pattern

// NeuroRing spike processing in HLS
void spike_processor(
    hls::stream<spike_event> &input_stream,
    hls::stream<spike_event> &output_stream,
    neuron_state *neuron_array
) {
    // Event-driven processing
    while (!input_stream.empty()) {
        spike_event spike = input_stream.read();
        
        // Process incoming spike
        update_neuron_state(neuron_array[ spike.target ], spike);
        
        // Generate outgoing spikes if threshold reached
        if (neuron_array[spike.target].membrane_potential > threshold) {
            spike_event outgoing;
            outgoing.source = spike.target;
            outgoing.timestamp = current_time;
            output_stream.write(outgoing);
        }
    }
}

Bidirectional Ring Communication

// Ring topology spike routing
void ring_router(
    hls::stream<spike_event> &left_input,
    hls::stream<spike_event> &right_input,
    hls::stream<spike_event> &left_output,
    hls::stream<spike_event> &right_output,
    hls::stream<spike_event> &local_processor
) {
    // Bidirectional routing logic
    if (!left_input.empty()) {
        route_spike(left_input.read(), local_processor, left_output, right_output);
    }
    if (!right_input.empty()) {
        route_spike(right_input.read(), local_processor, left_output, right_output);
    }
}

References

arXiv:2604.28059 - NeuroRing: Scaling Spiking Neural Networks via Multi-FPGA Bidirectional Ring Topologies and Stream-Dataflow Architectures
Accepted at Euro-Par 2026
NEST Simulator: https://nest-simulator.org/

Related Skills

[[snn-hardware-accelerator]] - General SNN hardware acceleration patterns
[[neuromorphic-hardware]] - Neuromorphic computing platforms
[[fpga-neuromorphic-design]] - FPGA-based neuromorphic design patterns
[[stream-dataflow-architecture]] - Stream-dataflow computing patterns