lsformer-local-structure-aware-spiking-transformer

star 1

LSFormer: Local Structure-Aware Spiking Transformer. Replaces global self-attention with dilated local windows and spiking response pooling for energy-efficient SNNs. Keywords: spiking transformer, local attention, SNN, energy-efficient, LSFormer, spiking neural network, self-attention bottleneck, SPooling, LS-SSA, Tiny-ImageNet

hiyenwong By hiyenwong schedule Updated 6/3/2026

name: lsformer-local-structure-aware-spiking-transformer description: > LSFormer: Local Structure-Aware Spiking Transformer. Replaces global self-attention with dilated local windows and spiking response pooling for energy-efficient SNNs. Keywords: spiking transformer, local attention, SNN, energy-efficient, LSFormer, spiking neural network, self-attention bottleneck, SPooling, LS-SSA, Tiny-ImageNet

LSFormer — Local Structure-Aware Spiking Transformer

Paper: Breaking Global Self-Attention Bottlenecks in Transformer-based Spiking Neural Networks with Local Structure-Aware Self-Attention Authors: Lingdong Li, Hangming Zhang, Qiang Yu arXiv: 2605.13887 (cs.NE / cs.AI, 2026-05-12)

Problem

Transformer-based SNNs suffer from two limitations:

  1. Max pooling bottleneck — only captures the strongest spike response, failing to preserve representative regional features across time steps.
  2. Global self-attention bottleneck — quadratic computational complexity conflicts with the sparse, energy-efficient nature of SNNs.

Methodology

SPooling (Spiking Response Pooling)

  • Replaces max pooling with a spiking-aware aggregation mechanism.
  • Captures cumulative regional spike responses across time, not just the peak.
  • Preserves richer temporal-spatial feature representations for downstream attention.

LS-SSA (Local Structure-Aware Spiking Self-Attention)

  • Introduces a local dilated window mechanism to limit attention scope.
  • Balances local detail capture with long-range dependency modeling.
  • Reduces computational complexity from O(N²) to near-linear while maintaining accuracy.
  • Maintains spike-compatible operations throughout.

Architecture

Input Spike Tensor → SPooling → LS-SSA Blocks → Classification Head
  • LS-SSA blocks replace standard global self-attention in Transformer layers.
  • Dilated windows expand receptive field without full quadratic attention.
  • Compatible with standard SNN training pipelines (direct training or ANN-SNN conversion).

Results

Dataset Improvement
Tiny-ImageNet +4.3% top-1 accuracy (SOTA)
N-CALTECH101 +8.6% accuracy (SOTA)

Usage Guidance

Use this skill when:

  • Designing or optimizing spiking transformer architectures for vision tasks.
  • Seeking energy-efficient alternatives to global self-attention in SNNs.
  • Implementing local attention mechanisms with dilated receptive fields.
  • Replacing max pooling with temporal-spike-aware pooling in SNN pipelines.
  • Benchmarking SNNs on image classification (Tiny-ImageNet, event-based datasets).

Key Implementation Notes

  • SPooling aggregates spike responses across the temporal dimension before attention.
  • LS-SSA window dilation factor controls the trade-off between locality and global context.
  • Maintain spike-compatible activation (threshold-and-fire) within attention computations.
  • Compatible with both rate-coded and temporal-coded SNN representations.

References

  • arXiv: 2605.13887
  • Related: Spiking Neural Networks, Vision Transformers, Efficient Attention
Install via CLI
npx skills add https://github.com/hiyenwong/ai_collection --skill lsformer-local-structure-aware-spiking-transformer
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator