name: fastedgesgeometry-architecture description: Architecture and implementation details of FastEdgesGeometry. Use when modifying FastEdgesGeometry, understanding its optimization techniques, or debugging edge generation issues.
FastEdgesGeometry Architecture
Background
FastEdgesGeometry is a performance-optimized fork of Three.js EdgesGeometry. The original EdgesGeometry had performance issues with complex geometries, leading to the creation of this optimized version.
File: src/FastEdgesGeometry.ts
Evolution History
Stage 1: Three.js EdgesGeometry (Original)
Standard Three.js EdgesGeometry implementation:
| Aspect | Implementation |
|---|---|
| Edge identity | String concatenation: "x,y,z" format (4 decimal precision) — exact, collision-free |
| Edge storage | Object with string keys: "hash0_hash1" |
| Normal calculation | Triangle.getNormal() method |
| Vertex accumulation | Array.push() |
Problem: String key generation and object key lookup are slow
Stage 2: FastEdgesGeometry (v0.7.3, 2024-11)
Replaced string keys with 32-bit numeric hashes:
| Aspect | Implementation |
|---|---|
| Edge identity | Numeric hash via hybridtaus() (Tausworthe algorithm) — lossy, collision-prone |
| Edge storage | Map<number, {index0, index1, normal}> |
| Normal calculation | Triangle.getNormal() method (unchanged) |
| Vertex accumulation | Array.push() (unchanged) |
Improvement: Numeric hash speeds up edge lookup (~3-4x faster)
Trade-off introduced: Hash collisions could corrupt edge detection; seed / precisionPoints options served as workarounds
Stage 3: FastEdgesGeometry (v0.7.4+, 2026-01)
Applied micro-optimizations:
| Aspect | Implementation |
|---|---|
| Edge identity | Inline computeHash() with pre-transformed seed (still collision-prone) |
| Edge storage | Map<number, slot> + parallel Typed Arrays |
| Normal calculation | Direct cross product calculation |
| Vertex accumulation | Pre-allocated Float32Array with index tracking |
Improvement: Function inlining, Typed Arrays, object allocation avoidance (~30% faster)
Stage 4: FastEdgesGeometry (2026-06, PR #371)
Full rewrite with exact edge identity — eliminates the Stage 2 accuracy trade-off while getting faster:
| Aspect | Implementation |
|---|---|
| Edge identity | Exact integer pair of welded vertex ids. A weld pre-pass maps each vertex to a canonical id by exact comparison of quantized coordinates. Hashes only pick probe start positions in open-addressing tables — they never decide identity |
| Edge storage | Open-addressing typed-array tables (linear probing, load factor <= 0.5, -1 empty / -2 tombstone) + parallel slot arrays in insertion order |
| Normal calculation | Direct cross product, stored once per face (Float64Array); the stored side is rounded with Math.fround at comparison time |
| Hot loop | 3-edges-per-triangle loop fully unrolled with scalar locals |
| Buffers | Module-level grow-only scratch arena when indexCount and positionCount <= 2^18; fresh allocation above (gate bounds retained memory ~30MB, speed is parity above the gate) |
| Attribute access | Direct .array reads; interleaved/normalized attributes are materialized into a packed copy once |
Improvement: collision-free output (the seed option became a documented no-op), 1.26x-4.2x vs Stage 3 depending on scenario, bit-identical to EdgesGeometry including segment order on standard geometries.
Output Contract Invariants (Stage 4)
Violating any of these breaks bit-compatibility with EdgesGeometry / previous outputs:
- Dot-product precision convention: threshold test compares
float64 (current face normal) x fround(stored normal). Comparing float32 x float32 (or float64 x float64) flips threshold-boundary edges (~1 in 170k edges on a 1-degree-threshold mesh). - Pairing semantics: a directed edge
(a->b)pairs with the latest unmatched(b->a). A duplicate directed edge (non-manifold) overwrites the earlier one, which is never emitted. - Emission order: matched edges are emitted at the stream position of the second (current) face; unmatched edges are emitted afterwards in insertion order.
- Weld pass is O(positionCount): every vertex is welded once, including unreferenced ones (pathological geometries with positionCount >> indexCount pay extra).
- Thread safety: the shared scratch arena makes the class not thread-safe (same caveat as the Stage 3 static arrays).
Legacy Public API
static taus() / static hybridtaus() are retained for backward compatibility but are no longer used for edge identity. The Tausworthe-style bit-mixing (magic masks 0xfffffffe / 0xfffffff8 / 0xfffffff0, GPU Gems 3 Ch. 37) survives inline in the weld pass purely as a probe-start scrambler.
Performance Results
Stage 4 measurements (vitest bench, DevContainer Node 24; medians; details in PR #371):
| Scenario | vs Stage 3 | vs three.js EdgesGeometry |
|---|---|---|
| Box x1000 individual constructions (EdgeGeometryMerger per-source path) | 1.53x mean / 2.4x best | ~4.6x |
| MergedBoxes 12k / 120k tri | 4.23x / 1.94x | ~15-27x |
| TorusKnot 204.8k tri / Sphere 523k tri | 1.50x / 1.26x | ~16-20x |
| TorusKnot 204.8k tri non-indexed | 1.02x (parity) | ~14x |
Accuracy: Stage 3 emitted 1 spurious edge on a smooth 523k-triangle sphere (hash collision, matching the birthday estimate for 32-bit hashes at ~1.57M directed edges); Stage 4 emits 0.
Measurement Archive
The benchmark harness, candidates A-I/Mix, accuracy specs, threshold sweep, and the full RESULTS.md exist only in the history of PR #371 (merge commit be262e8):
021f9d3— round 1: harness, candidates A-D, first quantification of the collision errord418f62— round 2: candidates E-H (rejected with data: interleaved table cells, sqrt-free threshold test)c7cb9e1— round 3: threshold sweep, production-shaped CandidateMix8f1669d— removal of the scaffolding from the tree
Measurement caveat learned: microbenchmarks running many implementation classes in one process converge due to polymorphic IC pollution — use dedicated two-class runs for adoption decisions.
Optimization Principles Applied
- Never let hashes decide identity - Hash only picks the probe start; identity is exact comparison (collision-free at hash-level speed)
- Inline and unroll hot loops - Eliminate call stack overhead and scratch-array indexing
- Pre-allocate fixed arrays - Replace dynamic structures (
Map,push()) with typed-array tables and index tracking - Reuse buffers below a size gate - Allocation dominates small repeated constructions; explicit resets dominate huge ones
Trade-offs
- Readability: Code is long and repetitive (unrolled edge blocks); changes must be applied to all three blocks consistently
- Maintainability: Changes require understanding the Output Contract Invariants above
- Memory: The scratch arena retains the largest-seen buffers (bounded ~30MB by the 2^18 gate)
- Justification: Edge geometry generation directly impacts user experience (frame drops during scene initialization)
Future Optimization Considerations
If further optimization is needed (1M+ triangles or sub-50ms requirements):
- WASM SIMD port of the sort-based pairing layout (Candidate C in the PR #371 history — sequential typed-array passes, designed for portability)
- WebGPU Compute Shader; pays off only if the
webgpu/render path can consume results without readback - Web Workers were evaluated and rejected (startup + copy overhead offsets gains); a SharedArrayBuffer pool would impose COOP/COEP on library users