Transformer-Native Architecture
The CRETA ASIC represents a fundamental rethinking of how silicon should process transformer workloads. Rather than adapting a general-purpose architecture, we started with the mathematical operations at the heart of transformer inference and designed custom hardware to execute them with maximum efficiency. Our architecture natively supports multi-head attention, feed-forward networks, and layer normalization with dedicated hardware units that eliminate the overhead of software emulation.
- Hardware-accelerated multi-head attention with configurable head counts
- Native support for rotary position embeddings (RoPE) and ALiBi
- Optimized softmax units with numerical stability guarantees