Transformer Model
A neural architecture based on self-attention that processes an entire sequence in parallel, capturing relationships between any two time steps directly.
Transformers replaced recurrence with self-attention, allowing a model to relate any two positions in a sequence directly rather than passing information step by step. This makes them excellent at capturing both short and long-range structure and highly parallelisable to train.
endeavr.ai includes a Transformer encoder in its ensemble to complement the recurrent models, giving the stack a view of the full lookback window at once.