HanumaGPT

HanumaGPT is an enhanced nanoGPT architecture featuring:

These modifications improve efficiency and performance, making HanumaGPT a more optimized variant of nanoGPT.

Efficient Memory Optimization: Optimized key/query vector sizes to reduce memory usage.
Enhanced Attention Mechanisms: Implemented sliding window attention for better long-sequence processing.
Custom Softmax Approximations: Experimented with alternative Softmax methods for improved efficiency.
Improved MLP Layers: Inspired by LLaMA models, leading to better performance.
Register Tokens: Added register tokens to improve autoregressive decoding.

Improved computational efficiency over standard nanoGPT.
Reduced inference costs while maintaining performance.
Better handling of long-sequence generation with enhanced attention mechanisms.