The Tokens
Bonding though words... Manpreet & Renaira
Featured
Glorious AI Noise
Keep Your Sanity When AI Breaks the Speed Limit I often hear this question about AI, and it can feel overwhelming to read, learn, and adapt. The real challenge lies in knowing where to begin, where to pause, and how to navigate through the...
AI: A Slightly Silly Look into the Future
Have been actively working on AI since before ChatGPT, and it has been amazing to watch the growth and speed at which AI models are changing every discussion and conversation around us. Saturday evening, I decided to pen down some of my thoughts. These...
Flash Attention
Write up is not written by the GPT of any form, and believe me, it feels good to have typos and grammatical mistakes to feel more human. I do not want to talk about how neural Networks were inspired using brains, but I do...
GRPO At Its Best
Wild world of fine-tuning large language models is where we feed math problems to a 7-billion-parameter beast (Qwen2.5-7B-Instruct), run it on 8 fire-breathing A100 GPUs, and politely ask it to get smarter without throwing a tantrum. This writeup dives into GRPO, a Reinforcement Learning...
Watch My Models Learn
Fancy models have set the bar high, but guess what? My model is taking a different route by mastering the art of improvement on every forward and backward pass! Let’s explore the numbers that prove this learning leap. (P.S. If you’re new here, check...
AI Leadership
AI is becoming increasingly prevalent in technology, with many products and features being developed in prototype stages. However, pressure to add “AI” to everything often doesn’t always lead to a meaningful impact. To effectively harness AI, it is essential to understand business needs and...
GPT & Me 🧠
This write-up is A Neural Network Love Story (Spoiler: It’s Complicated), one neuron at a time – while GPT pretends not to notice! It is my hands-on experience training a Generative Pretrained Transformer with 124 million parameters - powered by 8 massive NVIDIA A100 GPUs,...
Neural Networks and Coffee Breaks ☕
Step at a Time 📚 This writeup provides a beginner explanation for understanding and training GPT-2. I started by implementing a transformer decoder. You can visit mini-autograd and mini-models for my older work, and now I am slowly graduating to setting up, training, and...