Ask HN: Build Your Own LLM?

5 points by retube 7 hours ago

The best way to really understand how something works is to build it yourself. So I am wondering if there are any good tutorials on building your own LLM from scratch. I.e. implementing tokenisation, embeddings, attention and so on. I am not suggesting one could replicate chatGPT, but more a toy model that implements the core features but based on a much smaller corpus and training data.

2ro 7 hours ago

How about this?

https://mathstodon.xyz/@empty/115088095028020763

retube 6 hours ago

thanks

sfmz 6 hours ago

Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out. https://www.youtube.com/watch?v=kCc8FmEb1nY

beardyw 5 hours ago

Andrej Karpathy's Nano GPT is reasonably accessible and easy to run.
https://github.com/karpathy/nanoGPT

pm2222 6 hours ago

https://www.amazon.com/Build-Large-Language-Model-Scratch/dp...

ryanchants an hour ago

I'd get it straight from Manning and save a few bucks and take out the middle man: https://www.manning.com/books/build-a-large-language-model-f...
retube 6 hours ago

thanks. looks potential