|
|
|
| That is 100% my intention and hope and I think we are very close to deleting all of that. Right now on master, I am already only using Python for the tokenization preprocessing. In principle the requirements for llm.c should be extremely minimal. I think this a few days of work that is high on my mind.
Biggest problem right now is finding a place that can host the 135GB of tokens for FineWeb100B. Will probably use S3 or something. Related see: https://github.com/karpathy/llm.c/issues/482 |
|
| I think between Anna's Archive, fineweb and as many github repos as you can scrape you can get a pretty decent dataset.
I doubt Anna's Archive would produce a good model on its own though. |
|
| It’s not about learning. It’s about owning. Exactly the reason OpenAI stopped being open. Having GPT-4-quality LLMs created by anyone with a gaming PC would be pretty radical. |
|
| And you won’t get there. Those models are far too large for a 2024 GPU. Llama-3 70b is arguably close to GPT-4 but is still too large for gaming GPUs (and probably for many years of GPU updates) |
|
| I have a 24 core Intel cpu and llama3.cpp runs llama3 surprisingly fast in surprisingly little RAM. Yes it becomes a space heater but theres light at the end of the cuda free tunnel |
|
| You'd have to be deep into ML Infrastructure to use C, probably via CUDA. No-one who develops or uses ML models touches C or even C++. tinygrad and llama.cpp are exceptions. |
> Ultimately my interest in llm.c is to have a nice, clean, minimal, super dependency-light repo in direct C/CUDA implementation, which I find aesthetically pleasing.
Also, it's awesome that you spend your time on your passion.
Any plans on making a video series on llm.c? :D