A weekly roundup of some resources I've found helpful learning more about AI and building AI systems:
-
Large language models, explained with a minimum of math and jargon: The most accessible description of attention heads that I've read. I've read about attention mechanisms in transformers, but their importance really clicked after reading this article. The discussion about how the attention heads and the feed-forward layers are utilized to provide an answer to a question was particularly helpful.
-
Opportunities in AI: A great talk by Andrew Ng. AI is becoming more accessible. I'm excited to see what problems are going to be solved that haven't traditionally had a large enough ROI to justify an AI solution previously.
-
Let's build GPT: from scratch, in code, spelled out: I'm enjoying this video from Karpathy. Still working through it. I'm coding along with the video while watching and planning to take some of the ideas here and apply them to another project from scratch. Noodling on what that other project will be...
-
Introduction to PyTorch: Establishing a foundation on the basics of PyTorch.