16 September - Progress

My first AI Study Sessions were posted on my blog. I'm going to keep posting them there, on CrackedEng and on Twitter to foster different kinds of discussions.

September 16, 2024

I've realized that the way I learn best is by exposing myself to the information at a high level in a non-intensive manner, chewing on the information for a bit and then going back through it more intensively later while taking notes. I watched Karpathy's Micrograd video in one sitting one night while smoking a cigar and just marveled at his ability to teach while I tried to soak up little by little, but over the last week I've been working through it while taking notes.

This approach has been super helpful. As I take notes, I feel more like I'm reviewing information rather than seeing it for the first time, and I'm filling in the gaps that I didn't quite get at first. There is probably some cognitive science behind this that I haven't read up on, but regardless I think it's the way I'm going to go about it going forward.

Learnings

Karpathy is a brilliant teacher

He doesn't make me (a math moron) feel like an idiot for not understanding the chain rule off the rip by just tossing it out as an idea that you just have to understand. The math isn't 100% within my grasp as of yet, but the way he teaches it, relying on intuition rather than hardcore math algorithms to memorize, is incredibly helpful for me.

Neural Networks of this type are shockingly simple

Back propagation as an idea is really simple: you start with some algorithms, these algorithms have weights, biases and inputs, the inputs are static but the weights and biases are not, you run through several iterations of applying the weights, biases and squashing functions to get a loss, you go back and tweak the biases in the direction of the loss, calculate a new loss, repeat.

I think this simplicity highlights how much of the mystique around machine learning is really simplicity hidden behind jargon. I understand the need for jargon and I don't think it's inherently a bad thing (we have to communicate, after all) but for a long time I let that jargon scare me off.

The fact that Karpathy was able to explain backprop with fairly minimal mathematical requirement in a 2 hour long video shows you just how simple it can be.

Now, I know that ChatGPT isn't running on a simple algorithm like this and I understand that there is actual, legitimate complexity, but it's refreshing knowing I can make some serious progress on my learnings fairly easily.

I get why they're using Python

The Numpy library really does make things a lot easier compared to going without it, and the simplicity of Python does allow you to create NN's with very minimal setup/config code. I still don't like Python, but I get it.

Frustrations

My only real frustrations at this juncture is that I can't work on AI/ML stuff very often. Working a day job + building WhipsAI + now building a community is taking up a lot of time, so finding time to sit down and hardcore study is hard. This is a time management issue, though, so it's very solveable.

Next steps

Create a reasonable tempo for learning
Finish the Karpathy video with notes
Start building some Rust NN stuff