Imagine if an idea was a point on a graph, ideas that are similar would have points closer to each other, and points that are very different would be very far away. A llm is a predictive model for this graph, just like a line of best fit is a predictive model for a simple linear graph. So in a way, the model is predicting the information, it’s not stored directly or searched for.
A locally running llm is just one of these models shrunk down and executing on your computer.
Edit: removed a point about embeddings that wasnt fully accurate
Thanks. That helps me understand things better. I’m guessing you need all the data initially to set up the graph (model). Then you only need that?
Yep, exactly. Every llm has a ‘cut off date’ which is the last day that the data used to make the model was updated.