I find the associative nature of our memory can only be represented as a graph, but we definitely need a vector store.
I like to pose this as an identity question rather than memory as colloquially memory is only associated with fact extraction and retrieval, while identity extends that to traits, behavior, preferences, memory and narrative causal relations.
essentially a self organization of symbols and memories to produce a coherent singular entity.
While the general problem of limited context windows remains, the example given regarding Novellas is highly outdated. Even many open models now offer >128k tokens, which makes it possible to fit an entire novella in the context window (using rule of thumb, 128k tokens ~= 100k words ~= 300 pages of english text).
The frontier models even support context in the millions of tokens range, making it possible to keep an entire novella series inside the context window. I feel most memory related problems will simply disappear due to increasing context lengths, because attempts at outsourcing LLM memory or producing theoretically infinite context have beein going on for years and have not yet produced anything that can compete with raw scaling of transformers.
I had to quit halfway due to time constraints but I'll comment on this:
>A more explicit strategy: after producing a new document, take every pair of documents in context and ask the model if they should probably be connected. Ask what future queries could be helped by having those documents connected. Are those queries likely? If so, make the connection.
I find the associative nature of our memory can only be represented as a graph, but we definitely need a vector store.
I like to pose this as an identity question rather than memory as colloquially memory is only associated with fact extraction and retrieval, while identity extends that to traits, behavior, preferences, memory and narrative causal relations.
essentially a self organization of symbols and memories to produce a coherent singular entity.
I had written about it here.
https://saxenauts.io/blog/persona
While the general problem of limited context windows remains, the example given regarding Novellas is highly outdated. Even many open models now offer >128k tokens, which makes it possible to fit an entire novella in the context window (using rule of thumb, 128k tokens ~= 100k words ~= 300 pages of english text).
The frontier models even support context in the millions of tokens range, making it possible to keep an entire novella series inside the context window. I feel most memory related problems will simply disappear due to increasing context lengths, because attempts at outsourcing LLM memory or producing theoretically infinite context have beein going on for years and have not yet produced anything that can compete with raw scaling of transformers.
Check out memelang.net which aims to solve much of these issues.
I think Cortical Conductor Theory tries to resolve this.
I had to quit halfway due to time constraints but I'll comment on this:
>A more explicit strategy: after producing a new document, take every pair of documents in context and ask the model if they should probably be connected. Ask what future queries could be helped by having those documents connected. Are those queries likely? If so, make the connection.
Congratulations! You have invented attention!