How does Zep handle evolving user information and fact invalidation over time?
Zep utilizes a temporal context graph that updates with every interaction. When facts change, old ones are automatically invalidated, ensuring the agent always has the most current and accurate context.
What types of data sources can Zep ingest to build its unified context graph?
Zep can ingest a variety of data sources including chat messages, JSON business data, CRM records, application events, and documents. It integrates these into a single, evolving context graph.
How does Zep ensure low latency for real-time applications like voice agents?
Zep is optimized for performance, achieving P95 retrieval latency of less than 200ms. This speed is crucial for real-time applications such as voice agents, video agents, and live support, preventing latency issues.
Can I customize the context blocks that Zep provides to my LLM?
Yes, Zep allows for customizable context blocks. You can define templates using a simple API to format the retrieved context, including user profiles, relevant facts, and entities, optimized for your specific LLM needs.
How does Zep compare to other agent memory solutions in terms of accuracy and efficiency?
Zep leads on the LoCoMo benchmark for single-shot retrieval, demonstrating up to 80.32% accuracy at 189ms. It offers configurations to optimize for accuracy, latency, and token efficiency, outperforming traditional chat memory or static RAG approaches.
Does Zep support defining custom graph ontologies for specific business domains?
Yes, Zep allows users to easily define custom graph ontologies to improve the precision and relevance of agent memory. This customization can be done using familiar tools like Pydantic and Zod.