Tenure

Unclaimed

Local-first LLM memory proxy for contextualized AI interactions without re-briefing.

Knowledge Base AI Assistants AI Prompt Tools

Visit Website

FreemiumVisit Website

TL;DR - Tenure

Provides local, private long-term memory for LLMs.
Automatically injects structured context into every session.
Compatible with any OpenAI-compatible client without configuration.

Pricing: Free plan available

Best for: Growing teams

Pros & Cons

Pros

Eliminates repetitive re-briefing of LLMs, saving time and improving efficiency.
Ensures privacy by keeping all user data and context local on the machine.
Works seamlessly with existing OpenAI-compatible clients without any configuration changes.
Provides a structured and editable memory, offering more control than raw chat history.
Supports various LLM providers, offering flexibility in backend choice.

Cons

Requires local setup and management.
Currently an open-source project, which might imply less commercial support compared to paid solutions.

Key Features

Local-first memory storage (no cloud, no tracking)OpenAI API compatibility (drop-in proxy)Structured belief system (Preferences, Decisions, Entities, Open Questions, Expertise)Automatic context injection to eliminate re-briefingTransparent to LLM clients (no plugins or custom integrations needed)Instant import of existing knowledge (skills files, bios, notes)Full control over beliefs (visible, editable, auditable via admin UI)Configurable history and belief compaction

Pricing Plans

Free Trial

Free

$0 USD per month

Unlimited public/private repositories
Dependabot security and version updates
2,000 CI/CD minutes/month (Free for public repositories)
500MB of Packages storage (Free for public repositories)
Issues & Projects
Community support

Team

$4 USD per user/month

Everything included in Free
Access to GitHub Codespaces
Repository rules
Multiple reviewers in pull requests
Draft pull requests
Code owners
Required reviewers
Pages and Wikis
Environment deployment branches and secrets
3,000 CI/CD minutes/month (Free for public repositories)
2GB of Packages storage (Free for public repositories)
Web-based support

Enterprise

Starting at $21 USD per user/month

Everything included in Team
Data residency
Enterprise Managed Users
User provisioning through SCIM
Enterprise Account to centrally manage multiple organizations
Environment protection rules
Repository rules
Audit Log API
SOC1, SOC2, type 2 reports annually
FedRAMP Tailored Authority to Operate (ATO)

View full pricing

What is Tenure?

Editorial review

Tenure is a local, privacy-first proxy designed to provide long-term memory for Large Language Models (LLMs). It acts as an intermediary between any OpenAI-compatible client (like Open WebUI or LM Studio) and the LLM, automatically injecting relevant context based on a structured "world model" of user preferences, decisions, entities, and expertise. This eliminates the need to repeatedly brief the LLM on past conversations or specific requirements, ensuring that every new session is already contextualized. The tool is ideal for developers, writers, researchers, or anyone who frequently interacts with LLMs and experiences the frustration of models forgetting previous context. By running entirely on the user's machine, Tenure ensures data privacy and keeps all memory local. It supports instant import of existing knowledge and offers full control over the stored beliefs, allowing users to edit, audit, and manage their personalized context. Tenure aims to make LLM interactions more efficient and personalized by maintaining a persistent, structured understanding of the user's work. Tenure works by routing any OpenAI-compatible client to a local address (localhost:5757/v1). It intercepts prompts, enriches them with relevant information from its world model, and then forwards them to the chosen LLM provider (which can also be local or cloud-based). The LLM client remains unaware of Tenure's presence, making it a drop-in solution without requiring custom integrations or plugins. It also includes features for automatic history and belief compaction, and the ability to pause context extraction.

LCLouis CorneloupUpdated May 14, 2026 · how we evaluateSourcegithub.com ↗

Reviews

Be the first to review Tenure

Your take helps the next buyer. Verified LinkedIn reviewers get a badge.

Write a review

Best Tenure Alternatives

Top alternatives based on features, pricing, and user needs.

View full list →

ChromaPaid

Open-source vector database for AI applications

PineconeFreemium

Vector database

WeaviateFreemium

Open-source vector database with ML

MilvusFree

Open-source vector database for AI

ChatbaseFreemium

Build custom AI chatbots trained on your data

See all Knowledge Base tools →

Explore More

Best Knowledge Base Tools Best AI Assistants Tools Best AI Prompt Tools Tools Best Free Knowledge Base Best Free AI Assistants Best Free AI Prompt Tools

Tenure FAQ

How does Tenure differentiate its context injection from standard RAG (Retrieval Augmented Generation) approaches?

Tenure goes beyond typical RAG by building a structured "world model" of user preferences, decisions, and expertise, rather than just performing similarity searches on raw text. It uses alias-weighted term matching to return exactly what was named, ensuring more precise and relevant context injection compared to just finding semantically related information.

Can I use Tenure with a local LLM like those run through LM Studio, or is it only for cloud-based OpenAI models?

Yes, Tenure is designed to be provider agnostic and routes to any OpenAI-compatible endpoint. This includes local LLMs run through tools like LM Studio or Open WebUI, as well as cloud-based services like GPT-4o, Claude, or Bedrock, allowing you to use your preferred model while benefiting from Tenure's memory capabilities.

What kind of information does Tenure store in its "structured beliefs" and how can I manage it?

Tenure organizes information into categories such as Preferences, Decisions, Entities, Open Questions, and Expertise. You can view, edit, and audit every belief through an admin UI accessible at /beliefs. This allows you to pin important information, correct inaccuracies, and maintain full control over the context provided to your LLM.

If I want to temporarily disable Tenure's context injection for a specific chat session, how can I do that?

You can pause context extraction globally from Tenure's Settings. For more granular control, you can also disable it on a per-session basis directly within your chat client by typing !extract off, allowing you to manage when Tenure intervenes without leaving your workflow.

How does Tenure handle the growth of stored history and beliefs to prevent performance degradation or excessive storage use?

Tenure includes automatic history and belief compaction features. These can be configured from the admin UI with different modes such as aggressive, conservative, or off, allowing you to tune how frequently and thoroughly old data is processed and condensed to maintain performance and manage storage.

Source: github.com