Skip to content
Tenure logo

Tenure

Unclaimed

Local-first LLM memory proxy for contextualized AI interactions without re-briefing.

Visit Website

TL;DR - Tenure

  • Provides local, private long-term memory for LLMs.
  • Automatically injects structured context into every session.
  • Compatible with any OpenAI-compatible client without configuration.
Pricing: Free plan available
Best for: Growing teams

Pros & Cons

Pros

  • Eliminates repetitive re-briefing of LLMs, saving time and improving efficiency.
  • Ensures privacy by keeping all user data and context local on the machine.
  • Works seamlessly with existing OpenAI-compatible clients without any configuration changes.
  • Provides a structured and editable memory, offering more control than raw chat history.
  • Supports various LLM providers, offering flexibility in backend choice.

Cons

  • Requires local setup and management.
  • Currently an open-source project, which might imply less commercial support compared to paid solutions.

Key Features

Local-first memory storage (no cloud, no tracking)OpenAI API compatibility (drop-in proxy)Structured belief system (Preferences, Decisions, Entities, Open Questions, Expertise)Automatic context injection to eliminate re-briefingTransparent to LLM clients (no plugins or custom integrations needed)Instant import of existing knowledge (skills files, bios, notes)Full control over beliefs (visible, editable, auditable via admin UI)Configurable history and belief compaction

Pricing Plans

Free Trial

Free

$0 USD per month

  • Unlimited public/private repositories
  • Dependabot security and version updates
  • 2,000 CI/CD minutes/month (Free for public repositories)
  • 500MB of Packages storage (Free for public repositories)
  • Issues & Projects
  • Community support

Team

$4 USD per user/month

  • Everything included in Free
  • Access to GitHub Codespaces
  • Repository rules
  • Multiple reviewers in pull requests
  • Draft pull requests
  • Code owners
  • Required reviewers
  • Pages and Wikis
  • Environment deployment branches and secrets
  • 3,000 CI/CD minutes/month (Free for public repositories)
  • 2GB of Packages storage (Free for public repositories)
  • Web-based support

Enterprise

Starting at $21 USD per user/month

  • Everything included in Team
  • Data residency
  • Enterprise Managed Users
  • User provisioning through SCIM
  • Enterprise Account to centrally manage multiple organizations
  • Environment protection rules
  • Repository rules
  • Audit Log API
  • SOC1, SOC2, type 2 reports annually
  • FedRAMP Tailored Authority to Operate (ATO)

What is Tenure?

Editorial review
Tenure is a local, privacy-first proxy designed to provide long-term memory for Large Language Models (LLMs). It acts as an intermediary between any OpenAI-compatible client (like Open WebUI or LM Studio) and the LLM, automatically injecting relevant context based on a structured "world model" of user preferences, decisions, entities, and expertise. This eliminates the need to repeatedly brief the LLM on past conversations or specific requirements, ensuring that every new session is already contextualized. The tool is ideal for developers, writers, researchers, or anyone who frequently interacts with LLMs and experiences the frustration of models forgetting previous context. By running entirely on the user's machine, Tenure ensures data privacy and keeps all memory local. It supports instant import of existing knowledge and offers full control over the stored beliefs, allowing users to edit, audit, and manage their personalized context. Tenure aims to make LLM interactions more efficient and personalized by maintaining a persistent, structured understanding of the user's work. Tenure works by routing any OpenAI-compatible client to a local address (localhost:5757/v1). It intercepts prompts, enriches them with relevant information from its world model, and then forwards them to the chosen LLM provider (which can also be local or cloud-based). The LLM client remains unaware of Tenure's presence, making it a drop-in solution without requiring custom integrations or plugins. It also includes features for automatic history and belief compaction, and the ability to pause context extraction.

Reviews

Be the first to review Tenure

Your take helps the next buyer. Verified LinkedIn reviewers get a badge.

Write a review

Best Tenure Alternatives

Top alternatives based on features, pricing, and user needs.

View full list →

Explore More

Tenure FAQ

How does Tenure differentiate its context injection from standard RAG (Retrieval Augmented Generation) approaches?

Tenure goes beyond typical RAG by building a structured "world model" of user preferences, decisions, and expertise, rather than just performing similarity searches on raw text. It uses alias-weighted term matching to return exactly what was named, ensuring more precise and relevant context injection compared to just finding semantically related information.

Can I use Tenure with a local LLM like those run through LM Studio, or is it only for cloud-based OpenAI models?

Yes, Tenure is designed to be provider agnostic and routes to any OpenAI-compatible endpoint. This includes local LLMs run through tools like LM Studio or Open WebUI, as well as cloud-based services like GPT-4o, Claude, or Bedrock, allowing you to use your preferred model while benefiting from Tenure's memory capabilities.

What kind of information does Tenure store in its "structured beliefs" and how can I manage it?

Tenure organizes information into categories such as Preferences, Decisions, Entities, Open Questions, and Expertise. You can view, edit, and audit every belief through an admin UI accessible at /beliefs. This allows you to pin important information, correct inaccuracies, and maintain full control over the context provided to your LLM.

If I want to temporarily disable Tenure's context injection for a specific chat session, how can I do that?

You can pause context extraction globally from Tenure's Settings. For more granular control, you can also disable it on a per-session basis directly within your chat client by typing !extract off, allowing you to manage when Tenure intervenes without leaving your workflow.

How does Tenure handle the growth of stored history and beliefs to prevent performance degradation or excessive storage use?

Tenure includes automatic history and belief compaction features. These can be configured from the admin UI with different modes such as aggressive, conservative, or off, allowing you to tune how frequently and thoroughly old data is processed and condensed to maintain performance and manage storage.

Source: github.com