Show HN: Persistent Mind Model (PMM) – Update: an model-agnostic "mind-layer"
github.comA few weeks ago I shared the Persistent Mind Model (PMM) — a Python framework for giving an AI assistant a durable identity and memory across sessions, devices, and even model back-ends.
Since then, I’ve added some big updates:
- DevTaskManager — PMM can now autonomously open, track, and close its own development tasks, with event-logged lifecycle (task_created, task_progress, task_closed).
- BehaviorEngine hook — scans replies for artifacts (e.g. Done: lines, PR links, file references) and uto-generates evidence events; commitments now close with confidence thresholds instead of vibes.
- Autonomy probes — new API endpoints (/autonomy/tasks, /autonomy/status) expose live metrics: open tasks, commitment close rates, reflection contract pass-rate, drift signals.
- Slow-burn evolution — identity and personality traits evolve steadily through reflections and “drift,” rather than resetting each session.
Why this matters: Most agent frameworks feel impressive for a single run but collapse without continuity. PMM is different: it keeps an append-only event chain (SQLite hash-chained), a JSON self-model, and evidence-gated commitments. That means it can persist identity and behavior across LLMs — swap OpenAI for a local Ollama model and the “mind” stays intact.
In simple terms: PMM is an AI that remembers, stays consistent, and slowly develops a self-referential identity over time.
Right now the evolution of it "identity" is slow, for stability and testing reasons, but it works.
I’d love feedback on:
What you’d want from an “AI mind-layer” like this.
Whether the probes (metrics, pass-rate, evidence ratio) surface the right signals.
How you’d imagine using something like this (personal assistant, embodied agent, research tool?).
I'm doing something similar, so some thoughts:
1. I really like the "commitment" concept. That solves a real conversational problem where the AI can be too easy to redirect, moving on too fluidly from previous conversational beats. And the AI will easily make commitments that it can't or won't keep, so tracking them is good.
2. Reflection is a good approach. I think this is generally in the zone of "memory", though a more neutral term like insight or observation can be better for setting expectations. There's a lot of systems that are using explicit memory management, with tools to save or load or search memories, and I don't think that's very good. I include both techniques in my work because sometimes the AI wants to KNOW that it has remembered something. But maybe the commitment idea is a better way to think about it. Reflection lets the memory be built from a larger context. And usually the peak moment when a memory would be explicitly stored isn't actually the final moment, and so a reflective memory will be more nuanced and correct.
3. It's good to create a model for personality. I should probably be more explicit in my own work, though I guess I focus mostly on behavioral aspects: how the AI should act toward the user, not what the AI's "identity" is. But generally I don't trust scores. A score implies a rubrik already embedded in the model, and to the degree that even exists the rubrik is unstable, not portable between models, and changes can be arbitrary. Instead I like to use terms that imply the rubrik. So if you take Big Five then I'd create terms for each attribute and score and use those terms exclusively, ignoring numbers entirely. For instance for neuroticism you might have Unflappable → Even-keeled → Sensitive → Reactive → Vulnerable.
4. I can't tell if Emergence Metrics are prescriptive or descriptive. I'm guessing it's actually unclear in the implementation as well. The AI can pretend to be all kinds of things, but I think you are trying to get past just pretend.
Thanks for commenting! These are really helpful. Super helpful framing.
Here's where my thinking is going (I could be totally wrong, but this is new ground for me);
You nailed the problem on commitments. A lot of AIs will say “I’ll do X” and then immediately let the thread drift. PMM logs those as commit_open events, and tracked promises.They don’t close unless there’s actual evidence (file, PR link, or at minimum a Done: markers that gets picked up by the BehaviorEngine).
That’s why my close rates look brutally low right now. I’d rather see a truthful 0.000% than a fake 100% “done.”
Over time, the evidence hooks should help close more loops, but always with proof. Or at least that's what I'm trying to nail down. lol
I went with “reflection” because it emphasizes the recursive/self-referential aspect, but “insight” or “observation” might be clearer. Functionally, it’s closer to what you described, building memory from a broader context, rather than snap-shotting a single moment.
The personality scores are a just a raw blunt tool at moment. Right now I’m using IAS/GAS metrics as scaffolding, but I don’t think numbers are the endgame. I am leaning toward descriptors, or tiers within the traits, as stable representations of states within these traits. The question is, how far down do I nest?
The emergence metrics are supposed to be descriptive. I’m trying to measure what’s happening, not tell the model what it should become. In early runs, they’re mostly flat, but the hope is that with continuity and reflection, I'll see them drift in ways that track identity change over time.
If I were to be completely honest, this is a thought experiment being fleshed out. How can I create a personal AI that's model agnostic, portable, and develops in alignment in a manner that is personalized to the person using it?
So far, things seems to be tracking in the right direction from what I can see. Either that, or I'm constructing the world most amazing AI confabulation LARP machine. :)
Either way, I'm pulling my hair out in the process.