Discord Bot · Local AI · Persistent Memory

PROJECT MAKI

A local AI chatbot with persistent memory,
character depth, and a Discord home.

About the Project

What Is Maki?

Maki started as a learning project for local LLM inference via Ollama. What began as curiosity about running models without cloud dependencies evolved into a character-driven Discord bot with persistent per-user memory, a personality that genuinely develops over time, and two fully independent personas.

2Personas
LocalInference only
Per-userPersistent memory
Self-correctingResponses
Node.js v22discord.js v14OllamaGemma 4 E4BJSON StorageRTX 4060

Core Systems

How It Works

🧠

Memory System

Every user gets their own JSON file. After each reply, two background LLM passes silently extract structured facts — one for what the user revealed, one for what Maki disclosed about herself. Facts carry weight tiers: core facts persist indefinitely, recent facts have a 30-day TTL, and stale facts expire automatically without cluttering context.

{
  "userId": "182734...",
  "familiarity": 23,
  "lastSeen": "2025-11-14T22:47:00Z",
  "history": [ /* rolling 20-turn window */ ],
  "userFacts": [
    {
      "fact": "Works as a software engineer",
      "weight": "core",
      "addedAt": "2025-10-02"
    },
    {
      "fact": "Recently moved apartments",
      "weight": "recent",
      "ttl": "2025-12-14"
    },
    {
      "fact": "Mentioned a bad day at work",
      "weight": "stale",
      "expiredAt": "2025-10-18"
    }
  ],
  "selfFacts": [
    {
      "fact": "Mentioned disliking crowded spaces",
      "weight": "core"
    }
  ]
}
📈

Familiarity System

A numeric score tracks relationship depth per user. Each exchange adds +1; discovering new personal facts adds +2. The score drives how Maki behaves across five tiers — from guarded and minimal at zero to genuinely open at 50+. Scores survive restarts and can be manually adjusted per user.

New
0–4
Polite but closed. Minimal self-disclosure.
Acquaintance
5–14
Slightly warmer. Occasional dry observation.
Comfortable
15–29
Real opinions surface. Subtle humor.
Genuine
30–49
Guards come down. References shared history.
Close
50+
Full character depth. Openly invested.
🕰️

Time Context

Every system prompt receives the current time of day and how long it has been since the user last spoke. Maki's tone shifts across six mood windows — more guarded before 6am, more honest after 10pm. Time-since-last-seen is injected in natural language: “it's been about three weeks.”

🌙Before 6amTired, less filtered
🌅6–9amAlert, slightly curt
☀️9am–5pmDefault tone
🌆5–8pmSlightly relaxed
🌇8–10pmMore conversational
🌃After 10pmQuiet, more honest
🔄

Loop Detection & Self-Correction

detectLoop() scans the last several assistant turns and flags near-identical responses (>80% similarity). When a loop is detected, correctLoop() fires a second LLM pass with explicit context. The user never sees the repeated reply — only the clean, corrected response is committed to history.

// Detect near-identical recent replies (>80% similarity)
function detectLoop(history, threshold = 0.8) {
  const recent = history
    .filter(m => m.role === "assistant")
    .slice(-4);

  for (let i = 0; i < recent.length - 1; i++) {
    const sim = similarity(
      recent[i].content,
      recent[recent.length - 1].content
    );
    if (sim > threshold) return true;
  }
  return false;
}

// Fire a correction pass — user never sees the looped reply
async function correctLoop(history, persona) {
  const result = await ollama.chat({
    ...persona.options,
    messages: [
      ...history,
      {
        role: "user",
        content: "[INTERNAL] Your last reply was too similar " +
          "to a recent one. Please vary your response."
      }
    ]
  });
  return result.message.content;
}
👥

Dual Persona Architecture

Maki and Yuki are completely independent. Separate memory directories, separate fact stores, separate familiarity scores. Each persona lives in personalities/ with its own persona.js config. Per-channel routing via environment variables lets each Discord channel be assigned a different character.

Maki
  • 35, Tokyo → Seattle
  • Infrastructure engineer
  • Reserved, dry humor
  • Warms up slowly
  • maki/memory/
vs
Yuki
  • 26, Osaka → Tokyo
  • Graphic designer
  • Openly excited
  • Warm from message one
  • yuki/memory/

Development History

Dev Timeline

Nine phases from a Mistral 7B experiment to a production-grade character bot.

01

Project Kickoff

  • discord.js v14 with Mistral 7B as the initial LLM
  • Rolling conversation history with username injection into every prompt
  • Basic slash commands: /clear, /reset, /status
02

Personality Build

  • System prompt design with detailed speech style guide
  • System prompt injection fix — persona now active on every message
  • Initial character lore: Tokyo background, reserved demeanor
03

Model Problems

  • Mistral 7B repeated itself and looped on long contexts
  • Hallucinated user opinions and invented facts without prompting
  • Researched candidate models: LLaMA 3, Qwen3, Phi-3
04

Model Upgrade

  • Migrated to Qwen3 8B — significant quality improvement immediately
  • Discovered think: mode for internal chain-of-thought reasoning
  • Fixed non-Latin script bleed affecting Japanese text output
05

Memory System

  • Per-user JSON files: rolling history, extracted facts, lastSeen, score
  • Two background LLM extraction passes per reply — user facts + self-facts
  • Fact weight tiers: core (stable), recent (30-day TTL), stale (expired)
06

Familiarity System

  • Numeric score: +1 per exchange, +2 when new personal facts found
  • Five relationship tiers that change how the persona behaves
  • Scores persist across sessions and are manually editable
07

Loop Detection

  • detectLoop() scans recent assistant turns for >80% similarity
  • correctLoop() fires a second LLM pass — user never sees the repeated reply
  • Added repeat_penalty parameter tuning to Ollama config
08

Character Depth

  • Full backstory: brother Naota, Seattle transplant, infrastructure career
  • Fixed continuity bug where Maki contradicted earlier self-disclosures
  • Addressed prose flattening — response length variance improved
09

Gemma 4 E4B + Yuki

  • Edge-optimised Gemma 4 E4B with internal reasoning layer
  • Built Yuki as a fully independent persona with separate memory directory
  • UI model switcher + per-channel persona routing via environment variables

Conversation Samples

Screenshots

Real conversation samples coming soon.

MakiToday at 11:42 PM
Screenshot coming soon
YukiToday at 11:42 PM
Screenshot coming soon
MakiToday at 11:42 PM
Screenshot coming soon
YukiToday at 11:42 PM
Screenshot coming soon
MakiToday at 11:42 PM
Screenshot coming soon
YukiToday at 11:42 PM
Screenshot coming soon

Explore the Source

Built with Node.js, discord.js v14, and Ollama on an RTX 4060.