From Tools to Orchestration: Building My Personal AI OS on Telegram

For a long time, the problem wasn’t a lack of tools, but an excess of them. Tasks in Notion, e-mails in Gmail, appointments in the calendar—personal context scattered across a thousand places. I created this Telegram assistant to reduce that fragmentation, turning the app into a single interface to operate my daily routines.

Instead of creating yet another dashboard, I used a channel that is already part of my daily life. The idea is simple: I send a natural message, the system understands the intent, executes the right tools, and returns a response with the proper context.

Why I Didn't Use OpenClaw

Many have asked why I didn't use existing frameworks like OpenClaw or similar alternatives. The answer comes down to total control and predictability.

As the project is for personal use and critical operations, I chose to build my own runtime from scratch to ensure:

  • Fine-Grained Token Management: Exact control over consumption in every interaction, limiting history length to avoid unnecessary costs.
  • LLM Orchestration: Full granularity over API calls, without the "black box" of third-party abstractions.
  • Bespoke Memory: Implementation of a custom memory architecture, separating short-term transactional history from long-term durable context.
  • Security and Isolation: Absolute control over authentication flows and credential storage.

What the Project Solves in Practice

The goal was never a generic chatbot, but a personal operator capable of acting across multiple domains. Today, it centralizes:

  • Organization: Capturing and prioritizing tasks and notes in Notion.
  • Google Workspace: Calendar management and searching, reading, or sending emails (Gmail).
  • Finance & Health: Logging and analyzing expenses, meals (with LLM-generated macro/calorie estimates), and exercises in Notion, as well as metabolism tracking.
  • Information: News summaries (via RSS) and contact management.
  • Multimodality: Text interactions, document reading, and automatic audio transcription via Whisper (gpt-4o-mini-transcribe).

The Architecture: Isolated Modules

The golden rule was to keep the interface, runtime, and external integrations strictly separated. Looking at the code, the core components are structured as follows:

  • run.py and telegram_bot.py: Entrypoint and Telegram interface.
  • assistant_connector/: The heart of the project, containing the agentic runtime, tool catalog, and SQLite persistence. Everything is guided by configurations in agents.json.
  • Domain Connectors: Isolated folders (notion_connector, gmail_connector, calendar_connector, openai_connector).
  • google_auth_server.py: A background HTTP server dedicated exclusively to processing Google OAuth2 callbacks.

Project Architecture

The Runtime Loop

When a message arrives, processing is handed over to runtime.py. The model doesn’t "reply" immediately; it coordinates operations. The loop works like this: 1. Receives the message and understands the context. 2. Calls the model via function calling. 3. Executes the actual tools. 4. Feeds the results back to the model and repeats until it reaches the final action and response.

Runtime flow and user response

Tools and Extensibility

The project was designed to be extensible. The tool catalog has no hardcoded intentions. To add a new capability, simply declare the metadata in JSON and implement the handler in Python. It is a simplified and focused version of the architecture we use today at Draiven.

Hybrid Memory: SQLite vs. Markdown Persistence

One of the biggest differentiators of this project is the memory scheme. Instead of a single database, I divided the assistant's knowledge into two layers:

1. Transactional Memory (SQLite)

SQLite acts as the "short-term cortex." It is fast, structured, and ideal for:

  • Temporary History: Recent conversation contexts.
  • Scheduled Tasks: What needs to be triggered via systemd.
  • Security: API keys and credentials encrypted with Fernet.

2. Persistent Memory (Markdown)

For long-term context, I use Markdown files. This is where the system truly "learns" about me.

  • Coherent Reading: The system doesn’t read all files all the time. It evaluates the intent of the conversation and decides, for example, to pull info from health.md only when the topic is health.
  • Autonomous Writing and Editing: The system decides on its own when new information is relevant enough to be stored permanently or when it should edit an existing memory record.

Building my own memory system was an incredible experience. While it already exists in advanced LLMs, having control over how the bot manages its "life files" brings a level of transparency and security that complex vector databases often hide.

Based on the logic in _select_user_memory_context, the system intelligently selects only the most relevant knowledge to inject into the prompt. As shown in the snippet below, the function scores each memory file against the user’s message by tokenizing the request and checking for keyword overlaps. It applies a manual boost to foundational files like about-me.md, ranks the candidates, and selects only the top matches. This ensures memory retrieval remains lightweight and contextually accurate, preventing the LLM from being overwhelmed by the entire user profile in every interaction.

   def _select_user_memory_context(self, user_message: str, user_memories: dict[str, str]) -> str:
       if not user_memories:
           return ""

       query = str(user_message or "").lower()
       tokens = set(re.findall(r"[a-z0-9à-ÿ_]{3,}", query))
       scored = []
       for file_name, content in user_memories.items():
           sample = f"{file_name.lower()} {content[:1200].lower()}"
           score = 0
           if file_name.lower() in ("about-me.md", "about_me.md", "about-user.md", "about_user.md"):
               score += 1
           score += sum(1 for token in tokens if token in sample)
           if score > 0:
               scored.append((score, file_name, content))

       if not scored:
           first_name = next(iter(user_memories))
           scored = [(1, first_name, user_memories[first_name])]

       scored.sort(key=lambda item: item[0], reverse=True)
       selected = scored[:2]
       chunks = [
           f"### {file_name}\n{self._truncate_text(content, 1400)}"
           for _, file_name, content in selected
       ]
       return self._truncate_text("\n\n".join(chunks), self._max_user_memory_chars)

Security, Credentials, and Deployment

This is where the project stopped being a script and became infrastructure.

For credential management, I avoided the standard of putting user keys in .env. The system supports multiple users (authorized via Telegram ID) and credentials are encrypted with Fernet and stored in SQLite per user.

Setup happens via chat: I can use the /setup command to open an interactive panel or simply send a natural message like configure notion_api_key: secret_xxx.

For Google, the flow is zero-friction: 1. I type /google_auth on Telegram. 2. The bot returns an authorization link. 3. The local google_auth_server.py captures the callback, exchanges the code for the token, and saves it securely in the database.

Credentials and Auth Flow

To ensure the assistant operates safely in the real world, the test suite (with over 80% coverage in core modules) blocks external network calls by default. In production, it runs as an Ubuntu service (systemd), ensuring stability.

Conclusion

The most interesting point of this project wasn't just putting a bot on Telegram. It was transforming it into a lightweight and robust personal runtime, with multi-layer memory, real-world task execution, and secure authentication.

In the end, the value isn't in "chatting with AI," but in reducing operational friction. When a quick voice note in traffic becomes a structured task in Notion, calculates the calories of my lunch, or sends a long email (with my confirmation), the assistant stops being a proof of concept and becomes the infrastructure of my routine.