From Tools to Orchestration: Building My Personal AI OS on Telegram
For a long time, the problem wasn’t a lack of tools, but an excess of them. Tasks in Notion, e-mails in Gmail, appointments in the calendar—personal context scattered across a thousand places. I created this Telegram assistant to reduce that fragmentation, turning the app into a single interface to operate my daily routines.
Instead of creating yet another dashboard, I used a channel that is already part of my daily life. The idea is simple: I send a natural message, the system understands the intent, executes the right tools, and returns a response with the proper context.
Why I Didn't Use OpenClaw
Many have asked why I didn't use existing frameworks like OpenClaw or similar alternatives. The answer comes down to total control and predictability.
As the project is for personal use and critical operations, I chose to build my own runtime from scratch to ensure:
- Fine-Grained Token Management: Exact control over consumption in every interaction, limiting history length to avoid unnecessary costs.
- LLM Orchestration: Full granularity over API calls, without the "black box" of third-party abstractions.
- Bespoke Memory: Implementation of a custom memory architecture, separating short-term transactional history from long-term durable context.
- Security and Isolation: Absolute control over authentication flows and credential storage.
What the Project Solves in Practice
The goal was never a generic chatbot, but a personal operator capable of acting across multiple domains. Today, it centralizes:
- Organization: Capturing and prioritizing tasks and notes in Notion.
- Google Workspace: Calendar management and searching, reading, or sending emails (Gmail).
- Finance & Health: Logging and analyzing expenses, meals (with LLM-generated macro/calorie estimates), and exercises in Notion, as well as metabolism tracking.
- Information: News summaries (via RSS) and contact management.
- Multimodality: Text interactions, document reading, and automatic audio transcription via Whisper (
gpt-4o-mini-transcribe).
The Architecture: Isolated Modules
The golden rule was to keep the interface, runtime, and external integrations strictly separated. Looking at the code, the core components are structured as follows:
run.pyandtelegram_bot.py: Entrypoint and Telegram interface.assistant_connector/: The heart of the project, containing the agentic runtime, tool catalog, and SQLite persistence. Everything is guided by configurations inagents.json.- Domain Connectors: Isolated folders (
notion_connector,gmail_connector,calendar_connector,openai_connector). google_auth_server.py: A background HTTP server dedicated exclusively to processing Google OAuth2 callbacks.

The Runtime Loop
When a message arrives, processing is handed over to runtime.py. The model doesn’t "reply" immediately; it coordinates operations. The loop works like this:
1. Receives the message and understands the context.
2. Calls the model via function calling.
3. Executes the actual tools.
4. Feeds the results back to the model and repeats until it reaches the final action and response.

Tools and Extensibility
The project was designed to be extensible. The tool catalog has no hardcoded intentions. To add a new capability, simply declare the metadata in JSON and implement the handler in Python. It is a simplified and focused version of the architecture we use today at Draiven.
Hybrid Memory: SQLite vs. Markdown Persistence
One of the biggest differentiators of this project is the memory scheme. Instead of a single database, I divided the assistant's knowledge into two layers:
1. Transactional Memory (SQLite)
SQLite acts as the "short-term cortex." It is fast, structured, and ideal for:
- Temporary History: Recent conversation contexts.
- Scheduled Tasks: What needs to be triggered via systemd.
- Security: API keys and credentials encrypted with
Fernet.
2. Persistent Memory (Markdown)
For long-term context, I use Markdown files. This is where the system truly "learns" about me.
- Coherent Reading: The system doesn’t read all files all the time. It evaluates the intent of the conversation and decides, for example, to pull info from
health.mdonly when the topic is health. - Autonomous Writing and Editing: The system decides on its own when new information is relevant enough to be stored permanently or when it should edit an existing memory record.
Building my own memory system was an incredible experience. While it already exists in advanced LLMs, having control over how the bot manages its "life files" brings a level of transparency and security that complex vector databases often hide.
Based on the logic in _select_user_memory_context, the system intelligently selects only the most relevant knowledge to inject into the prompt. As shown in the snippet below, the function scores each memory file against the user’s message by tokenizing the request and checking for keyword overlaps. It applies a manual boost to foundational files like about-me.md, ranks the candidates, and selects only the top matches. This ensures memory retrieval remains lightweight and contextually accurate, preventing the LLM from being overwhelmed by the entire user profile in every interaction.
def _select_user_memory_context(self, user_message: str, user_memories: dict[str, str]) -> str:
if not user_memories:
return ""
query = str(user_message or "").lower()
tokens = set(re.findall(r"[a-z0-9à-ÿ_]{3,}", query))
scored = []
for file_name, content in user_memories.items():
sample = f"{file_name.lower()} {content[:1200].lower()}"
score = 0
if file_name.lower() in ("about-me.md", "about_me.md", "about-user.md", "about_user.md"):
score += 1
score += sum(1 for token in tokens if token in sample)
if score > 0:
scored.append((score, file_name, content))
if not scored:
first_name = next(iter(user_memories))
scored = [(1, first_name, user_memories[first_name])]
scored.sort(key=lambda item: item[0], reverse=True)
selected = scored[:2]
chunks = [
f"### {file_name}\n{self._truncate_text(content, 1400)}"
for _, file_name, content in selected
]
return self._truncate_text("\n\n".join(chunks), self._max_user_memory_chars)
Security, Credentials, and Deployment
This is where the project stopped being a script and became infrastructure.
For credential management, I avoided the standard of putting user keys in .env. The system supports multiple users (authorized via Telegram ID) and credentials are encrypted with Fernet and stored in SQLite per user.
Setup happens via chat: I can use the /setup command to open an interactive panel or simply send a natural message like configure notion_api_key: secret_xxx.
For Google, the flow is zero-friction:
1. I type /google_auth on Telegram.
2. The bot returns an authorization link.
3. The local google_auth_server.py captures the callback, exchanges the code for the token, and saves it securely in the database.

To ensure the assistant operates safely in the real world, the test suite (with over 80% coverage in core modules) blocks external network calls by default. In production, it runs as an Ubuntu service (systemd), ensuring stability.
Conclusion
The most interesting point of this project wasn't just putting a bot on Telegram. It was transforming it into a lightweight and robust personal runtime, with multi-layer memory, real-world task execution, and secure authentication.
In the end, the value isn't in "chatting with AI," but in reducing operational friction. When a quick voice note in traffic becomes a structured task in Notion, calculates the calories of my lunch, or sends a long email (with my confirmation), the assistant stops being a proof of concept and becomes the infrastructure of my routine.