Articles
-
GuidesHow Does OpenClaw Work? A Guided Tour of the Lobster Assistant
OpenClaw (formerly Clawdbot, then Moltbot) is a personal AI assistant that runs on your devices. Here is the architecture: the Gateway control plane, channels, sessions, tools, skills, and the agent loop.
-
GuidesShipping Safe Tooling: Schemas, Validation, and Failure Modes in Tool Calling
A production guide to tool calling safety: designing tight tool contracts, validating outputs, limiting agency, and handling retries, idempotency, and audit logs for tool-using agents.
-
GuidesThe Return of RAG in 2026
RAG is back in 2026 because long context did not solve freshness, permissions, or reliability. Modern RAG looks like search engineering: hybrid retrieval, reranking, and tight evals.
-
OpinionWhy Frontier Models Are Getting More Restrictive
Moderation is no longer a thin filter on top of a chatbot. For frontier labs, it is becoming an end-to-end product and risk system shaped by capability jumps, regulation, and enterprise expectations.
-
GuidesLLM Evals for Chat and Tool-Using Agents: A Practical Guide to Test Suites and Graders
A production-first guide to evaluating chat assistants and tool-using agents with a small, reliable eval suite: datasets, grader types, flake reduction, and CI gates.
-
GuidesVoice Pipelines vs Speech-to-Speech Models: What to Ship for Voice Agents
A practical comparison of cascaded voice pipelines (ASR→LLM→TTS) versus speech-to-speech models, with examples, trade-offs, and code patterns.
-
GuidesOpenAI Codex CLI vs Claude Code: A Practical Harness Comparison for Real Repos
Claude Code is the safe bet. Codex CLI came in strong and is already close in UX. Here is the harness-level comparison: approvals, sandboxing, context, and extensibility.
-
GuidesThe LLM Cost and Scaling Playbook: Cut Your Bill Without Killing Quality
A practical, production-first guide to reducing LLM spend with model routing, token discipline, caching, batching, and rate-limit aware throughput.
-
OpinionStop Defaulting to Python for LLM Apps
If streaming is the default UX, TypeScript is the pragmatic default stack.