How I Use AI Today
After a brief conversation with my elderly parents, I thought it'd be useful to jot these techniques down.
All of the below is based on my current understanding, and subject to change.
Day-to-day chat
General privacy expectation is to have less privacy than web search (AI is more intrusive, more subject to data mining, monitoring, and censorship). Users are exposed to similar levels of pervasive, non-consensual tracking, as using Facebook, Instagram, and YouTube.
- use AI embedded in search via Brave, Google
- free, anonymous chat with Z.ai via VPN: no memories, but impossible for Z to connect usage with your IP address. Zhipu is outside of reach of US subpoenas, unlike almost all other hyperscalers and free internet service providers (including Brave). As a minnow in China, it cannot compete with the Chinese hyperscalers and must hope for commercial survival via international growth. Unlike Moonshot (Kimi K family of models), Zhipu (GLM family of models) offers endpoints to servers located inside China for Chinese users, and in Singapore for non-Chinese users
- Gemini via web interface, iPhone app, all behind a Google login (usage feeds everything Google)
- works really well
- Google is bigger than any single government, therefore carries some tyranny resistance but also many ways for governments and spies to influence it. Somewhat subject to market discipline in terms of privacy, but the market expresses lowest-common-denominator consumer treatment values
- ChatGPT and Anthropic feel creepy (in very different ways)
- personally, am happy to accept Google Takeout control plane for personal information
Research
One magic phrase. Use it without ever changing it, to get more impartial answers: "tldr - steelman - critique"
Doesn't need orchestration of multiple agents. Single ones have always been good enough for my uses
- notebook llm
- Z.ai via vpn, never logging in
- keeping an eye on Grok as a crazy LLM developing muscles to protect itself from information poisoning and attempts to influence
Interactive / Manual Coding
Cursor - mostly use Composer 1.5 - a nice middle ground between Devin, Moonshot Kimi K2, Claude Sonnet, Codex, Gemini
I can't trust models to implement new things. I 99% trust them to write detailed specs after interactive discussion, that I barely read again. Trust individual stages of TDD red-green-refactor loops. The refactor loop is the haziest, a bit of a cargo cult. Unpopular opinion: asking one model to review the other model is a cargo cult
GLM Plan is 90% of Claude subscription at 1/10th the cost, has been difficult in my usage to hit rate limits, but I am inefficiently vigilant about my consumption
Claude Code - preferred tool for big planning + "Ralph loop". Often run it inside Cursor so I can use familiar methods to see what it is doing, but I push myself to use IDE's less. In the long run, expect to use coding IDE's like we use disassemblers today. Not that worried about vendor lock-in for now, because Zhipu (and others!) offers a drop-in replacement plan
OpenCode - useful backup to Claude Code, that enforces generality and offsets vendor lock-in even more
skills / hooks / mcp's / frameworks:
- obra superpowers
- gsd
- inspired by OpenSpec
- custom skills with agent personalities called
architect-staff-engineer,product-engineer,qa-process,product-frontend-designer - adhoc MCPs, built by Claude / GLM itself for specific purposes
- making my own MCPs/skills feels like an inefficient side quest, but I convince myself they can be useful in the future as building blocks / templates
- am paranoid about using others' skills
Agents orchestration
- self-hosted Spacebot + private Discord server
- a swarm of coding agents gets launched, for projects broken down into sequential and parallel tasks, that I instruct via Discord chat interface and a Kanban board
- must fight against tendency to make it about the method of todo lists / organization / marie kondo, rather than actually getting things done / achieving delight
- PA style chat (personal trainer, nutritionist, news curator)
- limits of trainer
- doesn't watch you to identify mistakes that might cause injury
- doesn't motivate you like a real person
- doesn't motivate you like a paid service (sunk cost)
- limits of nutritionist
- the point is to do something quick and cheap
- the LLM confidently guesses reasonable-appearing answers
- might be entirely wrong, in ways that a highly-reviewed $10/month iOS app is more unlikely to be
- limits of "remind me when..."
- agents regularly forget to act, or act based on a different random interpretation of their instructions than they have demonstrated in the past. Oops, sowwy, they say
- actual cron jobs, little bash scripts, little python projects, little bun scripts are deterministic AND basically free rather than pay-per-inference
- limits of trainer
- sounding board chat (rabbi, motivational speaker)
- limits of rabbi
- Wizard of Oz veil experience
- is more beholden to the LLMs training, than any instructions you give it
- less likely to challenge you, than a confident and experienced rabbi
- more likely to make up stuff (and do so convincingly)
- more likely to soothe, than to force you to confront discomfort and grow
- limits of motivational speaker
- agent persona lives in uncanny valley
- like a gym membership, much of value comes from act of paying (a costly commitment)
- and the other value is community, which an agent does not provide
- feels similar to the value of watching episodes of Oprah
- limits of rabbi
- private vs public - Ollama Cloud, Cloudflare,
sensitive data - since context is grabbed by every major LLM provider
- generally feel more comfortable with Anthropic's 30-day retention policy than GPT or Google
- use RunPod serverless instances (also consider Vultr, DO) which scale to zero. Companies sell infrastructure not pure inference APIs, and offer useful templates
- pay for Fireworks private inference
- big, cutting-edge open-source models are available and cost effective
- company is set up to offer HIPAA, SOC-2, and various ISO certifications
- prefer to OpenRouter, OpenCode Zen which anonymize but don't preserve privacy
- prefer to Ollama Cloud, feels more professional. If Ollama goes down or gets hacked, you feel like oh of course, it's a free/ contributor-supported community. If Fireworks falls over, it feels like a more surprising outcome
- Venice.ai is in a weird state of trust-me-bro. Current offering is in between the bad state of privacy, and the ideal state of privacy. And for principled reasons, they lean against offering audits to earn trust.
- also, token pricing exposes the buyer to speculative price risk and variance. You may buy inference at a price that pays for itself over 3 years, or 3 months, or 10 years. Such different outcomes!
Data
- have stopped using Upstash Redis and Cloudflare KV for personal projects. Prefer self-hosting bun + sqlite, lmdb, and cloud Momento which gives access to a sea of data rather than many individual freshwater wells
- still use Neon Postgres regularly, but limited. Question the need for distributed postgres. Don't love Supabase, but for multi-contributor vibe coded projects it's probably better than Neon (?)
- bun + sqlite, Render sqlite, Cloudflare D1, self-hosted small postgres instances
- R2 / B2 / Tigris S3 for distributed storage
Other tools
Try again every so often: Codex (inside Cursor), Gemini CLI just to see what has changed
I'm about 10% python, 70% Typescript, 20% vanilla html + css. Astral uv and bun are awesome and highly agent-friendly. mypy is way more annoying to incorporate into an agentic workflow than tsc -b --noEmit