# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## What this project is A Vanna 2.0 deployment that lets a user ask natural-language questions in Portuguese and get back SQL results from a ClickHouse Cloud database. **Not** a fork of Vanna — the upstream repo is cloned into `vanna/` and installed editable; the application code lives at the project root. Wiring: `OpenAILlmService` (LLM) + `ChromaAgentMemory` (local vector store, persisted to `./chroma_db/`) + `RLSClickHouseRunner` (subclass of upstream `ClickHouseRunner` that injects ClickHouse `additional_table_filters` for tenant isolation against `gold` database in ClickHouse Cloud). Two ways to query the agent: - CLI: `python ask.py "your question"` (uses `StaticUserResolver` with IDs from env/flags) - Web: `uvicorn server:app --port 8765` then embed `` (uses `RequestContextUserResolver` reading IDs from query params) ## ⚠️ Vanna-first rule (READ BEFORE WRITING NEW CODE) **Before adding new code, search `vanna/src/vanna/` and `vanna/frontends/` for an existing solution.** This project deliberately uses upstream Vanna primitives wherever possible — only the items in "Intentional custom code" below are project-specific. When unsure: 1. `grep -r "the-thing-you-want" vanna/src/vanna/` first. 2. Check `vanna/src/vanna/servers/`, `vanna/src/vanna/integrations/`, `vanna/src/vanna/core/`, `vanna/frontends/webcomponent/`. 3. Read `vanna/src/vanna/examples/claude_sqlite_example.py` — closest reference for assembly patterns. 4. Only if there's no built-in equivalent, write custom code in the project root and document why in this section. ### What we use from Vanna (do NOT reimplement) - **Server**: `vanna.servers.fastapi.VannaFastAPIServer` — see `server.py`. Provides `/api/vanna/v2/chat_sse|chat_websocket|chat_poll`, healthcheck, CORS, RequestContext extraction (cookies+headers+query_params+metadata). - **Frontend**: `` Web Component from `vanna/frontends/webcomponent/`. Build artifact at `vanna/frontends/webcomponent/dist/vanna-components.js` (npm-built; gitignored). Floating button via `starting-state="minimized"`. Renders rich components including Plotly charts, sortable/searchable tables, code blocks, status cards, progress bars. - **Tools**: `RunSqlTool` + `VisualizeDataTool` from `vanna.tools` — share a `LocalFileSystem(working_directory="./data_storage")`; SQL writes CSV → viz reads CSV → emits `chart` rich component the web component renders via Plotly. - **Memory tools**: `SearchSavedCorrectToolUsesTool` + `SaveQuestionToolArgsTool` + `SaveTextMemoryTool` from `vanna.tools` — fecham o loop self-learning. ChromaAgentMemory agora também é escrito pelo LLM em runtime (par pergunta→args do `run_sql` após sucesso), não só pelo `train.py` (schema docs offline). Orientação de uso vive em `system_prompt.py` na seção "Memória". - **Memory**: `vanna.integrations.chromadb.ChromaAgentMemory` for vector store; `agent_memory.save_text_memory(content, context)` is the canonical write API (usada por `train.py`; o LLM usa o tool `save_text_memory` em runtime). - **User model**: `vanna.User` — has `ConfigDict(extra="allow")` (`vanna/src/vanna/core/user/models.py:25`) so we can attach `program_id`/`store_id` as ad-hoc fields without subclassing. - **Resolver ABC**: `vanna.core.user.UserResolver` — base class for both `StaticUserResolver` and `RequestContextUserResolver`. - **RequestContext**: extracted automatically by the server (`vanna/src/vanna/servers/fastapi/routes.py:46-52`) — read from `query_params`/`cookies`/`headers`/`metadata` in the resolver. - **SQL runner base**: `vanna.integrations.clickhouse.ClickHouseRunner` — we subclass it, never modify upstream. - **CLI server runner**: `vanna.servers.cli.server_runner` (the `vanna serve` command) — we don't use it directly, but it's the reference for how to wire FastAPI + frontend bundle. ### Intentional custom code (Vanna has no equivalent) - `TenantAwareChromaMemory` (`tenant_memory.py`) — Vanna's `ChromaAgentMemory` é single-collection, sem scoping. Vazaria perguntas/aprendizados entre tenants (program × store) que compartilham o mesmo deploy. Esta classe compõe duas instâncias: collection compartilhada `vanna_clickhouse_gold` pra text memories (schema docs do `train.py` — comum a todos), e collection `vanna_clickhouse_gold__p__s` lazy-criada por (program_id, store_id) pra tool-usage memories. Roteamento por tipo de memória: text → shared, tool → tenant. `train.py` não muda — escreve só text memories na shared. - `RLSClickHouseRunner` (`rls_runner.py`) — Vanna's `ClickHouseRunner` doesn't support per-query settings. We override `run_sql` to inject `additional_table_filters` via `client.query(sql, settings=...)`. The `ToolRegistry.transform_args()` hook is Vanna's "official" RLS extension point but only allows arg rewriting / rejection — it can't reach `clickhouse_connect` settings, so we keep the runner subclass. - `_FORBIDDEN_SCHEMA_RE` / `_INTROSPECTION_STMT_RE` (`rls_runner.py`) — regex guards no topo de `run_sql` que rejeitam SQL contra `system.*` / `information_schema.*` ou statements `SHOW`/`DESCRIBE`/`EXPLAIN`. ClickHouse Cloud não enforça REVOKE column-level em `system.tables` (acesso herdado de role default), então blindamos app-side. Sem isso, o LLM lia `system.tables.create_table_query` e via DDL com colunas revogadas. - `_format_table_filter_map` / `_quote` (`rls_runner.py`) — `clickhouse_connect` serializes Python dicts as JSON (double quotes) but ClickHouse's Map literal needs single quotes with `''` escape. Hand-built literal is the workaround. - `_round_decimal_columns` (`rls_runner.py`) — após cada query, percorre as colunas do `pd.DataFrame` e arredonda valores `Decimal`/`float` pra 2 casas (constante `_DISPLAY_DECIMALS`). Necessário porque ClickHouse retorna `Decimal(N, 6)` por padrão e o dataframe rich component renderiza via `value.toLocaleString()` (`vanna/frontends/webcomponent/src/components/rich-component-system.ts:624`) sem rounding — sem isso, `307427.030000` aparecia na tabela. Patchar o renderer upstream exigiria rebuild npm + perde no `git pull`; arredondar app-side cobre todas as queries sem depender do LLM emitir `round(..., 2)` no SELECT. Colunas integer/string/datetime ficam intactas (verificação via `pd.api.types.is_float_dtype` + sample do object dtype). CSV escrito pelo `RunSqlTool` herda os valores arredondados — visualize_data tudo bem porque charts já arredondam exibição. - `StaticUserResolver` / `RequestContextUserResolver` (`agent.py`) — Vanna ships only the `UserResolver` ABC, no concrete implementations. - `system_prompt.py` — Vanna tem `DefaultSystemPromptBuilder(base_prompt=...)` mas nenhum prompt domain-specific. Constante `SYSTEM_PROMPT` injeta regras pt-BR + confidencialidade + escopo de loja única + métricas padronizadas + formatação R$/L. Edita o arquivo direto pra iterar; sem rebuild, sem re-train. - `VisualizeDataToolPT` + `ClubPetroChartGenerator` (`viz_tool.py`) — subclasse de `VisualizeDataTool` com três customizações sobre upstream: (1) novo arg opcional `chart_type` (`line`/`bar`/`scatter`/`histogram`/`area`) no schema (`VisualizeDataArgsPT`) — quando o LLM passa, força o tipo via `_render_forced`; quando omite, cai na heurística como fallback. Threadado via `contextvars.ContextVar` no `execute` pra suportar execuções concorrentes do tool singleton (Vanna roda async). (2) `description` PT-BR com gatilhos por tipo de pergunta (ranking → `chart_type='bar'`, série → `'line'`, etc.). (3) `ClubPetroChartGenerator` substitui o default `PlotlyChartGenerator` upstream — dropa o fallback "4+ colunas → go.Table" (`vanna/src/vanna/integrations/plotly/chart_generator.py:51-55`) que duplicava o dataframe rich; aplica `_coerce_datetime_columns` (string ISO → datetime64) nos DOIS caminhos (<4 e >=4 cols), pra que séries temporais sejam detectadas independente do nº de colunas; e usa `_create_ranked_bar_chart` próprio (em vez do `_create_bar_chart` upstream que re-agrega com `groupby` e perde a ordenação descendente do ranking). - `RLS_INTERNAL_COLS` (`train.py`) — `program_id`/`store_id` precisam estar no GRANT (RLS depende deles), mas escondemos da doc do ChromaDB pra não tentar o LLM a usá-los manualmente nas queries. - `train.py` — Vanna has `AgentMemory.save_text_memory` but no schema crawler. Usa apenas `system.columns` (filtrado pelo ClickHouse via GRANT column-level, sem dependência de `system.tables` nem `SHOW CREATE TABLE`) e emite uma memória de texto por tabela em `RLS_TABLES`. Sem DDL bruto pra evitar vazamento de colunas revogadas. - `ask.py` — Vanna has only the `vanna serve` CLI; no ad-hoc `vanna ask`. This is a thin async wrapper for terminal use. - `local_request_context()` (`agent.py`) — 1-line factory because Vanna has no default `RequestContext` constructor. - `csv_cleanup.py` — Vanna upstream não tem GC pros CSVs que `RunSqlTool` escreve em `data_storage/`. O `LocalFileSystem.write_file` só escreve, nunca apaga; sem isso o disco cresce linearmente com o tráfego. Módulo standalone com `sweep_once()` (síncrono, idempotente — apaga `query_results_*.csv` com mtime > `CSV_TTL_SECONDS`, default 1800s) e tarefa asyncio periódica (`CSV_SWEEP_INTERVAL_SECONDS`, default 600s). `server.py` pluga via `on_event("startup"/"shutdown")` — sweep no boot pra recolher legado de runs anteriores e armar o loop de fundo. CLI (`ask.py`) não roda cleanup em tempo real; depende do próximo boot do servidor. - `static/vanna-embed-bootstrap.js` — fonte única do JS de wiring exigido por todo embed do ``: theme pierce (monkey-patch do setter de `adoptedStyleSheets` pra forçar o `themeSheet` sempre como último item — Lit re-assina depois do `attachShadow` e venceria a cascade sem isso); tradutor PT-BR via MutationObserver em todo shadow root novo; markdown processor pros balões (`escapeHtml` → code → bold/italic → links → **headers ANTES de listas** porque a regex de lista consome o `\n` final e quebraria header subsequente → `\n` → `
` → cleanup `
` adjacente a blocos); load de fonts; injeção do bundle. Servido em `/vanna-embed-bootstrap.js` (rota explícita no `server.py`). Expõe `window.VannaEmbed.ensureLoaded({ baseUrl, extraCss? }) -> Promise` (idempotente). Antes desta extração, a app React (`clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts`) duplicava ~250 linhas idênticas; toda correção tinha que ser feita em dois lugares e divergia (foi exatamente o que aconteceu com fix de markdown headers — só pegou o demo, app React continuou quebrada). Agora `embed-demo.html` e `vannaChatLoader.ts` são thin wrappers que injetam `

``` Fonts são carregadas pelo bootstrap (`loadFontsOnce`) — o cliente não precisa mais incluir `` de Google Fonts manualmente. Bundle (`vanna-components.js`) também é injetado pelo bootstrap (`injectBundle`). ### Filtragem + tradução de chunks (chat_filter.py) `FilteringChatHandler` (subclasse de `vanna.servers.base.ChatHandler`) é injetado em `VannaFastAPIServer.chat_handler` antes de `create_app()`. Ele intercepta o stream de `ChatStreamChunk` e: 1. **Whitelist de `rich.type`** — drops chunks cujo tipo não está em `ALLOWED_RICH_TYPES = {text, dataframe, chart, status_bar_update, chat_input_update}`. Tipos extras que o agent emite (`status_card`, `task_tracker_update`, `notification`, `log_viewer`, `progress_display`, etc.) somem antes de virar SSE. 2. **Tradução PT** — strings hardcoded em inglês emitidas por `vanna/src/vanna/core/agent/agent.py` (`Response complete`, `Ready for next message`, `Processing your request...`, etc.) são substituídas via tabela `TRANSLATIONS` exact-match nos campos `message`, `detail`, `placeholder` do `rich.data`. Strings que caem fora da tabela (incluindo dinâmicas tipo `Running 3 tools`) passam intocadas — adicionar entradas conforme aparecerem no chat. Charts: o `ClubPetroChartGenerator` (`viz_tool.py`) substitui o default upstream e dropa o `go.Table` fallback de 4+ colunas, então não tem dataframe duplicado visualmente. O LLM controla o tipo via arg `chart_type` no `visualize_data` (line/bar/scatter/histogram/area); quando omite, cai numa heurística por shape do CSV. Pra adicionar mais tipos permitidos ou novas traduções: editar `chat_filter.py`. Sem rebuild de bundle, sem mexer em upstream. ### Web server (server.py) `server.py` is ~60 lines: `VannaFastAPIServer(agent, config={cors:..., api_base_url:""}).create_app()`, mounts `/static/` to `vanna/frontends/webcomponent/dist/`, adiciona rotas estáticas pra arquivos do project-root `static/`: - `/vanna-theme.css` — tema CSS adoptado em todo shadow root. - `/vanna-embed-bootstrap.js` — JS único de wiring (theme pierce + tradutor + markdown + bundle loader). Consumido por `embed-demo.html` e pela app React. - `/clubpetro-logo.{png,svg}`, `/dashboard-bg.png` — assets. - `/embed-demo.html` — smoke test page que substitui `__PROGRAM_ID__` / `__STORE_ID__` / `__USER_ID__` placeholders com valores do `.env` em runtime (lido fresh a cada request — sem restart pra mudanças em HTML). All chat routes (`/api/vanna/v2/chat_sse|chat_websocket|chat_poll`) come from upstream — do not redefine them. The web component (``) sends `program_id` / `store_id` as query params on the endpoint URL: `sse-endpoint="/api/vanna/v2/chat_sse?program_id=X&store_id=Y"`. The upstream server populates `RequestContext.query_params` from `dict(http_request.query_params)`, so `RequestContextUserResolver` picks them up. ### Training flow (train.py) Vanna 2.0 has no separate "training" API. Schema knowledge is injected by saving **text memories** into the same `ChromaAgentMemory` instance the agent reads from. `DefaultLlmContextEnhancer` (auto-wired when no enhancer is passed) retrieves the top-k similar text memories and appends them to the system prompt on every turn. `train.py` itera tabelas em `RLS_TABLES`, lê `system.columns` (que o ClickHouse já filtra por GRANT column-level — não precisa privilege em `system.tables`), filtra colunas em `RLS_INTERNAL_COLS = {"program_id", "store_id"}` pra escondê-las do contexto do LLM, e salva uma memória de texto por tabela com lista de colunas + 3 sample rows (sem DDL, sem `SHOW CREATE TABLE`). Re-rodar após: - mudança de GRANT no ClickHouse (coluna nova → aparece automaticamente; coluna revogada → some). - edição de `RLS_TABLES`. - edição de `RLS_INTERNAL_COLS`. Sequência: `rm -rf chroma_db/ && python train.py`. When constructing a `ToolContext` manually (as in `train.py`), the `agent_memory=` field is required by Pydantic. ### CLI (ask.py) `argparse` with `--program-id` / `--store-id` flags + positional question. Calls `build_agent(program_id, store_id)`. `Agent.send_message` is an async generator yielding `UiComponent` objects. Each component has both a `rich_component` (structured for the web UI) and a `simple_component` (text fallback). The CLI prefers `rich_component.content`, falls back to `simple_component.text`. ## Non-obvious gotchas - **`ToolRegistry` API**: use `register_local_tool(tool, access_groups=[])` — there is no plain `.register()` method despite what some upstream examples suggest. - **Empty `access_groups=[]`** means the tool is accessible to all users; non-empty is a permission allowlist. - **`additional_table_filters` serialization**: `clickhouse_connect` serializes Python `dict` settings as JSON (double quotes), which ClickHouse rejects with `CANNOT_PARSE_QUOTED_STRING (code 26)`. The Map value must be passed as a pre-formatted string literal with single quotes and `''` escape — see `_format_table_filter_map` in `rls_runner.py`. Don't pass a `dict` to `client.query(..., settings={"additional_table_filters": {...}})`. - **ChromaDB first run** downloads ~80 MB ONNX embedding model to `~/.cache/chroma/`. Cached afterwards. If running offline, pre-download or pass a custom `embedding_function`. - **Editable extras on macOS**: `pip install -e './vanna[clickhouse]'` fails with `non-local file URIs are not supported`. Install the editable package and the extras separately. - **Python 3.9** triggers deprecation warnings from `clickhouse-connect` and `urllib3`/LibreSSL. Not blocking. If upgrading the Python interpreter, recreate the venv. - **`SELECT *` on `gold.sales` falha com ACCESS_DENIED** — `wren_ia` tem column-level GRANT, então `SELECT *` exige SELECT em todas as colunas (inclusive as 9 revogadas) e ClickHouse rejeita. Listar colunas explicitamente. (Antes do GRANT column-level, `RunSqlTool` salvava o full result em `./data_storage//query_results_*.csv` — esse fluxo continua válido para queries com colunas explícitas.) - **REVOKE column-level em `system.*` é silenciosamente ignorado em ClickHouse Cloud** — `REVOKE SELECT(create_table_query) ON system.tables FROM wren_ia` retorna "succeeded" no parser mas runtime continua liberando a leitura. Defesa real é app-side via regex no `RLSClickHouseRunner`. - **Upstream examples are not all current**: `openai_quickstart.py` imports `OpenAILlmService` from `vanna.integrations.anthropic` (a bug); `mock_sqlite_example.py` calls `agent.send_message(user=...)` but the real signature now takes `request_context=`. Trust `claude_sqlite_example.py` and `vanna/src/vanna/core/agent/agent.py` over the others. ## Database ClickHouse Cloud, database `gold`, accessed over HTTPS port 8443 with `secure=true`. **Tabela treinada**: `gold.sales` — analytical sales fact (data: fuel/retail, `CLUBE ALE` / `POP FIDELIDADE`). Filtrada por `program_id` + `store_id` por request via RLS. **GRANT column-level em `wren_ia`** — 20 colunas das 29 originais: - `program_id`, `store_id` — necessárias pro RLS funcionar (mas escondidas do LLM via `RLS_INTERNAL_COLS`). - `sale_id`, `cartao_do_cliente`, `nome_da_rede`, `nome_da_loja`, `tipo_loja`, `categoria_loja`, `nome_do_cliente`, `categoria_do_cliente`, `nome_do_atendente`, `produto`, `categoria_do_produto`, `quantidade_total_produto`, `valor_total_produto`, `desconto`, `pontuacao_produto`, `voucher_aplicado_venda`, `data_da_compra`, `fidelizada`, `e_combustivel` — visíveis ao LLM (18). 9 colunas REVOGADAS (não aparecem em `system.columns`, geram `ACCESS_DENIED` se o LLM tentar): `customer_id`, `customer_category_id`, `sale_product_id`, `product_id`, `product_category_id`, `attendant_id`, `rfid_do_atendente`, `sync_date`. Conferir grants atuais via `SHOW GRANTS FOR wren_ia` no console ClickHouse Cloud (admin) ou `SELECT * FROM system.grants WHERE user_name = 'wren_ia'`. **Tabela negada**: `gold.vw_relatorios_exportaveis_analitico_vendas` — exportable view; **denied** at runner level (`'0'` filter) and at ClickHouse grant level (sem SELECT). Credentials live in `.env` (gitignored). RLS values too: `RLS_PROGRAM_ID`, `RLS_STORE_ID` (CLI defaults). Optional: `OPENAI_TEMPERATURE` (default 1.0). Both `agent.py` and `train.py` call `load_dotenv()` at import. ## Where to look in upstream Server / frontend: - `vanna/src/vanna/servers/fastapi/app.py` — `VannaFastAPIServer` factory - `vanna/src/vanna/servers/fastapi/routes.py` — chat SSE/WS/poll endpoints; auto-extracts `RequestContext` from cookies/headers/query_params (line 46-52) - `vanna/src/vanna/servers/base/templates.py` — index HTML used by `GET /` (login + `` embed) - `vanna/frontends/webcomponent/src/components/vanna-chat.ts` — Web Component attributes (`api-base`, `sse-endpoint`, `starting-state`, `theme`, etc.) - `vanna/frontends/webcomponent/src/services/api-client.ts` — confirms URL construction is naive `${baseUrl}${endpoint}` concat (so query strings on `sse-endpoint` work) Core: - `vanna/src/vanna/core/agent/agent.py` — `Agent` class, `send_message` signature, required init params - `vanna/src/vanna/core/registry.py` — `ToolRegistry`; the `transform_args()` hook (line 113-142) is the "official" RLS extension point but only handles arg rewriting / `ToolRejection`, not query settings - `vanna/src/vanna/core/system_prompt/default.py:47-48` — `DefaultSystemPromptBuilder.build_system_prompt`: quando `base_prompt` é não-nulo retorna ele direto, descartando o prompt default do Vanna. - `vanna/src/vanna/core/user/{models,resolver,request_context}.py` — `User` (with `extra="allow"`), `UserResolver` ABC, `RequestContext` - `vanna/src/vanna/integrations/{openai,chromadb,clickhouse}/` — integrations this project uses - `vanna/src/vanna/integrations/plotly/chart_generator.py:51` — heurística "4+ colunas → go.Table" do default generator. Substituímos via `ClubPetroChartGenerator` em `viz_tool.py`. - `vanna/src/vanna/tools/run_sql.py` — `RunSqlTool` behavior (truncation, CSV side-output) - `vanna/src/vanna/examples/claude_sqlite_example.py` — closest working reference for the assembly pattern - `vanna/MIGRATION_GUIDE.md` — only relevant if migrating Vanna 0.x code (not used here)