vanna-clubpetro/CLAUDE.md
leonardosalazar-cp 1d152c0dce Initial commit: Vanna 2.0 deployment for ClubPetro
Wrapper application around upstream Vanna with:
- Tenant-aware ChromaDB memory (per program/store)
- ClickHouse RLS runner with introspection guards
- PT-BR system prompt and chat translations
- Custom Plotly chart generator (ranked bar, datetime coercion)
- Embed bootstrap (theme pierce + i18n + markdown) shared by demo and React app
- Event sink for chat turn observability
2026-04-29 17:22:05 -03:00

257 lines
30 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What this project is
A Vanna 2.0 deployment that lets a user ask natural-language questions in Portuguese and get back SQL results from a ClickHouse Cloud database. **Not** a fork of Vanna — the upstream repo is cloned into `vanna/` and installed editable; the application code lives at the project root.
Wiring: `OpenAILlmService` (LLM) + `ChromaAgentMemory` (local vector store, persisted to `./chroma_db/`) + `RLSClickHouseRunner` (subclass of upstream `ClickHouseRunner` that injects ClickHouse `additional_table_filters` for tenant isolation against `gold` database in ClickHouse Cloud).
Two ways to query the agent:
- CLI: `python ask.py "your question"` (uses `StaticUserResolver` with IDs from env/flags)
- Web: `uvicorn server:app --port 8765` then embed `<vanna-chat>` (uses `RequestContextUserResolver` reading IDs from query params)
## ⚠️ Vanna-first rule (READ BEFORE WRITING NEW CODE)
**Before adding new code, search `vanna/src/vanna/` and `vanna/frontends/` for an existing solution.** This project deliberately uses upstream Vanna primitives wherever possible — only the items in "Intentional custom code" below are project-specific.
When unsure:
1. `grep -r "the-thing-you-want" vanna/src/vanna/` first.
2. Check `vanna/src/vanna/servers/`, `vanna/src/vanna/integrations/`, `vanna/src/vanna/core/`, `vanna/frontends/webcomponent/`.
3. Read `vanna/src/vanna/examples/claude_sqlite_example.py` — closest reference for assembly patterns.
4. Only if there's no built-in equivalent, write custom code in the project root and document why in this section.
### What we use from Vanna (do NOT reimplement)
- **Server**: `vanna.servers.fastapi.VannaFastAPIServer` — see `server.py`. Provides `/api/vanna/v2/chat_sse|chat_websocket|chat_poll`, healthcheck, CORS, RequestContext extraction (cookies+headers+query_params+metadata).
- **Frontend**: `<vanna-chat>` Web Component from `vanna/frontends/webcomponent/`. Build artifact at `vanna/frontends/webcomponent/dist/vanna-components.js` (npm-built; gitignored). Floating button via `starting-state="minimized"`. Renders rich components including Plotly charts, sortable/searchable tables, code blocks, status cards, progress bars.
- **Tools**: `RunSqlTool` + `VisualizeDataTool` from `vanna.tools` — share a `LocalFileSystem(working_directory="./data_storage")`; SQL writes CSV → viz reads CSV → emits `chart` rich component the web component renders via Plotly.
- **Memory tools**: `SearchSavedCorrectToolUsesTool` + `SaveQuestionToolArgsTool` + `SaveTextMemoryTool` from `vanna.tools` — fecham o loop self-learning. ChromaAgentMemory agora também é escrito pelo LLM em runtime (par pergunta→args do `run_sql` após sucesso), não só pelo `train.py` (schema docs offline). Orientação de uso vive em `system_prompt.py` na seção "Memória".
- **Memory**: `vanna.integrations.chromadb.ChromaAgentMemory` for vector store; `agent_memory.save_text_memory(content, context)` is the canonical write API (usada por `train.py`; o LLM usa o tool `save_text_memory` em runtime).
- **User model**: `vanna.User` — has `ConfigDict(extra="allow")` (`vanna/src/vanna/core/user/models.py:25`) so we can attach `program_id`/`store_id` as ad-hoc fields without subclassing.
- **Resolver ABC**: `vanna.core.user.UserResolver` — base class for both `StaticUserResolver` and `RequestContextUserResolver`.
- **RequestContext**: extracted automatically by the server (`vanna/src/vanna/servers/fastapi/routes.py:46-52`) — read from `query_params`/`cookies`/`headers`/`metadata` in the resolver.
- **SQL runner base**: `vanna.integrations.clickhouse.ClickHouseRunner` — we subclass it, never modify upstream.
- **CLI server runner**: `vanna.servers.cli.server_runner` (the `vanna serve` command) — we don't use it directly, but it's the reference for how to wire FastAPI + frontend bundle.
### Intentional custom code (Vanna has no equivalent)
- `TenantAwareChromaMemory` (`tenant_memory.py`) — Vanna's `ChromaAgentMemory` é single-collection, sem scoping. Vazaria perguntas/aprendizados entre tenants (program × store) que compartilham o mesmo deploy. Esta classe compõe duas instâncias: collection compartilhada `vanna_clickhouse_gold` pra text memories (schema docs do `train.py` — comum a todos), e collection `vanna_clickhouse_gold__p<program>__s<store>` lazy-criada por (program_id, store_id) pra tool-usage memories. Roteamento por tipo de memória: text → shared, tool → tenant. `train.py` não muda — escreve só text memories na shared.
- `RLSClickHouseRunner` (`rls_runner.py`) — Vanna's `ClickHouseRunner` doesn't support per-query settings. We override `run_sql` to inject `additional_table_filters` via `client.query(sql, settings=...)`. The `ToolRegistry.transform_args()` hook is Vanna's "official" RLS extension point but only allows arg rewriting / rejection — it can't reach `clickhouse_connect` settings, so we keep the runner subclass.
- `_FORBIDDEN_SCHEMA_RE` / `_INTROSPECTION_STMT_RE` (`rls_runner.py`) — regex guards no topo de `run_sql` que rejeitam SQL contra `system.*` / `information_schema.*` ou statements `SHOW`/`DESCRIBE`/`EXPLAIN`. ClickHouse Cloud não enforça REVOKE column-level em `system.tables` (acesso herdado de role default), então blindamos app-side. Sem isso, o LLM lia `system.tables.create_table_query` e via DDL com colunas revogadas.
- `_format_table_filter_map` / `_quote` (`rls_runner.py`) — `clickhouse_connect` serializes Python dicts as JSON (double quotes) but ClickHouse's Map literal needs single quotes with `''` escape. Hand-built literal is the workaround.
- `_round_decimal_columns` (`rls_runner.py`) — após cada query, percorre as colunas do `pd.DataFrame` e arredonda valores `Decimal`/`float` pra 2 casas (constante `_DISPLAY_DECIMALS`). Necessário porque ClickHouse retorna `Decimal(N, 6)` por padrão e o dataframe rich component renderiza via `value.toLocaleString()` (`vanna/frontends/webcomponent/src/components/rich-component-system.ts:624`) sem rounding — sem isso, `307427.030000` aparecia na tabela. Patchar o renderer upstream exigiria rebuild npm + perde no `git pull`; arredondar app-side cobre todas as queries sem depender do LLM emitir `round(..., 2)` no SELECT. Colunas integer/string/datetime ficam intactas (verificação via `pd.api.types.is_float_dtype` + sample do object dtype). CSV escrito pelo `RunSqlTool` herda os valores arredondados — visualize_data tudo bem porque charts já arredondam exibição.
- `StaticUserResolver` / `RequestContextUserResolver` (`agent.py`) — Vanna ships only the `UserResolver` ABC, no concrete implementations.
- `system_prompt.py` — Vanna tem `DefaultSystemPromptBuilder(base_prompt=...)` mas nenhum prompt domain-specific. Constante `SYSTEM_PROMPT` injeta regras pt-BR + confidencialidade + escopo de loja única + métricas padronizadas + formatação R$/L. Edita o arquivo direto pra iterar; sem rebuild, sem re-train.
- `VisualizeDataToolPT` + `ClubPetroChartGenerator` (`viz_tool.py`) — subclasse de `VisualizeDataTool` com três customizações sobre upstream: (1) novo arg opcional `chart_type` (`line`/`bar`/`scatter`/`histogram`/`area`) no schema (`VisualizeDataArgsPT`) — quando o LLM passa, força o tipo via `_render_forced`; quando omite, cai na heurística como fallback. Threadado via `contextvars.ContextVar` no `execute` pra suportar execuções concorrentes do tool singleton (Vanna roda async). (2) `description` PT-BR com gatilhos por tipo de pergunta (ranking → `chart_type='bar'`, série → `'line'`, etc.). (3) `ClubPetroChartGenerator` substitui o default `PlotlyChartGenerator` upstream — dropa o fallback "4+ colunas → go.Table" (`vanna/src/vanna/integrations/plotly/chart_generator.py:51-55`) que duplicava o dataframe rich; aplica `_coerce_datetime_columns` (string ISO → datetime64) nos DOIS caminhos (<4 e >=4 cols), pra que séries temporais sejam detectadas independente do nº de colunas; e usa `_create_ranked_bar_chart` próprio (em vez do `_create_bar_chart` upstream que re-agrega com `groupby` e perde a ordenação descendente do ranking).
- `RLS_INTERNAL_COLS` (`train.py`) — `program_id`/`store_id` precisam estar no GRANT (RLS depende deles), mas escondemos da doc do ChromaDB pra não tentar o LLM a usá-los manualmente nas queries.
- `train.py` — Vanna has `AgentMemory.save_text_memory` but no schema crawler. Usa apenas `system.columns` (filtrado pelo ClickHouse via GRANT column-level, sem dependência de `system.tables` nem `SHOW CREATE TABLE`) e emite uma memória de texto por tabela em `RLS_TABLES`. Sem DDL bruto pra evitar vazamento de colunas revogadas.
- `ask.py` — Vanna has only the `vanna serve` CLI; no ad-hoc `vanna ask`. This is a thin async wrapper for terminal use.
- `local_request_context()` (`agent.py`) — 1-line factory because Vanna has no default `RequestContext` constructor.
- `csv_cleanup.py` — Vanna upstream não tem GC pros CSVs que `RunSqlTool` escreve em `data_storage/`. O `LocalFileSystem.write_file` só escreve, nunca apaga; sem isso o disco cresce linearmente com o tráfego. Módulo standalone com `sweep_once()` (síncrono, idempotente — apaga `query_results_*.csv` com mtime > `CSV_TTL_SECONDS`, default 1800s) e tarefa asyncio periódica (`CSV_SWEEP_INTERVAL_SECONDS`, default 600s). `server.py` pluga via `on_event("startup"/"shutdown")` — sweep no boot pra recolher legado de runs anteriores e armar o loop de fundo. CLI (`ask.py`) não roda cleanup em tempo real; depende do próximo boot do servidor.
- `static/vanna-embed-bootstrap.js` — fonte única do JS de wiring exigido por todo embed do `<vanna-chat>`: theme pierce (monkey-patch do setter de `adoptedStyleSheets` pra forçar o `themeSheet` sempre como último item — Lit re-assina depois do `attachShadow` e venceria a cascade sem isso); tradutor PT-BR via MutationObserver em todo shadow root novo; markdown processor pros balões (`escapeHtml` → code → bold/italic → links → **headers ANTES de listas** porque a regex de lista consome o `\n` final e quebraria header subsequente → `\n``<br>` → cleanup `<br>` adjacente a blocos); load de fonts; injeção do bundle. Servido em `/vanna-embed-bootstrap.js` (rota explícita no `server.py`). Expõe `window.VannaEmbed.ensureLoaded({ baseUrl, extraCss? }) -> Promise<void>` (idempotente). Antes desta extração, a app React (`clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts`) duplicava ~250 linhas idênticas; toda correção tinha que ser feita em dois lugares e divergia (foi exatamente o que aconteceu com fix de markdown headers — só pegou o demo, app React continuou quebrada). Agora `embed-demo.html` e `vannaChatLoader.ts` são thin wrappers que injetam `<script src=...>` e chamam `ensureLoaded`. Edição do bootstrap reflete em ambos automaticamente após hard-refresh (sem rebuild npm).
- `EventSink` + `TurnRecord` + ContextVar `_current_turn` (`events_sink.py`) — grava 1 row por turno de chat em `events.vanna_ai` (ClickHouse). Vanna tem `observability_provider` (spans/metrics) mas não tem persistência estruturada de interação fim-a-fim. `chat_filter.py` cria o TurnRecord no início do `handle_stream`, captura `question` (de `request.message`), `program_id`/`store_id`/`user_id` (de `request_context.query_params` — frontend do web app deve passar os 3 como query string nos endpoints `sse-endpoint`/`ws-endpoint`/`poll-endpoint` do `<vanna-chat>`; `user_id` defaulta a string vazia se ausente, sem quebrar o chat), `response` (concatenando rich.type=='text' chunks), `status`/`error_message` (do try/except). Hooks downstream escrevem no mesmo TurnRecord via ContextVar: `EventCapturingToolRegistry.transform_args` (`agent.py`) registra cada tool call e captura `args.sql` quando `tool.name == "run_sql"`; `VisualizeDataToolPT.execute` registra (chart_type, title). Flush no `finally` do stream via `EventSink.flush` (`asyncio.to_thread` pro `client.insert` sync; try/except envolve tudo — falha de insert nunca quebra a resposta ao usuário). Cliente CH lazy, mesmas creds do RLS runner — `wren_ia` precisa `GRANT INSERT ON events.vanna_ai`.
## Commands
All commands assume the venv is active:
```
source venv/bin/activate
```
Common workflows:
```
python ask.py "your question" # CLI (uses .env RLS_PROGRAM_ID/RLS_STORE_ID)
python ask.py --program-id <id> --store-id <id> "..." # CLI with override
python train.py # re-extract schema (only RLS_TABLES) into ChromaDB
python test_clickhouse.py # raw connectivity smoke test (no LLM)
uvicorn server:app --host 127.0.0.1 --port 8765 # web server
```
Web component build (one-time, after `git pull` of upstream or after upstream version bump):
```
cd vanna/frontends/webcomponent
npm install
npm run build
# outputs vanna/frontends/webcomponent/dist/vanna-components.js (~7.5 MB)
```
Re-installing or pulling upstream Vanna changes:
```
cd vanna && git pull && cd ..
pip install -e ./vanna
```
Adding more LLM/DB/vector extras (Vanna defines them in `vanna/pyproject.toml`):
```
pip install <package> # extras shorthand `pip install -e './vanna[xxx]'` fails on macOS file URI
```
## Architecture
### Agent assembly (agent.py)
`build_agent(program_id=None, store_id=None, user_resolver=None)` is the single source of truth. Vanna 2.0's `Agent.__init__` requires **both** `agent_memory` and `user_resolver` — no defaults. Two resolver flavors:
- `StaticUserResolver` — fixed `User` from env/flags. CLI default.
- `RequestContextUserResolver` — reads `program_id`/`store_id` from `request_context.query_params` (with `metadata` fallback) per request. Validates with `_require_id` (regex `^[A-Za-z0-9_-]+$`); raises `PermissionError` on missing/invalid. Web default.
`ChromaAgentMemory(persist_directory="./chroma_db", collection_name="vanna_clickhouse_gold")` is shared by both.
Agent uses `temperature=float(os.environ.get("OPENAI_TEMPERATURE", "1.0"))`. Default 1.0 mantém compat com modelos de reasoning/`gpt-5*` que rejeitam outros valores. Pra modelos que aceitam ajuste (ex.: `gpt-4o`), set `OPENAI_TEMPERATURE=0.2` no `.env` pra mais determinismo na geração de SQL.
System prompt customizado vem de `system_prompt.SYSTEM_PROMPT` via `DefaultSystemPromptBuilder(base_prompt=...)`. Quando `base_prompt` é não-nulo (`vanna/src/vanna/core/system_prompt/default.py:47-48`), substitui o prompt default do Vanna; o `LlmContextEnhancer` ainda anexa as memórias retrievadas do ChromaDB depois.
### RLS (rls_runner.py)
`RLSClickHouseRunner` extends `ClickHouseRunner`. Em `run_sql` faz dois trabalhos:
**1. Bloqueio app-side de introspecção** (regex guards no topo da função, antes de tudo):
- `_FORBIDDEN_SCHEMA_RE` — rejeita SQL referenciando `system.*` ou `information_schema.*`. ClickHouse Cloud não enforça REVOKE column-level em `system.tables` (acesso vem de role default não-revogável), então é app-side ou nada.
- `_INTROSPECTION_STMT_RE` — rejeita statements `SHOW`/`DESCRIBE`/`EXPLAIN`. Sem isso, o LLM bypassava `system.*` via `SHOW TABLES FROM gold` ou `DESCRIBE TABLE gold.sales` e descobria colunas revogadas.
Ambos levantam `PermissionError` com mensagem orientando o LLM a usar o contexto treinado.
**2. Injeção de RLS** (`additional_table_filters`):
- `RLS_TABLES = ("gold.sales",)` — receives a `program_id = '...' AND store_id = '...'` filter built from `context.user.program_id` / `store_id`.
- `DENIED_TABLES = ("gold.vw_relatorios_exportaveis_analitico_vendas",)` — receives the literal `0` filter (zero rows). Defense in depth — the ClickHouse user `wren_ia` already lacks SELECT grant on the view.
`train.py` imports `RLS_TABLES` so the trained schema docs match the RLS scope (single source of truth).
The filter expression is built as a hand-formatted ClickHouse Map literal because `clickhouse_connect` JSON-serializes Python dicts. See `_format_table_filter_map`.
### Tools registered (agent.py)
Two tools sharing a `LocalFileSystem(working_directory="./data_storage")`:
- `RunSqlTool(sql_runner=RLSClickHouseRunner(...), file_system=fs)` — executes RLS-filtered SQL, dumps full result to CSV in `./data_storage/<user-hash>/query_results_*.csv`, returns truncated preview to the LLM.
- `VisualizeDataToolPT(file_system=fs)` — subclasse local de `VisualizeDataTool` (`viz_tool.py`) com `ClubPetroChartGenerator` injetado e arg `chart_type` opcional. Lê o CSV da rodada anterior, emite um `chart` rich component (Plotly figure JSON). Não usa o `PlotlyChartGenerator` default — ver "Intentional custom code" pros detalhes (drop de `go.Table`, coerção de datetime uniforme, ranked bar). The web component renders it; the CLI just reports "Created visualization from <file>".
Both registered via `tools.register_local_tool(tool, access_groups=[])` (Vanna's `register(tool)` shorthand exists in some upstream examples but isn't the current API). Empty `access_groups=[]` means accessible to all users.
### Theming (static/vanna-theme.css + adoptedStyleSheets pierce)
`<vanna-chat>` exposes ~50 CSS custom properties (`vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts`). Each internal custom element (`vanna-message`, `vanna-status-bar`, `vanna-progress-tracker`, `plotly-chart`, `rich-card`, `rich-task-list`, `rich-progress-bar`) re-imports `vannaDesignTokens` and re-declares the literals on its own `:host`. **A `<link rel="stylesheet">` in the document only retemas `vanna-chat` itself** — nested elements live in encapsulated shadow trees that document selectors can't reach, and their internal `:host` rules shadow any inherited custom property.
O pierce vive em `static/vanna-embed-bootstrap.js` (ver "Intentional custom code"): (a) cria um `CSSStyleSheet` construído, (b) monkey-patcha o **setter** de `ShadowRoot.prototype.adoptedStyleSheets` pra sempre mover esse sheet pro fim do array (necessário porque Lit re-assina o array depois do `attachShadow` e venceria a cascade sem isso), (c) `fetch()`-a `/vanna-theme.css` e chama `replaceSync()` pra popular. Adopted sheets cascateiam depois do `static styles` do próprio component, então com `:host` specificity igual a regra adoptada vence.
`static/vanna-theme.css` uses selector `:host, vanna-chat { ... }` so the same file works in both contexts:
- `<link>` in the document → `vanna-chat { ... }` matches the host element.
- `adoptedStyleSheets` inside each shadow root → `:host { ... }` matches the shadow host of that root.
**No JS rebuild needed** to change colors/fonts/spacing — edit `static/vanna-theme.css` and hard-refresh. Don't modify `vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts` (upstream, would be lost on `git pull`).
`@import` of Google Fonts inside `vanna-theme.css` doesn't work because constructed `CSSStyleSheet` (via `replaceSync`) silently strips `@import`. The bootstrap injeta `<link>` no `<head>` via `loadFontsOnce`.
A sidebar (`<vanna-progress-tracker>`) é escondida pelo lado consumer (não pelo bootstrap) — em `embed-demo.html` e em `VannaChat.tsx`, depois de `customElements.whenDefined("vanna-chat")` resolver, seta `el.showProgress = false`. Lit Boolean property attributes default-true não desligam via HTML attribute (qualquer presença = true).
#### Embed bootstrap (`static/vanna-embed-bootstrap.js`)
Bootstrap único servido pelo Vanna server em `/vanna-embed-bootstrap.js` (rota no `server.py`). Concentra todo o JS de wiring; consumido por:
- `static/embed-demo.html` (smoke test, mesma origem) — `<script src="/vanna-embed-bootstrap.js">` + chamada `VannaEmbed.ensureLoaded({ baseUrl: "" })`.
- `clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts` (app React, cross-origin) — injeta `<script src="${baseUrl}/vanna-embed-bootstrap.js">` dinamicamente, espera `window.VannaEmbed`, chama `ensureLoaded({ baseUrl, extraCss })`. `extraCss` carrega override do avatar logo (SVG mora no `/public/` do CRA, não no Vanna server).
Antes desta extração, a app React duplicava ~250 linhas idênticas — fix de markdown header rodou só no demo e a app React continuou quebrada. Lição: **toda mudança em theme/i18n/markdown agora vive só em `static/vanna-embed-bootstrap.js`**. Hard-refresh nas duas pontas pega automático.
Pra produção, a página do cliente carrega:
```html
<link rel="stylesheet" href="https://SEU-VANNA/vanna-theme.css">
<script src="https://SEU-VANNA/vanna-embed-bootstrap.js"></script>
<vanna-chat id="chat" ...></vanna-chat>
<script>
window.VannaEmbed.ensureLoaded({ baseUrl: "https://SEU-VANNA" }).then(() => {
document.getElementById("chat").showProgress = false;
});
</script>
```
Fonts são carregadas pelo bootstrap (`loadFontsOnce`) — o cliente não precisa mais incluir `<link>` de Google Fonts manualmente. Bundle (`vanna-components.js`) também é injetado pelo bootstrap (`injectBundle`).
### Filtragem + tradução de chunks (chat_filter.py)
`FilteringChatHandler` (subclasse de `vanna.servers.base.ChatHandler`) é injetado em `VannaFastAPIServer.chat_handler` antes de `create_app()`. Ele intercepta o stream de `ChatStreamChunk` e:
1. **Whitelist de `rich.type`** — drops chunks cujo tipo não está em `ALLOWED_RICH_TYPES = {text, dataframe, chart, status_bar_update, chat_input_update}`. Tipos extras que o agent emite (`status_card`, `task_tracker_update`, `notification`, `log_viewer`, `progress_display`, etc.) somem antes de virar SSE.
2. **Tradução PT** — strings hardcoded em inglês emitidas por `vanna/src/vanna/core/agent/agent.py` (`Response complete`, `Ready for next message`, `Processing your request...`, etc.) são substituídas via tabela `TRANSLATIONS` exact-match nos campos `message`, `detail`, `placeholder` do `rich.data`. Strings que caem fora da tabela (incluindo dinâmicas tipo `Running 3 tools`) passam intocadas — adicionar entradas conforme aparecerem no chat.
Charts: o `ClubPetroChartGenerator` (`viz_tool.py`) substitui o default upstream e dropa o `go.Table` fallback de 4+ colunas, então não tem dataframe duplicado visualmente. O LLM controla o tipo via arg `chart_type` no `visualize_data` (line/bar/scatter/histogram/area); quando omite, cai numa heurística por shape do CSV.
Pra adicionar mais tipos permitidos ou novas traduções: editar `chat_filter.py`. Sem rebuild de bundle, sem mexer em upstream.
### Web server (server.py)
`server.py` is ~60 lines: `VannaFastAPIServer(agent, config={cors:..., api_base_url:""}).create_app()`, mounts `/static/` to `vanna/frontends/webcomponent/dist/`, adiciona rotas estáticas pra arquivos do project-root `static/`:
- `/vanna-theme.css` — tema CSS adoptado em todo shadow root.
- `/vanna-embed-bootstrap.js` — JS único de wiring (theme pierce + tradutor + markdown + bundle loader). Consumido por `embed-demo.html` e pela app React.
- `/clubpetro-logo.{png,svg}`, `/dashboard-bg.png` — assets.
- `/embed-demo.html` — smoke test page que substitui `__PROGRAM_ID__` / `__STORE_ID__` / `__USER_ID__` placeholders com valores do `.env` em runtime (lido fresh a cada request — sem restart pra mudanças em HTML).
All chat routes (`/api/vanna/v2/chat_sse|chat_websocket|chat_poll`) come from upstream — do not redefine them.
The web component (`<vanna-chat>`) sends `program_id` / `store_id` as query params on the endpoint URL: `sse-endpoint="/api/vanna/v2/chat_sse?program_id=X&store_id=Y"`. The upstream server populates `RequestContext.query_params` from `dict(http_request.query_params)`, so `RequestContextUserResolver` picks them up.
### Training flow (train.py)
Vanna 2.0 has no separate "training" API. Schema knowledge is injected by saving **text memories** into the same `ChromaAgentMemory` instance the agent reads from. `DefaultLlmContextEnhancer` (auto-wired when no enhancer is passed) retrieves the top-k similar text memories and appends them to the system prompt on every turn.
`train.py` itera tabelas em `RLS_TABLES`, lê `system.columns` (que o ClickHouse já filtra por GRANT column-level — não precisa privilege em `system.tables`), filtra colunas em `RLS_INTERNAL_COLS = {"program_id", "store_id"}` pra escondê-las do contexto do LLM, e salva uma memória de texto por tabela com lista de colunas + 3 sample rows (sem DDL, sem `SHOW CREATE TABLE`).
Re-rodar após:
- mudança de GRANT no ClickHouse (coluna nova → aparece automaticamente; coluna revogada → some).
- edição de `RLS_TABLES`.
- edição de `RLS_INTERNAL_COLS`.
Sequência: `rm -rf chroma_db/ && python train.py`.
When constructing a `ToolContext` manually (as in `train.py`), the `agent_memory=` field is required by Pydantic.
### CLI (ask.py)
`argparse` with `--program-id` / `--store-id` flags + positional question. Calls `build_agent(program_id, store_id)`. `Agent.send_message` is an async generator yielding `UiComponent` objects. Each component has both a `rich_component` (structured for the web UI) and a `simple_component` (text fallback). The CLI prefers `rich_component.content`, falls back to `simple_component.text`.
## Non-obvious gotchas
- **`ToolRegistry` API**: use `register_local_tool(tool, access_groups=[])` — there is no plain `.register()` method despite what some upstream examples suggest.
- **Empty `access_groups=[]`** means the tool is accessible to all users; non-empty is a permission allowlist.
- **`additional_table_filters` serialization**: `clickhouse_connect` serializes Python `dict` settings as JSON (double quotes), which ClickHouse rejects with `CANNOT_PARSE_QUOTED_STRING (code 26)`. The Map value must be passed as a pre-formatted string literal with single quotes and `''` escape — see `_format_table_filter_map` in `rls_runner.py`. Don't pass a `dict` to `client.query(..., settings={"additional_table_filters": {...}})`.
- **ChromaDB first run** downloads ~80 MB ONNX embedding model to `~/.cache/chroma/`. Cached afterwards. If running offline, pre-download or pass a custom `embedding_function`.
- **Editable extras on macOS**: `pip install -e './vanna[clickhouse]'` fails with `non-local file URIs are not supported`. Install the editable package and the extras separately.
- **Python 3.9** triggers deprecation warnings from `clickhouse-connect` and `urllib3`/LibreSSL. Not blocking. If upgrading the Python interpreter, recreate the venv.
- **`SELECT *` on `gold.sales` falha com ACCESS_DENIED** — `wren_ia` tem column-level GRANT, então `SELECT *` exige SELECT em todas as colunas (inclusive as 9 revogadas) e ClickHouse rejeita. Listar colunas explicitamente. (Antes do GRANT column-level, `RunSqlTool` salvava o full result em `./data_storage/<user-hash>/query_results_*.csv` — esse fluxo continua válido para queries com colunas explícitas.)
- **REVOKE column-level em `system.*` é silenciosamente ignorado em ClickHouse Cloud** — `REVOKE SELECT(create_table_query) ON system.tables FROM wren_ia` retorna "succeeded" no parser mas runtime continua liberando a leitura. Defesa real é app-side via regex no `RLSClickHouseRunner`.
- **Upstream examples are not all current**: `openai_quickstart.py` imports `OpenAILlmService` from `vanna.integrations.anthropic` (a bug); `mock_sqlite_example.py` calls `agent.send_message(user=...)` but the real signature now takes `request_context=`. Trust `claude_sqlite_example.py` and `vanna/src/vanna/core/agent/agent.py` over the others.
## Database
ClickHouse Cloud, database `gold`, accessed over HTTPS port 8443 with `secure=true`.
**Tabela treinada**: `gold.sales` — analytical sales fact (data: fuel/retail, `CLUBE ALE` / `POP FIDELIDADE`). Filtrada por `program_id` + `store_id` por request via RLS.
**GRANT column-level em `wren_ia`** — 20 colunas das 29 originais:
- `program_id`, `store_id` — necessárias pro RLS funcionar (mas escondidas do LLM via `RLS_INTERNAL_COLS`).
- `sale_id`, `cartao_do_cliente`, `nome_da_rede`, `nome_da_loja`, `tipo_loja`, `categoria_loja`, `nome_do_cliente`, `categoria_do_cliente`, `nome_do_atendente`, `produto`, `categoria_do_produto`, `quantidade_total_produto`, `valor_total_produto`, `desconto`, `pontuacao_produto`, `voucher_aplicado_venda`, `data_da_compra`, `fidelizada`, `e_combustivel` — visíveis ao LLM (18).
9 colunas REVOGADAS (não aparecem em `system.columns`, geram `ACCESS_DENIED` se o LLM tentar): `customer_id`, `customer_category_id`, `sale_product_id`, `product_id`, `product_category_id`, `attendant_id`, `rfid_do_atendente`, `sync_date`. Conferir grants atuais via `SHOW GRANTS FOR wren_ia` no console ClickHouse Cloud (admin) ou `SELECT * FROM system.grants WHERE user_name = 'wren_ia'`.
**Tabela negada**: `gold.vw_relatorios_exportaveis_analitico_vendas` — exportable view; **denied** at runner level (`'0'` filter) and at ClickHouse grant level (sem SELECT).
Credentials live in `.env` (gitignored). RLS values too: `RLS_PROGRAM_ID`, `RLS_STORE_ID` (CLI defaults). Optional: `OPENAI_TEMPERATURE` (default 1.0). Both `agent.py` and `train.py` call `load_dotenv()` at import.
## Where to look in upstream
Server / frontend:
- `vanna/src/vanna/servers/fastapi/app.py``VannaFastAPIServer` factory
- `vanna/src/vanna/servers/fastapi/routes.py` — chat SSE/WS/poll endpoints; auto-extracts `RequestContext` from cookies/headers/query_params (line 46-52)
- `vanna/src/vanna/servers/base/templates.py` — index HTML used by `GET /` (login + `<vanna-chat>` embed)
- `vanna/frontends/webcomponent/src/components/vanna-chat.ts` — Web Component attributes (`api-base`, `sse-endpoint`, `starting-state`, `theme`, etc.)
- `vanna/frontends/webcomponent/src/services/api-client.ts` — confirms URL construction is naive `${baseUrl}${endpoint}` concat (so query strings on `sse-endpoint` work)
Core:
- `vanna/src/vanna/core/agent/agent.py``Agent` class, `send_message` signature, required init params
- `vanna/src/vanna/core/registry.py``ToolRegistry`; the `transform_args()` hook (line 113-142) is the "official" RLS extension point but only handles arg rewriting / `ToolRejection`, not query settings
- `vanna/src/vanna/core/system_prompt/default.py:47-48``DefaultSystemPromptBuilder.build_system_prompt`: quando `base_prompt` é não-nulo retorna ele direto, descartando o prompt default do Vanna.
- `vanna/src/vanna/core/user/{models,resolver,request_context}.py``User` (with `extra="allow"`), `UserResolver` ABC, `RequestContext`
- `vanna/src/vanna/integrations/{openai,chromadb,clickhouse}/` — integrations this project uses
- `vanna/src/vanna/integrations/plotly/chart_generator.py:51` — heurística "4+ colunas → go.Table" do default generator. Substituímos via `ClubPetroChartGenerator` em `viz_tool.py`.
- `vanna/src/vanna/tools/run_sql.py``RunSqlTool` behavior (truncation, CSV side-output)
- `vanna/src/vanna/examples/claude_sqlite_example.py` — closest working reference for the assembly pattern
- `vanna/MIGRATION_GUIDE.md` — only relevant if migrating Vanna 0.x code (not used here)