vanna-clubpetro/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## What this project is

A Vanna 2.0 deployment that lets a user ask natural-language questions in Portuguese and get back SQL results from a ClickHouse Cloud database. **Not** a fork of Vanna — the upstream repo is cloned into `vanna/` and installed editable; the application code lives at the project root.

Wiring: `OpenAILlmService` (LLM) + `ChromaAgentMemory` (local vector store, persisted to `./chroma_db/`) + `RLSClickHouseRunner` (subclass of upstream `ClickHouseRunner` that injects ClickHouse `additional_table_filters` for tenant isolation against `gold` database in ClickHouse Cloud).

Two ways to query the agent:
- CLI: `python ask.py "your question"` (uses `StaticUserResolver` with IDs from env/flags)
- Web: `uvicorn server:app --port 8765` then embed `<vanna-chat>` (uses `RequestContextUserResolver` reading IDs from query params)

## ⚠️ Vanna-first rule (READ BEFORE WRITING NEW CODE)

**Before adding new code, search `vanna/src/vanna/` and `vanna/frontends/` for an existing solution.** This project deliberately uses upstream Vanna primitives wherever possible — only the items in "Intentional custom code" below are project-specific.

When unsure:
1. `grep -r "the-thing-you-want" vanna/src/vanna/` first.
2. Check `vanna/src/vanna/servers/`, `vanna/src/vanna/integrations/`, `vanna/src/vanna/core/`, `vanna/frontends/webcomponent/`.
3. Read `vanna/src/vanna/examples/claude_sqlite_example.py` — closest reference for assembly patterns.
4. Only if there's no built-in equivalent, write custom code in the project root and document why in this section.

### What we use from Vanna (do NOT reimplement)
- **Server**: `vanna.servers.fastapi.VannaFastAPIServer` — see `server.py`. Provides `/api/vanna/v2/chat_sse|chat_websocket|chat_poll`, healthcheck, CORS, RequestContext extraction (cookies+headers+query_params+metadata).
- **Frontend**: `<vanna-chat>` Web Component from `vanna/frontends/webcomponent/`. Build artifact at `vanna/frontends/webcomponent/dist/vanna-components.js` (npm-built; gitignored). Floating button via `starting-state="minimized"`. Renders rich components including Plotly charts, sortable/searchable tables, code blocks, status cards, progress bars.
- **Tools**: `RunSqlTool` + `VisualizeDataTool` from `vanna.tools` — share a `LocalFileSystem(working_directory="./data_storage")`; SQL writes CSV → viz reads CSV → emits `chart` rich component the web component renders via Plotly.
- **Memory tools**: `SearchSavedCorrectToolUsesTool` + `SaveQuestionToolArgsTool` + `SaveTextMemoryTool` from `vanna.tools` — fecham o loop self-learning. ChromaAgentMemory agora também é escrito pelo LLM em runtime (par pergunta→args do `run_sql` após sucesso), não só pelo `train.py` (schema docs offline). Orientação de uso vive em `system_prompt.py` na seção "Memória".
- **Memory**: `vanna.integrations.chromadb.ChromaAgentMemory` for vector store; `agent_memory.save_text_memory(content, context)` is the canonical write API (usada por `train.py`; o LLM usa o tool `save_text_memory` em runtime).
- **User model**: `vanna.User` — has `ConfigDict(extra="allow")` (`vanna/src/vanna/core/user/models.py:25`) so we can attach `program_id`/`store_id` as ad-hoc fields without subclassing.
- **Resolver ABC**: `vanna.core.user.UserResolver` — base class for both `StaticUserResolver` and `RequestContextUserResolver`.
- **RequestContext**: extracted automatically by the server (`vanna/src/vanna/servers/fastapi/routes.py:46-52`) — read from `query_params`/`cookies`/`headers`/`metadata` in the resolver.
- **SQL runner base**: `vanna.integrations.clickhouse.ClickHouseRunner` — we subclass it, never modify upstream.
- **CLI server runner**: `vanna.servers.cli.server_runner` (the `vanna serve` command) — we don't use it directly, but it's the reference for how to wire FastAPI + frontend bundle.

### Intentional custom code (Vanna has no equivalent)
- `TenantAwareChromaMemory` (`tenant_memory.py`) — Vanna's `ChromaAgentMemory` é single-collection, sem scoping. Vazaria perguntas/aprendizados entre tenants (program × store) que compartilham o mesmo deploy. Esta classe compõe duas instâncias: collection compartilhada `vanna_clickhouse_gold` pra text memories (schema docs do `train.py` — comum a todos), e collection `vanna_clickhouse_gold__p<program>__s<store>` lazy-criada por (program_id, store_id) pra tool-usage memories. Roteamento por tipo de memória: text → shared, tool → tenant. `train.py` não muda — escreve só text memories na shared.
- `RLSClickHouseRunner` (`rls_runner.py`) — Vanna's `ClickHouseRunner` doesn't support per-query settings. We override `run_sql` to inject `additional_table_filters` via `client.query(sql, settings=...)`. The `ToolRegistry.transform_args()` hook is Vanna's "official" RLS extension point but only allows arg rewriting / rejection — it can't reach `clickhouse_connect` settings, so we keep the runner subclass.
- `_FORBIDDEN_SCHEMA_RE` / `_INTROSPECTION_STMT_RE` (`rls_runner.py`) — regex guards no topo de `run_sql` que rejeitam SQL contra `system.*` / `information_schema.*` ou statements `SHOW`/`DESCRIBE`/`EXPLAIN`. ClickHouse Cloud não enforça REVOKE column-level em `system.tables` (acesso herdado de role default), então blindamos app-side. Sem isso, o LLM lia `system.tables.create_table_query` e via DDL com colunas revogadas.
- `_format_table_filter_map` / `_quote` (`rls_runner.py`) — `clickhouse_connect` serializes Python dicts as JSON (double quotes) but ClickHouse's Map literal needs single quotes with `''` escape. Hand-built literal is the workaround.
- `_round_decimal_columns` (`rls_runner.py`) — após cada query, percorre as colunas do `pd.DataFrame` e arredonda valores `Decimal`/`float` pra 2 casas (constante `_DISPLAY_DECIMALS`). Necessário porque ClickHouse retorna `Decimal(N, 6)` por padrão e o dataframe rich component renderiza via `value.toLocaleString()` (`vanna/frontends/webcomponent/src/components/rich-component-system.ts:624`) sem rounding — sem isso, `307427.030000` aparecia na tabela. Patchar o renderer upstream exigiria rebuild npm + perde no `git pull`; arredondar app-side cobre todas as queries sem depender do LLM emitir `round(..., 2)` no SELECT. Colunas integer/string/datetime ficam intactas (verificação via `pd.api.types.is_float_dtype` + sample do object dtype). CSV escrito pelo `RunSqlTool` herda os valores arredondados — visualize_data tudo bem porque charts já arredondam exibição.
- `StaticUserResolver` / `RequestContextUserResolver` (`agent.py`) — Vanna ships only the `UserResolver` ABC, no concrete implementations.
- `system_prompt.py` — Vanna tem `DefaultSystemPromptBuilder(base_prompt=...)` mas nenhum prompt domain-specific. Constante `SYSTEM_PROMPT` injeta regras pt-BR + confidencialidade + escopo de loja única + métricas padronizadas + formatação R$/L. Edita o arquivo direto pra iterar; sem rebuild, sem re-train.
- `VisualizeDataToolPT` + `ClubPetroChartGenerator` (`viz_tool.py`) — subclasse de `VisualizeDataTool` com três customizações sobre upstream: (1) novo arg opcional `chart_type` (`line`/`bar`/`scatter`/`histogram`/`area`) no schema (`VisualizeDataArgsPT`) — quando o LLM passa, força o tipo via `_render_forced`; quando omite, cai na heurística como fallback. Threadado via `contextvars.ContextVar` no `execute` pra suportar execuções concorrentes do tool singleton (Vanna roda async). (2) `description` PT-BR com gatilhos por tipo de pergunta (ranking → `chart_type='bar'`, série → `'line'`, etc.). (3) `ClubPetroChartGenerator` substitui o default `PlotlyChartGenerator` upstream — dropa o fallback "4+ colunas → go.Table" (`vanna/src/vanna/integrations/plotly/chart_generator.py:51-55`) que duplicava o dataframe rich; aplica `_coerce_datetime_columns` (string ISO → datetime64) nos DOIS caminhos (<4 e >=4 cols), pra que séries temporais sejam detectadas independente do nº de colunas; e usa `_create_ranked_bar_chart` próprio (em vez do `_create_bar_chart` upstream que re-agrega com `groupby` e perde a ordenação descendente do ranking).
- `RLS_INTERNAL_COLS` (`train.py`) — `program_id`/`store_id` precisam estar no GRANT (RLS depende deles), mas escondemos da doc do ChromaDB pra não tentar o LLM a usá-los manualmente nas queries.
- `train.py` — Vanna has `AgentMemory.save_text_memory` but no schema crawler. Usa apenas `system.columns` (filtrado pelo ClickHouse via GRANT column-level, sem dependência de `system.tables` nem `SHOW CREATE TABLE`) e emite uma memória de texto por tabela em `RLS_TABLES`. Sem DDL bruto pra evitar vazamento de colunas revogadas.
- `ask.py` — Vanna has only the `vanna serve` CLI; no ad-hoc `vanna ask`. This is a thin async wrapper for terminal use.
- `local_request_context()` (`agent.py`) — 1-line factory because Vanna has no default `RequestContext` constructor.
- `csv_cleanup.py` — Vanna upstream não tem GC pros CSVs que `RunSqlTool` escreve em `data_storage/`. O `LocalFileSystem.write_file` só escreve, nunca apaga; sem isso o disco cresce linearmente com o tráfego. Módulo standalone com `sweep_once()` (síncrono, idempotente — apaga `query_results_*.csv` com mtime > `CSV_TTL_SECONDS`, default 1800s) e tarefa asyncio periódica (`CSV_SWEEP_INTERVAL_SECONDS`, default 600s). `server.py` pluga via `on_event("startup"/"shutdown")` — sweep no boot pra recolher legado de runs anteriores e armar o loop de fundo. CLI (`ask.py`) não roda cleanup em tempo real; depende do próximo boot do servidor.
- `static/vanna-embed-bootstrap.js` — fonte única do JS de wiring exigido por todo embed do `<vanna-chat>`: theme pierce (monkey-patch do setter de `adoptedStyleSheets` pra forçar o `themeSheet` sempre como último item — Lit re-assina depois do `attachShadow` e venceria a cascade sem isso); tradutor PT-BR via MutationObserver em todo shadow root novo; markdown processor pros balões (`escapeHtml` → code → bold/italic → links → **headers ANTES de listas** porque a regex de lista consome o `\n` final e quebraria header subsequente → `\n` → `<br>` → cleanup `<br>` adjacente a blocos); load de fonts; injeção do bundle. Servido em `/vanna-embed-bootstrap.js` (rota explícita no `server.py`). Expõe `window.VannaEmbed.ensureLoaded({ baseUrl, extraCss? }) -> Promise<void>` (idempotente). Antes desta extração, a app React (`clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts`) duplicava ~250 linhas idênticas; toda correção tinha que ser feita em dois lugares e divergia (foi exatamente o que aconteceu com fix de markdown headers — só pegou o demo, app React continuou quebrada). Agora `embed-demo.html` e `vannaChatLoader.ts` são thin wrappers que injetam `<script src=...>` e chamam `ensureLoaded`. Edição do bootstrap reflete em ambos automaticamente após hard-refresh (sem rebuild npm).
- `EventSink` + `TurnRecord` + ContextVar `_current_turn` (`events_sink.py`) — grava 1 row por turno de chat em `events.vanna_ai` (ClickHouse). Vanna tem `observability_provider` (spans/metrics) mas não tem persistência estruturada de interação fim-a-fim. `chat_filter.py` cria o TurnRecord no início do `handle_stream`, captura `question` (de `request.message`), `program_id`/`store_id`/`user_id` (de `request_context.query_params` — frontend do web app deve passar os 3 como query string nos endpoints `sse-endpoint`/`ws-endpoint`/`poll-endpoint` do `<vanna-chat>`; `user_id` defaulta a string vazia se ausente, sem quebrar o chat), `response` (concatenando rich.type=='text' chunks), `status`/`error_message` (do try/except). Hooks downstream escrevem no mesmo TurnRecord via ContextVar: `EventCapturingToolRegistry.transform_args` (`agent.py`) registra cada tool call e captura `args.sql` quando `tool.name == "run_sql"`; `VisualizeDataToolPT.execute` registra (chart_type, title). Flush no `finally` do stream via `EventSink.flush` (`asyncio.to_thread` pro `client.insert` sync; try/except envolve tudo — falha de insert nunca quebra a resposta ao usuário). Cliente CH lazy, mesmas creds do RLS runner — `wren_ia` precisa `GRANT INSERT ON events.vanna_ai`.

## Commands

All commands assume the venv is active:
```
source venv/bin/activate
```

Common workflows:
```
python ask.py "your question"                                  # CLI (uses .env RLS_PROGRAM_ID/RLS_STORE_ID)
python ask.py --program-id <id> --store-id <id> "..."          # CLI with override
python train.py                                                # re-extract schema (only RLS_TABLES) into ChromaDB
python test_clickhouse.py                                      # raw connectivity smoke test (no LLM)
uvicorn server:app --host 127.0.0.1 --port 8765                # web server
```

Web component build (one-time, after `git pull` of upstream or after upstream version bump):
```
cd vanna/frontends/webcomponent
npm install
npm run build
# outputs vanna/frontends/webcomponent/dist/vanna-components.js (~7.5 MB)
```

Re-installing or pulling upstream Vanna changes:
```
cd vanna && git pull && cd ..
pip install -e ./vanna
```

Adding more LLM/DB/vector extras (Vanna defines them in `vanna/pyproject.toml`):
```
pip install <package>              # extras shorthand `pip install -e './vanna[xxx]'` fails on macOS file URI
```

## Architecture

### Agent assembly (agent.py)

`build_agent(program_id=None, store_id=None, user_resolver=None)` is the single source of truth. Vanna 2.0's `Agent.__init__` requires **both** `agent_memory` and `user_resolver` — no defaults. Two resolver flavors:
- `StaticUserResolver` — fixed `User` from env/flags. CLI default.
- `RequestContextUserResolver` — reads `program_id`/`store_id` from `request_context.query_params` (with `metadata` fallback) per request. Validates with `_require_id` (regex `^[A-Za-z0-9_-]+$`); raises `PermissionError` on missing/invalid. Web default.

`ChromaAgentMemory(persist_directory="./chroma_db", collection_name="vanna_clickhouse_gold")` is shared by both.

Agent uses `temperature=float(os.environ.get("OPENAI_TEMPERATURE", "1.0"))`. Default 1.0 mantém compat com modelos de reasoning/`gpt-5*` que rejeitam outros valores. Pra modelos que aceitam ajuste (ex.: `gpt-4o`), set `OPENAI_TEMPERATURE=0.2` no `.env` pra mais determinismo na geração de SQL.

System prompt customizado vem de `system_prompt.SYSTEM_PROMPT` via `DefaultSystemPromptBuilder(base_prompt=...)`. Quando `base_prompt` é não-nulo (`vanna/src/vanna/core/system_prompt/default.py:47-48`), substitui o prompt default do Vanna; o `LlmContextEnhancer` ainda anexa as memórias retrievadas do ChromaDB depois.

### RLS (rls_runner.py)

`RLSClickHouseRunner` extends `ClickHouseRunner`. Em `run_sql` faz dois trabalhos:

**1. Bloqueio app-side de introspecção** (regex guards no topo da função, antes de tudo):
- `_FORBIDDEN_SCHEMA_RE` — rejeita SQL referenciando `system.*` ou `information_schema.*`. ClickHouse Cloud não enforça REVOKE column-level em `system.tables` (acesso vem de role default não-revogável), então é app-side ou nada.
- `_INTROSPECTION_STMT_RE` — rejeita statements `SHOW`/`DESCRIBE`/`EXPLAIN`. Sem isso, o LLM bypassava `system.*` via `SHOW TABLES FROM gold` ou `DESCRIBE TABLE gold.sales` e descobria colunas revogadas.

Ambos levantam `PermissionError` com mensagem orientando o LLM a usar o contexto treinado.

**2. Injeção de RLS** (`additional_table_filters`):
- `RLS_TABLES = ("gold.sales",)` — receives a `program_id = '...' AND store_id = '...'` filter built from `context.user.program_id` / `store_id`.
- `DENIED_TABLES = ("gold.vw_relatorios_exportaveis_analitico_vendas",)` — receives the literal `0` filter (zero rows). Defense in depth — the ClickHouse user `wren_ia` already lacks SELECT grant on the view.

`train.py` imports `RLS_TABLES` so the trained schema docs match the RLS scope (single source of truth).

The filter expression is built as a hand-formatted ClickHouse Map literal because `clickhouse_connect` JSON-serializes Python dicts. See `_format_table_filter_map`.

### Tools registered (agent.py)

Two tools sharing a `LocalFileSystem(working_directory="./data_storage")`:
- `RunSqlTool(sql_runner=RLSClickHouseRunner(...), file_system=fs)` — executes RLS-filtered SQL, dumps full result to CSV in `./data_storage/<user-hash>/query_results_*.csv`, returns truncated preview to the LLM.
- `VisualizeDataToolPT(file_system=fs)` — subclasse local de `VisualizeDataTool` (`viz_tool.py`) com `ClubPetroChartGenerator` injetado e arg `chart_type` opcional. Lê o CSV da rodada anterior, emite um `chart` rich component (Plotly figure JSON). Não usa o `PlotlyChartGenerator` default — ver "Intentional custom code" pros detalhes (drop de `go.Table`, coerção de datetime uniforme, ranked bar). The web component renders it; the CLI just reports "Created visualization from <file>".

Both registered via `tools.register_local_tool(tool, access_groups=[])` (Vanna's `register(tool)` shorthand exists in some upstream examples but isn't the current API). Empty `access_groups=[]` means accessible to all users.

### Theming (static/vanna-theme.css + adoptedStyleSheets pierce)

`<vanna-chat>` exposes ~50 CSS custom properties (`vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts`). Each internal custom element (`vanna-message`, `vanna-status-bar`, `vanna-progress-tracker`, `plotly-chart`, `rich-card`, `rich-task-list`, `rich-progress-bar`) re-imports `vannaDesignTokens` and re-declares the literals on its own `:host`. **A `<link rel="stylesheet">` in the document only retemas `vanna-chat` itself** — nested elements live in encapsulated shadow trees that document selectors can't reach, and their internal `:host` rules shadow any inherited custom property.

O pierce vive em `static/vanna-embed-bootstrap.js` (ver "Intentional custom code"): (a) cria um `CSSStyleSheet` construído, (b) monkey-patcha o **setter** de `ShadowRoot.prototype.adoptedStyleSheets` pra sempre mover esse sheet pro fim do array (necessário porque Lit re-assina o array depois do `attachShadow` e venceria a cascade sem isso), (c) `fetch()`-a `/vanna-theme.css` e chama `replaceSync()` pra popular. Adopted sheets cascateiam depois do `static styles` do próprio component, então com `:host` specificity igual a regra adoptada vence.

`static/vanna-theme.css` uses selector `:host, vanna-chat { ... }` so the same file works in both contexts:
- `<link>` in the document → `vanna-chat { ... }` matches the host element.
- `adoptedStyleSheets` inside each shadow root → `:host { ... }` matches the shadow host of that root.

**No JS rebuild needed** to change colors/fonts/spacing — edit `static/vanna-theme.css` and hard-refresh. Don't modify `vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts` (upstream, would be lost on `git pull`).

`@import` of Google Fonts inside `vanna-theme.css` doesn't work because constructed `CSSStyleSheet` (via `replaceSync`) silently strips `@import`. The bootstrap injeta `<link>` no `<head>` via `loadFontsOnce`.

A sidebar (`<vanna-progress-tracker>`) é escondida pelo lado consumer (não pelo bootstrap) — em `embed-demo.html` e em `VannaChat.tsx`, depois de `customElements.whenDefined("vanna-chat")` resolver, seta `el.showProgress = false`. Lit Boolean property attributes default-true não desligam via HTML attribute (qualquer presença = true).

#### Embed bootstrap (`static/vanna-embed-bootstrap.js`)

Bootstrap único servido pelo Vanna server em `/vanna-embed-bootstrap.js` (rota no `server.py`). Concentra todo o JS de wiring; consumido por:
- `static/embed-demo.html` (smoke test, mesma origem) — `<script src="/vanna-embed-bootstrap.js">` + chamada `VannaEmbed.ensureLoaded({ baseUrl: "" })`.
- `clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts` (app React, cross-origin) — injeta `<script src="${baseUrl}/vanna-embed-bootstrap.js">` dinamicamente, espera `window.VannaEmbed`, chama `ensureLoaded({ baseUrl, extraCss })`. `extraCss` carrega override do avatar logo (SVG mora no `/public/` do CRA, não no Vanna server).

Antes desta extração, a app React duplicava ~250 linhas idênticas — fix de markdown header rodou só no demo e a app React continuou quebrada. Lição: **toda mudança em theme/i18n/markdown agora vive só em `static/vanna-embed-bootstrap.js`**. Hard-refresh nas duas pontas pega automático.

Pra produção, a página do cliente carrega:
```html
<link rel="stylesheet" href="https://SEU-VANNA/vanna-theme.css">
<script src="https://SEU-VANNA/vanna-embed-bootstrap.js"></script>

<vanna-chat id="chat" ...></vanna-chat>

<script>
  window.VannaEmbed.ensureLoaded({ baseUrl: "https://SEU-VANNA" }).then(() => {
    document.getElementById("chat").showProgress = false;
  });
</script>
```
Fonts são carregadas pelo bootstrap (`loadFontsOnce`) — o cliente não precisa mais incluir `<link>` de Google Fonts manualmente. Bundle (`vanna-components.js`) também é injetado pelo bootstrap (`injectBundle`).

### Filtragem + tradução de chunks (chat_filter.py)

`FilteringChatHandler` (subclasse de `vanna.servers.base.ChatHandler`) é injetado em `VannaFastAPIServer.chat_handler` antes de `create_app()`. Ele intercepta o stream de `ChatStreamChunk` e:

1. **Whitelist de `rich.type`** — drops chunks cujo tipo não está em `ALLOWED_RICH_TYPES = {text, dataframe, chart, status_bar_update, chat_input_update}`. Tipos extras que o agent emite (`status_card`, `task_tracker_update`, `notification`, `log_viewer`, `progress_display`, etc.) somem antes de virar SSE.
2. **Tradução PT** — strings hardcoded em inglês emitidas por `vanna/src/vanna/core/agent/agent.py` (`Response complete`, `Ready for next message`, `Processing your request...`, etc.) são substituídas via tabela `TRANSLATIONS` exact-match nos campos `message`, `detail`, `placeholder` do `rich.data`. Strings que caem fora da tabela (incluindo dinâmicas tipo `Running 3 tools`) passam intocadas — adicionar entradas conforme aparecerem no chat.

Charts: o `ClubPetroChartGenerator` (`viz_tool.py`) substitui o default upstream e dropa o `go.Table` fallback de 4+ colunas, então não tem dataframe duplicado visualmente. O LLM controla o tipo via arg `chart_type` no `visualize_data` (line/bar/scatter/histogram/area); quando omite, cai numa heurística por shape do CSV.

Pra adicionar mais tipos permitidos ou novas traduções: editar `chat_filter.py`. Sem rebuild de bundle, sem mexer em upstream.

### Web server (server.py)

`server.py` is ~60 lines: `VannaFastAPIServer(agent, config={cors:..., api_base_url:""}).create_app()`, mounts `/static/` to `vanna/frontends/webcomponent/dist/`, adiciona rotas estáticas pra arquivos do project-root `static/`:
- `/vanna-theme.css` — tema CSS adoptado em todo shadow root.
- `/vanna-embed-bootstrap.js` — JS único de wiring (theme pierce + tradutor + markdown + bundle loader). Consumido por `embed-demo.html` e pela app React.
- `/clubpetro-logo.{png,svg}`, `/dashboard-bg.png` — assets.
- `/embed-demo.html` — smoke test page que substitui `__PROGRAM_ID__` / `__STORE_ID__` / `__USER_ID__` placeholders com valores do `.env` em runtime (lido fresh a cada request — sem restart pra mudanças em HTML).

All chat routes (`/api/vanna/v2/chat_sse|chat_websocket|chat_poll`) come from upstream — do not redefine them.

The web component (`<vanna-chat>`) sends `program_id` / `store_id` as query params on the endpoint URL: `sse-endpoint="/api/vanna/v2/chat_sse?program_id=X&store_id=Y"`. The upstream server populates `RequestContext.query_params` from `dict(http_request.query_params)`, so `RequestContextUserResolver` picks them up.

### Training flow (train.py)

Vanna 2.0 has no separate "training" API. Schema knowledge is injected by saving **text memories** into the same `ChromaAgentMemory` instance the agent reads from. `DefaultLlmContextEnhancer` (auto-wired when no enhancer is passed) retrieves the top-k similar text memories and appends them to the system prompt on every turn.

`train.py` itera tabelas em `RLS_TABLES`, lê `system.columns` (que o ClickHouse já filtra por GRANT column-level — não precisa privilege em `system.tables`), filtra colunas em `RLS_INTERNAL_COLS = {"program_id", "store_id"}` pra escondê-las do contexto do LLM, e salva uma memória de texto por tabela com lista de colunas + 3 sample rows (sem DDL, sem `SHOW CREATE TABLE`).

Re-rodar após:
- mudança de GRANT no ClickHouse (coluna nova → aparece automaticamente; coluna revogada → some).
- edição de `RLS_TABLES`.
- edição de `RLS_INTERNAL_COLS`.

Sequência: `rm -rf chroma_db/ && python train.py`.

When constructing a `ToolContext` manually (as in `train.py`), the `agent_memory=` field is required by Pydantic.

### CLI (ask.py)

`argparse` with `--program-id` / `--store-id` flags + positional question. Calls `build_agent(program_id, store_id)`. `Agent.send_message` is an async generator yielding `UiComponent` objects. Each component has both a `rich_component` (structured for the web UI) and a `simple_component` (text fallback). The CLI prefers `rich_component.content`, falls back to `simple_component.text`.

## Non-obvious gotchas

- **`ToolRegistry` API**: use `register_local_tool(tool, access_groups=[])` — there is no plain `.register()` method despite what some upstream examples suggest.
- **Empty `access_groups=[]`** means the tool is accessible to all users; non-empty is a permission allowlist.
- **`additional_table_filters` serialization**: `clickhouse_connect` serializes Python `dict` settings as JSON (double quotes), which ClickHouse rejects with `CANNOT_PARSE_QUOTED_STRING (code 26)`. The Map value must be passed as a pre-formatted string literal with single quotes and `''` escape — see `_format_table_filter_map` in `rls_runner.py`. Don't pass a `dict` to `client.query(..., settings={"additional_table_filters": {...}})`.
- **ChromaDB first run** downloads ~80 MB ONNX embedding model to `~/.cache/chroma/`. Cached afterwards. If running offline, pre-download or pass a custom `embedding_function`.
- **Editable extras on macOS**: `pip install -e './vanna[clickhouse]'` fails with `non-local file URIs are not supported`. Install the editable package and the extras separately.
- **Python 3.9** triggers deprecation warnings from `clickhouse-connect` and `urllib3`/LibreSSL. Not blocking. If upgrading the Python interpreter, recreate the venv.
- **`SELECT *` on `gold.sales` falha com ACCESS_DENIED** — `wren_ia` tem column-level GRANT, então `SELECT *` exige SELECT em todas as colunas (inclusive as 9 revogadas) e ClickHouse rejeita. Listar colunas explicitamente. (Antes do GRANT column-level, `RunSqlTool` salvava o full result em `./data_storage/<user-hash>/query_results_*.csv` — esse fluxo continua válido para queries com colunas explícitas.)
- **REVOKE column-level em `system.*` é silenciosamente ignorado em ClickHouse Cloud** — `REVOKE SELECT(create_table_query) ON system.tables FROM wren_ia` retorna "succeeded" no parser mas runtime continua liberando a leitura. Defesa real é app-side via regex no `RLSClickHouseRunner`.
- **Upstream examples are not all current**: `openai_quickstart.py` imports `OpenAILlmService` from `vanna.integrations.anthropic` (a bug); `mock_sqlite_example.py` calls `agent.send_message(user=...)` but the real signature now takes `request_context=`. Trust `claude_sqlite_example.py` and `vanna/src/vanna/core/agent/agent.py` over the others.

## Database

ClickHouse Cloud, database `gold`, accessed over HTTPS port 8443 with `secure=true`.

**Tabela treinada**: `gold.sales` — analytical sales fact (data: fuel/retail, `CLUBE ALE` / `POP FIDELIDADE`). Filtrada por `program_id` + `store_id` por request via RLS.

**GRANT column-level em `wren_ia`** — 20 colunas das 29 originais:
- `program_id`, `store_id` — necessárias pro RLS funcionar (mas escondidas do LLM via `RLS_INTERNAL_COLS`).
- `sale_id`, `cartao_do_cliente`, `nome_da_rede`, `nome_da_loja`, `tipo_loja`, `categoria_loja`, `nome_do_cliente`, `categoria_do_cliente`, `nome_do_atendente`, `produto`, `categoria_do_produto`, `quantidade_total_produto`, `valor_total_produto`, `desconto`, `pontuacao_produto`, `voucher_aplicado_venda`, `data_da_compra`, `fidelizada`, `e_combustivel` — visíveis ao LLM (18).

9 colunas REVOGADAS (não aparecem em `system.columns`, geram `ACCESS_DENIED` se o LLM tentar): `customer_id`, `customer_category_id`, `sale_product_id`, `product_id`, `product_category_id`, `attendant_id`, `rfid_do_atendente`, `sync_date`. Conferir grants atuais via `SHOW GRANTS FOR wren_ia` no console ClickHouse Cloud (admin) ou `SELECT * FROM system.grants WHERE user_name = 'wren_ia'`.

**Tabela negada**: `gold.vw_relatorios_exportaveis_analitico_vendas` — exportable view; **denied** at runner level (`'0'` filter) and at ClickHouse grant level (sem SELECT).

Credentials live in `.env` (gitignored). RLS values too: `RLS_PROGRAM_ID`, `RLS_STORE_ID` (CLI defaults). Optional: `OPENAI_TEMPERATURE` (default 1.0). Both `agent.py` and `train.py` call `load_dotenv()` at import.

## Where to look in upstream

Server / frontend:
- `vanna/src/vanna/servers/fastapi/app.py` — `VannaFastAPIServer` factory
- `vanna/src/vanna/servers/fastapi/routes.py` — chat SSE/WS/poll endpoints; auto-extracts `RequestContext` from cookies/headers/query_params (line 46-52)
- `vanna/src/vanna/servers/base/templates.py` — index HTML used by `GET /` (login + `<vanna-chat>` embed)
- `vanna/frontends/webcomponent/src/components/vanna-chat.ts` — Web Component attributes (`api-base`, `sse-endpoint`, `starting-state`, `theme`, etc.)
- `vanna/frontends/webcomponent/src/services/api-client.ts` — confirms URL construction is naive `${baseUrl}${endpoint}` concat (so query strings on `sse-endpoint` work)

Core:
- `vanna/src/vanna/core/agent/agent.py` — `Agent` class, `send_message` signature, required init params
- `vanna/src/vanna/core/registry.py` — `ToolRegistry`; the `transform_args()` hook (line 113-142) is the "official" RLS extension point but only handles arg rewriting / `ToolRejection`, not query settings
- `vanna/src/vanna/core/system_prompt/default.py:47-48` — `DefaultSystemPromptBuilder.build_system_prompt`: quando `base_prompt` é não-nulo retorna ele direto, descartando o prompt default do Vanna.
- `vanna/src/vanna/core/user/{models,resolver,request_context}.py` — `User` (with `extra="allow"`), `UserResolver` ABC, `RequestContext`
- `vanna/src/vanna/integrations/{openai,chromadb,clickhouse}/` — integrations this project uses
- `vanna/src/vanna/integrations/plotly/chart_generator.py:51` — heurística "4+ colunas → go.Table" do default generator. Substituímos via `ClubPetroChartGenerator` em `viz_tool.py`.
- `vanna/src/vanna/tools/run_sql.py` — `RunSqlTool` behavior (truncation, CSV side-output)
- `vanna/src/vanna/examples/claude_sqlite_example.py` — closest working reference for the assembly pattern
- `vanna/MIGRATION_GUIDE.md` — only relevant if migrating Vanna 0.x code (not used here)