vanna-clubpetro/CLAUDE.md
leonardosalazar-cp 1d152c0dce Initial commit: Vanna 2.0 deployment for ClubPetro
Wrapper application around upstream Vanna with:
- Tenant-aware ChromaDB memory (per program/store)
- ClickHouse RLS runner with introspection guards
- PT-BR system prompt and chat translations
- Custom Plotly chart generator (ranked bar, datetime coercion)
- Embed bootstrap (theme pierce + i18n + markdown) shared by demo and React app
- Event sink for chat turn observability
2026-04-29 17:22:05 -03:00

30 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

What this project is

A Vanna 2.0 deployment that lets a user ask natural-language questions in Portuguese and get back SQL results from a ClickHouse Cloud database. Not a fork of Vanna — the upstream repo is cloned into vanna/ and installed editable; the application code lives at the project root.

Wiring: OpenAILlmService (LLM) + ChromaAgentMemory (local vector store, persisted to ./chroma_db/) + RLSClickHouseRunner (subclass of upstream ClickHouseRunner that injects ClickHouse additional_table_filters for tenant isolation against gold database in ClickHouse Cloud).

Two ways to query the agent:

  • CLI: python ask.py "your question" (uses StaticUserResolver with IDs from env/flags)
  • Web: uvicorn server:app --port 8765 then embed <vanna-chat> (uses RequestContextUserResolver reading IDs from query params)

⚠️ Vanna-first rule (READ BEFORE WRITING NEW CODE)

Before adding new code, search vanna/src/vanna/ and vanna/frontends/ for an existing solution. This project deliberately uses upstream Vanna primitives wherever possible — only the items in "Intentional custom code" below are project-specific.

When unsure:

  1. grep -r "the-thing-you-want" vanna/src/vanna/ first.
  2. Check vanna/src/vanna/servers/, vanna/src/vanna/integrations/, vanna/src/vanna/core/, vanna/frontends/webcomponent/.
  3. Read vanna/src/vanna/examples/claude_sqlite_example.py — closest reference for assembly patterns.
  4. Only if there's no built-in equivalent, write custom code in the project root and document why in this section.

What we use from Vanna (do NOT reimplement)

  • Server: vanna.servers.fastapi.VannaFastAPIServer — see server.py. Provides /api/vanna/v2/chat_sse|chat_websocket|chat_poll, healthcheck, CORS, RequestContext extraction (cookies+headers+query_params+metadata).
  • Frontend: <vanna-chat> Web Component from vanna/frontends/webcomponent/. Build artifact at vanna/frontends/webcomponent/dist/vanna-components.js (npm-built; gitignored). Floating button via starting-state="minimized". Renders rich components including Plotly charts, sortable/searchable tables, code blocks, status cards, progress bars.
  • Tools: RunSqlTool + VisualizeDataTool from vanna.tools — share a LocalFileSystem(working_directory="./data_storage"); SQL writes CSV → viz reads CSV → emits chart rich component the web component renders via Plotly.
  • Memory tools: SearchSavedCorrectToolUsesTool + SaveQuestionToolArgsTool + SaveTextMemoryTool from vanna.tools — fecham o loop self-learning. ChromaAgentMemory agora também é escrito pelo LLM em runtime (par pergunta→args do run_sql após sucesso), não só pelo train.py (schema docs offline). Orientação de uso vive em system_prompt.py na seção "Memória".
  • Memory: vanna.integrations.chromadb.ChromaAgentMemory for vector store; agent_memory.save_text_memory(content, context) is the canonical write API (usada por train.py; o LLM usa o tool save_text_memory em runtime).
  • User model: vanna.User — has ConfigDict(extra="allow") (vanna/src/vanna/core/user/models.py:25) so we can attach program_id/store_id as ad-hoc fields without subclassing.
  • Resolver ABC: vanna.core.user.UserResolver — base class for both StaticUserResolver and RequestContextUserResolver.
  • RequestContext: extracted automatically by the server (vanna/src/vanna/servers/fastapi/routes.py:46-52) — read from query_params/cookies/headers/metadata in the resolver.
  • SQL runner base: vanna.integrations.clickhouse.ClickHouseRunner — we subclass it, never modify upstream.
  • CLI server runner: vanna.servers.cli.server_runner (the vanna serve command) — we don't use it directly, but it's the reference for how to wire FastAPI + frontend bundle.

Intentional custom code (Vanna has no equivalent)

  • TenantAwareChromaMemory (tenant_memory.py) — Vanna's ChromaAgentMemory é single-collection, sem scoping. Vazaria perguntas/aprendizados entre tenants (program × store) que compartilham o mesmo deploy. Esta classe compõe duas instâncias: collection compartilhada vanna_clickhouse_gold pra text memories (schema docs do train.py — comum a todos), e collection vanna_clickhouse_gold__p<program>__s<store> lazy-criada por (program_id, store_id) pra tool-usage memories. Roteamento por tipo de memória: text → shared, tool → tenant. train.py não muda — escreve só text memories na shared.
  • RLSClickHouseRunner (rls_runner.py) — Vanna's ClickHouseRunner doesn't support per-query settings. We override run_sql to inject additional_table_filters via client.query(sql, settings=...). The ToolRegistry.transform_args() hook is Vanna's "official" RLS extension point but only allows arg rewriting / rejection — it can't reach clickhouse_connect settings, so we keep the runner subclass.
  • _FORBIDDEN_SCHEMA_RE / _INTROSPECTION_STMT_RE (rls_runner.py) — regex guards no topo de run_sql que rejeitam SQL contra system.* / information_schema.* ou statements SHOW/DESCRIBE/EXPLAIN. ClickHouse Cloud não enforça REVOKE column-level em system.tables (acesso herdado de role default), então blindamos app-side. Sem isso, o LLM lia system.tables.create_table_query e via DDL com colunas revogadas.
  • _format_table_filter_map / _quote (rls_runner.py) — clickhouse_connect serializes Python dicts as JSON (double quotes) but ClickHouse's Map literal needs single quotes with '' escape. Hand-built literal is the workaround.
  • _round_decimal_columns (rls_runner.py) — após cada query, percorre as colunas do pd.DataFrame e arredonda valores Decimal/float pra 2 casas (constante _DISPLAY_DECIMALS). Necessário porque ClickHouse retorna Decimal(N, 6) por padrão e o dataframe rich component renderiza via value.toLocaleString() (vanna/frontends/webcomponent/src/components/rich-component-system.ts:624) sem rounding — sem isso, 307427.030000 aparecia na tabela. Patchar o renderer upstream exigiria rebuild npm + perde no git pull; arredondar app-side cobre todas as queries sem depender do LLM emitir round(..., 2) no SELECT. Colunas integer/string/datetime ficam intactas (verificação via pd.api.types.is_float_dtype + sample do object dtype). CSV escrito pelo RunSqlTool herda os valores arredondados — visualize_data tudo bem porque charts já arredondam exibição.
  • StaticUserResolver / RequestContextUserResolver (agent.py) — Vanna ships only the UserResolver ABC, no concrete implementations.
  • system_prompt.py — Vanna tem DefaultSystemPromptBuilder(base_prompt=...) mas nenhum prompt domain-specific. Constante SYSTEM_PROMPT injeta regras pt-BR + confidencialidade + escopo de loja única + métricas padronizadas + formatação R$/L. Edita o arquivo direto pra iterar; sem rebuild, sem re-train.
  • VisualizeDataToolPT + ClubPetroChartGenerator (viz_tool.py) — subclasse de VisualizeDataTool com três customizações sobre upstream: (1) novo arg opcional chart_type (line/bar/scatter/histogram/area) no schema (VisualizeDataArgsPT) — quando o LLM passa, força o tipo via _render_forced; quando omite, cai na heurística como fallback. Threadado via contextvars.ContextVar no execute pra suportar execuções concorrentes do tool singleton (Vanna roda async). (2) description PT-BR com gatilhos por tipo de pergunta (ranking → chart_type='bar', série → 'line', etc.). (3) ClubPetroChartGenerator substitui o default PlotlyChartGenerator upstream — dropa o fallback "4+ colunas → go.Table" (vanna/src/vanna/integrations/plotly/chart_generator.py:51-55) que duplicava o dataframe rich; aplica _coerce_datetime_columns (string ISO → datetime64) nos DOIS caminhos (<4 e >=4 cols), pra que séries temporais sejam detectadas independente do nº de colunas; e usa _create_ranked_bar_chart próprio (em vez do _create_bar_chart upstream que re-agrega com groupby e perde a ordenação descendente do ranking).
  • RLS_INTERNAL_COLS (train.py) — program_id/store_id precisam estar no GRANT (RLS depende deles), mas escondemos da doc do ChromaDB pra não tentar o LLM a usá-los manualmente nas queries.
  • train.py — Vanna has AgentMemory.save_text_memory but no schema crawler. Usa apenas system.columns (filtrado pelo ClickHouse via GRANT column-level, sem dependência de system.tables nem SHOW CREATE TABLE) e emite uma memória de texto por tabela em RLS_TABLES. Sem DDL bruto pra evitar vazamento de colunas revogadas.
  • ask.py — Vanna has only the vanna serve CLI; no ad-hoc vanna ask. This is a thin async wrapper for terminal use.
  • local_request_context() (agent.py) — 1-line factory because Vanna has no default RequestContext constructor.
  • csv_cleanup.py — Vanna upstream não tem GC pros CSVs que RunSqlTool escreve em data_storage/. O LocalFileSystem.write_file só escreve, nunca apaga; sem isso o disco cresce linearmente com o tráfego. Módulo standalone com sweep_once() (síncrono, idempotente — apaga query_results_*.csv com mtime > CSV_TTL_SECONDS, default 1800s) e tarefa asyncio periódica (CSV_SWEEP_INTERVAL_SECONDS, default 600s). server.py pluga via on_event("startup"/"shutdown") — sweep no boot pra recolher legado de runs anteriores e armar o loop de fundo. CLI (ask.py) não roda cleanup em tempo real; depende do próximo boot do servidor.
  • static/vanna-embed-bootstrap.js — fonte única do JS de wiring exigido por todo embed do <vanna-chat>: theme pierce (monkey-patch do setter de adoptedStyleSheets pra forçar o themeSheet sempre como último item — Lit re-assina depois do attachShadow e venceria a cascade sem isso); tradutor PT-BR via MutationObserver em todo shadow root novo; markdown processor pros balões (escapeHtml → code → bold/italic → links → headers ANTES de listas porque a regex de lista consome o \n final e quebraria header subsequente → \n<br> → cleanup <br> adjacente a blocos); load de fonts; injeção do bundle. Servido em /vanna-embed-bootstrap.js (rota explícita no server.py). Expõe window.VannaEmbed.ensureLoaded({ baseUrl, extraCss? }) -> Promise<void> (idempotente). Antes desta extração, a app React (clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts) duplicava ~250 linhas idênticas; toda correção tinha que ser feita em dois lugares e divergia (foi exatamente o que aconteceu com fix de markdown headers — só pegou o demo, app React continuou quebrada). Agora embed-demo.html e vannaChatLoader.ts são thin wrappers que injetam <script src=...> e chamam ensureLoaded. Edição do bootstrap reflete em ambos automaticamente após hard-refresh (sem rebuild npm).
  • EventSink + TurnRecord + ContextVar _current_turn (events_sink.py) — grava 1 row por turno de chat em events.vanna_ai (ClickHouse). Vanna tem observability_provider (spans/metrics) mas não tem persistência estruturada de interação fim-a-fim. chat_filter.py cria o TurnRecord no início do handle_stream, captura question (de request.message), program_id/store_id/user_id (de request_context.query_params — frontend do web app deve passar os 3 como query string nos endpoints sse-endpoint/ws-endpoint/poll-endpoint do <vanna-chat>; user_id defaulta a string vazia se ausente, sem quebrar o chat), response (concatenando rich.type=='text' chunks), status/error_message (do try/except). Hooks downstream escrevem no mesmo TurnRecord via ContextVar: EventCapturingToolRegistry.transform_args (agent.py) registra cada tool call e captura args.sql quando tool.name == "run_sql"; VisualizeDataToolPT.execute registra (chart_type, title). Flush no finally do stream via EventSink.flush (asyncio.to_thread pro client.insert sync; try/except envolve tudo — falha de insert nunca quebra a resposta ao usuário). Cliente CH lazy, mesmas creds do RLS runner — wren_ia precisa GRANT INSERT ON events.vanna_ai.

Commands

All commands assume the venv is active:

source venv/bin/activate

Common workflows:

python ask.py "your question"                                  # CLI (uses .env RLS_PROGRAM_ID/RLS_STORE_ID)
python ask.py --program-id <id> --store-id <id> "..."          # CLI with override
python train.py                                                # re-extract schema (only RLS_TABLES) into ChromaDB
python test_clickhouse.py                                      # raw connectivity smoke test (no LLM)
uvicorn server:app --host 127.0.0.1 --port 8765                # web server

Web component build (one-time, after git pull of upstream or after upstream version bump):

cd vanna/frontends/webcomponent
npm install
npm run build
# outputs vanna/frontends/webcomponent/dist/vanna-components.js (~7.5 MB)

Re-installing or pulling upstream Vanna changes:

cd vanna && git pull && cd ..
pip install -e ./vanna

Adding more LLM/DB/vector extras (Vanna defines them in vanna/pyproject.toml):

pip install <package>              # extras shorthand `pip install -e './vanna[xxx]'` fails on macOS file URI

Architecture

Agent assembly (agent.py)

build_agent(program_id=None, store_id=None, user_resolver=None) is the single source of truth. Vanna 2.0's Agent.__init__ requires both agent_memory and user_resolver — no defaults. Two resolver flavors:

  • StaticUserResolver — fixed User from env/flags. CLI default.
  • RequestContextUserResolver — reads program_id/store_id from request_context.query_params (with metadata fallback) per request. Validates with _require_id (regex ^[A-Za-z0-9_-]+$); raises PermissionError on missing/invalid. Web default.

ChromaAgentMemory(persist_directory="./chroma_db", collection_name="vanna_clickhouse_gold") is shared by both.

Agent uses temperature=float(os.environ.get("OPENAI_TEMPERATURE", "1.0")). Default 1.0 mantém compat com modelos de reasoning/gpt-5* que rejeitam outros valores. Pra modelos que aceitam ajuste (ex.: gpt-4o), set OPENAI_TEMPERATURE=0.2 no .env pra mais determinismo na geração de SQL.

System prompt customizado vem de system_prompt.SYSTEM_PROMPT via DefaultSystemPromptBuilder(base_prompt=...). Quando base_prompt é não-nulo (vanna/src/vanna/core/system_prompt/default.py:47-48), substitui o prompt default do Vanna; o LlmContextEnhancer ainda anexa as memórias retrievadas do ChromaDB depois.

RLS (rls_runner.py)

RLSClickHouseRunner extends ClickHouseRunner. Em run_sql faz dois trabalhos:

1. Bloqueio app-side de introspecção (regex guards no topo da função, antes de tudo):

  • _FORBIDDEN_SCHEMA_RE — rejeita SQL referenciando system.* ou information_schema.*. ClickHouse Cloud não enforça REVOKE column-level em system.tables (acesso vem de role default não-revogável), então é app-side ou nada.
  • _INTROSPECTION_STMT_RE — rejeita statements SHOW/DESCRIBE/EXPLAIN. Sem isso, o LLM bypassava system.* via SHOW TABLES FROM gold ou DESCRIBE TABLE gold.sales e descobria colunas revogadas.

Ambos levantam PermissionError com mensagem orientando o LLM a usar o contexto treinado.

2. Injeção de RLS (additional_table_filters):

  • RLS_TABLES = ("gold.sales",) — receives a program_id = '...' AND store_id = '...' filter built from context.user.program_id / store_id.
  • DENIED_TABLES = ("gold.vw_relatorios_exportaveis_analitico_vendas",) — receives the literal 0 filter (zero rows). Defense in depth — the ClickHouse user wren_ia already lacks SELECT grant on the view.

train.py imports RLS_TABLES so the trained schema docs match the RLS scope (single source of truth).

The filter expression is built as a hand-formatted ClickHouse Map literal because clickhouse_connect JSON-serializes Python dicts. See _format_table_filter_map.

Tools registered (agent.py)

Two tools sharing a LocalFileSystem(working_directory="./data_storage"):

  • RunSqlTool(sql_runner=RLSClickHouseRunner(...), file_system=fs) — executes RLS-filtered SQL, dumps full result to CSV in ./data_storage/<user-hash>/query_results_*.csv, returns truncated preview to the LLM.
  • VisualizeDataToolPT(file_system=fs) — subclasse local de VisualizeDataTool (viz_tool.py) com ClubPetroChartGenerator injetado e arg chart_type opcional. Lê o CSV da rodada anterior, emite um chart rich component (Plotly figure JSON). Não usa o PlotlyChartGenerator default — ver "Intentional custom code" pros detalhes (drop de go.Table, coerção de datetime uniforme, ranked bar). The web component renders it; the CLI just reports "Created visualization from ".

Both registered via tools.register_local_tool(tool, access_groups=[]) (Vanna's register(tool) shorthand exists in some upstream examples but isn't the current API). Empty access_groups=[] means accessible to all users.

Theming (static/vanna-theme.css + adoptedStyleSheets pierce)

<vanna-chat> exposes ~50 CSS custom properties (vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts). Each internal custom element (vanna-message, vanna-status-bar, vanna-progress-tracker, plotly-chart, rich-card, rich-task-list, rich-progress-bar) re-imports vannaDesignTokens and re-declares the literals on its own :host. A <link rel="stylesheet"> in the document only retemas vanna-chat itself — nested elements live in encapsulated shadow trees that document selectors can't reach, and their internal :host rules shadow any inherited custom property.

O pierce vive em static/vanna-embed-bootstrap.js (ver "Intentional custom code"): (a) cria um CSSStyleSheet construído, (b) monkey-patcha o setter de ShadowRoot.prototype.adoptedStyleSheets pra sempre mover esse sheet pro fim do array (necessário porque Lit re-assina o array depois do attachShadow e venceria a cascade sem isso), (c) fetch()-a /vanna-theme.css e chama replaceSync() pra popular. Adopted sheets cascateiam depois do static styles do próprio component, então com :host specificity igual a regra adoptada vence.

static/vanna-theme.css uses selector :host, vanna-chat { ... } so the same file works in both contexts:

  • <link> in the document → vanna-chat { ... } matches the host element.
  • adoptedStyleSheets inside each shadow root → :host { ... } matches the shadow host of that root.

No JS rebuild needed to change colors/fonts/spacing — edit static/vanna-theme.css and hard-refresh. Don't modify vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts (upstream, would be lost on git pull).

@import of Google Fonts inside vanna-theme.css doesn't work because constructed CSSStyleSheet (via replaceSync) silently strips @import. The bootstrap injeta <link> no <head> via loadFontsOnce.

A sidebar (<vanna-progress-tracker>) é escondida pelo lado consumer (não pelo bootstrap) — em embed-demo.html e em VannaChat.tsx, depois de customElements.whenDefined("vanna-chat") resolver, seta el.showProgress = false. Lit Boolean property attributes default-true não desligam via HTML attribute (qualquer presença = true).

Embed bootstrap (static/vanna-embed-bootstrap.js)

Bootstrap único servido pelo Vanna server em /vanna-embed-bootstrap.js (rota no server.py). Concentra todo o JS de wiring; consumido por:

  • static/embed-demo.html (smoke test, mesma origem) — <script src="/vanna-embed-bootstrap.js"> + chamada VannaEmbed.ensureLoaded({ baseUrl: "" }).
  • clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts (app React, cross-origin) — injeta <script src="${baseUrl}/vanna-embed-bootstrap.js"> dinamicamente, espera window.VannaEmbed, chama ensureLoaded({ baseUrl, extraCss }). extraCss carrega override do avatar logo (SVG mora no /public/ do CRA, não no Vanna server).

Antes desta extração, a app React duplicava ~250 linhas idênticas — fix de markdown header rodou só no demo e a app React continuou quebrada. Lição: toda mudança em theme/i18n/markdown agora vive só em static/vanna-embed-bootstrap.js. Hard-refresh nas duas pontas pega automático.

Pra produção, a página do cliente carrega:

<link rel="stylesheet" href="https://SEU-VANNA/vanna-theme.css">
<script src="https://SEU-VANNA/vanna-embed-bootstrap.js"></script>

<vanna-chat id="chat" ...></vanna-chat>

<script>
  window.VannaEmbed.ensureLoaded({ baseUrl: "https://SEU-VANNA" }).then(() => {
    document.getElementById("chat").showProgress = false;
  });
</script>

Fonts são carregadas pelo bootstrap (loadFontsOnce) — o cliente não precisa mais incluir <link> de Google Fonts manualmente. Bundle (vanna-components.js) também é injetado pelo bootstrap (injectBundle).

Filtragem + tradução de chunks (chat_filter.py)

FilteringChatHandler (subclasse de vanna.servers.base.ChatHandler) é injetado em VannaFastAPIServer.chat_handler antes de create_app(). Ele intercepta o stream de ChatStreamChunk e:

  1. Whitelist de rich.type — drops chunks cujo tipo não está em ALLOWED_RICH_TYPES = {text, dataframe, chart, status_bar_update, chat_input_update}. Tipos extras que o agent emite (status_card, task_tracker_update, notification, log_viewer, progress_display, etc.) somem antes de virar SSE.
  2. Tradução PT — strings hardcoded em inglês emitidas por vanna/src/vanna/core/agent/agent.py (Response complete, Ready for next message, Processing your request..., etc.) são substituídas via tabela TRANSLATIONS exact-match nos campos message, detail, placeholder do rich.data. Strings que caem fora da tabela (incluindo dinâmicas tipo Running 3 tools) passam intocadas — adicionar entradas conforme aparecerem no chat.

Charts: o ClubPetroChartGenerator (viz_tool.py) substitui o default upstream e dropa o go.Table fallback de 4+ colunas, então não tem dataframe duplicado visualmente. O LLM controla o tipo via arg chart_type no visualize_data (line/bar/scatter/histogram/area); quando omite, cai numa heurística por shape do CSV.

Pra adicionar mais tipos permitidos ou novas traduções: editar chat_filter.py. Sem rebuild de bundle, sem mexer em upstream.

Web server (server.py)

server.py is ~60 lines: VannaFastAPIServer(agent, config={cors:..., api_base_url:""}).create_app(), mounts /static/ to vanna/frontends/webcomponent/dist/, adiciona rotas estáticas pra arquivos do project-root static/:

  • /vanna-theme.css — tema CSS adoptado em todo shadow root.
  • /vanna-embed-bootstrap.js — JS único de wiring (theme pierce + tradutor + markdown + bundle loader). Consumido por embed-demo.html e pela app React.
  • /clubpetro-logo.{png,svg}, /dashboard-bg.png — assets.
  • /embed-demo.html — smoke test page que substitui __PROGRAM_ID__ / __STORE_ID__ / __USER_ID__ placeholders com valores do .env em runtime (lido fresh a cada request — sem restart pra mudanças em HTML).

All chat routes (/api/vanna/v2/chat_sse|chat_websocket|chat_poll) come from upstream — do not redefine them.

The web component (<vanna-chat>) sends program_id / store_id as query params on the endpoint URL: sse-endpoint="/api/vanna/v2/chat_sse?program_id=X&store_id=Y". The upstream server populates RequestContext.query_params from dict(http_request.query_params), so RequestContextUserResolver picks them up.

Training flow (train.py)

Vanna 2.0 has no separate "training" API. Schema knowledge is injected by saving text memories into the same ChromaAgentMemory instance the agent reads from. DefaultLlmContextEnhancer (auto-wired when no enhancer is passed) retrieves the top-k similar text memories and appends them to the system prompt on every turn.

train.py itera tabelas em RLS_TABLES, lê system.columns (que o ClickHouse já filtra por GRANT column-level — não precisa privilege em system.tables), filtra colunas em RLS_INTERNAL_COLS = {"program_id", "store_id"} pra escondê-las do contexto do LLM, e salva uma memória de texto por tabela com lista de colunas + 3 sample rows (sem DDL, sem SHOW CREATE TABLE).

Re-rodar após:

  • mudança de GRANT no ClickHouse (coluna nova → aparece automaticamente; coluna revogada → some).
  • edição de RLS_TABLES.
  • edição de RLS_INTERNAL_COLS.

Sequência: rm -rf chroma_db/ && python train.py.

When constructing a ToolContext manually (as in train.py), the agent_memory= field is required by Pydantic.

CLI (ask.py)

argparse with --program-id / --store-id flags + positional question. Calls build_agent(program_id, store_id). Agent.send_message is an async generator yielding UiComponent objects. Each component has both a rich_component (structured for the web UI) and a simple_component (text fallback). The CLI prefers rich_component.content, falls back to simple_component.text.

Non-obvious gotchas

  • ToolRegistry API: use register_local_tool(tool, access_groups=[]) — there is no plain .register() method despite what some upstream examples suggest.
  • Empty access_groups=[] means the tool is accessible to all users; non-empty is a permission allowlist.
  • additional_table_filters serialization: clickhouse_connect serializes Python dict settings as JSON (double quotes), which ClickHouse rejects with CANNOT_PARSE_QUOTED_STRING (code 26). The Map value must be passed as a pre-formatted string literal with single quotes and '' escape — see _format_table_filter_map in rls_runner.py. Don't pass a dict to client.query(..., settings={"additional_table_filters": {...}}).
  • ChromaDB first run downloads ~80 MB ONNX embedding model to ~/.cache/chroma/. Cached afterwards. If running offline, pre-download or pass a custom embedding_function.
  • Editable extras on macOS: pip install -e './vanna[clickhouse]' fails with non-local file URIs are not supported. Install the editable package and the extras separately.
  • Python 3.9 triggers deprecation warnings from clickhouse-connect and urllib3/LibreSSL. Not blocking. If upgrading the Python interpreter, recreate the venv.
  • SELECT * on gold.sales falha com ACCESS_DENIEDwren_ia tem column-level GRANT, então SELECT * exige SELECT em todas as colunas (inclusive as 9 revogadas) e ClickHouse rejeita. Listar colunas explicitamente. (Antes do GRANT column-level, RunSqlTool salvava o full result em ./data_storage/<user-hash>/query_results_*.csv — esse fluxo continua válido para queries com colunas explícitas.)
  • REVOKE column-level em system.* é silenciosamente ignorado em ClickHouse CloudREVOKE SELECT(create_table_query) ON system.tables FROM wren_ia retorna "succeeded" no parser mas runtime continua liberando a leitura. Defesa real é app-side via regex no RLSClickHouseRunner.
  • Upstream examples are not all current: openai_quickstart.py imports OpenAILlmService from vanna.integrations.anthropic (a bug); mock_sqlite_example.py calls agent.send_message(user=...) but the real signature now takes request_context=. Trust claude_sqlite_example.py and vanna/src/vanna/core/agent/agent.py over the others.

Database

ClickHouse Cloud, database gold, accessed over HTTPS port 8443 with secure=true.

Tabela treinada: gold.sales — analytical sales fact (data: fuel/retail, CLUBE ALE / POP FIDELIDADE). Filtrada por program_id + store_id por request via RLS.

GRANT column-level em wren_ia — 20 colunas das 29 originais:

  • program_id, store_id — necessárias pro RLS funcionar (mas escondidas do LLM via RLS_INTERNAL_COLS).
  • sale_id, cartao_do_cliente, nome_da_rede, nome_da_loja, tipo_loja, categoria_loja, nome_do_cliente, categoria_do_cliente, nome_do_atendente, produto, categoria_do_produto, quantidade_total_produto, valor_total_produto, desconto, pontuacao_produto, voucher_aplicado_venda, data_da_compra, fidelizada, e_combustivel — visíveis ao LLM (18).

9 colunas REVOGADAS (não aparecem em system.columns, geram ACCESS_DENIED se o LLM tentar): customer_id, customer_category_id, sale_product_id, product_id, product_category_id, attendant_id, rfid_do_atendente, sync_date. Conferir grants atuais via SHOW GRANTS FOR wren_ia no console ClickHouse Cloud (admin) ou SELECT * FROM system.grants WHERE user_name = 'wren_ia'.

Tabela negada: gold.vw_relatorios_exportaveis_analitico_vendas — exportable view; denied at runner level ('0' filter) and at ClickHouse grant level (sem SELECT).

Credentials live in .env (gitignored). RLS values too: RLS_PROGRAM_ID, RLS_STORE_ID (CLI defaults). Optional: OPENAI_TEMPERATURE (default 1.0). Both agent.py and train.py call load_dotenv() at import.

Where to look in upstream

Server / frontend:

  • vanna/src/vanna/servers/fastapi/app.pyVannaFastAPIServer factory
  • vanna/src/vanna/servers/fastapi/routes.py — chat SSE/WS/poll endpoints; auto-extracts RequestContext from cookies/headers/query_params (line 46-52)
  • vanna/src/vanna/servers/base/templates.py — index HTML used by GET / (login + <vanna-chat> embed)
  • vanna/frontends/webcomponent/src/components/vanna-chat.ts — Web Component attributes (api-base, sse-endpoint, starting-state, theme, etc.)
  • vanna/frontends/webcomponent/src/services/api-client.ts — confirms URL construction is naive ${baseUrl}${endpoint} concat (so query strings on sse-endpoint work)

Core:

  • vanna/src/vanna/core/agent/agent.pyAgent class, send_message signature, required init params
  • vanna/src/vanna/core/registry.pyToolRegistry; the transform_args() hook (line 113-142) is the "official" RLS extension point but only handles arg rewriting / ToolRejection, not query settings
  • vanna/src/vanna/core/system_prompt/default.py:47-48DefaultSystemPromptBuilder.build_system_prompt: quando base_prompt é não-nulo retorna ele direto, descartando o prompt default do Vanna.
  • vanna/src/vanna/core/user/{models,resolver,request_context}.pyUser (with extra="allow"), UserResolver ABC, RequestContext
  • vanna/src/vanna/integrations/{openai,chromadb,clickhouse}/ — integrations this project uses
  • vanna/src/vanna/integrations/plotly/chart_generator.py:51 — heurística "4+ colunas → go.Table" do default generator. Substituímos via ClubPetroChartGenerator em viz_tool.py.
  • vanna/src/vanna/tools/run_sql.pyRunSqlTool behavior (truncation, CSV side-output)
  • vanna/src/vanna/examples/claude_sqlite_example.py — closest working reference for the assembly pattern
  • vanna/MIGRATION_GUIDE.md — only relevant if migrating Vanna 0.x code (not used here)