Wrapper application around upstream Vanna with: - Tenant-aware ChromaDB memory (per program/store) - ClickHouse RLS runner with introspection guards - PT-BR system prompt and chat translations - Custom Plotly chart generator (ranked bar, datetime coercion) - Embed bootstrap (theme pierce + i18n + markdown) shared by demo and React app - Event sink for chat turn observability
30 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
What this project is
A Vanna 2.0 deployment that lets a user ask natural-language questions in Portuguese and get back SQL results from a ClickHouse Cloud database. Not a fork of Vanna — the upstream repo is cloned into vanna/ and installed editable; the application code lives at the project root.
Wiring: OpenAILlmService (LLM) + ChromaAgentMemory (local vector store, persisted to ./chroma_db/) + RLSClickHouseRunner (subclass of upstream ClickHouseRunner that injects ClickHouse additional_table_filters for tenant isolation against gold database in ClickHouse Cloud).
Two ways to query the agent:
- CLI:
python ask.py "your question"(usesStaticUserResolverwith IDs from env/flags) - Web:
uvicorn server:app --port 8765then embed<vanna-chat>(usesRequestContextUserResolverreading IDs from query params)
⚠️ Vanna-first rule (READ BEFORE WRITING NEW CODE)
Before adding new code, search vanna/src/vanna/ and vanna/frontends/ for an existing solution. This project deliberately uses upstream Vanna primitives wherever possible — only the items in "Intentional custom code" below are project-specific.
When unsure:
grep -r "the-thing-you-want" vanna/src/vanna/first.- Check
vanna/src/vanna/servers/,vanna/src/vanna/integrations/,vanna/src/vanna/core/,vanna/frontends/webcomponent/. - Read
vanna/src/vanna/examples/claude_sqlite_example.py— closest reference for assembly patterns. - Only if there's no built-in equivalent, write custom code in the project root and document why in this section.
What we use from Vanna (do NOT reimplement)
- Server:
vanna.servers.fastapi.VannaFastAPIServer— seeserver.py. Provides/api/vanna/v2/chat_sse|chat_websocket|chat_poll, healthcheck, CORS, RequestContext extraction (cookies+headers+query_params+metadata). - Frontend:
<vanna-chat>Web Component fromvanna/frontends/webcomponent/. Build artifact atvanna/frontends/webcomponent/dist/vanna-components.js(npm-built; gitignored). Floating button viastarting-state="minimized". Renders rich components including Plotly charts, sortable/searchable tables, code blocks, status cards, progress bars. - Tools:
RunSqlTool+VisualizeDataToolfromvanna.tools— share aLocalFileSystem(working_directory="./data_storage"); SQL writes CSV → viz reads CSV → emitschartrich component the web component renders via Plotly. - Memory tools:
SearchSavedCorrectToolUsesTool+SaveQuestionToolArgsTool+SaveTextMemoryToolfromvanna.tools— fecham o loop self-learning. ChromaAgentMemory agora também é escrito pelo LLM em runtime (par pergunta→args dorun_sqlapós sucesso), não só pelotrain.py(schema docs offline). Orientação de uso vive emsystem_prompt.pyna seção "Memória". - Memory:
vanna.integrations.chromadb.ChromaAgentMemoryfor vector store;agent_memory.save_text_memory(content, context)is the canonical write API (usada portrain.py; o LLM usa o toolsave_text_memoryem runtime). - User model:
vanna.User— hasConfigDict(extra="allow")(vanna/src/vanna/core/user/models.py:25) so we can attachprogram_id/store_idas ad-hoc fields without subclassing. - Resolver ABC:
vanna.core.user.UserResolver— base class for bothStaticUserResolverandRequestContextUserResolver. - RequestContext: extracted automatically by the server (
vanna/src/vanna/servers/fastapi/routes.py:46-52) — read fromquery_params/cookies/headers/metadatain the resolver. - SQL runner base:
vanna.integrations.clickhouse.ClickHouseRunner— we subclass it, never modify upstream. - CLI server runner:
vanna.servers.cli.server_runner(thevanna servecommand) — we don't use it directly, but it's the reference for how to wire FastAPI + frontend bundle.
Intentional custom code (Vanna has no equivalent)
TenantAwareChromaMemory(tenant_memory.py) — Vanna'sChromaAgentMemoryé single-collection, sem scoping. Vazaria perguntas/aprendizados entre tenants (program × store) que compartilham o mesmo deploy. Esta classe compõe duas instâncias: collection compartilhadavanna_clickhouse_goldpra text memories (schema docs dotrain.py— comum a todos), e collectionvanna_clickhouse_gold__p<program>__s<store>lazy-criada por (program_id, store_id) pra tool-usage memories. Roteamento por tipo de memória: text → shared, tool → tenant.train.pynão muda — escreve só text memories na shared.RLSClickHouseRunner(rls_runner.py) — Vanna'sClickHouseRunnerdoesn't support per-query settings. We overriderun_sqlto injectadditional_table_filtersviaclient.query(sql, settings=...). TheToolRegistry.transform_args()hook is Vanna's "official" RLS extension point but only allows arg rewriting / rejection — it can't reachclickhouse_connectsettings, so we keep the runner subclass._FORBIDDEN_SCHEMA_RE/_INTROSPECTION_STMT_RE(rls_runner.py) — regex guards no topo derun_sqlque rejeitam SQL contrasystem.*/information_schema.*ou statementsSHOW/DESCRIBE/EXPLAIN. ClickHouse Cloud não enforça REVOKE column-level emsystem.tables(acesso herdado de role default), então blindamos app-side. Sem isso, o LLM liasystem.tables.create_table_querye via DDL com colunas revogadas._format_table_filter_map/_quote(rls_runner.py) —clickhouse_connectserializes Python dicts as JSON (double quotes) but ClickHouse's Map literal needs single quotes with''escape. Hand-built literal is the workaround._round_decimal_columns(rls_runner.py) — após cada query, percorre as colunas dopd.DataFramee arredonda valoresDecimal/floatpra 2 casas (constante_DISPLAY_DECIMALS). Necessário porque ClickHouse retornaDecimal(N, 6)por padrão e o dataframe rich component renderiza viavalue.toLocaleString()(vanna/frontends/webcomponent/src/components/rich-component-system.ts:624) sem rounding — sem isso,307427.030000aparecia na tabela. Patchar o renderer upstream exigiria rebuild npm + perde nogit pull; arredondar app-side cobre todas as queries sem depender do LLM emitirround(..., 2)no SELECT. Colunas integer/string/datetime ficam intactas (verificação viapd.api.types.is_float_dtype+ sample do object dtype). CSV escrito peloRunSqlToolherda os valores arredondados — visualize_data tudo bem porque charts já arredondam exibição.StaticUserResolver/RequestContextUserResolver(agent.py) — Vanna ships only theUserResolverABC, no concrete implementations.system_prompt.py— Vanna temDefaultSystemPromptBuilder(base_prompt=...)mas nenhum prompt domain-specific. ConstanteSYSTEM_PROMPTinjeta regras pt-BR + confidencialidade + escopo de loja única + métricas padronizadas + formatação R$/L. Edita o arquivo direto pra iterar; sem rebuild, sem re-train.VisualizeDataToolPT+ClubPetroChartGenerator(viz_tool.py) — subclasse deVisualizeDataToolcom três customizações sobre upstream: (1) novo arg opcionalchart_type(line/bar/scatter/histogram/area) no schema (VisualizeDataArgsPT) — quando o LLM passa, força o tipo via_render_forced; quando omite, cai na heurística como fallback. Threadado viacontextvars.ContextVarnoexecutepra suportar execuções concorrentes do tool singleton (Vanna roda async). (2)descriptionPT-BR com gatilhos por tipo de pergunta (ranking →chart_type='bar', série →'line', etc.). (3)ClubPetroChartGeneratorsubstitui o defaultPlotlyChartGeneratorupstream — dropa o fallback "4+ colunas → go.Table" (vanna/src/vanna/integrations/plotly/chart_generator.py:51-55) que duplicava o dataframe rich; aplica_coerce_datetime_columns(string ISO → datetime64) nos DOIS caminhos (<4 e >=4 cols), pra que séries temporais sejam detectadas independente do nº de colunas; e usa_create_ranked_bar_chartpróprio (em vez do_create_bar_chartupstream que re-agrega comgroupbye perde a ordenação descendente do ranking).RLS_INTERNAL_COLS(train.py) —program_id/store_idprecisam estar no GRANT (RLS depende deles), mas escondemos da doc do ChromaDB pra não tentar o LLM a usá-los manualmente nas queries.train.py— Vanna hasAgentMemory.save_text_memorybut no schema crawler. Usa apenassystem.columns(filtrado pelo ClickHouse via GRANT column-level, sem dependência desystem.tablesnemSHOW CREATE TABLE) e emite uma memória de texto por tabela emRLS_TABLES. Sem DDL bruto pra evitar vazamento de colunas revogadas.ask.py— Vanna has only thevanna serveCLI; no ad-hocvanna ask. This is a thin async wrapper for terminal use.local_request_context()(agent.py) — 1-line factory because Vanna has no defaultRequestContextconstructor.csv_cleanup.py— Vanna upstream não tem GC pros CSVs queRunSqlToolescreve emdata_storage/. OLocalFileSystem.write_filesó escreve, nunca apaga; sem isso o disco cresce linearmente com o tráfego. Módulo standalone comsweep_once()(síncrono, idempotente — apagaquery_results_*.csvcom mtime >CSV_TTL_SECONDS, default 1800s) e tarefa asyncio periódica (CSV_SWEEP_INTERVAL_SECONDS, default 600s).server.pypluga viaon_event("startup"/"shutdown")— sweep no boot pra recolher legado de runs anteriores e armar o loop de fundo. CLI (ask.py) não roda cleanup em tempo real; depende do próximo boot do servidor.static/vanna-embed-bootstrap.js— fonte única do JS de wiring exigido por todo embed do<vanna-chat>: theme pierce (monkey-patch do setter deadoptedStyleSheetspra forçar othemeSheetsempre como último item — Lit re-assina depois doattachShadowe venceria a cascade sem isso); tradutor PT-BR via MutationObserver em todo shadow root novo; markdown processor pros balões (escapeHtml→ code → bold/italic → links → headers ANTES de listas porque a regex de lista consome o\nfinal e quebraria header subsequente →\n→<br>→ cleanup<br>adjacente a blocos); load de fonts; injeção do bundle. Servido em/vanna-embed-bootstrap.js(rota explícita noserver.py). Expõewindow.VannaEmbed.ensureLoaded({ baseUrl, extraCss? }) -> Promise<void>(idempotente). Antes desta extração, a app React (clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts) duplicava ~250 linhas idênticas; toda correção tinha que ser feita em dois lugares e divergia (foi exatamente o que aconteceu com fix de markdown headers — só pegou o demo, app React continuou quebrada). Agoraembed-demo.htmlevannaChatLoader.tssão thin wrappers que injetam<script src=...>e chamamensureLoaded. Edição do bootstrap reflete em ambos automaticamente após hard-refresh (sem rebuild npm).EventSink+TurnRecord+ ContextVar_current_turn(events_sink.py) — grava 1 row por turno de chat emevents.vanna_ai(ClickHouse). Vanna temobservability_provider(spans/metrics) mas não tem persistência estruturada de interação fim-a-fim.chat_filter.pycria o TurnRecord no início dohandle_stream, capturaquestion(derequest.message),program_id/store_id/user_id(derequest_context.query_params— frontend do web app deve passar os 3 como query string nos endpointssse-endpoint/ws-endpoint/poll-endpointdo<vanna-chat>;user_iddefaulta a string vazia se ausente, sem quebrar o chat),response(concatenando rich.type=='text' chunks),status/error_message(do try/except). Hooks downstream escrevem no mesmo TurnRecord via ContextVar:EventCapturingToolRegistry.transform_args(agent.py) registra cada tool call e capturaargs.sqlquandotool.name == "run_sql";VisualizeDataToolPT.executeregistra (chart_type, title). Flush nofinallydo stream viaEventSink.flush(asyncio.to_threadproclient.insertsync; try/except envolve tudo — falha de insert nunca quebra a resposta ao usuário). Cliente CH lazy, mesmas creds do RLS runner —wren_iaprecisaGRANT INSERT ON events.vanna_ai.
Commands
All commands assume the venv is active:
source venv/bin/activate
Common workflows:
python ask.py "your question" # CLI (uses .env RLS_PROGRAM_ID/RLS_STORE_ID)
python ask.py --program-id <id> --store-id <id> "..." # CLI with override
python train.py # re-extract schema (only RLS_TABLES) into ChromaDB
python test_clickhouse.py # raw connectivity smoke test (no LLM)
uvicorn server:app --host 127.0.0.1 --port 8765 # web server
Web component build (one-time, after git pull of upstream or after upstream version bump):
cd vanna/frontends/webcomponent
npm install
npm run build
# outputs vanna/frontends/webcomponent/dist/vanna-components.js (~7.5 MB)
Re-installing or pulling upstream Vanna changes:
cd vanna && git pull && cd ..
pip install -e ./vanna
Adding more LLM/DB/vector extras (Vanna defines them in vanna/pyproject.toml):
pip install <package> # extras shorthand `pip install -e './vanna[xxx]'` fails on macOS file URI
Architecture
Agent assembly (agent.py)
build_agent(program_id=None, store_id=None, user_resolver=None) is the single source of truth. Vanna 2.0's Agent.__init__ requires both agent_memory and user_resolver — no defaults. Two resolver flavors:
StaticUserResolver— fixedUserfrom env/flags. CLI default.RequestContextUserResolver— readsprogram_id/store_idfromrequest_context.query_params(withmetadatafallback) per request. Validates with_require_id(regex^[A-Za-z0-9_-]+$); raisesPermissionErroron missing/invalid. Web default.
ChromaAgentMemory(persist_directory="./chroma_db", collection_name="vanna_clickhouse_gold") is shared by both.
Agent uses temperature=float(os.environ.get("OPENAI_TEMPERATURE", "1.0")). Default 1.0 mantém compat com modelos de reasoning/gpt-5* que rejeitam outros valores. Pra modelos que aceitam ajuste (ex.: gpt-4o), set OPENAI_TEMPERATURE=0.2 no .env pra mais determinismo na geração de SQL.
System prompt customizado vem de system_prompt.SYSTEM_PROMPT via DefaultSystemPromptBuilder(base_prompt=...). Quando base_prompt é não-nulo (vanna/src/vanna/core/system_prompt/default.py:47-48), substitui o prompt default do Vanna; o LlmContextEnhancer ainda anexa as memórias retrievadas do ChromaDB depois.
RLS (rls_runner.py)
RLSClickHouseRunner extends ClickHouseRunner. Em run_sql faz dois trabalhos:
1. Bloqueio app-side de introspecção (regex guards no topo da função, antes de tudo):
_FORBIDDEN_SCHEMA_RE— rejeita SQL referenciandosystem.*ouinformation_schema.*. ClickHouse Cloud não enforça REVOKE column-level emsystem.tables(acesso vem de role default não-revogável), então é app-side ou nada._INTROSPECTION_STMT_RE— rejeita statementsSHOW/DESCRIBE/EXPLAIN. Sem isso, o LLM bypassavasystem.*viaSHOW TABLES FROM goldouDESCRIBE TABLE gold.salese descobria colunas revogadas.
Ambos levantam PermissionError com mensagem orientando o LLM a usar o contexto treinado.
2. Injeção de RLS (additional_table_filters):
RLS_TABLES = ("gold.sales",)— receives aprogram_id = '...' AND store_id = '...'filter built fromcontext.user.program_id/store_id.DENIED_TABLES = ("gold.vw_relatorios_exportaveis_analitico_vendas",)— receives the literal0filter (zero rows). Defense in depth — the ClickHouse userwren_iaalready lacks SELECT grant on the view.
train.py imports RLS_TABLES so the trained schema docs match the RLS scope (single source of truth).
The filter expression is built as a hand-formatted ClickHouse Map literal because clickhouse_connect JSON-serializes Python dicts. See _format_table_filter_map.
Tools registered (agent.py)
Two tools sharing a LocalFileSystem(working_directory="./data_storage"):
RunSqlTool(sql_runner=RLSClickHouseRunner(...), file_system=fs)— executes RLS-filtered SQL, dumps full result to CSV in./data_storage/<user-hash>/query_results_*.csv, returns truncated preview to the LLM.VisualizeDataToolPT(file_system=fs)— subclasse local deVisualizeDataTool(viz_tool.py) comClubPetroChartGeneratorinjetado e argchart_typeopcional. Lê o CSV da rodada anterior, emite umchartrich component (Plotly figure JSON). Não usa oPlotlyChartGeneratordefault — ver "Intentional custom code" pros detalhes (drop dego.Table, coerção de datetime uniforme, ranked bar). The web component renders it; the CLI just reports "Created visualization from ".
Both registered via tools.register_local_tool(tool, access_groups=[]) (Vanna's register(tool) shorthand exists in some upstream examples but isn't the current API). Empty access_groups=[] means accessible to all users.
Theming (static/vanna-theme.css + adoptedStyleSheets pierce)
<vanna-chat> exposes ~50 CSS custom properties (vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts). Each internal custom element (vanna-message, vanna-status-bar, vanna-progress-tracker, plotly-chart, rich-card, rich-task-list, rich-progress-bar) re-imports vannaDesignTokens and re-declares the literals on its own :host. A <link rel="stylesheet"> in the document only retemas vanna-chat itself — nested elements live in encapsulated shadow trees that document selectors can't reach, and their internal :host rules shadow any inherited custom property.
O pierce vive em static/vanna-embed-bootstrap.js (ver "Intentional custom code"): (a) cria um CSSStyleSheet construído, (b) monkey-patcha o setter de ShadowRoot.prototype.adoptedStyleSheets pra sempre mover esse sheet pro fim do array (necessário porque Lit re-assina o array depois do attachShadow e venceria a cascade sem isso), (c) fetch()-a /vanna-theme.css e chama replaceSync() pra popular. Adopted sheets cascateiam depois do static styles do próprio component, então com :host specificity igual a regra adoptada vence.
static/vanna-theme.css uses selector :host, vanna-chat { ... } so the same file works in both contexts:
<link>in the document →vanna-chat { ... }matches the host element.adoptedStyleSheetsinside each shadow root →:host { ... }matches the shadow host of that root.
No JS rebuild needed to change colors/fonts/spacing — edit static/vanna-theme.css and hard-refresh. Don't modify vanna/frontends/webcomponent/src/styles/vanna-design-tokens.ts (upstream, would be lost on git pull).
@import of Google Fonts inside vanna-theme.css doesn't work because constructed CSSStyleSheet (via replaceSync) silently strips @import. The bootstrap injeta <link> no <head> via loadFontsOnce.
A sidebar (<vanna-progress-tracker>) é escondida pelo lado consumer (não pelo bootstrap) — em embed-demo.html e em VannaChat.tsx, depois de customElements.whenDefined("vanna-chat") resolver, seta el.showProgress = false. Lit Boolean property attributes default-true não desligam via HTML attribute (qualquer presença = true).
Embed bootstrap (static/vanna-embed-bootstrap.js)
Bootstrap único servido pelo Vanna server em /vanna-embed-bootstrap.js (rota no server.py). Concentra todo o JS de wiring; consumido por:
static/embed-demo.html(smoke test, mesma origem) —<script src="/vanna-embed-bootstrap.js">+ chamadaVannaEmbed.ensureLoaded({ baseUrl: "" }).clubpetro-frontend/src/components/VannaChat/vannaChatLoader.ts(app React, cross-origin) — injeta<script src="${baseUrl}/vanna-embed-bootstrap.js">dinamicamente, esperawindow.VannaEmbed, chamaensureLoaded({ baseUrl, extraCss }).extraCsscarrega override do avatar logo (SVG mora no/public/do CRA, não no Vanna server).
Antes desta extração, a app React duplicava ~250 linhas idênticas — fix de markdown header rodou só no demo e a app React continuou quebrada. Lição: toda mudança em theme/i18n/markdown agora vive só em static/vanna-embed-bootstrap.js. Hard-refresh nas duas pontas pega automático.
Pra produção, a página do cliente carrega:
<link rel="stylesheet" href="https://SEU-VANNA/vanna-theme.css">
<script src="https://SEU-VANNA/vanna-embed-bootstrap.js"></script>
<vanna-chat id="chat" ...></vanna-chat>
<script>
window.VannaEmbed.ensureLoaded({ baseUrl: "https://SEU-VANNA" }).then(() => {
document.getElementById("chat").showProgress = false;
});
</script>
Fonts são carregadas pelo bootstrap (loadFontsOnce) — o cliente não precisa mais incluir <link> de Google Fonts manualmente. Bundle (vanna-components.js) também é injetado pelo bootstrap (injectBundle).
Filtragem + tradução de chunks (chat_filter.py)
FilteringChatHandler (subclasse de vanna.servers.base.ChatHandler) é injetado em VannaFastAPIServer.chat_handler antes de create_app(). Ele intercepta o stream de ChatStreamChunk e:
- Whitelist de
rich.type— drops chunks cujo tipo não está emALLOWED_RICH_TYPES = {text, dataframe, chart, status_bar_update, chat_input_update}. Tipos extras que o agent emite (status_card,task_tracker_update,notification,log_viewer,progress_display, etc.) somem antes de virar SSE. - Tradução PT — strings hardcoded em inglês emitidas por
vanna/src/vanna/core/agent/agent.py(Response complete,Ready for next message,Processing your request..., etc.) são substituídas via tabelaTRANSLATIONSexact-match nos camposmessage,detail,placeholderdorich.data. Strings que caem fora da tabela (incluindo dinâmicas tipoRunning 3 tools) passam intocadas — adicionar entradas conforme aparecerem no chat.
Charts: o ClubPetroChartGenerator (viz_tool.py) substitui o default upstream e dropa o go.Table fallback de 4+ colunas, então não tem dataframe duplicado visualmente. O LLM controla o tipo via arg chart_type no visualize_data (line/bar/scatter/histogram/area); quando omite, cai numa heurística por shape do CSV.
Pra adicionar mais tipos permitidos ou novas traduções: editar chat_filter.py. Sem rebuild de bundle, sem mexer em upstream.
Web server (server.py)
server.py is ~60 lines: VannaFastAPIServer(agent, config={cors:..., api_base_url:""}).create_app(), mounts /static/ to vanna/frontends/webcomponent/dist/, adiciona rotas estáticas pra arquivos do project-root static/:
/vanna-theme.css— tema CSS adoptado em todo shadow root./vanna-embed-bootstrap.js— JS único de wiring (theme pierce + tradutor + markdown + bundle loader). Consumido porembed-demo.htmle pela app React./clubpetro-logo.{png,svg},/dashboard-bg.png— assets./embed-demo.html— smoke test page que substitui__PROGRAM_ID__/__STORE_ID__/__USER_ID__placeholders com valores do.envem runtime (lido fresh a cada request — sem restart pra mudanças em HTML).
All chat routes (/api/vanna/v2/chat_sse|chat_websocket|chat_poll) come from upstream — do not redefine them.
The web component (<vanna-chat>) sends program_id / store_id as query params on the endpoint URL: sse-endpoint="/api/vanna/v2/chat_sse?program_id=X&store_id=Y". The upstream server populates RequestContext.query_params from dict(http_request.query_params), so RequestContextUserResolver picks them up.
Training flow (train.py)
Vanna 2.0 has no separate "training" API. Schema knowledge is injected by saving text memories into the same ChromaAgentMemory instance the agent reads from. DefaultLlmContextEnhancer (auto-wired when no enhancer is passed) retrieves the top-k similar text memories and appends them to the system prompt on every turn.
train.py itera tabelas em RLS_TABLES, lê system.columns (que o ClickHouse já filtra por GRANT column-level — não precisa privilege em system.tables), filtra colunas em RLS_INTERNAL_COLS = {"program_id", "store_id"} pra escondê-las do contexto do LLM, e salva uma memória de texto por tabela com lista de colunas + 3 sample rows (sem DDL, sem SHOW CREATE TABLE).
Re-rodar após:
- mudança de GRANT no ClickHouse (coluna nova → aparece automaticamente; coluna revogada → some).
- edição de
RLS_TABLES. - edição de
RLS_INTERNAL_COLS.
Sequência: rm -rf chroma_db/ && python train.py.
When constructing a ToolContext manually (as in train.py), the agent_memory= field is required by Pydantic.
CLI (ask.py)
argparse with --program-id / --store-id flags + positional question. Calls build_agent(program_id, store_id). Agent.send_message is an async generator yielding UiComponent objects. Each component has both a rich_component (structured for the web UI) and a simple_component (text fallback). The CLI prefers rich_component.content, falls back to simple_component.text.
Non-obvious gotchas
ToolRegistryAPI: useregister_local_tool(tool, access_groups=[])— there is no plain.register()method despite what some upstream examples suggest.- Empty
access_groups=[]means the tool is accessible to all users; non-empty is a permission allowlist. additional_table_filtersserialization:clickhouse_connectserializes Pythondictsettings as JSON (double quotes), which ClickHouse rejects withCANNOT_PARSE_QUOTED_STRING (code 26). The Map value must be passed as a pre-formatted string literal with single quotes and''escape — see_format_table_filter_mapinrls_runner.py. Don't pass adicttoclient.query(..., settings={"additional_table_filters": {...}}).- ChromaDB first run downloads ~80 MB ONNX embedding model to
~/.cache/chroma/. Cached afterwards. If running offline, pre-download or pass a customembedding_function. - Editable extras on macOS:
pip install -e './vanna[clickhouse]'fails withnon-local file URIs are not supported. Install the editable package and the extras separately. - Python 3.9 triggers deprecation warnings from
clickhouse-connectandurllib3/LibreSSL. Not blocking. If upgrading the Python interpreter, recreate the venv. SELECT *ongold.salesfalha com ACCESS_DENIED —wren_iatem column-level GRANT, entãoSELECT *exige SELECT em todas as colunas (inclusive as 9 revogadas) e ClickHouse rejeita. Listar colunas explicitamente. (Antes do GRANT column-level,RunSqlToolsalvava o full result em./data_storage/<user-hash>/query_results_*.csv— esse fluxo continua válido para queries com colunas explícitas.)- REVOKE column-level em
system.*é silenciosamente ignorado em ClickHouse Cloud —REVOKE SELECT(create_table_query) ON system.tables FROM wren_iaretorna "succeeded" no parser mas runtime continua liberando a leitura. Defesa real é app-side via regex noRLSClickHouseRunner. - Upstream examples are not all current:
openai_quickstart.pyimportsOpenAILlmServicefromvanna.integrations.anthropic(a bug);mock_sqlite_example.pycallsagent.send_message(user=...)but the real signature now takesrequest_context=. Trustclaude_sqlite_example.pyandvanna/src/vanna/core/agent/agent.pyover the others.
Database
ClickHouse Cloud, database gold, accessed over HTTPS port 8443 with secure=true.
Tabela treinada: gold.sales — analytical sales fact (data: fuel/retail, CLUBE ALE / POP FIDELIDADE). Filtrada por program_id + store_id por request via RLS.
GRANT column-level em wren_ia — 20 colunas das 29 originais:
program_id,store_id— necessárias pro RLS funcionar (mas escondidas do LLM viaRLS_INTERNAL_COLS).sale_id,cartao_do_cliente,nome_da_rede,nome_da_loja,tipo_loja,categoria_loja,nome_do_cliente,categoria_do_cliente,nome_do_atendente,produto,categoria_do_produto,quantidade_total_produto,valor_total_produto,desconto,pontuacao_produto,voucher_aplicado_venda,data_da_compra,fidelizada,e_combustivel— visíveis ao LLM (18).
9 colunas REVOGADAS (não aparecem em system.columns, geram ACCESS_DENIED se o LLM tentar): customer_id, customer_category_id, sale_product_id, product_id, product_category_id, attendant_id, rfid_do_atendente, sync_date. Conferir grants atuais via SHOW GRANTS FOR wren_ia no console ClickHouse Cloud (admin) ou SELECT * FROM system.grants WHERE user_name = 'wren_ia'.
Tabela negada: gold.vw_relatorios_exportaveis_analitico_vendas — exportable view; denied at runner level ('0' filter) and at ClickHouse grant level (sem SELECT).
Credentials live in .env (gitignored). RLS values too: RLS_PROGRAM_ID, RLS_STORE_ID (CLI defaults). Optional: OPENAI_TEMPERATURE (default 1.0). Both agent.py and train.py call load_dotenv() at import.
Where to look in upstream
Server / frontend:
vanna/src/vanna/servers/fastapi/app.py—VannaFastAPIServerfactoryvanna/src/vanna/servers/fastapi/routes.py— chat SSE/WS/poll endpoints; auto-extractsRequestContextfrom cookies/headers/query_params (line 46-52)vanna/src/vanna/servers/base/templates.py— index HTML used byGET /(login +<vanna-chat>embed)vanna/frontends/webcomponent/src/components/vanna-chat.ts— Web Component attributes (api-base,sse-endpoint,starting-state,theme, etc.)vanna/frontends/webcomponent/src/services/api-client.ts— confirms URL construction is naive${baseUrl}${endpoint}concat (so query strings onsse-endpointwork)
Core:
vanna/src/vanna/core/agent/agent.py—Agentclass,send_messagesignature, required init paramsvanna/src/vanna/core/registry.py—ToolRegistry; thetransform_args()hook (line 113-142) is the "official" RLS extension point but only handles arg rewriting /ToolRejection, not query settingsvanna/src/vanna/core/system_prompt/default.py:47-48—DefaultSystemPromptBuilder.build_system_prompt: quandobase_prompté não-nulo retorna ele direto, descartando o prompt default do Vanna.vanna/src/vanna/core/user/{models,resolver,request_context}.py—User(withextra="allow"),UserResolverABC,RequestContextvanna/src/vanna/integrations/{openai,chromadb,clickhouse}/— integrations this project usesvanna/src/vanna/integrations/plotly/chart_generator.py:51— heurística "4+ colunas → go.Table" do default generator. Substituímos viaClubPetroChartGeneratoremviz_tool.py.vanna/src/vanna/tools/run_sql.py—RunSqlToolbehavior (truncation, CSV side-output)vanna/src/vanna/examples/claude_sqlite_example.py— closest working reference for the assembly patternvanna/MIGRATION_GUIDE.md— only relevant if migrating Vanna 0.x code (not used here)