> For the complete documentation index, see [llms.txt](https://docs.toucanai.cloud/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.toucanai.cloud/embed/embedding-overview/how-to/integrate-into-your-ai-chat.md).

# Integrate Toucan into your own AI chat

{% hint style="info" %}
**Target Audience**: Developers building a custom AI chat experience.
{% endhint %}

### Goal

Extend an AI chat you already own with Toucan's data-aware assistant. Your host LLM stays in charge of the conversation; when the user asks a data question, it calls a single Toucan tool and renders the response inline.

This is the alternative to [Embed an AI Chat](/embed/embedding-overview/how-to/embed-an-ai-chat.md): instead of dropping a Toucan-branded chat into your product, you keep your own chat UI and brand and add Toucan as a capability behind it.

***

### How it works

Toucan exposes an [MCP](https://modelcontextprotocol.io) endpoint at `https://toucanai.cloud/api/mcp`. Your backend connects to it like any other MCP server, attaches the discovered tools to your LLM call, and the model decides when to invoke them.

```
┌──────────────┐  user msg   ┌─────────────────┐
│  Your chat   │ ──────────► │  Your backend   │
│   (UI)       │ ◄────────── │  + your LLM     │
└──────────────┘  envelope   └────────┬────────┘
                                      │  MCP
                                      ▼
                             ┌─────────────────┐
                             │   Toucan MCP    │
                             │ (orchestrator,  │
                             │ query, charts)  │
                             └─────────────────┘
```

Tool results are self-contained JSON envelopes with an `answer` field (markdown the LLM relays verbatim) and a `visualizations` array (chart payloads your UI renders with the `<tc-result-renderer>` web component). No Toucan auth token is required on the browser side — visualizations carry their own data.

***

### Prerequisites

* At least one [connected and active database](/build/data-connections/how-to/add-a-database.md).
* A [valid API key](/embed/authentication/how-to/generate-an-api-key.md) for token generation.
* (Recommended): [Enriched metadata](/build/analyze-your-database-with-ai/how-to/analyze-your-database-with-ai.md) to improve the AI assistant's accuracy.
* (Recommended): [Row-Level Security (RLS) configured](/embed/permissions-and-row-level-security/how-to/apply-rls-to-your-database.md) for multi-tenant data isolation.
* An LLM SDK on your backend that speaks MCP. The examples below use the [Vercel AI SDK](https://ai-sdk.dev/) (`@ai-sdk/mcp`), but any MCP-capable client works.

***

### Steps

#### 1. Generate a Toucan embed token on your backend

The MCP endpoint is authenticated with the same embed tokens you use for dashboard or chat embedding. Mint a token server-side, scoped to the end user's attributes so RLS applies to every query the assistant runs. How you cache it (per request, per session, per browser tab) is up to your auth model — just keep it server-side.

See [Generate a token via API](/embed/authentication/how-to/generate-a-token-via-api.md) for the full token-generation flow.

#### 2. Connect to the Toucan MCP endpoint

From your chat backend (typically the route that proxies to your LLM), open an MCP session, then pull the tools list **and** the `tool-usage` prompt that ships the system-prompt addendum your LLM needs to relay Toucan's `answer` verbatim:

```ts
import { createMCPClient } from "@ai-sdk/mcp";

const client = await createMCPClient({
  transport: {
    type: "http",
    url: "https://toucanai.cloud/api/mcp",
    headers: { Authorization: `Bearer ${embedToken}` },
  },
});

const [tools, toolUsage] = await Promise.all([
  client.tools(),
  client.experimental_getPrompt({ name: "tool-usage" }),
]);
const systemAddendum = toolUsage.messages
  .map((m) => (m.content.type === "text" ? m.content.text : ""))
  .join(" ");
// Remember to call client.close() when the request finishes.
```

Both the tool list and the prompt are discovered at runtime via MCP, so any new capabilities or guidance Toucan ships will surface to your LLM automatically without code changes on your side.

#### 3. Attach the tool to your LLM call

Pass the tools to your existing chat completion and append `systemAddendum` to your own system prompt. The addendum tells the model to forward Toucan's natural-language `answer` instead of paraphrasing it; without it the model tends to summarize.

```ts
import { streamText } from "ai";

const result = streamText({
  model: yourModel,
  system: `${YOUR_SYSTEM_PROMPT} ${systemAddendum}`,
  messages,
  tools,
  onFinish: () => client.close(),
  onError:  () => client.close(),
});
```

#### 4. Render visualizations in your UI

Load the Toucan embed script once in your app's root layout so the `<tc-result-renderer>` custom element registers itself globally:

```html
<script type="module" src="https://toucanai.cloud/embed/embed.js"></script>
```

That single script tag is the only browser-side dependency.

Then, in your message-rendering loop, find the Toucan tool result, parse its text content as JSON, and assign each `visualization` to a `<tc-result-renderer>`. The renderer takes its input via the `payload` DOM property (not an attribute), so you assign it programmatically once the element is in the DOM.

Add a small helper to recognise a Toucan envelope by its `schema` field, so any other tools you've attached pass through untouched. The current wire-format tag is `"toucan-mcp-v1"`; new major versions will publish a new tag and this page will be updated.

```js
function parseToucanEnvelope(text) {
  try {
    const parsed = JSON.parse(text);
    if (parsed?.schema === "toucan-mcp-v1") return parsed;
  } catch {}
  return null;
}
```

**Vanilla JS / any framework**

```js
function renderToucanResult(toolResult, container) {
  const envelope = parseToucanEnvelope(toolResult.content?.[0]?.text);
  if (!envelope) return; // not a Toucan tool result: ignore
  for (const viz of envelope.visualizations) {
    const el = document.createElement("tc-result-renderer");
    el.style.cssText = "display: block; height: 360px";
    el.payload = viz;
    container.appendChild(el);
  }
}
```

**React example**

```tsx
// In your message-part loop, alongside your text branch:
if (part.type.startsWith("tool-") || part.type === "dynamic-tool") {
  const envelope = parseToucanEnvelope(part.output?.[0]?.text);
  if (!envelope) return null; // not a Toucan tool result: ignore
  return envelope.visualizations.map((viz, i) => (
    <tc-result-renderer
      key={i}
      ref={(el) => { if (el) el.payload = viz; }}
      style={{ display: "block", height: 360 }}
    />
  ));
}
```

Each visualization payload includes both the chart configuration and its data rows, so no further Toucan API calls or auth tokens are needed in the browser.

#### 5. Test

Send a non-data message ("hi") and verify your existing chat behaves as before. Then ask a data question ("Top 5 customers by revenue this quarter") and verify the answer streams in along with one or more rendered charts underneath.

***

### Response envelope reference

Every Toucan tool result is a single text content block whose text is JSON of the following shape. You don't need to import this type — it's reproduced here for reference; just `JSON.parse` the text and read the fields.

```ts
type ToucanMCPEnvelope = {
  schema: "toucan-mcp-v1";
  answer: string;                  // markdown — surface verbatim
  visualizations: Visualization[]; // empty when no chart was produced
  threadId: string;                // pass back on follow-up questions to retain context
};

type Visualization =
  | { type: "chart"; title: string; narrative?: string; config: ...; data: Row[] }
  | { type: "error"; title?: string; message: string }
  | { type: "text";  text: string };
```

The `schema` field is the version tag; renderers should branch on it and ignore unknown `type` values rather than throwing, so the contract stays forward-compatible.

***

### Current scope and limits

* **Conversational memory via `threadId`**. Every envelope includes a `threadId`. Pass it back in the next `ask_toucan_assistant` call to continue the same conversation thread — the assistant will remember previous queries and charts. Omit it (or start fresh) to begin a new thread. The host LLM decides when to thread vs. start fresh; the `tool-usage` system-prompt addendum (step 2) already instructs it how. Threads are ephemeral: they expire after a period of inactivity.
* **Browser does not need a Toucan token**. Only your backend authenticates; visualizations are self-contained.
* **RLS still applies**. Every query the assistant runs is scoped to the attributes you put on the embed token in step 1.
* **in-progress notifications**. Our tools pushes notifications as MCP `notifications/progress` message. Make sure your library handle them, and wire them into your existing status indicator, to see Toucan's real progress.

***

### Conclusion

Your chat now answers data questions through Toucan without leaving the host UI. Users keep your branding and conversational experience while gaining self-service analytics over your governed data layer.