For the complete documentation index, see llms.txt. This page is also available as Markdown.

PII & personal data

Target Audience: Developers and privacy stakeholders integrating Toucan AI, especially in embed and AI chat scenarios.

TL;DR

  • Toucan AI stores account and session data for users of the Toucan platform (for example email, name, session metadata).

  • Your end-user data in embed mode is mostly under your control: what you put in tokens and chat.

  • AI conversations can persist what users type — treat chat as potentially storing personal data.

  • Minimize what you send: pseudonymous IDs, only the attributes needed for security, no unnecessary context in AI clues.


When to use this

Use this page before going to production with embed or AI features, or when answering privacy questionnaires about where personal data can appear.


Where personal data can appear

1) Toucan platform accounts

When users sign in to Toucan AI directly, the platform stores typical account information such as:

  • Email address and display name

  • Profile fields you provide at signup

  • Session metadata (for example IP address or browser information, when collected)

This data supports authentication, billing, and product operation.

2) Embed integrations (your application)

In embed mode, you define what identity and attributes are sent to Toucan AI inside the embed token, for example:

  • A stable user identifier (distinctId)

  • Optional attributes used for row-level security (region, department, role, etc.)

  • Optional free-text context for the AI (aiContextClues)

Depending on what you send, these fields may be personal data. Toucan AI does not require email or legal name in embed tokens.

3) AI assistant chat

Users may type personal data directly into chat. That content can be stored as part of conversation history and is necessary to run a conversation, as it will be the context used as input for the LLM answer.

Query results shown in AI workflows may also appear in conversation state (for example summarized or sampled rows). See Data storage & retention.


Data you control vs Toucan controls

Area
Who controls content
Guidance

Embed token attributes

Your backend

Send only what RLS or product features require

distinctId

Your backend

Use a technical ID, not email or name

aiContextClues

Your backend

Avoid personal data; prefer non-identifying context

Chat messages

End user

Train users; clear history when needed

Connected database rows

Your database

Use RLS; Toucan queries on demand, does not bulk-copy your DB


Best practices (minimization)

  • Use a pseudonymous distinctId (never email, phone, or legal name).

  • Limit embed attributes to the minimum required for access control.

  • Do not pass personal data in AI context clues unless strictly necessary.

  • Apply row-level security on all sensitive tables.

  • Clear AI conversation history when your product workflow requires it.

  • Align with your privacy policy and DPA for subprocessors listed in Third-party subprocessors.


Last updated

Was this helpful?