Overview

Quick Start

This guide walks you through setting up your Fastino Personalization API workspace, generating API keys, and running your first end-to-end test using sandbox mode.

Personalization API Documentation

Overview

Our personalization API provides a comprehensive solution for building AI agents with deep user understanding. Unlike traditional memory systems, we offer:

  1. Agentic Search Over Noisy Data - Self-evolving tree of knowledge about users that continuously runs on servers, going beyond the scope of standard vector embeddings

  2. Powerful Query Endpoint - /query endpoint wrappable as a tool call for complex questions which require higher intelligence, not just simple chunk retrieval

  3. Context-Aware Chunks - /chunks endpoint accepts message history (not just questions) for more contextually relevant results

  4. Deterministic Profile Summaries - Natural language summaries perfect for system prompts, so your agent can start with a good understanding of the user.

  5. Flexible Ingestion - Accept any data format and automatically extract memories

  6. Privacy-Focused - All data is anonymized using GLiNER-2 before storage

Base URL: https://api.fastino.ai

Before You Start

Requirements

Before integrating with the Personalization API, ensure you have:

  • API access token - Obtain from the Fastino Developer Portal

  • JSON-capable HTTP client - curl, Postman, Python requests, or similar

  • ISO 8601 UTC timestamps - For all time-based fields (e.g., 2025-11-11T14:30:00Z)

  • Internet access - To https://api.fastino.ai

Creating Your Account

Step 1: Sign Up for API Access

Visit the Fastino Developer Portal and create your workspace.

Provide:

  • Email address and organization name

  • Intended use case (e.g., calendar assistant, productivity agent, sales automation)

  • Default region and timezone

Step 2: Generate API Key

Once your account is created:

  1. Navigate to API Keys in your dashboard

  2. Click "Create New API Key"

  3. Give it a descriptive name (e.g., "Production API", "Development")

  4. Copy and securely store your API key (it starts with pio_sk_...)

⚠️ Security Note: Keep your API keys private — never expose them in frontend code or public repositories.

Authentication

Most endpoints support flexible authentication:

  • JWT tokens - For user-authenticated requests

  • API keys - For server-to-server integration (recommended)

  • System auth - For internal services


API Key Authentication (Recommended)

Include your API key in the Authorization header:

-H "x-api-key: <your_api_key_here>"
-H "Content-Type: application/json"

Example with curl:

curl -X POST https://api.fastino.ai/register \
  -H "x-api-key: <your_api_key_here>" \
  -H "Content-Type: application/json" \
  -d '{"email": "me@pioneer.ai", "traits": {"name": "John Doe"}}'

Quick Start

1. Register a User

First, register a user to initialize their personalization profile. This triggers our multi-stage personalization workflow:

Endpoint: POST /register

Request:

{
  "email": "me@pioneer.ai",
  "purpose": "This will be used in an AI SDR agent to help understand the user's communication style and priorities",
  "traits": {
    "name": "Ash Lewis",
    "locale": "en-US",
    "timezone": "America/Los_Angeles",
    "linkedin_url": "https://www.linkedin.com/in/ashlewis",
    "twitter_url": "https://twitter.com/ashlewis",
    "website": "https://ash.example.com",
    "notes": "Founder/engineer; prefers concise communications"
  }
}

Response:

{
  "user_id": "usr_42af7c",
  "created_at": "2025-11-11T16:05:00Z",
  "status": "active"
}

Key Parameters:

  • email (required) - Unique email for the user

  • purpose (optional) - Helps fine-tune the world model for specific use cases (e.g., "meeting scheduling app", "sales automation")

  • traits (optional) - Social URLs, timezone, preferences, etc.

Purpose-Driven Personalization: When you provide a purpose, our system focuses on relevant aspects. For example:

  • Meeting scheduling app → Focus on timezone, meeting preferences, calendar patterns

  • Sales automation → Focus on communication style, relationships, decision-making patterns

  • Content creation → Focus on writing style, interests, past content

Notes:

  • If the user already exists, traits are merged (idempotent)

  • Stage 1 & 2 run during registration (social profile scraping + summary generation)

  • Stage 3 (agentic search) runs automatically at memory thresholds (5, 10, 20, 40, 80...)


2. Ingest User Data

Feed any user data into the system - conversations, documents, emails, notes, etc. Our system automatically extracts and stores relevant memories.

Endpoint: POST /ingest

Request:

{
  "user_id": "usr_42af7c", // returned from /register route.
  "source": "gmail",
  "message_history": [
    {
      "role": "user",
      "content": "I love sushi and usually eat late dinners around 9pm",
      "timestamp": "2025-11-01T14:30:00Z"
    },
    {
      "role": "assistant",
      "content": "Got it! I'll keep that in mind for restaurant recommendations.",
      "timestamp": "2025-11-01T14:30:15Z"
    }
  ],
  "documents": [
    {
      "content": "Q4 2025 Goals: Launch new product, hire 3 engineers, reach $1M ARR",
      "title": "Quarterly Goals Q4",
      "document_type": "document",
      "doc_id": "doc-q4-goals",
      "created_at": "2025-10-01T00:00:00Z"
    }
  ],
  "options": {
    "dedupe": true
  }
}

Response:

{
  "saved": {
    "documents": 1,
    "message_history": 1
  },
  "skipped": [],
  "processing_triggered": {
    "documents": true,
    "message_history": true
  },
  "inserted_document_ids": ["doc_123", "msg_456"],
  "received_at": "2025-11-11T16:10:00Z"
}

Key Features:

  • Automatic Anonymization - All PII is redacted using GLiNER-2 before storage

  • Flexible Format - Send conversations, documents, or both

  • Deduplication - Set dedupe: true to prevent duplicate ingestion

  • Background Processing - Memory extraction happens asynchronously

  • Automatic Stage 3 Triggers - When memory count hits thresholds (5, 10, 20...), our agentic search automatically runs to build deeper understanding

Document Types:

  • document - General documents, notes, articles

  • message_history - Conversation threads

  • email - Email content

  • calendar_event - Calendar entries

  • Custom types as needed


3. Get Profile Summary

Retrieve a natural language summary of the user - perfect for system prompts.

Endpoint: GET /summary

Request:

GET /summary?user_id=usr_42af7c&max_chars=1000

Response:

{
  "user_id": "usr_42af7c",
  "generated_at": "2025-11-11T16:05:00Z",
  "purpose": null,
  "summary": "Ash Lewis is a founder and engineer based in San Francisco (PST timezone). Ash prefers concise communications and values efficiency. Work style: Deep focus blocks from 9am-12pm, prefers meetings after 1pm. Currently focused on launching a new product and scaling the engineering team. Enjoys sushi and typically has late dinners around 9pm.",
  "cached": true
}

Usage:

# Add to your agent's system prompt
summary = get_profile_summary(user_id)
system_prompt = f"""You are a helpful assistant for {summary['summary']}
Keep their preferences and work style in mind when responding."""

Key Parameters:

  • user_id (required)

  • max_chars (optional, default 1000) - Truncate summary to this length

  • purpose (optional) - Request purpose-specific summary (not yet implemented, returns general summary)

Notes:

  • Deterministic output for the same input (unless underlying data changes)

  • Low latency - perfect for every agent session

  • Generated from Stage 2 data


4. Retrieve Relevant Chunks

Get contextually relevant memory chunks based on the current conversation. Use this at every agent turn to ground responses in user-specific context.

Endpoint: POST /chunks

Request:

{
  "user_id": "usr_42af7c",
  "history": [
    {"role": "system", "content": "You help with restaurant recommendations"},
    {"role": "user", "content": "I'm hungry, what should I eat tonight?"}
  ],
  "k": 6,
  "similarity_threshold": 0.25
}

Response:

{
  "chunks": [
    {
      "id": "mem_789",
      "text": "User loves sushi and usually eats late dinners around 9pm",
      "score": 0.82,
      "source": "memory",
      "created_at": "2025-11-01T14:30:00Z",
      "updated_at": "2025-11-01T14:30:00Z"
    },
    {
      "id": "stage3_5",
      "text": "Q: What are the user's food preferences?\nA: The user enjoys sushi, Italian cuisine, and prefers restaurants in the Mission district. Typically dines late (8-9pm) and values quick service.",
      "score": 0.78,
      "source": "stage3",
      "created_at": "2025-10-15T10:00:00Z",
      "updated_at": "2025-10-15T10:00:00Z",
      "question_index": 5,
      "question": "What are the user's food preferences?",
      "answer": "The user enjoys sushi, Italian cuisine, and prefers restaurants in the Mission district. Typically dines late (8-9pm) and values quick service."
    }
  ],
  "used_query": "hungry eat tonight food restaurant",
  "debug": {
    "stage3_count": 1,
    "memory_count": 1,
    "total_candidates": 8,
    "threshold": 0.25,
    "embedding_time_ms": 45,
    "search_time_ms": 23,
    "total_time_ms": 68
  }
}

Key Features:

  • Message History Input - Pass conversation context, not just a question

  • Unified Retrieval - Searches both Stage-3 Q&A and conversation memories

  • Source Attribution - Know whether chunks come from memories or agentic search

  • Low Latency - Fast vector search, no LLM calls

Usage Pattern:

# At every agent turn
chunks = get_relevant_chunks(user_id, conversation_history)
if chunks:
    context = "\n\n".join([c["text"] for c in chunks["chunks"]])
    user_message += f"\n\nRelevant context:\n{context}"

Key Parameters:

  • user_id (required)

  • history (required) - Recent conversation turns

  • k (optional, default 6) - Number of chunks to return

  • max_context_turns (optional, default 4) - How many recent turns to consider

  • similarity_threshold (optional, default 0.25) - Minimum similarity score (0.25-0.50 typical)

  • exclude_chunk_ids (optional) - Skip specific chunks


5. Query User Profile

Ask complex natural-language questions about the user. This is a high-latency, high-performance endpoint that can run agentic search when needed.

Endpoint: POST /query

Request:

{
  "user_id": "usr_42af7c",
  "question": "Who are the most important people in the user's professional network? For each person, describe their relationship, communication cadence, and what they typically discuss.",
  "use_cache": true
}

Response:

{
  "user_id": "usr_42af7c",
  "question": "Who are the most important people in the user's professional network?...",
  "answer": "Based on email analysis, here are the key relationships:\n\n1. **David Kimball (Recruiter)** - Long-term trusted career advisor. Communication: 2-3x per month with spikes during job searches. Topics: AI/ML opportunities, market intelligence, career strategy.\n\n2. **George Maloney (Fastino Co-Founder)** - Current employer relationship. Communication: Daily during onboarding, now sporadic milestone-driven. Topics: Strategic decisions, company vision.\n\n3. **Sarah Chen (Technical Co-founder)** - Close collaborator. Communication: Multiple times daily. Topics: Architecture decisions, code reviews, product strategy...",
  "cached": false
}

Key Features:

  • Cache-First - Attempts to answer from existing data (Stage 2 + Stage 3 + memories)

  • Agentic Fallback - If cache can't answer, runs full document search agent

  • Automatic Persistence - New answers are saved to Stage 3 for future caching

  • Complex Queries - Can handle multi-part questions requiring synthesis

Usage as Tool Call:

tools = [
    {
        "name": "query_user_profile",
        "description": "Ask detailed questions about the user's preferences, relationships, work style, or any personal information",
        "parameters": {
            "question": "The natural language question to ask"
        }
    }
]

# Agent decides when to call this tool
# Example: User asks "Schedule a meeting with my most important contacts"
# Agent calls: query_user_profile("Who are the user's most important professional contacts?")

Key Parameters:

  • user_id (required)

  • question (required, non-empty) - Natural language question

  • use_cache (optional, default true) - Set to false to force fresh agent run

Performance:

  • Cached response: ~100-500ms

  • Agent run: ~5-30 seconds (depending on complexity)

Integration Patterns

Pattern 1: System Prompt Enhancement

def create_agent_prompt(user_id):
    # Get deterministic profile summary
    summary = get_profile_summary(user_id)
    
    return f"""You are a helpful AI assistant.

User Context:
{summary['summary']}

Keep the user's preferences, work style, and context in mind when responding.
"""

Pattern 2: Contextual Grounding (Every Turn)

def process_message(user_id, conversation_history, new_message):
    # Add new message to history
    conversation_history.append({
        "role": "user",
        "content": new_message
    })
    
    # Get relevant chunks
    chunks_response = get_relevant_chunks(
        user_id=user_id,
        history=conversation_history,
        k=6
    )
    
    # Append context if relevant chunks found
    if chunks_response['chunks']:
        context = "\n".join([
            f"- {chunk['text']}" 
            for chunk in chunks_response['chunks']
        ])
        enhanced_message = f"{new_message}\n\n[Relevant context:\n{context}]"
    else:
        enhanced_message = new_message
    
    # Send to LLM
    return llm.chat(enhanced_message)

Pattern 3: Tool-Augmented Agent

tools = [
    {
        "name": "query_user_profile",
        "description": "Ask detailed questions about the user when you need specific information not in the current context",
        "parameters": {
            "type": "object",
            "properties": {
                "question": {
                    "type": "string",
                    "description": "Natural language question about the user"
                }
            },
            "required": ["question"]
        }
    }
]

def query_user_profile_tool(question: str):
    response = query_user_profile(
        user_id=current_user_id,
        question=question,
        use_cache=True
    )
    return response['answer']

# Agent automatically calls this when it needs more context
# Example: User says "Book dinner with my team"
# Agent calls: query_user_profile("Who is on the user's team and what are their dietary preferences?")

Pattern 4: Continuous Learning

def after_conversation(user_id, messages):
    # Ingest conversation for learning
    ingest_response = ingest_data(
        user_id=user_id,
        message_history=[
            {
                "role": msg['role'],
                "content": msg['content'],
                "timestamp": msg['timestamp']
            }
            for msg in messages
        ],
        options={"dedupe": True}
    )
    
    # System automatically:
    # 1. Extracts facts and updates memories
    # 2. Triggers Stage 3 at memory thresholds
    # 3. Evolves user understanding over time

Complete Example: Restaurant Recommendation Agent

import requests
from datetime import datetime

BASE_URL = "https://api.fastino.ai"
API_KEY = "<your_api_key>"

def get_headers():
    return {
        "x-api-key": f"{API_KEY}",
        "Content-Type": "application/json"
    }

# 1. Initialize user (one-time)
def register_user(user_id, email):
    response = requests.post(
        f"{BASE_URL}/register",
        headers=get_headers(),
        json={
            "user_id": user_id,
            "purpose": "Restaurant recommendation agent that suggests dining options based on user preferences, dietary restrictions, and past experiences",
            "traits": {
                "name": "User Name",
                "timezone": "America/Los_Angeles"
            }
        }
    )
    return response.json()

# 2. Get profile summary for system prompt
def get_system_prompt(user_id):
    response = requests.get(
        f"{BASE_URL}/summary",
        headers=get_headers(),
        params={"user_id": user_id, "max_chars": 500}
    )
    summary = response.json()['summary']
    
    return f"""You are a restaurant recommendation assistant.

User Profile:
{summary}

Provide personalized restaurant suggestions based on the user's preferences, dietary restrictions, and past experiences.
"""

# 3. Process user message with context
def process_message(user_id, conversation_history, new_message):
    # Get relevant context
    chunks_response = requests.post(
        f"{BASE_URL}/chunks",
        headers=get_headers(),
        json={
            "user_id": user_id,
            "history": conversation_history + [
                {"role": "user", "content": new_message}
            ],
            "k": 5
        }
    )
    
    chunks = chunks_response.json()['chunks']
    
    # Build enhanced message
    if chunks:
        context = "\n".join([f"- {c['text']}" for c in chunks])
        enhanced = f"{new_message}\n\n[Context: {context}]"
    else:
        enhanced = new_message
    
    # Send to your LLM
    # ... (your LLM call here)
    
    return enhanced

# 4. Learn from conversation
def save_conversation(user_id, messages):
    requests.post(
        f"{BASE_URL}/ingest",
        headers=get_headers(),
        json={
            "user_id": user_id,
            "source": "restaurant_agent",
            "message_history": [
                {
                    "role": msg['role'],
                    "content": msg['content'],
                    "timestamp": msg.get('timestamp', datetime.utcnow().isoformat() + 'Z')
                }
                for msg in messages
            ],
            "options": {"dedupe": True}
        }
    )

# Usage
user_id = "user_123"

# First time setup
register_user(user_id, "user@example.com")

# Get system prompt
system_prompt = get_system_prompt(user_id)

# Chat loop
conversation = []
while True:
    user_input = input("You: ")
    if user_input.lower() == 'quit':
        break
    
    # Process with context
    enhanced_message = process_message(user_id, conversation, user_input)
    
    # Get response from your LLM
    # assistant_response = your_llm_call(system_prompt, enhanced_message)
    
    # Update conversation
    conversation.append({"role": "user", "content": user_input})
    # conversation.append({"role": "assistant", "content": assistant_response})

# Save conversation for learning
save_conversation(user_id, conversation)

API Reference Summary

Endpoint

Method

Auth

Purpose

Latency

/register

POST

Required

Initialize user profile

~100 ms

/ingest

POST

Required

Add user data

~100-500ms

/summary

GET

Required

Get profile summary

~50-200ms

/chunks

POST

None

Get relevant context

~50-150ms

/query

POST

None

Ask complex questions

~5s to 3 minutes (agent runs when cache fails)

Best Practices

1. Registration

  • Try to provide a purpose to get better personalization

  • Include social URLs (LinkedIn, Twitter) for richer Stage 1 data

  • Register users as soon as they sign up

2. Ingestion

  • Ingest data continuously as it becomes available

  • Use dedupe: true to prevent duplicate processing

  • Include timestamps for all messages

  • Batch related documents together

3. Retrieval

  • Use /chunks at every agent turn for grounding

  • Use /summary once per session for system prompts

  • Use /query as a tool call for complex questions

  • Set appropriate similarity_threshold (0.25-0.35 for broad, 0.4-0.5 for precise)

4. Performance

  • Cache profile summaries per session

  • Use use_cache: true for queries (default)

  • Consider excluding already-used chunks with exclude_chunk_ids

5. Privacy

  • All data is automatically anonymized with GLiNER-2

  • PII is redacted before storage

  • User data is isolated by user_id

On this page