Personalization Use Cases

Personalized Retrieval

You can allow agents to fetch and rank the most relevant user-specific snippets from Fastino’s memory — enabling grounded reasoning, context continuity, and truly personal responses.

Using Fastino’s /chunks endpoint, agents can access top-k excerpts from user events and documents to support personalized chat, decision assistance, or proactive reasoning.

Overview

Personalized retrieval gives agents long-term memory recall — not just static summaries.
By retrieving specific, semantically relevant chunks of user data, agents can:

  • Reference prior conversations, notes, or emails directly.

  • Ground responses in authentic user context.

  • Build dynamic memory graphs across sessions and tools.

  • Avoid repeating questions or redundant suggestions.

This makes assistants contextually aware, coherent, and adaptive — across every interaction.

Key Components

Component

Description

Example

Chunk Store

Fastino’s internal memory of embedded user data.

“doc_7#p1 → Typical focus blocks 9–12 PT.”

Semantic Retrieval

Top-k search using adaptive user embeddings.

Finds notes, decisions, or events relevant to the query.

Context Fusion

Combines retrieved snippets with active conversation context.

Blends work style and recent meeting notes.

RAG Interface

Returns excerpts for use in model prompts or reasoning.

Provides chunk_id, excerpt, and score.

Example: Retrieving User Context for Grounding

POST /chunks
{
  "user_id": "usr_42af7c",
  "conversation": [
    {"role": "system", "content": "You are a helpful assistant that schedules meetings."},
    {"role": "user", "content": "Can you move next week’s sync?"}
  ],
  "top_k": 5
}

Response

{
  "results": [
    {
      "chunk_id": "doc_7#p1",
      "excerpt": "Typical focus blocks 9–12 PT; meetings after 1 PM preferred.",
      "score": 0.84
    },
    {
      "chunk_id": "evt_1#s1",
      "excerpt": "Let’s move stand-up to 2 PM starting next week.",
      "score": 0.81
    }
  ]
}

These retrieved snippets can then be fed directly into your model’s reasoning prompt.

Example: Prompt Grounding with Retrieved Snippets


This ensures the model’s response is grounded in authentic, user-specific information.

Example: Combining Retrieval with Summaries

Agents can combine retrieval results with deterministic summaries for hybrid reasoning:

POST /chunks
{
  "user_id": "usr_42af7c",
  "conversation": [{"role": "user", "content": "Can you propose new meeting times for next week?"}],
  "top_k": 3
}

By merging the high-level summary (“Ash prefers meetings after 1 PM”) with specific snippets, your agent gets both abstraction and precision.

Example: Retrieval in Multi-Agent Environments

Different agents (e.g., calendar, email, and planning assistants) can all use Fastino’s retrieval layer to share context:

Agent

Query

Retrieved Context

Calendar Agent

“When does Ash prefer meetings?”

“Typical focus blocks 9–12 PT.”

Email Agent

“How does Ash respond to scheduling changes?”

“Prefers async replies, concise updates.”

Chat Agent

“What should I remind Ash about today?”

“Pending review of design sprint doc.”

All agents read from the same memory base, ensuring cross-tool coherence.

Example Implementation (Python)

import requests

BASE_URL = "https://api.fastino.ai"
HEADERS = {"Authorization": "x-api-key: sk_live_456", "Content-Type": "application/json"}

def retrieve_user_context(user_id, conversation, top_k=5):
    payload = {"user_id": user_id, "conversation": conversation, "top_k": top_k}
    r = requests.post(f"{BASE_URL}/chunks", json=payload, headers=HEADERS)
    return r.json()["results"]

# Example usage
conversation = [
    {"role": "system", "content": "You are a scheduling assistant."},
    {"role": "user", "content": "Can you find the best time for the product review meeting?"}
]

results = retrieve_user_context("usr_42af7c", conversation)
for r in results:
    print(f"{r['excerpt']} (score={r['score']})")

This returns the top-ranked user memory snippets for grounding.

Architecture

Retrieval and Grounding Flow


Each user’s memory is stored as vectorized chunks, indexed for fast semantic retrieval.

Example: Cross-Tool Retrieval

Since Fastino indexes all user data under the same user_id, retrieval can cross tool boundaries:

POST /chunks
{
  "user_id": "usr_42af7c",
  "conversation": [
    {"role": "user", "content": "Summarize my recent product launch discussions."}
  ],
  "top_k": 5
}

Response

{
  "results": [
    {"chunk_id": "doc_not_1#p2", "excerpt": "Launch prep meeting scheduled with design and marketing.", "score": 0.82},
    {"chunk_id": "evt_slack_4#p1", "excerpt": "Agreed to finalize pitch deck by Friday.", "score": 0.79}
  ]
}

This retrieves memory from Notion, Slack, and Gmail in one call.

Confidence Scoring and Deduplication

Each retrieved snippet includes a relevance score (0–1).
You can filter or rerank snippets by score threshold:

for chunk in results:
    if chunk["score"]

Fastino automatically deduplicates by chunk ID and source to ensure clean, non-redundant context.

Combining Retrieval with Reasoning

Use Fastino’s retrieved context as structured input for any reasoning model or chain:

Step

Action

Endpoint

1

Retrieve user snippets

/chunks

2

Summarize or compress

/summary

3

Feed to reasoning model

Your agent / LLM context

4

Log new insights

/ingest

This enables iterative, context-aware reasoning loops that stay grounded over time.

Integration with Other Use Cases

Related Use Case

Description

Cross-tool Reasoning

Retrieve unified context across multiple tools.

Context Handoff

Transfer relevant snippets between devices or sessions.

Proactive Alignment

Fetch relevant data before proactive actions.

Decision Prediction

Retrieve past choices to guide new recommendations.

Use Cases

Use Case

Description

Conversational Agents

Retrieve relevant user memories to ground replies.

Knowledge Assistants

Fetch context-specific notes and documents.

Scheduling Systems

Retrieve historical meeting and routine data.

RAG-Powered Copilots

Combine Fastino retrieval with model reasoning pipelines.

Adaptive Writing Agents

Reference prior tone, phrasing, or writing samples dynamically.

Best Practices

  • Always specify top_k (default: 5) for optimal precision.

  • Cache results briefly in session memory for speed.

  • Filter by source or content type if needed (options.source_filter).

  • Pair retrieval with deterministic summaries for grounding.

  • Use feedback ingestion to refine embedding accuracy over time.

  • Never expose raw chunk IDs — treat as internal references.

Example: Deterministic Retrieval Summary

Response

{
  "summary": "Ash’s top retrieval sources include Gmail and Notion. Relevant context involves product launches, scheduling patterns, and recent design updates."
}

This provides a lightweight overview before performing a full retrieval operation.

Summary

The Personalized Retrieval use case gives your agents true user-level memory recall, enabling them to ground reasoning and output in authentic, contextual data.

By combining vectorized retrieval (/chunks) with deterministic summaries (/summary), Fastino turns every assistant into a personalized, contextually aware reasoning system — grounded, adaptive, and continuously learning.

Next, continue to Personalization Use Cases → Knowledge Graph Reasoning to explore how Fastino connects relationships and reasoning paths across user data.

On this page