New Models
February 27, 2025

Fastino PII: A lightweight architecture for personally identifiable information redaction

Fastino PII: A lightweight architecture for personally identifiable information redaction

ABSTRACT

Urchade Zaratiana, Ashley Lewis, Dan Iter, Julia White, Oliver Boyd, Riley Carlson @ Fastino AI

We introduce fastino pii, a high-performance model for redacting PersonallyIdentifiable Information (PII) from unstructured text. The model is capable ofidentifying and redacting multiple types of PII, including names, phone numbers,addresses, emails, IP addresses, usernames and other sensitive data. Leveraging theFastino architecture, fastino pii is the new state-of-the-art in F1 scores,outperforming larger and smaller LLMs, including gpt-4o. This report outlinesthe task format and evaluation of fastino pii.

MODEL DESIGN

The fastino pii model leverages the same lightweight architecture as Fastino, optimized for low-latency and high-efficiency processing.

Key features include:

• Training: Trained on diverse datasets containing various forms of PII across multiple domains and languages.

• Lightweight Inference: Optimized for CPU environments to enable cost-effective deployment.

RESULTS

The performance of Fastino PII was evaluated against leading models, including GPT-4o-mini, GPT-4o, and Gemini-1.5-Flash, based on F1 score (accuracy in identifying personally identifiable information) and latency (processing speed).

Among all models tested, Fastino PII achieved the highest F1 score of 96.94, outperforming GPT-4o (96.30), Gemini-1.5-Flash (95.11), and GPT-4o-mini (89.50). In addition to its superior accuracy, Fastino PII demonstrated a substantial advantage in processing speed, with a latency of just 257 milliseconds when running on a CPU. In contrast, the other models exhibited significantly higher latencies: GPT-4o-mini (2812 ms), GPT-4o (2855 ms), and Gemini-1.5-Flash (2450 ms).

This combination of high accuracy and exceptional speed makes Fastino PII an efficient solution for PII redaction. Its high recall ensures that sensitive entities are not overlooked, while its high precision minimizes false positives. These strengths enable Fastino PII to provide fast and reliable PII detection without sacrificing accuracy, even when deployed on limited computational resources.