LLM Guardrails That
Don't Slow You Down


A 300M-parameter safety model that evaluates prompts
and responses across four moderation dimensions in a single pass.
Competitive accuracy to guard models 23-90x its size.
87.7
avg. prompt
F1 score
16x
higher
throughput
0.3B
params
encoder
RESEARCH
Four moderation tasks.
One forward pass.

Multi-aspect moderation in one pass. Prompt safety, response safety, harm categorization, and jailbreak detection scored simultaneously in a single forward pass.
Learn more

Schema-conditioned at inference. Any combination of moderation tasks, composed at inference time. No retraining. No prompt redesign.
Learn more

Inline with the agent loop. At sub-30ms latency, GLiGuard is fast enough to gate every prompt and response without slowing your agent loop.
Deploy GLiGuard
EFFICIENCY
23–90× smaller than competing LLM guards
Single non-autoregressive pass
Bidirectional encoder evaluates every safety dimension in parallel. No token-by-token generation, no sequential decoding bottleneck.
Competitive accuracy at 1/30 the size
At 0.3B, GLiGuard runs on a single GPU and beats models like LlamaGuard (12B) and WildGuard (7B) on HarmBench and SafeRLHF response classification.
Stay on your infrastructure
Apache 2.0 open weights. Can be run on-prem / air-gapped, or deployed with Pioneer.
0.3B
Parameters
(vs. 7B-12B baselines)
9
Safety benchmarks
evaluated
16×
Higher throughput
vs. Qwen3Guard-8B
17×
Lower latency
at sequence length 64

GLiGuard Benchmark Results
Nine established safety benchmarks. Up to 16x faster inference.
LlamaGuard4
Qwen3Guard-Gen
FASTINO GLiGuard
Parameters
12B
8B
0.3B
Architecture
Decoder (autoregressive)
Decoder (autoregressive)
Encoder (bidirectional)
Multi tasks in one pass
Avg. prompt F1
82.5
88.7
87.7
Avg. response F1
70.8
84.1
82.7
HarmBench (response)
83.3
87.2
91.0
SafeRLHF (response)
42.5
70.5
84.5
Latency @ SL 64 (ms)
—
426
26
Throughput @ BS 4 (req/s)
—
8.2
133
1,100,000+
monthly
downloads
2400+
github
stars
1.1BN+
end
users
Join the community
Join our active community on discord
Join now
Need help?
Get in touch with our support team.
Contact Support
Fastino Inc. (“Fastino”) develops specialized AI models and provides APIs designed to support structured data extraction, classification, reasoning, and production AI workflows. Fastino is a technology company and does not provide legal, financial, compliance, or advisory services.
Any outputs, predictions, classifications, or decisions generated through Fastino models are based on the configuration, data, and implementation provided by the customer. Fastino does not control, verify, or guarantee the accuracy, completeness, or suitability of model outputs for any specific purpose. By using this website or Fastino’s models and services, you acknowledge that all content and outputs are provided for informational and operational purposes only and agree to our Terms of Use and Privacy Policy.
2026 Fastino Inc.
All rights reserved