LLM Guardrails That
Don't Slow You Down

A 300M-parameter safety model that evaluates prompts
and responses across four moderation dimensions in a single pass.
Competitive accuracy to guard models 23-90x its size.

87.7

avg. prompt
F1 score

16x

higher
throughput

0.3B

params
encoder

RESEARCH

Four moderation tasks.
One forward pass.

Multi-aspect moderation in one pass. Prompt safety, response safety, harm categorization, and jailbreak detection scored simultaneously in a single forward pass.

Learn more

Schema-conditioned at inference. Any combination of moderation tasks, composed at inference time. No retraining. No prompt redesign.

Learn more

Inline with the agent loop. At sub-30ms latency, GLiGuard is fast enough to gate every prompt and response without slowing your agent loop.

Deploy GLiGuard

EFFICIENCY

2390× smaller than competing LLM guards

Single non-autoregressive pass

Bidirectional encoder evaluates every safety dimension in parallel. No token-by-token generation, no sequential decoding bottleneck.

Competitive accuracy at 1/30 the size

At 0.3B, GLiGuard runs on a single GPU and beats models like LlamaGuard (12B) and WildGuard (7B) on HarmBench and SafeRLHF response classification.

Stay on your infrastructure

Apache 2.0 open weights. Can be run on-prem / air-gapped, or deployed with Pioneer.

0.3B

Parameters
(vs. 7B-12B baselines)

9

Safety benchmarks
evaluated

16×

Higher throughput

vs. Qwen3Guard-8B

17×

Lower latency
at sequence length 64

GLiGuard Benchmark Results

Nine established safety benchmarks. Up to 16x faster inference.

LlamaGuard4

Qwen3Guard-Gen

FASTINO GLiGuard

Parameters

12B

8B

0.3B

Architecture

Decoder (autoregressive)

Decoder (autoregressive)

Encoder (bidirectional)

Multi tasks in one pass

Avg. prompt F1

82.5

88.7

87.7

Avg. response F1

70.8

84.1

82.7

HarmBench (response)

83.3

87.2

91.0

SafeRLHF (response)

42.5

70.5

84.5

Latency @ SL 64 (ms)

426

26

Throughput @ BS 4 (req/s)

8.2

133

1,100,000+

monthly
downloads

2400+

github
stars

1.1BN+

end
users

Join the community

Join our active community on discord

Join now

Need help?

Get in touch with our support team.

Contact Support

Fastino Inc. (“Fastino”) develops specialized AI models and provides APIs designed to support structured data extraction, classification, reasoning, and production AI workflows. Fastino is a technology company and does not provide legal, financial, compliance, or advisory services.

Any outputs, predictions, classifications, or decisions generated through Fastino models are based on the configuration, data, and implementation provided by the customer. Fastino does not control, verify, or guarantee the accuracy, completeness, or suitability of model outputs for any specific purpose. By using this website or Fastino’s models and services, you acknowledge that all content and outputs are provided for informational and operational purposes only and agree to our Terms of Use and Privacy Policy.

2026 Fastino Inc.

All rights reserved