TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Falcon Perception

Falcon Perception is a 0.6B parameter multimodal model built for open-vocabulary grounding and instance segmentation rather than general visual chat. It processes image patches and text tokens together in a single Transformer using early fusion and a hybrid attention scheme, then generates structured outputs for object coordinates, size, and segmentation masks. It is aimed at dense grounding tasks such as natural-language object selection, promptable segmentation, and crowded-scene localization. The model card reports 68.0 Macro F1 on SA-Co and describes pixel-accurate full-resolution mask generation from language-guided queries.
New Multimodal Gen 3
Released: March 31, 2026

Overview

Falcon Perception is TII UAE’s 0.6B early-fusion vision-language model for open-vocabulary grounding and instance segmentation. Given an image and a natural language query, it can return zero, one, or many matching objects with pixel-accurate masks, making it suited for promptable object selection and dense visual localization tasks.

About Technology Innovation Institute

TII is a leading global research center dedicated to pushing the frontiers of knowledge.

Industry: Research
Company Size: 1027
Location: Abu Dhabi, AE
Website: tii.ae
View Company Profile

Tools using Falcon Perception

No tools found for this model yet.

Last updated: April 1, 2026
0 AIs selected
Clear selection
#
Name
Task