Open Trustworthy AI —

Trust Is the Ultimate Form of Intelligence

News

Embodied AI

VLABench is an open-source benchmark for evaluating Vision-Language-Action models, featuring 100 real-world tasks with natural language instructions. Designed to assess both action and language capabilities, it supports development of more robust AI systems. Join us in advancing trustworthy Embodied AI research through this community-driven initiative.

Mar 1, 2025

#Survey

Releasing Large Model Safety Survey

Our latest survey "Safety at Scale: A Comprehensive Survey of Large Model Safety" systematically analyzes safety threats facing today's large AI models, covering VFMs, LLMs, VLPs, VLMs, and T2I Diffusion models. Our findings highlight the current landscape of AI safety research and the urgent need for robust safety measures and collaborative efforts to ensure trustworthy AI development.

Feb 13, 2025

#Vision

Introducing the VisionSafety Platform

The safety of vision models is critical to trustworthy AI. We proudly launch the VisionSafety Platform—a cutting-edge initiative to rigorously evaluate model robustness through highly transferable adversarial attacks and million-scale adversarial datasets. This platform represents a major leap forward in securing vision-based AI systems against emerging threats.

Dec 24, 2024

Advancing Trustworthy AI Through Open Collaboration

OpenTAI is an open platform where researchers collaborate to accelerate practical Trustworthy AI solutions. We prioritize tools, benchmarks, and platforms over papers, bridging research with real-world impact.

Large Models for Trustworthy AI

AI-Generated Video Detection Model

DAVID XR1

An AI video detection model with defect categorization, temporal–spatial localization, and reasoning explanations.

Safety-aligned Video-Language Model

SafeVid

SafeVid is a framework for training safety-aligned Video Large Multimodal Models using a large-scale safety preference dataset.

SVG Generation Model

OmniSVG

OmniSVG is a unified SVG generation model that leverages VLMs to generate high-quality and complex SVGs.

Multi-Turn Referential Grounded Video Chat

SAMA

SAMA is a multi-turn referential grounded video chat model that advances fine-grained spatio-temporal understanding in videos by jointly tackling video referring understanding, grounding, and multi-turn dialogue.

Research

IDEATOR: Red Team VLMs

Jailbreaking vision-language models using themselves, with the assistance of text-to-image diffusion models.

Learn More >

CALM: RL-based LLM Auditing

Leveraging reinforcement learning with curiosity reward to black-box audit commercial large language models.

Learn More >

BlueSuffix: RL-based Multimodal Jailbreak Defense

Leveraging reinforcement learning to train a defense model to safeguard large models against multimodal jailbreaks.

Learn More >

AnyAttack: Large-scale Self-supervised Adversarial Attack

Exploring large-scale pre-training or surrogate scaling to generate scalable, targeted, and highly-transferable attacks.

Learn More >

DAO: Backdoor Detection in CLIP Training Data

Developing simple but effective backdoor data detection and filtering methods for real-world large-scale datasets.

Learn More >

Universal Master Key (UMK): Multimodal Jailbreak Attack

Exploring text-image dual optimization to craft more powerful white-box jailbreak attacks against vision-language models.

Learn More >

Datasets

Safety | Physical-world Attack

AdvT-shirt-1K

A physical-world adversarial T-shirt dataset for adversarial robustness evaluation.

Multimodal | Jailbreak

VLBreakBench

A multimodal jailbreak dataset for multimodal large language models.

Vision | Adversarial

CC1M-Adv-C/F

Two million-scale adversarial image datasets for large-scale evaluations.

Deepfake

WildDeepfake

A dataset of 7,314 face sequences from 707 deepfake videos.

Video | Safety Alignment

SafeVid-350k

A large-scale preference dataset with 350K video query-response pairs generated via LLMs using safety-focused adversarial prompts.

AI Detection

DAVID-X

AI-generated videos with fine-grained defect annotations.

Vision | GenAI

OmniSVG-2M

A large-scale SVG dataset with 2M SVG samples covering website icons, illustrations, graphic designs, anime character.

Embodied AI

Human2Robot

VR-collected human-robot aligned demonstration episodes.

Safety | Physical-world Attack

AdvT-shirt-1K

A physical-world adversarial T-shirt dataset for adversarial robustness evaluation.

Multimodal | Jailbreak

VLBreakBench

A multimodal jailbreak dataset for multimodal large language models.

Vision | Adversarial

CC1M-Adv-C/F

Two million-scale adversarial image datasets for large-scale evaluations.

Deepfake

WildDeepfake

A dataset of 7,314 face sequences from 707 deepfake videos.

Video | Safety Alignment

SafeVid-350k

A large-scale preference dataset with 350K video query-response pairs generated via LLMs using safety-focused adversarial prompts.

AI Detection

DAVID-X

AI-generated videos with fine-grained defect annotations.

Vision | GenAI

OmniSVG-2M

A large-scale SVG dataset with 2M SVG samples covering website icons, illustrations, graphic designs, anime character.

Embodied AI

Human2Robot

VR-collected human-robot aligned demonstration episodes.

Safety | Physical-world Attack

AdvT-shirt-1K

A physical-world adversarial T-shirt dataset for adversarial robustness evaluation.

Multimodal | Jailbreak

VLBreakBench

A multimodal jailbreak dataset for multimodal large language models.

Vision | Adversarial

CC1M-Adv-C/F

Two million-scale adversarial image datasets for large-scale evaluations.

Deepfake

WildDeepfake

A dataset of 7,314 face sequences from 707 deepfake videos.

Benchmarks

VisionSafety Bench

An Adversarial Evaluation Platform for Vision Models

Our open-source platform provides datasets, algorithms, and tools for scalable adversarial evaluation of vision models. Now available for community use - we welcome your feedback and contributions!

visionadversarialmillion-scale

Learn More >

RewardModel Bench

A Reward Model Benchmark for LLM Alignment Evaluation

A reward model benchmark for evaluating the effectiveness of alignment in large language models. The benchmark consists of 49 real-world scenarios and both pairwise and Best-of-N (BoN) evaluations.

LLMreward modelalignment

Learn More >

VLBreakBench

A Multimodal Jailbreak Benchmark for Vision-Language Models

VLBreakBench evaluates VLMs through two tiers: a base set (1 jailbreak pair per query) and a challenge set (3 pairs per query), covering 12 safety topics and 46 subcategories (916 harmful queries), totaling 3,654 jailbreak samples.

multimodaljailbreaksafety

Learn More >

Tools