VLABench is an open-source benchmark for evaluating Vision-Language-Action models, featuring 100 real-world tasks with natural language instructions. Designed to assess both action and language capabilities, it supports development of more robust AI systems. Join us in advancing trustworthy Embodied AI research through this community-driven initiative.
Our latest survey "Safety at Scale: A Comprehensive Survey of Large Model Safety" systematically analyzes safety threats facing today's large AI models, covering VFMs, LLMs, VLPs, VLMs, and T2I Diffusion models. Our findings highlight the current landscape of AI safety research and the urgent need for robust safety measures and collaborative efforts to ensure trustworthy AI development.
The safety of vision models is critical to trustworthy AI. We proudly launch the VisionSafety Platform—a cutting-edge initiative to rigorously evaluate model robustness through highly transferable adversarial attacks and million-scale adversarial datasets. This platform represents a major leap forward in securing vision-based AI systems against emerging threats.
Dec 24, 2024
Advancing Trustworthy AI Through Open Collaboration
OpenTAI is an open platform where researchers collaborate to accelerate practical Trustworthy AI solutions. We prioritize tools, benchmarks, and platforms over papers, bridging research with real-world impact.
Research
VLM Red Teaming
Jailbreaking vision-language models using themselves, with the assistance of text-to-image diffusion models.
Learn More >
LLM Auditing
Leveraging reinforcement learning with curiosity reward to black-box audit commercial large language models.
Learn More >
Multimodal Jailbreak Defense
Leveraging reinforcement learning to train a defense model to safeguard large models against multimodal jailbreaks.
Learn More >
Transferable and Scaled Attacks
Exploring large-scale pre-training or surrogate scaling to generate scalable, targeted, and highly-transferable attacks.
Learn More >
Real-world Backdoor Detection
Developing simple but effective backdoor data detection and filtering methods for real-world large-scale datasets.
Learn More >
Multimodal Jailbreak Attack
Exploring text-image dual optimization to craft more powerful white-box jailbreak attacks against vision-language models.
Learn More >
Benchmarks
VisionSafety Bench
An Adversarial Evaluation Platform for Vision Models
Our open-source platform provides datasets, algorithms, and tools for scalable adversarial evaluation of vision models. Now available for community use - we welcome your feedback and contributions!
visionadversarialmillion-scale
Learn More >
RewardModel Bench
A Reward Model Benchmark for LLM Alignment Evaluation
A reward model benchmark for evaluating the effectiveness of alignment in large language models. The benchmark consists of 49 real-world scenarios and both pairwise and Best-of-N (BoN) evaluations.
LLMreward modelalignment
Learn More >
VLBreakBench
A Multimodal Jailbreak Benchmark for Vision-Language Models
VLBreakBench evaluates VLMs through two tiers: a base set (1 jailbreak pair per query) and a challenge set (3 pairs per query), covering 12 safety topics and 46 subcategories (916 harmful queries), totaling 3,654 jailbreak samples.