Developing Multimodal Visual AI to Holistically Enhance Visual Perception Capabilities
Covering multiple visual capabilities such as object detection, keypoint detection, and image captioning, it supports text prompts, visual prompts, and prompt-free operations. Through end-to-end API services, it makes visual perception smarter and more efficient.