Chinese-CLIP —Released!
M6 Team from DAMO Academy proposed Chinese-CLIP, a Chinese contrastive vision-language pretrained representation model. We use a large-scale Chinese image-text pair dataset (~200M) to train the model, and we hope that it can help users to achieve easy-to-use image representation generation, cross-modal retrieval (such as MUGE retrieval task) and zero-shot image classification for Chinese data. Multiple model scales are provided. Recently, Chinese-CLIP is also integrated into ModelScope platform of DAMO Academy, along with Huggingface transformers 🤗 codebase. For more detailed information, you can play with Chinese-CLIP via:
Paper: https://arxiv.org/abs/2211.01335
Github: https://github.com/OFA-Sys/Chinese-CLIP (welcome star! 🔥🔥)
ModelScope: https://www.modelscope.cn/models?name=clip&page=1&tasks=image-text-retrieval
Retrieval Demo: https://www.modelscope.cn/studios/damo/chinese_clip_applications/summary
Zero-shot Classification Demo: https://huggingface.co/spaces/OFA-Sys/chinese-clip-zero-shot-image-classification
Enjoy and stay tuned!
近期达摩院智能计算实验室M6团队推出了Chinese-CLIP中文预训练图文表征模型(Paper链接:https://arxiv.org/abs/2211.01335 ;Github链接: https://github.com/OFA-Sys/Chinese-CLIP ),该项目是OpenAI CLIP模型的中文版本,使用约2亿中文图文语料,预训练并开源了多个不同规模。目前代码支持快速上手图文特征提取、图文检索(如MUGE检索任务)、以及零样本图片分类,几行代码即可统统上手!近期,Chinese-CLIP也被集成于达摩院ModelScope魔搭平台,以及Huggingface transformers 🤗 代码库。希望大家在比赛中多多试用 & 在github多多star!
March 07, 2022 — OFA Released!
M6 Team from Intelligent Computing LAB, DAMO Academy proposed OFA (One For All), a unified framework for multimodal pretraining. OFA has unified architectures, tasks, and modalities in a unified seq2seq framework, and achieved SOTA performance in a number of multimodal downstream tasks. For more detailed information, you can play with OFA via:
Enjoy and stay tuned!
达摩院智能计算实验室M6团队推出最新工作——通用统一多模态预训练模型OFA(One For All),论文已在arxiv公开,代码已在Github开源(Paper链接:https://arxiv.org/abs/2202.03052 ;Github链接: https://github.com/OFA-Sys/OFA)。OFA将多模态及单模态的理解和生成任务统一到seq2seq的生成式框架中,目前已经在多个任务达到SOTA表现。同时还提供了在线Demo (https://huggingface.co/ofa-sys)和Colab notebooks (https://github.com/OFA-Sys/OFA/blob/main/colab.md)供体验。欢迎关注!
January 14, 2022 — Text to Image Generation task Baseline Released!
The Text to Image Generation task has released the baseline at https://github.com/MUGE-2021/image-generation-baseline .
January 11, 2022 — Multimodal Retrieval task Baseline Released!
The Multimodal Retrieval task has released the baseline at https://github.com/MUGE-2021/image-retrieval-baseline .
December 21, 2021 — MUGE is recommended by CCF-CV (中国计算机学会-计算机视觉专委会).
September 16, 2021 — Image Caption task Baseline Released!
The Image Caption task has released the baseline at https://github.com/MUGE-2021/image-caption-baseline .
August 16, 2021 — Multimodal Retrieval Dataset Added!
A new dataset for multimodal retrieval is added to MUGE:
August 1, 2021 — Initial Release!
The initial benchmark contains the following data sets: