BAAI
BAAI:BGE-M3
The BGE-M3 model is a multi-functional, multi-lingual, and multi-granularity text embedding framework that integrates three retrieval methods: dense retrieval, multi-vector retrieval, and sparse retrieval. Capable of processing over 100 languages and inputs ranging from short sentences to lengthy documents (up to 8,192 tokens), it delivers state-of-the-art performance in multilingual and cross-lingual tasks, topping benchmarks like MIRACL and MKQA. Additionally, BGE-M3 excels in long-document retrieval, achieving strong results on datasets such as MLDR and NarrativeQA, solidifying its versatility across diverse text granularities and applications.
baai/bge-reranker-v2-m3
baai/bge-reranker-v2-m3 is a lightweight multi-language re-ranking model. Developed based on the bge-m3 model, it has strong multilingual capabilities, ease of deployment, and fast inference speed. This model takes query and document as input and directly outputs a similarity score, rather than an embedding vector. It is well-suited for multi-language scenarios, particularly excelling in Chinese and English processing.