
From Embeddings to ANN: Practical Performance on MS MARCO Web Search
1 Jul 2025
Dive into the practical evaluation of embedding models and ANN algorithms on MS MARCO Web Search, revealing insights into real-world search system behavior.

Measuring Search Excellence: Result Quality and System Performance
1 Jul 2025
Explore the robust evaluation framework for MS MARCO Web Search baselines, covering both result quality and system performance under resource constraints.

Establishing Baselines: MS MARCO Web Search's Foundational Methods
1 Jul 2025
Explore the cutting-edge embedding models and disk-based ANN algorithms selected as initial baselines for the new MS MARCO search benchmark.

MS MARCO Web Search: Unveiling Initial Benchmark Results
1 Jul 2025
Explore the foundational benchmark results on the MS MARCO Web Search 100M dataset, featuring state-of-the-art embedding models

Unlocking Web Search AI: MS MARCO's Three Grand Challenges
1 Jul 2025
Discover how MS MARCO Web Search sparks new research, posing formidable challenges in large-scale embedding model generalization

Deep Dive into MS MARCO Web Search: Unpacking Dataset Characteristics
29 Jun 2025
Explore a comprehensive analysis of the MS MARCO Web Search dataset, detailing its multilingual distribution and significant data skew

Crafting Real-World Queries: MS MARCO Web Search's Authentic Data
29 Jun 2025
Discover how MS MARCO Web Search meticulously selects and labels millions of real queries from Bing search logs

Introducing MS MARCO Web Search: A New Era for LLM and IR Data
28 Jun 2025
Witness the arrival of MS MARCO Web Search, the first colossal, authentic, and information-rich web dataset with millions of clicked query-document labels

Why New Datasets are Needed for Deep Learning-Enhanced IR
28 Jun 2025
This section critiques existing information retrieval benchmarks, noting their lack of web-scale data and highly-skewed multilingual queries