cover

From Embeddings to ANN: Practical Performance on MS MARCO Web Search

1 Jul 2025

Dive into the practical evaluation of embedding models and ANN algorithms on MS MARCO Web Search, revealing insights into real-world search system behavior.

cover

Measuring Search Excellence: Result Quality and System Performance

1 Jul 2025

Explore the robust evaluation framework for MS MARCO Web Search baselines, covering both result quality and system performance under resource constraints.

cover

Establishing Baselines: MS MARCO Web Search's Foundational Methods

1 Jul 2025

Explore the cutting-edge embedding models and disk-based ANN algorithms selected as initial baselines for the new MS MARCO search benchmark.

cover

MS MARCO Web Search: Unveiling Initial Benchmark Results

1 Jul 2025

Explore the foundational benchmark results on the MS MARCO Web Search 100M dataset, featuring state-of-the-art embedding models

cover

Unlocking Web Search AI: MS MARCO's Three Grand Challenges

1 Jul 2025

Discover how MS MARCO Web Search sparks new research, posing formidable challenges in large-scale embedding model generalization

cover

Deep Dive into MS MARCO Web Search: Unpacking Dataset Characteristics

29 Jun 2025

Explore a comprehensive analysis of the MS MARCO Web Search dataset, detailing its multilingual distribution and significant data skew

cover

Crafting Real-World Queries: MS MARCO Web Search's Authentic Data

29 Jun 2025

Discover how MS MARCO Web Search meticulously selects and labels millions of real queries from Bing search logs

cover

Introducing MS MARCO Web Search: A New Era for LLM and IR Data

28 Jun 2025

Witness the arrival of MS MARCO Web Search, the first colossal, authentic, and information-rich web dataset with millions of clicked query-document labels

cover

Why New Datasets are Needed for Deep Learning-Enhanced IR

28 Jun 2025

This section critiques existing information retrieval benchmarks, noting their lack of web-scale data and highly-skewed multilingual queries