Storage-Compute Disaggregation for Vector Databases with Adaptive Prefetching and Remote Memory Throttling

Authors

  • Sanjeewa Wickramathilaka Lanka Institute of Information Sciences, Computer Science Department, Pelawatta Road, Anuradhapura 50000, Sri Lanka Author
  • Pubudu Rathnayake Southern School of Computing, Computer Science Department, Weliwatta Road, Matara 81000, Sri Lanka Author

Abstract

Vector databases increasingly serve latency-sensitive similarity search workloads whose working sets exceed the DRAM capacity of any single machine. Storage-compute disaggregation, where stateless query executors access remote index and vector payloads over high-speed fabrics, offers elasticity and operational simplicity but introduces a new bottleneck: remote memory traffic can dominate tail latency and amplify congestion collapse under fan-out heavy approximate nearest neighbor search. This paper studies disaggregated vector database execution with a focus on two coupled mechanisms: adaptive prefetching that anticipates future remote reads during graph- and centroid-based search, and remote memory throttling that shapes demand to preserve predictable service under contention. We develop a cost model that distinguishes control-path metadata fetches from data-path vector payload fetches, incorporates caching and compression, and captures the sensitivity of recall to partial execution under budgets. Building on this model, we propose a state-conditioned prefetcher that learns a probability-of-use for candidate remote blocks from the evolving search frontier, and a throttling controller that enforces per-tenant and per-query-rate constraints via a multi-objective Lagrangian balancing latency, bandwidth, and energy. We analyze complexity, provide error bounds for approximate execution and sketch-based pruning, and show that several prefetch scheduling problems are NP-hard, motivating practical heuristics with bounded regret under stationarity assumptions. Finally, we describe implementation details for RDMA-like transports, storage internals for compressed vector pages, and a reproducible evaluation methodology emphasizing tail metrics and interference patterns.

Downloads

Published

2020-09-04