Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions libdd-trace-utils/src/span/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
pub mod trace_utils;
pub mod v04;
pub mod v05;
pub mod vec_map;

use crate::msgpack_decoder::decode::buffer::read_string_ref_nomut;
use crate::msgpack_decoder::decode::error::DecodeError;
Expand Down
260 changes: 260 additions & 0 deletions libdd-trace-utils/src/span/vec_map.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github doesn't support thread comments so putting my thoughts here to keep it clear.

Overall this seems like it could be very beneficial to perf, have you considered:

  1. Using a dependency that does it (ex: https://github.com/p-avital/vec-map-rs)
  2. Storing keys and values in separate Vecs which as explained in vec-map-rs's readme could make the key lookup even faster by packing them close together.
  3. Adding micro benchmarks to compare HashMap/BTreeMap/VecMap(/vec-map-rs) operations on different number of keys, size of keys and size of values ?
  4. Having a higher level hybrid struct that underneath uses a VecMap until a certain threshold in the number of keys is reached then switches to HashMap ? (might be overkill)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could make the key lookup even faster by packing them close together.

Attribute lists are usually pretty small (less than 10s of elements), and we need to chase one pointer when iterating anyway to read the string, and we would need to do two allocations instead of one so I have doubts on the perf gain.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are all very good points:

  1. There's always a balance when pulling in a new dependency. For libdatadog, the cons are size explosion, and supply chain attack, most notably. Since the code is really simple (it's mostly boilerplate to be honest), I felt like having a bespoke datastructure here was preferable. But I think both approaches are reasonable. Another thing with having our own is that we can customize it to our needs, typically the tombstone-like removal (well it could probably easily be implemented as a wrapper of vec-map-rs, so maybe not a great example). Meta and metrics are very special cases of maps, and we might be able to take advantage of that even further?
  2. As @paullegranddc mentioned, I don't think it's worth the complexity or even that it would be a win. We do not care that much for get already. But to keep in mind for potential evolutions?
  3. We plan to bench this in dd-trace-rs and other real usage. For now this code is dead, so it doesn't impact anything that exists already. But maybe having a bunch of micro benches doesn't hurt, and can be easily generated by an LLM. I'll look into it!
  4. Without thinking too much into it, I think this might be overkill for now. Simplicity, beside being beneficial for maintenance, usually wins in terms of performance in a surprisingly vast number of use-cases: less branching, no pointer chasing and cache misses, etc. We should bench real tracers and only think about a more complex scheme if it turns out there are largish maps that show unacceptable perf degradation.

// SPDX-License-Identifier: Apache-2.0

//! This module defines a associative map datastructure for spans data (meta, metrics, etc.) backed
//! by a vector. Spans are mostly allocated and constructed, and more rarely read or mutated.
//! [VecMap] is thus optimized for insertion (which is just `Vec::push`), without any hashing
//! involved. Fetching and removing a value is, on the other hand, linear time in the size of the
//! map.

use serde::ser::{Serialize, SerializeMap, Serializer};
use std::borrow::Borrow;
use std::collections::HashSet;
use std::hash::Hash;

/// A Vec-backed map that provides HashMap-like lookup by key.
///
/// # Duplicates
///
/// Duplicates are tolerated: [VecMap::insert] always appends, and [VecMap::get]/[VecMap::get_mut]
/// return the *last* matching entry so that later writes shadow earlier ones. This optimizes for
/// fast insert and construction (that might happen on the client's application hot path), avoiding
/// a linear scan on each insert (or a potential costly full re-hashing with a hashmap).
/// Additionally, while overriding a metric or a meta definitively happens, it's assumed to be rare
/// enough so such that the size penalty of duplication is expected to be reasonable.
///
/// **Important**: note that only [VecMap::get] and [VecMap::get_mut] are duplicate-aware, so to
/// speak. [Vec::len], [Vec::iter], and others just delegates to the underlying `Vec`, and won't
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// speak. [Vec::len], [Vec::iter], and others just delegates to the underlying `Vec`, and won't
/// speak. [VecMap::len], [VecMap::iter], and others just delegates to the underlying `Vec`, and won't

/// deduplicate.
///
/// Explicit deduplication is currently being done automatically and on-the-fly during
/// serialization. If needed, in the future, we might trigger deduplication on other events, for
/// example at insertion if the size is bigger than a threshold.
#[derive(Clone, Debug, PartialEq, Default)]
pub struct VecMap<K, V>(Vec<(K, V)>);

impl<K, V> VecMap<K, V> {
#[must_use]
#[inline]
pub fn new() -> Self {
VecMap(Vec::new())
}

#[must_use]
#[inline]
pub fn with_capacity(capacity: usize) -> Self {
VecMap(Vec::with_capacity(capacity))
}

#[inline]
pub fn insert(&mut self, key: K, value: V) {
self.0.push((key, value));
}

#[inline]
pub fn get<Q>(&self, key: &Q) -> Option<&V>
where
K: Borrow<Q>,
Q: ?Sized + PartialEq,
{
self.0
.iter()
.rev()
.find(|(k, _)| k.borrow() == key)
.map(|(_, v)| v)
}

#[inline]
pub fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V>
where
K: Borrow<Q>,
Q: ?Sized + PartialEq,
{
self.0
.iter_mut()
.rev()
.find(|(k, _)| (*k).borrow() == key)
.map(|(_, v)| v)
}

#[inline]
pub fn contains_key<Q>(&self, key: &Q) -> bool
where
K: Borrow<Q>,
Q: ?Sized + PartialEq,
{
self.0.iter().any(|(k, _)| k.borrow() == key)
}

/// Remove all entries matching this key from the map. This method uses [Vec::retain], and is
/// thus potentially costly (like any removal in a vector-like datastructure).
// Note: we might implement a tombstone or option-based deletion later, if removal is a bit too
// costly.
#[inline]
pub fn remove_slow<Q>(&mut self, key: &Q)
where
K: Borrow<Q>,
Q: ?Sized + PartialEq,
{
self.0.retain(|(k, _)| k.borrow() != key);
}

/// Iterate over the element, including duplicate entries.
#[inline]
pub fn iter(&self) -> std::slice::Iter<'_, (K, V)> {
self.0.iter()
}

/// Iterate mutably over the elements, including duplicate entries.
#[inline]
pub fn iter_mut(&mut self) -> std::slice::IterMut<'_, (K, V)> {
self.0.iter_mut()
}

/// Return the length of the underlying vector, thus including duplicate entries.
#[inline]
pub fn len(&self) -> usize {
self.0.len()
}

#[inline]
pub fn is_empty(&self) -> bool {
self.0.is_empty()
}
}

impl<K, V> From<Vec<(K, V)>> for VecMap<K, V> {
fn from(vec: Vec<(K, V)>) -> Self {
VecMap(vec)
}
}

impl<K, V> FromIterator<(K, V)> for VecMap<K, V> {
fn from_iter<I: IntoIterator<Item = (K, V)>>(iter: I) -> Self {
VecMap(iter.into_iter().collect())
}
}

impl<K, V> IntoIterator for VecMap<K, V> {
type Item = (K, V);
type IntoIter = std::vec::IntoIter<(K, V)>;

fn into_iter(self) -> Self::IntoIter {
self.0.into_iter()
}
}

impl<'a, K, V> IntoIterator for &'a VecMap<K, V> {
type Item = &'a (K, V);
type IntoIter = std::slice::Iter<'a, (K, V)>;

fn into_iter(self) -> Self::IntoIter {
self.0.iter()
}
}

impl<'a, K, V> IntoIterator for &'a mut VecMap<K, V> {
type Item = &'a mut (K, V);
type IntoIter = std::slice::IterMut<'a, (K, V)>;

fn into_iter(self) -> Self::IntoIter {
self.0.iter_mut()
}
}

impl<K, V> Extend<(K, V)> for VecMap<K, V> {
fn extend<I: IntoIterator<Item = (K, V)>>(&mut self, iter: I) {
self.0.extend(iter);
}
}

impl<K: Serialize + Eq + Hash, V: Serialize> Serialize for VecMap<K, V> {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
let mut map = serializer.serialize_map(None)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not great as rpm_serde will allocate an intermediary buffer and copy it to the main buffer once it contains the length https://docs.rs/rmp-serde/latest/src/rmp_serde/encode.rs.html#474-494

I think we should deduplicate before serialization/rely on the agent semantic that last duplicate key wins

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a point where I hesitated. At some point I had a DedupedVecMap wrapper, and only implemented serialize on it. However, it's surprisingly annoying to avoid any additional allocation when doing it in-place. You either need to clone the keys (maybe that's ok?), because all the nice iterators or retain helpers take a closure that holds key references that don't survive long enough.

One possibility is to clone keys to put them in the set. I think we could technically avoid that but that would mean re-implementing retain manually, basically... Probably clone is fine.

let mut seen = HashSet::with_capacity(self.len());
for (k, v) in self.0.iter().rev() {
if seen.insert(k) {
map.serialize_entry(k, v)?;
}
}
Comment on lines +173 to +179
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about doing this then? It allocates a bit more for because we use a hashmap with the index instead of a set but does the same amount of work

Suggested change
let mut map = serializer.serialize_map(None)?;
let mut seen = HashSet::with_capacity(self.len());
for (k, v) in self.0.iter().rev() {
if seen.insert(k) {
map.serialize_entry(k, v)?;
}
}
let mut seen = HashMap::with_capacity(self.len());
for (i, (k, _)) in self.0.iter().rev().enumerate() {
seen.entry(k.as_ref()).or_insert(i);
}
let mut map = serializer.serialize_map(Some(seen.len()))?;
for (k, v) in seen.values().map(|i| self.0.get(i)) {
map.serialize_entry(k, v)?;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, moving everything to a hashmap is another possibility, but as you said it does allocate a bit more. If we do this I wonder if we shouldn't bite the bullet and just build a hashmap out of the vec and iterate over it. If the value are 64 bits, this get rid of fetching back stuff from the vec, and waste no more. If values can be locally large though, maybe not.

map.end()
}
Comment on lines +172 to +181
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make everything reverse order, not just the duplicates. If serialized we don't care about ordering this is fine. Otherwise a map with upsert without reversing would achieve the same effect without reversing outputs for non-dedup entries but is slightly slower.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also building this first would give us the dedup size and so we wouldn't need to have the serializer.serialize_map(None).

Am I missing something ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reverse order is fine. I should make that explicit, but VecMap doesn't provide any guarantees about the iteration order of stuff with different keys, as most maps (HashMap in particular, which has a random iteration order by default).

The problem of building the deduped map first is that it causes an additional intermediate allocation. Or we could just count in a first loop, but then we'd have to clear the HashSet and to start again hash everything twice. None of those are really a no-go, but I could avoid them this way.

}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn get_returns_last_inserted() {
let mut m = VecMap::new();
m.insert("a", 1);
m.insert("a", 2);
assert_eq!(m.get("a"), Some(&2));
}

#[test]
fn get_mut_returns_last_inserted() {
let mut m = VecMap::new();
m.insert("a", 1);
m.insert("a", 2);
*m.get_mut("a").unwrap() = 42;
assert_eq!(m.get("a"), Some(&42));
// First entry unchanged
assert_eq!(m.iter().next().unwrap().1, 1);
}

#[test]
fn remove_removes_all_occurrences() {
let mut m = VecMap::new();
m.insert("a", 1);
m.insert("b", 2);
m.insert("a", 3);
m.remove_slow("a");
assert_eq!(m.get("a"), None);
assert!(!m.contains_key("a"));
assert_eq!(m.len(), 1);
}

#[test]
fn contains_key_works() {
let mut m = VecMap::new();
assert!(!m.contains_key("x"));
m.insert("x", 10);
assert!(m.contains_key("x"));
}

#[test]
fn from_iterator() {
let m: VecMap<&str, i32> = vec![("a", 1), ("b", 2)].into_iter().collect();
assert_eq!(m.len(), 2);
assert_eq!(m.get("b"), Some(&2));
}

#[test]
fn into_iter_consuming() {
let mut m = VecMap::new();
m.insert("a", 1);
m.insert("b", 2);
let pairs: Vec<_> = m.into_iter().collect();
assert_eq!(pairs, vec![("a", 1), ("b", 2)]);
}

#[test]
fn serialize_deduplicates_keeping_last() {
let mut m = VecMap::new();
m.insert("a", 0);
m.insert("b", 0);
m.insert("b", 1);
m.insert("a", 1);
m.insert("a", 3);
m.insert("b", 2);

let serialized: serde_json::Value = serde_json::to_value(&m).unwrap();
let obj = serialized.as_object().unwrap();

assert_eq!(obj.len(), 2);
assert_eq!(obj.get("a").unwrap(), 3);
assert_eq!(obj.get("b").unwrap(), 2);
}
}
Loading