GitHub - emeryberger/Hoard: The Hoard Memory Allocator: A Fast, Scalable, and Memory-efficient Malloc for Linux, Windows, and Mac.

The Hoard Memory Allocator

The Hoard memory allocator is a fast, scalable, and memory-efficient memory allocator that works on a range of platforms, including Linux, Mac OS X, and Windows.

Hoard is a drop-in replacement for malloc that can dramatically improve application performance, especially for multithreaded programs running on multiprocessors and multicore CPUs. No source code changes necessary: just link it in or set one environment variable (see Building Hoard, below).

Press

"If you'll be running on multiprocessor machines, ... use Emery Berger's excellent Hoard multiprocessor memory management code. It's a drop-in replacement for the C and C++ memory routines and is very fast on multiprocessor machines."
- Debugging Applications for Microsoft .NET and Microsoft Windows, Microsoft Press
"(To improve scalability), consider an open source alternative such as the Hoard Memory Manager..."
- Windows System Programming, Addison-Wesley
"Hoard dramatically improves program performance through its more efficient use of memory. Moreover, Hoard has provably bounded memory blowup and low synchronization costs."
- Principles of Parallel Programming, Addison-Wesley

Users

Companies using Hoard in their products and servers include AOL, British Telecom, Blue Vector, Business Objects (formerly Crystal Decisions), Cisco, Credit Suisse, Entrust, InfoVista, Kamakura, Novell, Oktal SE, OpenText, OpenWave Systems (for their Typhoon and Twister servers), Pervasive Software, Plath GmbH, Quest Software, Reuters, Royal Bank of Canada, SAP, Sonus Networks, Tata Communications, and Verite Group.

Open source projects using Hoard include the Asterisk Open Source Telephony Project, Bayonne GNU telephony server, the Cilk parallel programming language, the GNU Common C++ system, the OpenFOAM computational fluid dynamics toolkit, and the SafeSquid web proxy.

Hoard is now a standard compiler option for the Standard Performance Evaluation Corporation's CPU2006 benchmark suite for the Intel and Open64 compilers.

Licensing

Hoard has now been released under the widely-used and permissive Apache license, version 2.0.

Why Hoard?

There are a number of problems with existing memory allocators that make Hoard a better choice.

Contention

Multithreaded programs often do not scale because the heap is a bottleneck. When multiple threads simultaneously allocate or deallocate memory from the allocator, the allocator will serialize them. Programs making intensive use of the allocator actually slow down as the number of processors increases. Your program may be allocation-intensive without you realizing it, for instance, if your program makes many calls to the C++ Standard Template Library (STL). Hoard eliminates this bottleneck.

False Sharing

System-provided memory allocators can cause insidious problems for multithreaded code. They can lead to a phenomenon known as "false sharing": threads on different CPUs can end up with memory in the same cache line, or chunk of memory. Accessing these falsely-shared cache lines is hundreds of times slower than accessing unshared cache lines. Hoard is designed to prevent false sharing.

Blowup

Multithreaded programs can also lead the allocator to blowup memory consumption. This effect can multiply the amount of memory needed to run your application by the number of CPUs on your machine: four CPUs could mean that you need four times as much memory. Hoard is guaranteed (provably!) to bound memory consumption.

Installation

Homebrew (Mac OS X)

You can use Homebrew to install the current version of Hoard as follows:

brew tap emeryberger/hoard
brew install --HEAD emeryberger/hoard/libhoard

This not only installs the Hoard library, but also creates a hoard command you can use to run Hoard with anything at the command-line.

hoard myprogram-goes-here

Building Hoard from source (Mac OS X, Linux, and Windows WSL2)

On Linux, you may need to first install the appropriate version of libstdc++-dev (e.g., libstdc++-12-dev):

   sudo apt install libstdc++-dev

Now, to build Hoard from source, do the following:

    git clone https://github.com/emeryberger/Hoard
    mkdir build && cd build
    cmake ..
    make

You can then use Hoard by linking it with your executable, or by setting the LD_PRELOAD environment variable, as in

    export LD_PRELOAD=/path/to/libhoard.so

or, in Mac OS X:

    export DYLD_INSERT_LIBRARIES=/path/to/libhoard.dylib

Building Hoard (Windows)

Hoard uses Microsoft Detours for function interposition on Windows. Detours is automatically downloaded and built by CMake.

git clone https://github.com/emeryberger/Hoard
cd Hoard
mkdir build && cd build
cmake ..
cmake --build . --config Release

This produces build\Release\hoard.dll along with withdll.exe and setdll.exe tools. Supports x86, x64, ARM, and ARM64 architectures.

Using Hoard on Windows

Important: Programs must be compiled with /MD (dynamic C runtime) for Hoard to intercept allocations. Programs compiled with /MT (static C runtime) have allocation functions embedded directly in the executable, which Hoard cannot intercept.

With unmodified executables (recommended):

Use withdll.exe (built automatically) to inject Hoard into any program at runtime, similar to LD_PRELOAD on Linux:

build\Release\withdll.exe /d:build\Release\hoard.dll yourapp.exe [args...]

Permanent modification:

Use setdll.exe (built automatically) to modify an executable's import table:

# Add Hoard to executable (creates backup as .exe~)
build\Release\setdll.exe /d:build\Release\hoard.dll yourapp.exe

# Remove Hoard from executable
build\Release\setdll.exe /r:hoard.dll yourapp.exe

Linking at build time:

You can also link Hoard directly into your application:

cl /Ox /MD yourapp.cpp /link hoard.lib

Benchmarks

The directory benchmarks/ contains a number of benchmarks used to evaluate and tune Hoard.

All benchmarks were run on a 192-core, 2-node NUMA system (AMD EPYC). Graphs are normalized to Hoard (1.0 = Hoard, shown as green line). Values above the line mean worse than Hoard.

Summary

Key findings:

Hoard achieves 1.3-1.5x higher throughput than mimalloc, jemalloc, and glibc on server workloads (Larson)
Hoard is 2-5x faster on realloc-heavy workloads (Phong)
Hoard uses less memory than mimalloc and jemalloc at high thread counts
On NUMA systems, Hoard is up to 1.6x faster due to NUMA-aware memory management

Larson (server workload simulation)

Simulates a multithreaded server handling many short-lived allocations with object passing between threads.

Take-home: Hoard achieves 1.3-1.5x higher throughput than all other allocators across all thread counts. This benchmark is representative of real server workloads.

threadtest (malloc/free throughput)

Measures raw allocation/deallocation throughput with minimal work between operations.

Take-home: Hoard is fastest at low-medium thread counts (8-32 threads) and matches mimalloc at 256 threads. Hoard uses significantly less memory than jemalloc at high thread counts.

Phong (realloc-heavy workload)

Tests realloc performance with repeated grow/shrink patterns.

Take-home: Hoard is 2-5x faster than all other allocators at low-medium thread counts (4-64) due to its optimized in-place realloc implementation.

linux-scalability

Pure malloc/free pairs with no work between operations. Tests raw allocator scalability.

Take-home: jemalloc excels here; this workload is adversarial for Hoard's superblock design. However, jemalloc uses significantly more memory.

NUMA Performance

On NUMA systems, memory locality matters. Hoard's NUMA-aware sharding keeps allocations on the same NUMA node as the allocating thread, reducing cross-node memory traffic.

Take-home: At 128 threads on a 2-node NUMA system, Hoard is 1.4x faster than mimalloc, 1.4x faster than jemalloc, and 1.6x faster than glibc. The advantage grows with thread count.

Technical Information

Hoard has changed quite a bit over the years, but for technical details of the first version of Hoard, read Hoard: A Scalable Memory Allocator for Multithreaded Applications, by Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. The Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX). Cambridge, MA, November 2000.

Name		Name	Last commit message	Last commit date
Latest commit History 677 Commits
.github		.github
benchmarks		benchmarks
cmake		cmake
doc		doc
scripts		scripts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
AUTHORS		AUTHORS
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
COPYING		COPYING
LICENSE		LICENSE
NEWS		NEWS
NOTICE		NOTICE
README.md		README.md
THANKS		THANKS
bench_malloc.cpp		bench_malloc.cpp
test_malloc.cpp		test_malloc.cpp
todo.md		todo.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Hoard Memory Allocator

Press

Users

Licensing

Why Hoard?

Contention

False Sharing

Blowup

Installation

Homebrew (Mac OS X)

Building Hoard from source (Mac OS X, Linux, and Windows WSL2)

Building Hoard (Windows)

Using Hoard on Windows

Benchmarks

Summary

Larson (server workload simulation)

threadtest (malloc/free throughput)

Phong (realloc-heavy workload)

linux-scalability

NUMA Performance

Technical Information

About

Licenses found

Uh oh!

Releases 5

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

The Hoard Memory Allocator

Press

Users

Licensing

Why Hoard?

Contention

False Sharing

Blowup

Installation

Homebrew (Mac OS X)

Building Hoard from source (Mac OS X, Linux, and Windows WSL2)

Building Hoard (Windows)

Using Hoard on Windows

Benchmarks

Summary

Larson (server workload simulation)

threadtest (malloc/free throughput)

Phong (realloc-heavy workload)

linux-scalability

NUMA Performance

Technical Information

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 5

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages