SpikeInterface · Julie-Fabre · Jan 7, 2026 · Jan 7, 2026 · Jan 7, 2026 · Jan 7, 2026
diff --git a/.gitignore b/.gitignore
@@ -141,6 +141,7 @@ examples/modules_gallery/**/*.zarr
 
 # Files and folders generated during tests
 test_folder/
+*trained_pipeline/
 
 # Mac OS
 .DS_Store

diff --git a/doc/api.rst b/doc/api.rst
@@ -373,6 +373,7 @@ spikeinterface.curation
     .. autofunction:: remove_redundant_units
     .. autofunction:: remove_duplicated_spikes
     .. autofunction:: remove_excess_spikes
+    .. autofunction:: threshold_metrics_label_units
     .. autofunction:: model_based_label_units
     .. autofunction:: load_model
     .. autofunction:: train_model

diff --git a/doc/how_to/auto_label_units.rst b/doc/how_to/auto_label_units.rst
@@ -0,0 +1,274 @@
+Automatic labeling units after spike sorting
+============================================
+
+This example shows how to automatically label units after spike sorting,
+using three different approaches:
+
+1. Simple filter based on quality metrics
+2. Bombcell: heuristic approach to label units based on quality and
+   template metrics [Fabre]_
+3. UnitRefine: pre-trained classifiers to label units as noise or
+   SUA/MUA [Jain]_
+
+.. code:: ipython3
+
+    import numpy as np
+
+    import spikeinterface as si
+    import spikeinterface.curation as sc
+    import spikeinterface.widgets as sw
+
+    from pprint import pprint
+
+.. code:: ipython3
+
+    %matplotlib inline
+
+.. code:: ipython3
+
+    analyzer_path = "/ssd980/working/analyzer_np2_single_shank.zarr"
+
+.. code:: ipython3
+
+    sorting_analyzer = si.load(analyzer_path)
+
+
+.. code:: ipython3
+
+    sorting_analyzer
+
+
+.. parsed-literal::
+
+    SortingAnalyzer: 96 channels - 142 units - 1 segments - zarr - sparse - has recording
+    Loaded 14 extensions: amplitude_scalings, correlograms, isi_histograms, noise_levels, principal_components, quality_metrics, random_spikes, spike_amplitudes, spike_locations, template_metrics, template_similarity, templates, unit_locations, waveforms
+
+
+
+The ``SortingAnalyzer`` includes several metrics that we can use for
+curation:
+
+.. code:: ipython3
+
+    sorting_analyzer.get_metrics_extension_data().columns
+
+
+
+
+.. parsed-literal::
+
+    Index(['amplitude_cutoff', 'amplitude_cv_median', 'amplitude_cv_range',
+           'amplitude_median', 'd_prime', 'drift_mad', 'drift_ptp', 'drift_std',
+           'firing_range', 'firing_rate', 'isi_violations_count',
+           'isi_violations_ratio', 'isolation_distance', 'l_ratio', 'nn_hit_rate',
+           'nn_miss_rate', 'noise_cutoff', 'noise_ratio', 'num_spikes',
+           'presence_ratio', 'rp_contamination', 'rp_violations', 'sd_ratio',
+           'silhouette', 'sliding_rp_violation', 'snr', 'sync_spike_2',
+           'sync_spike_4', 'sync_spike_8', 'exp_decay', 'half_width',
+           'main_peak_to_trough_ratio', 'main_to_next_extremum_duration',
+           'num_negative_peaks', 'num_positive_peaks',
+           'peak_after_to_trough_ratio', 'peak_after_width',
+           'peak_before_to_peak_after_ratio', 'peak_before_to_trough_ratio',
+           'peak_before_width', 'peak_to_trough_duration', 'recovery_slope',
+           'repolarization_slope', 'spread', 'trough_width', 'velocity_above',
+           'velocity_below', 'waveform_baseline_flatness'],
+          dtype='object')
+
+
+
+1. Quality-metrics based curation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A simple solution is to use a filter based on quality metrics. To do so,
+we can use the ``spikeinterface.curation.threshold_metrics_label_units``
+function and provide a set of thresholds.
+
+.. code:: ipython3
+
+    qm_thresholds = {
+        "snr": {"min": 5},
+        "firing_rate": {"min": 0.1, "max": 200},
+        "rp_contamination": {"max": 0.5}
+    }
+
+.. code:: ipython3
+
+    qm_labels = sc.threshold_metrics_label_units(sorting_analyzer, thresholds=qm_thresholds)
+
+.. code:: ipython3
+
+    qm_labels["label"].value_counts()
+
+
+
+
+.. parsed-literal::
+
+    label
+    noise    115
+    good      27
+    Name: count, dtype: int64
+
+
+
+.. code:: ipython3
+
+    w = sw.plot_unit_labels(sorting_analyzer, qm_labels["label"], ylims=(-300, 100))
+    w.figure.suptitle("Quality-metrics labeling")
+
+
+
+.. image:: auto_label_units_files/auto_label_units_12_1.png
+
+
+Only 27 units are labeled as *good*, and we can see from the plots that
+some “noisy” waveforms are not properly flagged and some visually good
+waveforms are labeled as noise. Let’s take a look at more powerful
+methods.
+
+1. Bombcell
+-----------
+
+**Bombcell** ([Fabre]_) is another threshold-based method that also uses
+quality metrics and template metrics, but in a much more refined way! It
+can label units as ``noise``, ``mua``, and ``good`` and further detect
+``non-soma`` units. It comes with some default thresholds, but
+user-defined thresholds can be provided from a dictionary or a JSON
+file.
+
+.. code:: ipython3
+
+    bombcell_default_thresholds = sc.bombcell_get_default_thresholds()
+    pprint(bombcell_default_thresholds)
+
+
+.. parsed-literal::
+
+    {'amplitude_cutoff': {'max': 0.2, 'min': None},
+     'amplitude_median': {'max': None, 'min': 40},
+     'drift_ptp': {'max': 100, 'min': None},
+     'exp_decay': {'max': 0.1, 'min': 0.01},
+     'main_peak_to_trough_ratio': {'max': 0.8, 'min': None},
+     'num_negative_peaks': {'max': 1, 'min': None},
+     'num_positive_peaks': {'max': 2, 'min': None},
+     'num_spikes': {'max': None, 'min': 300},
+     'peak_after_to_trough_ratio': {'max': 0.8, 'min': None},
+     'peak_before_to_peak_after_ratio': {'max': 3, 'min': None},
+     'peak_before_to_trough_ratio': {'max': 3, 'min': None},
+     'peak_before_width': {'max': None, 'min': 0.00015},
+     'peak_to_trough_duration': {'max': 0.00115, 'min': 0.0001},
+     'presence_ratio': {'max': None, 'min': 0.7},
+     'rp_contamination': {'max': 0.1, 'min': None},
+     'snr_baseline': {'max': None, 'min': 5},
+     'trough_width': {'max': None, 'min': 0.0002},
+     'waveform_baseline_flatness': {'max': 0.5, 'min': None}}
+
+
+.. code:: ipython3
+
+    bombcell_labels = sc.bombcell_label_units(sorting_analyzer, thresholds=bombcell_default_thresholds)
+
+.. code:: ipython3
+
+    bombcell_labels["label"].value_counts()
+
+
+
+
+.. parsed-literal::
+
+    label
+    good        58
+    noise       50
+    mua         33
+    non_soma     1
+    Name: count, dtype: int64
+
+
+
+.. code:: ipython3
+
+    w = sw.plot_unit_labels(sorting_analyzer, bombcell_labels["label"], ylims=(-300, 100))
+    w.figure.suptitle("Bombcell labeling")
+
+
+
+.. image:: auto_label_units_files/auto_label_units_18_1.png
+
+
+UnitRefine
+----------
+
+**UnitRefine** ([Jain]_) also uses quality and template metrics, but in
+a different way. It uses pre-trained classifiers to trained on
+hand-curated data. By default, the classification is performed in two
+steps: first a *noise*/*neural* classifier is applied, followed by a
+*sua*/*mua* classifier. Several models are available on the
+`SpikeInterface HuggingFace
+page <https://huggingface.co/SpikeInterface>`__.
+
+.. code:: ipython3
+
+    unitrefine_labels = sc.unitrefine_label_units(
+        sorting_analyzer,
+        noise_neural_classifier="SpikeInterface/UnitRefine_noise_neural_classifier",
+        sua_mua_classifier="SpikeInterface/UnitRefine_sua_mua_classifier",
+    )
+
+.. code:: ipython3
+
+    unitrefine_labels["label"].value_counts()
+
+
+
+
+.. parsed-literal::
+
+    label
+    sua      62
+    noise    47
+    mua      33
+    Name: count, dtype: int64
+
+
+
+.. code:: ipython3
+
+    w = sw.plot_unit_labels(sorting_analyzer, unitrefine_labels["label"], ylims=(-300, 100))
+    w.figure.suptitle("UnitRefine labeling")
+
+
+
+.. image:: auto_label_units_files/auto_label_units_22_1.png
+
+
+.. note::
+
+    If you want to train your own models, see the `UnitRefine
+    repo <%60https://github.com/anoushkajain/UnitRefine%60>`__ for
+    instructions!
+
+This “How To” demonstrated how to automatically label units after spike
+sorting with different strategies. We recommend running **Bombcell** and
+**UnitRefine** as part of your pipeline. These methods will facilitate
+further curation and make downstream analysis cleaner.
+
+To remove units from your ``SortingAnalyzer``, you can simply use the
+``select_units`` function:
+
+.. code:: ipython3
+
+    non_noisy_units = bombcell_labels["label"] != "noise"
+    sorting_analyzer_clean = sorting_analyzer.select_units(sorting_analyzer.unit_ids[non_noisy_units])
+
+.. code:: ipython3
+
+    sorting_analyzer_clean
+
+
+
+
+.. parsed-literal::
+
+    SortingAnalyzer: 96 channels - 92 units - 1 segments - memory - sparse - has recording
+    Loaded 14 extensions: random_spikes, waveforms, templates, amplitude_scalings, correlograms, isi_histograms, noise_levels, principal_components, spike_locations, spike_amplitudes, quality_metrics, template_metrics, template_similarity, unit_locations
diff --git a/doc/how_to/auto_label_units_files/auto_label_units_12_1.png b/doc/how_to/auto_label_units_files/auto_label_units_12_1.png
diff --git a/doc/how_to/auto_label_units_files/auto_label_units_18_1.png b/doc/how_to/auto_label_units_files/auto_label_units_18_1.png
diff --git a/doc/how_to/auto_label_units_files/auto_label_units_22_1.png b/doc/how_to/auto_label_units_files/auto_label_units_22_1.png
diff --git a/doc/how_to/import_kilosort_data.rst b/doc/how_to/import_kilosort_data.rst
@@ -49,7 +49,7 @@ If you'd like to store the information you've computed, you can save the analyze
     )
 
 You now have a fully functional ``SortingAnalyzer`` - congrats! You can now use `spikeinterface-gui <https://github.com/SpikeInterface/spikeinterface-gui/>`__. to view the results
-interactively, or start manually labelling your units to `create an automated curation model <https://spikeinterface.readthedocs.io/en/stable/tutorials_custom_index.html#automated-curation-tutorials>`__.
+interactively, or start manually labeling your units to `create an automated curation model <https://spikeinterface.readthedocs.io/en/stable/tutorials_custom_index.html#automated-curation-tutorials>`__.
 
 Note that if you have access to the raw recording, you can attach it to the analyzer, and re-compute extensions from the raw data. E.g.
 

diff --git a/doc/how_to/index.rst b/doc/how_to/index.rst
@@ -21,6 +21,7 @@ Guides on how to solve specific, short problems in SpikeInterface. Learn how to.
     load_matlab_data
     load_your_data_into_sorting
     benchmark_with_hybrid_recordings
+    auto_label_units
     auto_curation_training
     auto_curation_prediction
     import_kilosort_data
diff --git a/doc/images/template_metrics.png b/doc/images/template_metrics.png
diff --git a/doc/modules/metrics.rst b/doc/modules/metrics.rst
@@ -28,14 +28,32 @@ metric information. For example, you can get the list of available metrics using
 .. code-block::
 
     Available metric columns:
-    ['peak_to_valley', 'peak_trough_ratio', 'half_width', 'repolarization_slope',
-     'recovery_slope', 'num_positive_peaks', 'num_negative_peaks', 'velocity_above',
-     'velocity_below', 'exp_decay', 'spread']
+    [
+        'peak_to_trough_duration',
+        'half_width',
+        'repolarization_slope',
+        'recovery_slope',
+        'num_positive_peaks',
+        'num_negative_peaks',
+        'main_to_next_peak_duration',
+        'peak_before_to_trough_ratio',
+        'peak_after_to_trough_ratio',
+        'peak_before_to_peak_after_ratio',
+        'main_peak_to_trough_ratio',
+        'trough_width',
+        'peak_before_width',
+        'peak_after_width',
+        'waveform_baseline_flatness',
+        'velocity_above',
+        'velocity_below',
+        'exp_decay',
+        'spread'
+    ]
 
 
 .. code-block:: python
 
-    metric_descriptions = ComputeTemplateMetrics.get_metric_descriptions()
+    metric_descriptions = ComputeTemplateMetrics.get_metric_column_descriptions()
     print("Metric descriptions: ")
     print(metric_descriptions)
 
@@ -44,21 +62,30 @@ metric information. For example, you can get the list of available metrics using
 
     Metric descriptions:
     {
-        'peak_to_valley': 'Duration in s between the trough (minimum) and the peak (maximum) of the spike waveform.',
-        'peak_trough_ratio': 'Ratio of the amplitude of the peak (maximum) to the trough (minimum) of the spike waveform.',
+        'peak_to_trough_duration': 'Duration in seconds between the trough (minimum) and the peak (maximum) of the spike waveform.',
         'half_width': 'Duration in s at half the amplitude of the trough (minimum) of the spike waveform.',
         'repolarization_slope': 'Slope of the repolarization phase of the spike waveform, between the trough (minimum) and return to baseline in uV/s.',
         'recovery_slope': 'Slope of the recovery phase of the spike waveform, after the peak (maximum) returning to baseline in uV/s.',
         'num_positive_peaks': 'Number of positive peaks in the template',
-        'num_negative_peaks': 'Number of negative peaks in the template',
+        'num_negative_peaks': 'Number of negative peaks (troughs) in the template',
+        'main_to_next_peak_duration': 'Duration in seconds from main extremum to next extremum.',
+        'peak_before_to_trough_ratio': 'Ratio of peak before amplitude to trough amplitude',
+        'peak_after_to_trough_ratio': 'Ratio of peak after amplitude to trough amplitude',
+        'peak_before_to_peak_after_ratio': 'Ratio of peak before amplitude to peak after amplitude',
+        'main_peak_to_trough_ratio': 'Ratio of main peak amplitude to trough amplitude',
+        'trough_width': 'Width of the main trough in seconds',
+        'peak_before_width': 'Width of the main peak before trough in seconds',
+        'peak_after_width': 'Width of the main peak after trough in seconds',
+        'waveform_baseline_flatness': 'Ratio of max baseline amplitude to max waveform amplitude. Lower = flatter baseline.',
         'velocity_above': 'Velocity of the spike propagation above the max channel in um/ms',
         'velocity_below': 'Velocity of the spike propagation below the max channel in um/ms',
-        'exp_decay': 'Exponential decay of the template amplitude over distance from the extremum channel (1/um).',
+        'exp_decay': 'Spatial decay of the template amplitude over distance from the extremum channel (1/um). Uses exponential or linear fit based on linear_fit parameter.',
         'spread': 'Spread of the template amplitude in um, calculated as the distance between channels whose templates exceed the spread_threshold.'
     }
 
 
 
+
 .. toctree::
     :caption: Metrics submodules
     :maxdepth: 1