settylab · settylab-dotto-bot · May 29, 2026 · May 30, 2026 · Jun 2, 2026
@@ -4,10 +4,22 @@ All notable changes to this project will be documented in this file.
 
 ## [Unreleased]
 
+## [0.8.0] - 2026-05-28
+
+### Behavior changes
+
+These three changes correct discrepancies between the implementation and the manuscript that describes Kompot's statistics. Two are numerical scale shifts that preserve relative rankings; the third is a default-value harmonization. Re-tune any absolute thresholds calibrated against 0.7.0.
+
+ - **Mahalanobis denominator now sums covariances**: the gene-wise Mahalanobis distance used by `DifferentialExpression.predict(compute_mahalanobis=True)` now computes the posterior combined covariance as `Σ_a + Σ_b` (the variance of the difference of two independent posterior estimators) instead of `(Σ_a + Σ_b) / 2`. Matches the manuscript definition `D(a,b) = sqrt((μ_a − μ_b)^T (Σ_a + Σ_b)^(-1) (μ_a − μ_b))`. **Effect**: absolute Mahalanobis distances in the GP-only regime contract by a factor of `√2` (and `D²` by 2). Relative rankings of genes are unchanged because the same scale factor applies everywhere, and FDR thresholds re-calibrate against the null. The sample-variance branch was already correctly summed.
+ - **Differential-abundance posterior tail probability is now one-sided**: `DifferentialAbundance.predict()` returns `PTP = Φ(−|z|)` (one-sided), matching the manuscript definition `PTP(x_i) = Φ(−|Δ(x_i)|/√(σ_a² + σ_b²))`. Previous releases returned `2·Φ(−|z|)` (two-sided). **Effect**: numeric PTP values are halved relative to 0.7.0; equivalently, `neg_log10_fold_change_ptp` increases by `log10(2) ≈ 0.301`. The threshold `PTP < 1e-3` previously corresponded to `|z| ≥ 3.29` and now corresponds to `|z| ≥ 3.09`. Re-tune any hard-coded `ptp_threshold` chosen against 0.7.0 if you want to preserve the old call-rate.
+ - **`use_empirical_variance` default is now `False` everywhere**: harmonized across `kompot.de()` (already False), `DifferentialExpression.__init__`, `ExpressionModel.__init__`, `kompot.smooth_expression()`, the deprecated `compute_differential_expression()` and `compute_smoothed_expression()` wrappers, and the CLI `smooth_config_template.yaml`. Previously these four entry points defaulted to `True`, inconsistent with both the recommended `kompot.de()` path and the manuscript's "empirical variance is disabled by default" statement. Code that relies on empirical variance must now pass `use_empirical_variance=True` explicitly.
+ - **Differential-expression posterior tail probability is now stored in log space**: `DifferentialExpression.predict(compute_mahalanobis=True)` and `kompot.de(..., store_additional_stats=True)` now compute the Mahalanobis posterior tail probability with `scipy.stats.chi2.logsf` in float64 and store `-log10(PTP)` in a renamed field `<result_key>_<a>_to_<b>_neg_log10_ptp` (previously the linear tail probability `chi2.sf` in `<result_key>_<a>_to_<b>_ptp`). The PTP is a strictly monotone transform of the Mahalanobis distance, but for an embedding with df on the order of tens the linear `chi2.sf` evaluates to values numerically indistinguishable from `1.0` for the majority of genes (every gene with `D²` below the chi-squared mean), saturating the stored statistic and destroying gene-ranking resolution at the head of the distribution. Computing `logsf` directly (never forming `1 − cdf`) keeps every value distinct and recovers the Mahalanobis ranking exactly. This mirrors the differential-abundance path, whose `neg_log10_lfc_ptp` field is already stored this way. **Effect**: the stored values change scale and orientation — larger now means more significant (a probability `p` becomes `−log10(p) ∈ [0, ∞)`), and the field is renamed. `volcano_de(y_axis_type="ptp")` reads the column directly with no additional `-log10` transform, and its `significance_threshold` is still supplied as a probability. Update any code that reads the old `_ptp` column.
+
 ### New features
 
  - **`--dry-run` flag for `kompot de` CLI**: estimates memory, disk, and output field requirements without running the analysis. Outputs machine-parseable JSON to stdout and a human-readable report to stderr. Exit code reflects feasibility.
  - **`kompot.configure_logging(stream)`**: reconfigure the kompot logger output stream. The CLI now logs to stderr by default, keeping stdout clean for machine-parseable output (dry-run JSON, table output).
+ - **`kompot.plot.lollipop`**: ax-embeddable gene-set-enrichment lollipop plot. One row per enriched term; a stem runs to a dot whose x-position encodes significance (`-log10(FDR)` by default, or any score column such as StringDB `signal` / enrichr `Combined Score`) and whose area encodes the matched-gene count, with a dashed `FDR = 0.05` guide and an in-axes aesthetic key. The headline feature is input flexibility: pass a `kompot.plot.StringDBReport` (its `get_functional_enrichment()` is called for you), the `signal`-sorted DataFrame that method returns, **or** a generic enrichment table from another tool (gseapy/enrichr, GOATOOLS, clusterProfiler). Column-name mapping params (`term_col`, `score_col`, `count_col`, `fdr_col`) with case-insensitive autodetection — including the gseapy `"k/K"` `Overlap`-string parser — bridge the schema differences. The fig-3 manuscript specifics (direction-red accent, reserved title band, GO-Process category) are now parameters with manuscript-matching defaults. Like `dotplot`, it composes into an externally-provided `ax=` instead of building its own `GridSpec`.
  - **`kompot.plot.dotplot`**: ax-embeddable fold-change-per-group dotplot. Color = mean of a per-cell LFC layer within each `groupby` category; size = fraction of cells expressing. Gene selection is either an explicit list or auto-picked top-N by Mahalanobis from run history (with optional `filter_key`, e.g. restricting to `is_de=True`). Pass `axes=(main, cbar, size_legend)` to compose into a larger figure, or leave `axes=None` for a standalone figure. Unlike `scanpy.pl.DotPlot`, this function does not build its own `GridSpec` and does not fight externally-provided axes, which is the whole reason it exists. Shares gene-selection, layer-fetch, and colormap-normalization primitives with `kompot.plot.heatmap` via the existing `heatmap.utils` helpers.
 
 ### Improvements

@@ -15,6 +15,16 @@ Expression Plots
 
 .. autofunction:: kompot.plot.plot_gene_expression
 
+Dotplots
+--------
+
+.. autofunction:: kompot.plot.dotplot
+
+Enrichment Lollipop
+-------------------
+
+.. autofunction:: kompot.plot.lollipop
+
 Heatmaps
 --------
 

@@ -771,8 +771,10 @@ def _compute_fdr(
             "mahalanobis": internal_null_mahalanobis,
             "mean_lfc": expression_results["mean_log_fold_change"][n_real:],
         }
-        if "ptp" in expression_results:
-            null_table_data["ptp"] = expression_results["ptp"][n_real:]
+        if "neg_log10_ptp" in expression_results:
+            null_table_data["neg_log10_ptp"] = expression_results["neg_log10_ptp"][
+                n_real:
+            ]
         null_data["table"] = pd.DataFrame(
             null_table_data,
             index=null_gene_names,
@@ -823,8 +825,10 @@ def _compute_fdr(
 
     if "mahalanobis_distances" in expression_results:
         expression_results["mahalanobis_distances"] = real_mahalanobis
-    if "ptp" in expression_results:
-        expression_results["ptp"] = expression_results["ptp"][:n_real]
+    if "neg_log10_ptp" in expression_results:
+        expression_results["neg_log10_ptp"] = expression_results["neg_log10_ptp"][
+            :n_real
+        ]
 
     return fdr_results
 
@@ -909,17 +913,21 @@ def _store_de_results(
             adata,
         )
 
-    if compute_mahalanobis and "ptp" in expression_results and store_additional_stats:
-        ptp = _ensure_1d(
-            expression_results["ptp"],
-            "ptp",
+    if (
+        compute_mahalanobis
+        and "neg_log10_ptp" in expression_results
+        and store_additional_stats
+    ):
+        neg_log10_ptp = _ensure_1d(
+            expression_results["neg_log10_ptp"],
+            "neg_log10_ptp",
             n_selected,
             logger,
         )
         _add_var_column(
             new_var_columns,
             field_names["ptp_key"],
-            ptp,
+            neg_log10_ptp,
             selected_genes,
             adata,
         )
@@ -1426,9 +1434,13 @@ def _compute_group_results(
                         f"significantly DE at FDR < {fdr_threshold}"
                     )
 
-        # Group-wise ptp
-        if compute_mahalanobis and store_additional_stats and "ptp" in subset_results:
-            sub_ptp = subset_results["ptp"]
+        # Group-wise neg_log10_ptp
+        if (
+            compute_mahalanobis
+            and store_additional_stats
+            and "neg_log10_ptp" in subset_results
+        ):
+            sub_ptp = subset_results["neg_log10_ptp"]
             if len(sub_ptp) == len(expanded_genes):
                 sub_ptp = sub_ptp[:n_real]
             elif len(sub_ptp) != n_real:
@@ -1641,7 +1653,8 @@ def _build_field_mapping(
                 "location": "var",
                 "type": "ptp",
                 "description": (
-                    "Posterior tail probability from chi-squared distribution"
+                    "Negative log10 posterior tail probability (-log10 PTP) from "
+                    "the chi-squared distribution, computed in log space"
                 ),
             }
 
@@ -1728,7 +1741,10 @@ def _add_group_field_mapping(
             field_mapping[ptp_k] = {
                 "location": "varm",
                 "type": "ptp",
-                "description": "Peak-to-peak values for all subsets",
+                "description": (
+                    "Negative log10 posterior tail probability (-log10 PTP) for "
+                    "all subsets"
+                ),
                 "contains_subsets": subset_names,
             }
 

@@ -89,7 +89,7 @@ def cleanup(
 
     - ``'mean_log_fold_change'``: Mean log fold change values
     - ``'mahalanobis'``: Mahalanobis distances
-    - ``'ptp'``: Posterior tail probability
+    - ``'ptp'``: Negative log10 posterior tail probability (-log10 PTP)
     - ``'mahalanobis_pvalue'``: P-values from empirical null
     - ``'mahalanobis_local_fdr'``: Local FDR values
     - ``'mahalanobis_tail_fdr'``: Tail-based FDR values

@@ -530,8 +530,8 @@ def de(
 
     if compute_mahalanobis and "mahalanobis_distances" in expression_results:
         results_data["mahalanobis"] = expression_results["mahalanobis_distances"]
-    if "ptp" in expression_results:
-        results_data["ptp"] = expression_results["ptp"]
+    if "neg_log10_ptp" in expression_results:
+        results_data["neg_log10_ptp"] = expression_results["neg_log10_ptp"]
 
     result_dict["table"] = pd.DataFrame(results_data, index=selected_genes)
     result_dict["underrepresentation"] = underrep
@@ -751,7 +751,7 @@ def compute_differential_expression(
     return_full_results: bool = False,
     store_posterior_covariance: bool = False,
     allow_single_condition_variance: bool = False,
-    use_empirical_variance: bool = True,
+    use_empirical_variance: bool = False,
     progress: bool = True,
     null_genes="auto",
     null_seed=42,

@@ -109,7 +109,7 @@ def smooth_expression(
     ls = gp.ls if gp is not None else None
     ls_factor = gp.ls_factor if gp is not None else 10.0
     n_landmarks = gp.n_landmarks if gp is not None else 5000
-    use_empirical_variance = gp.use_empirical_variance if gp is not None else True
+    use_empirical_variance = gp.use_empirical_variance if gp is not None else False
     eps = gp.eps if gp is not None else 1e-8
     random_state = gp.random_state if gp is not None else None
     batch_size = gp.batch_size if gp is not None else 500
@@ -393,7 +393,7 @@ def compute_smoothed_expression(
     sigma: float = 1.0,
     ls: Optional[float] = None,
     ls_factor: float = 10.0,
-    use_empirical_variance: bool = True,
+    use_empirical_variance: bool = False,
     eps: float = 1e-8,
     random_state: Optional[int] = None,
     batch_size: int = 500,

@@ -226,7 +226,7 @@ def generate_output_field_names(
         field_names.update(
             {
                 "mahalanobis_key": f"{result_key}_{cond1_safe}_to_{cond2_safe}_mahalanobis{suffix}",
-                "ptp_key": f"{result_key}_{cond1_safe}_to_{cond2_safe}_ptp{suffix}",
+                "ptp_key": f"{result_key}_{cond1_safe}_to_{cond2_safe}_neg_log10_ptp{suffix}",
                 "mean_lfc_key": f"{result_key}_{cond1_safe}_to_{cond2_safe}_mean_lfc",
                 "smoothed_key_1": f"{result_key}_{cond1_safe}_smoothed",
                 "smoothed_key_2": f"{result_key}_{cond2_safe}_smoothed",

@@ -26,7 +26,7 @@ n_landmarks: 5000              # Number of landmarks for Nystrom approximation
 sample_col: null               # Column in adata.obs with sample IDs
 
 # Empirical variance (heteroscedastic noise):
-use_empirical_variance: true   # Estimate per-gene noise from GP residuals
+use_empirical_variance: false  # Estimate per-gene noise from GP residuals
 
 # GP kernel parameters:
 sigma: 1.0                     # Noise level for function estimator

@@ -603,11 +603,12 @@ def compute_sample_variance2(X_batch):
         sd = np.sqrt(log_fold_change_uncertainty + self.eps)
         log_fold_change_zscore = log_fold_change / sd
 
-        # Compute PTP (Posterior Tail Probability) in natural log (base e)
+        # Compute PTP (Posterior Tail Probability) in natural log (base e).
+        # One-sided per manuscript: PTP = Φ(−|z|) = min(Φ(z), Φ(−z)) for real z.
         ln_ptp = np.minimum(
             normal.logcdf(log_fold_change_zscore),
             normal.logcdf(-log_fold_change_zscore),
-        ) + np.log(2)
+        )
 
         # Convert from natural log to negative log10 (for better volcano plot visualization)
         # ln_ptp is a log of a small value (typically < 1), so it's negative

@@ -2,8 +2,7 @@
 
 import numpy as np
 import jax
-import jax.numpy as jnp
-import jax.scipy.stats as jax_stats
+from scipy.stats import chi2 as scipy_chi2
 from typing import Optional, Dict, Any
 import logging
 from mellon.parameters import compute_landmarks
@@ -42,7 +41,7 @@ def __init__(
         self,
         n_landmarks: Optional[int] = None,
         use_sample_variance: Optional[bool] = None,
-        use_empirical_variance: bool = True,
+        use_empirical_variance: bool = False,
         eps: float = 1e-8,  # Increased default epsilon for better numerical stability
         jit_compile: bool = False,
         function_predictor1: Optional[Any] = None,
@@ -625,8 +624,10 @@ def compute_mahalanobis_distances(
             # Points for sample variance computation
             variance_points = X
 
-        # Average the covariance matrices
-        combined_cov = (cov1 + cov2) / 2
+        # Sum the covariance matrices: Σ_a + Σ_b is the variance of the
+        # difference of independent posterior estimators, matching the
+        # Mahalanobis denominator defined in the manuscript.
+        combined_cov = cov1 + cov2
         del cov1, cov2
 
         # For sample variance, use diag=False to get full covariance matrices
@@ -976,13 +977,38 @@ def predict(
 
             if hasattr(self, "_last_mahalanobis_dof"):
                 logger.debug(
-                    f"Computing ptp with {self._last_mahalanobis_dof} degrees of freedom..."
+                    f"Computing neg_log10_ptp with {self._last_mahalanobis_dof} "
+                    "degrees of freedom..."
                 )
-                mahalanobis_squared = jnp.array(mahalanobis_distances) ** 2
-                ptp = jax_stats.chi2.sf(
+                # Posterior tail probability (PTP) of the Mahalanobis distance under
+                # the chi-squared null. Computed in LOG space and stored as
+                # -log10(PTP), mirroring the DA path's neg_log10_lfc_ptp convention.
+                #
+                # The PTP is chi2.sf(D^2, df). For an embedding with df on the order
+                # of tens, the vast majority of genes have D^2 well below the chi2
+                # mean, where the linear sf evaluates to values numerically
+                # indistinguishable from 1.0 in float64 (1 - epsilon rounds to 1.0).
+                # Storing the linear sf therefore collapses most genes onto a single
+                # saturated value and destroys gene-ranking resolution at the head of
+                # the distribution. chi2.logsf returns the log of the same quantity
+                # directly (never forming 1 - cdf), so log(1 - epsilon) ~= -epsilon
+                # remains representable and every gene keeps a distinct value.
+                #
+                # scipy in float64 is used deliberately: jax runs in float32 unless
+                # x64 is explicitly enabled, and float32 logsf re-collapses the
+                # dynamic range (the precision is in the mantissa we are trying to
+                # preserve).
+                mahalanobis_squared = np.asarray(
+                    mahalanobis_distances, dtype=np.float64
+                ) ** 2
+                ln_ptp = scipy_chi2.logsf(
                     mahalanobis_squared, df=self._last_mahalanobis_dof
                 )
-                result["ptp"] = np.array(ptp)
+                # Convert natural-log tail probability to -log10(PTP): positive and
+                # larger for more significant genes, matching the DA convention.
+                result["neg_log10_ptp"] = np.asarray(
+                    -(ln_ptp / np.log(10)), dtype=np.float64
+                )
 
         return result
 

@@ -112,6 +112,7 @@ class ExpressionModel:
         Number of landmarks for Nystrom approximation.
     use_empirical_variance : bool
         Whether to estimate per-gene empirical variance from GP residuals.
+        By default False.
     eps : float
         Small constant for numerical stability.
     random_state : int, optional
@@ -135,7 +136,7 @@ class ExpressionModel:
     def __init__(
         self,
         n_landmarks: Optional[int] = None,
-        use_empirical_variance: bool = True,
+        use_empirical_variance: bool = False,
         eps: float = 1e-8,
         random_state: Optional[int] = None,
         batch_size: int = 500,

@@ -166,6 +166,20 @@ def dotplot(*args, **kwargs):
         )
 
 
+try:
+    from .lollipop import lollipop
+
+    __all__.append("lollipop")
+except ImportError as e:
+    logger.warning(f"Could not import lollipop function due to: {e}")
+
+    def lollipop(*args, **kwargs):
+        raise ImportError(
+            "Lollipop plot unavailable due to missing dependencies. "
+            "matplotlib is required."
+        )
+
+
 # Import StringDB report class
 try:
     from .stringdb import StringDBReport

@@ -230,7 +230,7 @@ def _fallback_field_inference(
         "direction_key": ["direction"],
         "mean_lfc_key": ["mean_lfc", "lfc", "log_fold_change", "fold_change"],
         "mahalanobis_key": ["mahalanobis", "score"],
-        "ptp_key": ["ptp"],
+        "ptp_key": ["neg_log10_ptp", "ptp"],
         "is_de_key": ["is_de", "significant"],
         "zscore_key": ["zscore", "z_score"],
         "density_key_1": ["log_density"],