Conversation
Bumps [pytest](https://github.com/pytest-dev/pytest) from 3.1.3 to 9.0.3. - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst) - [Commits](pytest-dev/pytest@3.1.3...9.0.3) --- updated-dependencies: - dependency-name: pytest dependency-version: 9.0.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
Bump pytest from 3.1.3 to 9.0.3
- Drop version+build pins on ucsc-bedgraphtobigwig, ucsc-bedtobigbed, ucsc-bigwigmerge, ucsc-stringify in requirements-conda.yml — the pinned 377 builds required openssl 1.1.1 and conflicted with the env's openssl 3.x. Letting the solver pick a compatible build resolves the install. (#321) - Add r-argparser to requirements-conda.yml; tools/PEPATAC_summarizer.R needs it but it was not in the conda env. (#228) - Add r-r.utils to requirements-conda.yml and R.utils to PEPATACr Imports; required by data.table::fread for .bed.gz peak coverage files. (#229) - Pin GenomicDistributions (>= 1.4.6) and GenomicDistributionsData (>= 1.0.0) in PEPATACr Imports to avoid the chromSizes_hg38 / TSS_hg38 namespace mismatch when one side is upgraded without the other. (#230) - Fix curl-into-variable bug in checkinstall at three sites: the REQS / PIPELINE fallback branches stored file contents in a variable that was used as a path downstream. Switched to mktemp + curl -o for the URL fallback, assigned the path directly for the local branch. (#226) - Drop dead run-container.md link from docs/install.md and refresh intro now that containers were removed in 0.12.0. Closes #228, #229, #230, #321, #226.
…-dedup - _align() referenced an undefined in the single-end + bwa branch paired chain (minus filter_pair) for both --keep and no-keep paths: pm.run([cmd1, cmd2, cmd3, cmd4], <target>). (#299) - Fix missing pm.fail_pipeline in the unmap_fq1 branch of the filter_paired_fq.pl handle check; previously a stuck filter on R1 set an error string that was never raised. Reworked the error message into a shared template that points at the underlying psutil introspection issue and recommends both --keep and --noFIFO as workarounds. (#234) - Add --skip-dedup flag for protocols where duplicates are biologically meaningful (CUT&Tag, CUT&RUN). When set: copy mapping_genome_bam to _sort_dedup.bam so downstream peak calling finds the expected path; report Duplicate_reads=0 and pass through Dedup_aligned_reads/Dedup_alignment_rate/Dedup_total_efficiency from the pre-dedup metrics. Plumbed through sample_pipeline_interface.yaml so it can be set per-sample. (#249) - Drop redundant Time/Success keys from pepatac_output_schema.yaml (both samples: and project: blocks). These are pipestat's auto- tracked status fields and the duplicate declaration triggered "SchemaError: Overlap between project- and sample-level keys" on newer pipestat. (#322, #305) - Fix _LOGGER NameError in tools/bamQC.py and bamSitesToWig.py: the variable was only defined inside , so pararead workers re-importing the module under multiprocessing 'spawn' (macOS default) hit NameError when class methods logged. Added a module-level fallback logger above each class definition. (#266) - Fix peakCounts() ref-peaks ignored when *_peaks_coverage.bed.gz coexists with *_ref_peaks_coverage.bed: the shared variable preferred .bed.gz from the regular peaks file and then looked for a non-existent _ref_peaks_coverage.bed.gz, falling through to the "not derived from a singular reference peak set" warning. Detect ref vs regular extensions independently. (#218, #219) - Guard refgenie[sample.genome] lookups in sample_pipeline_interface.yaml with , so projects with non-refgenie genomes (e.g. galGal6, bosTau9) no longer crash the Jinja template with an attmap AttributeError; instead they fall through to the per-sample paths or error cleanly from pepatac.py. (#231) - Fix plotAnno() empty-input fallback path bug: was constructing file.path(<output_file>, "<sample>_partition_dist.pdf") (treating the output pdf as a directory) and quit()ing the R session. Replaced with return(ggplot()), matching the function's other empty-data branches; the caller writes a clean blank placeholder at the expected target. (#232) Closes #299, #234, #249, #322, #305, #266, #218, #219, #231, #232.
…ps3dp/tools/refgenie_config.yaml workaround - faq.md: expand the TSSE entry to name the refgene_anno asset / UCSC RefGene as the source of TSS coords and note that the cutoff-of-6 threshold is hg38-tuned and empirical. Point at ENCODE ATAC-seq data standards for per-assembly reference numbers. (#235) - assets.md: add a Using a custom adapter file subsection documenting the adapters resource override in pipelines/pepatac.yaml. (#252) - assets.md: document the /home/jps3dp/tools/refgenie_config.yaml-required-even-with-manual-paths quirk and the empty-refgenie-config workaround. The proper fix is in the in-progress refgenie 1.0 migration (PR #327). (#251) - count_table.md: make the per-sample PEPATAC_completed.flag handling explicit in the consensus-peak-set count table workflow. Two paths: delete the flag files (one-liner with find -delete) or pass --ignore-flags to looper run. (#215) - assets.md: troubleshooting subsection for TypeError: 'NoneType' object is not iterable — root-caused to incomplete refgenie assets (commonly missing prealignment FASTA), with diagnostic and fix commands. The error itself is upstream refgenconf behavior; replaced by the refgenie 1.0 migration (PR #327). (#216) - glossary.md: document column formats for _peaks_coverage.bed (8 columns) and _ref_peaks_coverage.bed (15 columns; narrowPeak coordinates + bedtools coverage stats + normalized count). (#233) - assets.md: Running a non-refgenie genome through looper subsection — sample_modifiers/imply pattern with chrom_sizes, genome_index, etc. set per-sample. (#231 docs portion) Closes #235, #252, #251, #215, #216, #233.
Two distinct breakages in the integration-test runner that both surface
immediately when running ./tests/scripts/test-integration.sh against a
fresh bulker install:
1. Default crate name didnt match what bulker actually caches.
The scripts defaulted PEPATAC_TEST_BULKER_CRATE to local/bulker_manifest,
but bulker caches tests/bulker_manifest.yaml as bulker/pepatac:1.1.1 --
it auto-namespaces the manifests name: pepatac field under bulker/.
bulker crate list | grep -q local/bulker_manifest therefore always
failed, even immediately after a successful
bulker crate install tests/bulker_manifest.yaml, leaving the runner
wedged in a crate not cached loop. services.shs usage banner and
tests/README.mds env-var table both advertised yet a *third* name
(databio/pepatac), used nowhere in the code -- doc drift.
Realigned all three to bulker/pepatac to match what bulker caches.
2. PATH extraction via bulker activate --echo was format-fragile and
broke on bulker 0.0.15 (the Rust rewrite).
test-integration.sh and services.sh both invoked
bulker activate --echo | grep ^export PATH= | sed ... | cut ...
to fish the crates shim directory out of bulkers activation output.
On bulker 0.0.15, --echo errored (argument --echo cannot be used
multiple times -- likely a clap quirk in this version), and even when
it worked the parse was brittle to quoting/formatting changes between
bulker releases.
Switched to bulker exec <crate> -- <cmd>, which lets bulker manage
PATH for the duration of one command. The pytest invocation in
test-integration.sh and the per-tool which check in services.sh
now both run inside bulker exec, with no PATH scraping at all.
Files:
- tests/scripts/test-integration.sh: default BULKER_CRATE to bulker/pepatac;
drop the activate-and-extract-PATH block; run pytest via bulker exec.
- tests/scripts/services.sh: default BULKER_CRATE to bulker/pepatac; fix
usage banner; rewrite check_tools to probe each tool via
bulker exec ... -- which <tool>.
- tests/README.md: update env-var table to bulker/pepatac.
- tests/integration/{conftest,test_looper_run,test_end_to_end}.py:
refresh docstring prereq lines.
Regression from f20c354 (clean up integration tests).
Adds testthat coverage for the two PEPATACr bugs fixed in 8bda2e4 so they can't be silently reintroduced. - Hoist peakCounts()'s local detect_ext closure to a package-internal helper .detectPeakCoverageExt(suffix, results_subdir, sample_names, genomes). Same behavior, no functional change; just makes the ext-detection logic unit-testable without reconstructing the full peakCounts() pipeline (which would need valid peak data, chrom sizes, and full sample_table setup). - tests/testthat/test-peakCounts-ext.R: seven scenarios exercising the helper directly, including the exact mixed-state bug from #218/#219 (*_peaks_coverage.bed.gz from the initial sample run alongside *_ref_peaks_coverage.bed from the --frip-ref-peaks re-run), the inverse mixed state, multi-genome sample tables, and the no-match / warning fall-through path. - tests/testthat/test-plotAnno.R: three scenarios for the empty-input fallback from #232 — missing input file, empty (size-0) input file, and the full caller pattern (pdf() / print(plotAnno(...)) / dev.off()) producing a non-empty placeholder pdf at the expected target path. Asserts the old bug's spurious file.path(output_pdf, ...) touch target is NOT created, so the fix can't silently regress.
Two distinct breakages in the integration-test runner that both surface immediately when running ./tests/scripts/test-integration.sh against a fresh bulker install: 1. Default crate name didn't match what bulker actually caches. PEPATAC_TEST_BULKER_CRATE defaulted to 'local/bulker_manifest', but bulker auto-namespaces the manifest's field and caches as 'bulker/pepatac:1.1.1'. The check therefore always failed even right after a successful . 2. PATH extraction via was format-fragile and broke on bulker 0.0.15. Switched to , which lets bulker manage PATH for the duration of one command, with no output parsing at all. Realigned both scripts to default to 'bulker/pepatac'; rewrote test-integration.sh's pytest invocation and services.sh's check_tools to use . Refreshed the README env-var table, services.sh usage banner, and the docstring prereq lines in conftest.py / test_looper_run.py / test_end_to_end.py for consistency. Regression from f20c354 (clean up integration tests).
Both test-integration.sh's install if not cached gate and
services.sh's check_crate_cached probed for crate availability via:
bulker crate list 2>/dev/null | grep -q
But prints crate name and tag in *separate*
whitespace-delimited columns, so a literal bulker/pepatac:1.1.1 grep
never matches even when the crate is freshly cached. test-integration.sh
silently fell through to install from local manifest every run
(harmless, just slow). services.sh hard-errored with Bulker crate ... is
not cached immediately after a successful install printed
Cached: bulker/pepatac:1.1.1 — the user-visible wedge.
Replaced both grep checks with ,
which directly tests the operation we actually care about. The hint
text in services.sh's error path now points at (the manifest file path that bulker can
actually load) rather than echoing back the cache key, which bulker
would 404 on against hub.bulker.io.
Surfaced after 28edeec (Fix integration-test bulker crate default to
include tag) tightened the default to the full identifier --
that change exposed the previously-masked format mismatch in the
list-grep check.
…lved one After bulker-side fixes landed (4c56b39, 28edeec, 9123d8c), the integration runner finally reached pytest -- only to die immediately with No module named pytest. The python3 inside bulker exec resolved to the host's miniforge base install rather than the caller's active conda env (pepatac-env in this case), which is the one that actually has pytest + pypiper + the rest of pepatac's requirements installed. Cause: python3 is declared as a host_command in tests/bulker_manifest.yaml, so bulker forwards it unmodified to the host. But bulker's PATH ordering inside bulker exec walks the system PATH in whatever order and picks the first python3 it finds -- which on a typical HPC node is miniforge base, not the user's currently activated conda env. Fix: capture /home/jps3dp/anaconda3/bin/python3 BEFORE entering bulker exec, then pass that absolute path through to the exec'd command. Python locates its own site-packages from sys.prefix based on executable path, so the conda-env python keeps its installed packages regardless of how it's invoked. Also adds an early-exit ERROR if no python3 is on PATH at all (otherwise the failure mode is a confusing No module named pytest inside bulker exec several seconds later), and prints the resolved python path in the runner banner so it's clear which interpreter the tests are running under.
Commit 598aa3b captured command -v python3 before entering bulker exec, on the assumption that the callers active env would be first on PATH. That assumption holds on developer machines where conda activate is the last PATH-mutating step. On HPC nodes where a module load miniforge fires AFTER conda activate pepatac-env, the modules PATH prepend buries the conda envs bin/ behind miniforge base -- command -v python3 returns miniforge (no pytest installed) even though the prompt says (pepatac-env) and CONDA_PREFIX points at the env. Rather than picking a single env var or PATH lookup as authoritative, walk a small candidate list and pick the FIRST python that actually imports pytest: PYTHON_CANDIDATES=() [ -n /home/jps3dp/anaconda3 ] && PYTHON_CANDIDATES+=(/home/jps3dp/anaconda3/bin/python3) [ -n ] && PYTHON_CANDIDATES+=(/bin/python3) PYTHON_CANDIDATES+=(/home/jps3dp/anaconda3/bin/python3) PYTHON_CANDIDATES+=(/home/jps3dp/anaconda3/bin/python) for candidate in ; do [ -x ] || continue if -c import pytest >/dev/null 2>&1; then ACTIVE_PYTHON= break fi done If nothing in the list imports pytest, the script errors out with the exact list of candidates that were tried -- a clearer failure than the previous No module named pytest buried inside bulker exec output.
- tools/pepatac_summarizer/: Python package with CLI, consensus peak calling via gtars, peak counts, and summary plots - pipelines/pepatac_collator.py: --summarizer python|R dispatch, defaults to python - Remove obsolete PEPATACr R tests - tests/test_summarizer.py: unit tests - tests/test_summarizer_integration.py: integration tests
- Add plot_tss_distance using TssIndex.from_regionset.calc_tss_distances; wire into pepatac.py anno block (replaces R placeholder) - Fix plot_frif to sum read counts from bedtools coverage outputs - Reorder plot_partition_distribution to horizontal stacked bar with inline percent labels; add natural chrom sort + canonical chrom filter - Add fragment-distribution median; add chrom/tssdist/part/frif CLI subcommands - Align PartitionList.from_gtf defaults to R's GenomicDistributions: core_prom=100, prox_prom=2000 (was 2000/10000)
…olves during runp.
…bulker syntax; checkinstall fixes
…original hard-coded for redundancy and backwards compatibility
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release 0.14.0. See docs/changelog.md for full Added/Changed/Fixed/Removed entries.