Skip to content

Out of memory with high number of genes #41

@Nathaniel-github

Description

@Nathaniel-github

Copy pasting our discussion from slack:

when I remove that initial filtering step the pf2_nd function errors out in the init step due to a memory error. I'll copy paste the full trace but the exact line is: cov_matrix = cp.zeros((n_genes, n_genes), dtype=cp.float64) which makes sense (34k x 34k).

(.venv) (cellcommunicationpf2) nthomas@aretha:~/cellcommunication-Pf2$ nvidia-smi
Wed Jul 23 12:41:14 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08 Driver Version: 575.57.08 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:02:00.0 Off | Off |
| 0% 36C P8 11W / 450W | 1MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA TITAN Xp Off | 00000000:21:00.0 On | N/A |
| 23% 35C P8 10W / 250W | 11MiB / 12288MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
(.venv) (cellcommunicationpf2) nthomas@aretha:~/cellcommunication-Pf2$ make output/figure1.svg
rye run fbuild 1
Importing data...
Loading BALF COVID data (preserving sparse format)...
Data loaded as sparse matrix: csr_matrix
Non-zero elements: 115,815,526 (5.47%)
Zero elements: 2,000,532,888 (94.53%)
Sparsity: 94.53%
Loading ligand-receptor pairs data...
Cached 2005 ligand-receptor pairs
Original data shape: (63103, 33538)
Filtering data...
Filtered data shape: (63103, 33538)
Running CC-PF2 with rank=10 and cp_rank=10...
Traceback (most recent call last):
File "/home/nthomas/cellcommunication-Pf2/.venv/bin/fbuild", line 8, in
sys.exit(genFigure())
^^^^^^^^^^^
File "/home/nthomas/cellcommunication-Pf2/cellcommunicationpf2/figures/common.py", line 76, in genFigure
ff = makeFigure()
^^^^^^^^^^^^
File "/home/nthomas/cellcommunication-Pf2/cellcommunicationpf2/figures/figure1.py", line 66, in makeFigure
adata_filtered, r2x = run_cc_pf2_workflow(
^^^^^^^^^^^^^^^^^^^^
File "/home/nthomas/cellcommunication-Pf2/cellcommunicationpf2/utils.py", line 114, in run_cc_pf2_workflow
results, r2x = cc_pf2(
^^^^^^^
File "/home/nthomas/cellcommunication-Pf2/cellcommunicationpf2/cc_pf2.py", line 115, in cc_pf2
pf2_output, pf2_r2x = parafac2_nd(
^^^^^^^^^^^^
File "/home/nthomas/cellcommunication-Pf2/.venv/lib/python3.11/site-packages/parafac2/parafac2.py", line 112, in parafac2_nd
factors, norm_tensor = parafac2_init(X_list, means, rank, random_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nthomas/cellcommunication-Pf2/.venv/lib/python3.11/site-packages/parafac2/parafac2.py", line 71, in parafac2_init
cov_matrix += total_rows * cp.outer(means, means)
~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
File "cupy/_core/core.pyx", line 1334, in cupy._core.core._ndarray_base.mul
File "cupy/_core/_kernel.pyx", line 1349, in cupy._core._kernel.ufunc.call
File "cupy/_core/_kernel.pyx", line 645, in cupy._core._kernel._get_out_args_from_optionals
File "cupy/_core/core.pyx", line 2884, in cupy._core.core._ndarray_init
File "cupy/_core/core.pyx", line 257, in cupy._core.core._ndarray_base._init_fast
File "cupy/cuda/memory.pyx", line 738, in cupy.cuda.memory.alloc
File "cupy/cuda/memory.pyx", line 1424, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1445, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1116, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1137, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
File "cupy/cuda/memory.pyx", line 1382, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
File "cupy/cuda/memory.pyx", line 1385, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 8,998,380,032 bytes (allocated so far: 24,350,057,984 bytes).
make: *** [makefile:10: output/figure1.svg] Error 1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions