-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I am processing StereoSeq nanopore raw data with Spl-IsoQuant v2.2.0, and encountered severe memory exhaustion during the barcode calling step. The process completes barcode indexing but then consumes massive memory (already 800GB and still increasing), making it impossible to proceed with my current computing resources.
Key Details (Logs + Environment)
Critical runtime logs (process stuck after barcode indexing):
2025-12-16 06:00:48,194 - INFO - Running Spl-IsoQuant version 2.2.0
2025-12-16 06:00:48,196 - WARNING - Output folder already exists, some files may be overwritten.
2025-12-16 06:00:48,274 - INFO - Novel unspliced transcripts will not be reported, set --report_novel_unspliced true to discover them
2025-12-16 06:00:48,274 - INFO - === Spl-IsoQuant pipeline started ===
2025-12-16 06:00:48,274 - INFO - Python version: 3.11.14 | packaged by conda-forge | (main, Oct 22 2025, 22:46:25) [GCC 14.3.0]
2025-12-16 06:00:48,274 - INFO - gffutils version: 0.13
2025-12-16 06:00:48,274 - INFO - pysam version: 0.23.3
2025-12-16 06:00:48,274 - INFO - pyfaidx version: 0.9.0.3
2025-12-16 06:00:48,298 - INFO - Detecting barcodes
2025-12-16 06:00:48,326 - INFO - Processing /xxxx/sample1.fq.gz
2025-12-16 06:00:48,340 - INFO - Using barcodes from /xxxx/barcodelist
2025-12-16 10:50:24,536 - INFO - Indexed 428882153 barcodes
2025-12-16 10:50:33,129 - INFO - Barcode caller created
After the Barcode caller created log line, the process does not proceed further, and memory usage climbs continuously to 800GB+ (no OOM error yet, but my node’s memory is nearly exhausted).
Questions:
Are there Spl-IsoQuant-specific parameters to optimize memory usage for massive barcode sets (e.g., chunked barcode processing, lightweight indexing, or barcode filtering)?
Is there a way to preprocess the 428M barcode list to reduce memory load without losing valid barcodes for StereoSeq data?
What is the recommended memory configuration for running Spl-IsoQuant with a ~400M barcode library (StereoSeq full-chip data)?