HPC
Note: As of January 2023, the PMACS HPC is generally only used for burst compute in special circumstances. Please contact Yoseph before using this system.
Access HPC
I am working on this page, it is not complete yet but there were a lot of HPC questions coming up so I thought I could at least start the effort.
Log Into Submit Host
tbd
Example MAJIQ Job
tbd
Non-Release MAJIQ
Install
ssh into mercury.pmacs.upenn.edu
run: module add git python/3.6.1
run: pip3 install --user cython numpy pysam -U
run: pip3 install --user git+https://bitbucket.org/biociphers/majiq.git#majiq -U
Running
bsub -N <job_name> -e <err_file> -o <stdout> -M <MEMORY> -q <queue> -R "span [hosts=1] rusage [mem=MEMORY]" bash <your_script.sh>
E.g. `bsub -N calebsjob -e err -o out -n 7 -M 5000 -q normal -R "span [hosts=1] rusage [mem=5000]" bash caleb.sh`
Static Directory
tbd
Transferring Files
consign.pmacs.upenn.edu - SSH only. Submit host. Check and submit jobs.
mercury.pmacs.upenn.edu - SSH but also has outbound network allowed. Used to download to the HPC (wget, subversion, git, etc.)
Common Queue Commands
Check the LSF documentation for further information.
bjobs - Check job statuses
ibash - log onto a worker node
Questions or Concerns?
Submit a ticket at https://helpdesk.pmacs.upenn.edu under the "Systems" queue or email PMACS-SYS-SCI@lists.upenn.edu with all your LPC questions.
Storage
A Warning
Storage costs a lot, you should keep 1TB or below data owned by your user account.
Where to Store Data (AKA The Capitalistic model)
Home Directory - /home/<username> or ~/
For small or sensitive jobs, default to using your home directory. This will be the only space where you have control over the files.
Unbacked Up Directories (15TB each) - /project/yosephblab/
General Use.
Finding Data
tbd
Available Barash Lab Hosts
tbd
Useful Commands
To help mitigate issue where users are controlling nodes, we'll have to include the following arguments in every job submission:
bsub -n 7 -R "span[ptile=7]" < job.sh
For example, if you are running STAR with 7 threads, the -n 7 in the bsub command ensures that you get 7 cores for the job and -R "span[ptile=7]" ensures that you get those 7 cores on the same machine, in most cases, for example running STAR or MAJIQ, you do need the cores on the same machine. General rule is that number of threads required == number of cores requested. Each machine on LPC has 15 cores, so don't request more than 7 cores unless you absolutely need to.
To increase memory you can use the -M option, for example the following option in your bsub command asks the lsf for 10 GB of RAM for the job:
-M 10240
Interactive mode on specific machine:
bsub -Is -m "mamut" -q barash_interactive 'bash'
Don't use the interactive bash if you need to run computationally heavy jobs, use the normal bsub command. The lsf system doesn't keep track of the resources used by the ibash and you can cause the machine to crash or other jobs to fail.
Check hosts statuses:
bhosts barash_exe