LPC
Access LPC
To gain access send an email to PMACS-SYS-SCI@lists.upenn.edu asking the admins to give you permission to the queue. Include in the email that the queue is "barash" and PMACS username that would like access. Yoseph will need to confirm authorization therefore, make sure to CC Yoseph on this email.
Log Into Submit Host
Once access has been given, ssh into scisub.pmacs.upenn.edu. This host is publicly reachable and doesn't require a VPN connection.
Example MAJIQ Job
There is a simple MAJIQ job example attached to this wiki. It uses the data from the workshop example. It only runs a build, but this should be enough to you started.
Download "majiq_test.sh" script attached to this wiki and move it to scisub.
Prepare data needed to complete job found in the beginning of the script
Submit job by running the following command bsub < majiq_test.sh
To check the status run bjobs. Once the job has finished, you should find the majiq_build directory and the "out" and an "error" file.
Non-Release MAJIQ
Install
ssh into sciget.pmacs.upenn.edu
run: module add git python/3.6.1
run: pip3 install --user cython numpy pysam -U
run: pip3 install --user git+https://bitbucket.org/biociphers/majiq.git#majiq -U
Running
run: module add python/3.6.1
run: ~/.local/bin/majiq -v
Static Directory
Each execute node will have a scratch (/scratch) directory that is available for large jobs. We've found that it is more efficient to move your files to the scratch directory, work from there, and then remove them when done. It might be a good idea to write your output files to this directory and then move them to their final location once the job has finished.
Transferring Files
scisub.pmacs.upenn.edu - SSH only. Submit host. Check and submit jobs.
transfer.pmacs.upenn.edu - SFTP/RSYNC/SCP only. No ssh available.
sciget.pmacs.upenn.edu - SSH but also has outbound network allowed. Used to download to the LPC (wget, subversion, git, etc.)
Common Queue Commands
Check the LSF documentation for further information.
bjobs - Check job statuses
ibash - log onto a worker node
Questions or Concerns?
Submit a ticket at https://helpdesk.pmacs.upenn.edu under the "Systems" queue or email PMACS-SYS-SCI@lists.upenn.edu with all your LPC questions.
Storage
A Warning
There is limited space, I know there seems like a lot there, but our tape isn't infinite. Know how much space is available within the directory you're working.
Where to Store Data (AKA The Capitalistic model)
Home Directory - /home/<username> or ~/
For small or sensitive jobs, default to using your home directory. This will be the only space where you have control over the files.
Unbacked Up Directories (15TB each) - /project/barash_hdr{1,2,3}/
General Use.
Backed Up Directories (7TB each) - /project/barash_hdb{1,2}/
For files we don't want to lose as we can't easily recover (like FASTQ from GEO), stuff that is shared, and stuff that is important to reproduce papers.
Finding Data
If you can't remember were you put your work, here's a couple helpful command to find your way back.
find /project/barash_hd* -name <file_name>
Available Barash Lab Hosts
Run the following to see which hosts are available for Barash Lab:
bhosts barash_exe
As of Oct 22, 2019, available machines include:
bilbringi, chimaera, eradicator, mamut
Useful Commands
To help mitigate issue where users are controlling nodes, we'll have to include the following arguments in every job submission:
bsub -n 7 -R "span[ptile=7]" < job.sh
For example, if you are running STAR with 7 threads, the -n 7 in the bsub command ensures that you get 7 cores for the job and -R "span[ptile=7]" ensures that you get those 7 cores on the same machine, in most cases, for example running STAR or MAJIQ, you do need the cores on the same machine. General rule is that number of threads required == number of cores requested. Each machine on LPC has 15 cores, so don't request more than 7 cores unless you absolutely need to.
To increase memory you can use the -M option, for example the following option in your bsub command asks the lsf for 10 GB of RAM for the job:
-M 10240
Interactive mode on specific machine:
bsub -Is -m "mamut" -q barash_interactive 'bash'
Don't use the interactive bash if you need to run computationally heavy jobs, use the normal bsub command. The lsf system doesn't keep track of the resources used by the ibash and you can cause the machine to crash or other jobs to fail.
Check hosts statuses:
bhosts barash_exe
Additional Reading
https://wiki.pmacs.upenn.edu/pub/LPC
https://www.ibm.com/support/knowledgecenter/en/SSETD4_9.1.3/lsf_kc_cmd_ref.html