Network / SSH
In many circumstances you may find that a system you are using, has some ports blocked which you need to access. It is a common occurrence when using HPC systemc, cloud systems, or systems behind VPNs, that the system administrator has blocked ports which you need to work. It is unlikely you will be able to get a timely fix to this issue (if any) from the administrator. Luckily, there are a few different tricks that can be used to work around this and complete what you need.
Accessing blocked port 22 for repositories
On some systems, port 22 (SSH), which is used for our Bitbucket repositories, as well as other repositories (i.e. Github and Gitlab), is blocked outbound, which prevents you from properly cloning and pushing to these places. Luckily, they often provide alternate hosts which use port 443 instead of port 22. As 443 is the https (web) port, it is almost never blocked on systems and is likely to function properly.
Bitbucket (https://confluence.atlassian.com/bbkb/port-22-is-blocked-on-local-network-1168865232.html):
git clone ssh://git@altssh.bitbucket.org:443/<Workspace>/<repo_name>/
Github (https://docs.github.com/en/authentication/troubleshooting-ssh/using-ssh-over-the-https-port):
git clone ssh://git@ssh.github.com:443/YOUR-USERNAME/YOUR-REPOSITORY.git
SSH Tunnels
In some cases, remote systems are very secure ; they will block ports outside of a whitelist. For example, when accessing the Jordi system from some locations or networks, only SSH (port 22) may be allowed. Meanwhile, you may want to access a service on another port. For example, you might want to create a jupyter notebook to work with through your browser, or you might want to run an instance of the voila tool to view the results of your RNAseq analysis. By default, these tools will choose some random port, which they will display when you launch them. For example, if you launch voila on jordi1 it will run on port 5000, so that if you go into your local web browser and navigate to http://jordi1:5000 , you would see voila.
voila view my_build_files -p 5005 --host 0.0.0.0
(note the switch --host 0.0.0.0 is needed, for the voila instance to listen for requests from outside of the machine on which it's running, here 0.0.0.0 effectively means "listen for all addresses")
But what if port 5000 is blocked? When this happens, the website will act like it is hanging -- you will need to use an SSH Tunnel to work around this.
An SSH tunnel is a feature of the SSH protocol which allows you to redirect and capture local and remote ports to re-route network traffic. In the situation above, you can use an ssh tunnel to capture port 5000 traffic on Jordi, redirect it over SSH (port 22), and then locally redirect it to listen instead on a certain port locally on your machine. We can run a command like so to set up a tunnel for this purpose:
ssh -N -L 5005:jordi1:5005 username@jordi1
This command, if working correctly, will continue to run and display nothing. Now, in order to see voila, instead of going to http://jordi1:5000 in your browser, you would go to http://localhost:5000 , as the tunnel is effectively making your local machine the web server.
Multiple SSH tunnels when both machines are behind firewalls
The following Guide was originally written by Joseph Aicher, however, IN BRIEF, here is how it's done from Jordi to CHOP:
On jordi:
ssh -N -R 5005:localhost:22 transfer@transfer.biociphers.org -o ServerAliveInterval=40
Oh chop:
ssh -N -L 5005:localhost:5005 transfer@transfer.biociphers.org -o ServerAliveInterval=40
then
rsync -az --info=progress2 -e 'ssh -p 5005' /path/on/chop <your username on jordi>@localhost:/path/on/jordi
or
rsync -az --info=progress2 -e 'ssh -p 5005' <your username on jordi>@localhost:/path/on/jordi /path/on/chop
(you'll of course need to change your username, and possibly the port number from 5005 to something else if it's being used. If you are using private key authentication (the default) on jordi, you'll need to copy your private key to chop, so that you can authenticate to jordi from chop)
The two tunnel commands (ssh -R and ssh -L) can be left running indefinitely, though after a very long time they may lost connection and need to be run again. Keep them alive in screen sessions.
In order to use the "transfer@transfer.biociphers.org server, you'll need the password, which can be provided to you by the system administrator.
Full guide by Joseph:
We often need to transfer files from one server (or cluster) to another. Sometimes, these servers are inaccessible to each other (e.g. one is only available on the PMACS network, the other is only available on the CHOP network).
In these cases, it is recommended that you talk with IT to arrange transfer of data following all appropriate protocols and not circumvent their firewalls by using the following procedure (if applicable):
Scenario:
hostA (e.g. CHOP cluster) has data we need to move/copy to ...
hostB (e.g. jordi cluster) is to where we want to move/copy data
hostA / hostB are inaccessible from each other
hostShared (e.g. PMACS LPC, oracle HPC) is accessible from both hostA and hostB
i.e. from hostA and from hostB, we can connect to hostShared
generally, we cannot connect from hostShared to hostA or to hostB
It is possible to set up forward/reverse forwarding through hostShared to enable SSH from hostA to hostB:
Set up reverse tunnel from hostB (port 22, the default for ssh) to hostShared (port $TUNNEL_PORT):
# *ON hostB, in a terminal multiplexer that can be detached, i.e. tmux/screen*
# -N: no shell for commands, just listen for port forwarding
# -v: verbose, to see information about connections being opened/closed, not nothing
# -R {dest_port}:{bind_address}:{local_port} : reverse tunnel from dest to src
ssh -Nv -R ${TUNNEL_PORT}:localhost:22 hostShared
Set up forward tunnel from hostA (port $TUNNEL_PORT) to hostShared (port $TUNNEL_PORT):
# *ON hostA, in a terminal multiplexer that can be detached, i.e. tmux/screen*
# -Nv as before
# -L : forward tunnel from src to dest
ssh -Nv -L ${TUNNEL_PORT}:localhost:${TUNNEL_PORT} hostShared
While these commands are active/haven't timed-out, it is possible to ssh from hostA to hostB:
# *ON hostA*
# -p {port} : use this port for SSH connection
ssh -p ${TUNNEL_PORT} localhost
# one can configure alternative ports in scp and rsync (see man pages or edit .ssh/config)
So, while the first two commands running in the background on hostB and hostA, local SSH connections to port $TUNNEL_PORT will be forwarded past any firewalls to hostB. This is dependent on bandwidth over both connections. So, hypothetically tunneling through PMACS LPC, which has a fast connection to firewalled clusters, vs Oracle HPC, which does not, would lead to much different transfer speeds.
But, to be clear, this is a post for educational purposes about port forwarding rather than an endorsement of working around access restrictions that you agreed to in order to gain access to the firewalled servers.