Skip to content

Quick Tutorial

Brian High edited this page May 19, 2020 · 15 revisions

This quick tutorial aims provide an overview of the complete process from connecting to Brain, to having an application running on it. For this tutorial, we'll be using R and R Studio, as they are commonly used applications among Brain's userbase.

Requirements

For this tutorial you'll need the following:

  • An account on Brain
  • X2Go Client installed on your computer

Connecting

For connecting via X2Go, please see the Connecting page.

Once connected, you should be presented with an IceWM desktop environment. IceWM provides an interface that is similar to Windows 98 or Windows 2000. If you don't see a desktop environment, please double-check your connection settings.

Transferring Data

For this tutorial, we won't go in depth on transferring data to / from Brain. With Brain you have several options for moving your data, largely dependent on where it currently exists. If your data resides in an online resource like Google Drive, you can use a web browser (Firefox) from Brain to access your Google account. However, if your data is currently located on a DEOHS system, you would need to use SFTP to transfer the files from Vector.

Starting an Interactive Session

When working on Brain, it's important to consider the size of your dataset, the operations you'll be performing on it, and the number of computing cores needed. Unlike other systems, with Brain, you have to ask the scheduler (Grid Engine) for an allocation of resources. The scheduler uses that allocation request to find a computing node with the available capacity to meet your need. For this demonstration, we'll assume the dataset is about 4GB in size, we'll be generating multiple subsets of the data (increases memory required) and our needs are limited to a single core of computing power.

NOTE: A single core of computing power is equivalent to 2 hyperthreads, or 2 vCPUs. On Brain, this will be referred to as 2 slots of computing capacity (a slot is 1 hyperthread, and approximately 8GB RAM).

  1. From within the IceWM desktop environment, click on the application menu (lower left corner, has a CentOS logo)

  2. Navigate the menu to Programs > System Tools > Terminal

  3. You should see the familiar Linux terminal or shell environment. For this demonstration, I'll be using the group "demo", along with resource queue "demo.q". Replace those values with the group and queue you have access to. At the shell prompt, request an interactive session:

     newgrp demo
     qlogin -q demo.q -pe smp 2
    
  4. If everything went well, you should receive the message "Entering SGE interactive session..."

Launching R and R Studio

Brain utilizes Environment Modules to swap in and out various software versions from your system path. This makes it easy for you to find what software is available, along with the specific versions of offered. One of the side benefits of modules, is their ability to automatically load any modules the software depends on.

  1. At your interactive session's shell prompt, enter the following to load R 3.6.0 and R Studio 1.2.1335 into your path.

     module load R/R-3.6.0
     module load R/rstudio-1.2.1335
    
  2. Finally, launch R Studio from the prompt:

     rstudio
    

NOTE: You can't install R packages from the compute nodes. This gives you a couple options for installing packages. You could launch R and R Studio from the cluster head node (frontend), and install your packages. The second way, is to use R's builtin interactive environment to install your packages on the head node.

Installing R Packages

By utilizing R's builtin interactive environment, you can install packages while R Studio is running on a compute node. While this process may seem a bit uncomfortable at first, it's somewhat typical of high performance computing environments. In HPC environments, compute nodes rarely have compilers or software installation tools installed, thus forcing users to install via a head node or compiler node.

  1. Launch a new Terminal (app) window or tab (not the Terminal tab in RStudio)

  2. At the familiar Linux shell prompt, load the R environment module:

     module load R/R-3.6.0
    
  3. Next, launch R's interactive environment:

    R
    
  4. For this demo, I'll be installing the 'tidyverse' package:

    install.packages('tidyverse')
    
  5. After a short delay, you will be prompted to select a CRAN mirror. Choose "0-Cloud", which should be at the top. Then, click "OK".

Ending your session

Once you have finished working on Brain, it's important to properly end your interactive session. Failing to end it, will tie up resources that could be used by other users of your group's resource queue.

  1. Make sure you have saved any code, and output data from R Studio to your home or group directory.

  2. Exit R Studio (File > Quit Session)

  3. At your Linux prompt, exit the interactive session:

     exit
    
  4. Click on the application menu (lower left, has a CentOS logo), and choose "Logout".