Skip to content

Latest commit

 

History

History
60 lines (36 loc) · 2.37 KB

README.md

File metadata and controls

60 lines (36 loc) · 2.37 KB

fpsync_ARCH

This is a wrapper script for the fpsync utility that comes with the fpart tool. It simplifies the most common fpsync tasks.
It can be used to copy large datasets of very small files. Please refer to the fpsync documentation for more options

Prerequisites

Installation

  1. Download the fpsync_ARCH.sh file and place it in your $PATH

  2. Make sure it is executable

        chown +x fpsync_ARCH.sh
    

Usage:

fpsync_ARCH Takes the following 3 optional options and source and destination paths if no options are provided, defaults in [ ] are used

    -T: Number of rsync threads     [15]
    -S: Size (In GB) per thread     [6]
    -F: Number of files per thread  [2500]

Example:

    fpsync_ARCH.sh <src directory> <destination directory>

Note that the source directory being copied must also be specified in the destination path

   fpsync_ARCH.sh /home/users1/<username>/<flowcell> /home/users2/<username>/<flowcell>

Override defaults using:

fpsync_ARCH.sh -T 25 -S 10 -F 5000 /home/users1/<username>/directory1 /home/users2/<username>/directory1

    - 25 concurrent threads (override using the -T option)
    - copying 10 GB of data per thread (override using the -S option in GB)
    - maximum of 5000 files per thread (override using the -F -option)

Generally speaking, when copying lots of small files, its best to reduce the amount of data (-S) and increase the number of concurrent rsync threads (-T) and limit how many files per thread (-F) to ensure there is enough data being copied concurrently to fill your bandwidth. The bottleneck will likely be your disk io.

fpsync_ARCH.sh is fastest from local disk to nfs mounted path, but can also be used over ssh

   fpsync_ARCH.sh -T 25 -S 10 -F 5000 /home/users1/<username>/<flowcell> username@hostname:/home/users2/<username>/<flowcell>

This option is considerably slower to due to encryption overhead

Refer to the fpsync man page for more granular options

Logging for all rsync threads for the above example would be located under /tmp/fpart-log/<username>-<flowcell>