Skip to content

lh3/tabtk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

538366f · Oct 10, 2024

History

21 Commits
Sep 7, 2014
Sep 21, 2014
Jul 30, 2016
Sep 7, 2014
Sep 7, 2014
Sep 7, 2014
Sep 7, 2014
Jul 30, 2016
Sep 7, 2014
Sep 7, 2014
Oct 10, 2024
Sep 13, 2014
Sep 7, 2014
Sep 21, 2014

Repository files navigation

Introduction

Tabtk is a fast and lightweight tool for processing TAB-delimited formats.

Tabtk Examples

  • Basic Unix cut (duplicated columns ignored):

      tabtk cut -f 5,1-3,6,6- file.txt
    
  • Reorder columns:

      tabtk cut -rf 5,1-3,6 file.txt
    
  • Duplicate columns (duplicated columns not ignored with option -r):

      tabtk cut -rf 1,1,1 file.txt
    
  • Use both SPACE and TAB as the delimitor:

      tabtk cut -d space -f 1-3 file.txt
    
  • Cut a CSV file:

      tabtk cut -d csv -f 2-4 file.csv
    

    Commas can appear in double-quotation marks.

  • Print lines in streamed.txt that matching loaded.txt on the first column:

      tabtk isct loaded.txt streamed.txt
    
  • Print lines matching the first two columns:

      tabtk isct -1 1,2 loaded.txt streamed.txt
    
  • Fixed-width view of a TAB delimited file and truncate long fields to 20

      tabtk view -l 20 tab.txt | less -S
    

    This by default loads 16MB data to RAM, not the whole file.

  • Grep a pattern in specified columns:

      tabtk grep -f 2 "^rs[0-9]+" file.vcf
    
  • Compute the mean, min and max of a numeric column:

      tabtk num -c 2 file.txt
    
  • Compute the standard deviation and quartile:

      tabtk num -Qc2 file.txt
    

About

Toolkit for processing TAB-delimited format

Resources

Stars

Watchers

Forks

Packages

No packages published