Skip to content

Latest commit

 

History

History
86 lines (68 loc) · 3.53 KB

programming_and_version_control_class.md

File metadata and controls

86 lines (68 loc) · 3.53 KB

Programming and Version Control

Brian High
CC BY-SA 4.0

Common Programming Issues

  • Poorly formatted, not following "best practices" style guides
    • Long sections of poorly commented code with obscure vairable names
    • Not enough blank lines separating code sections
    • No spaces between list items, function parameters
  • Too much copy and paste
    • Difficult to read, debug and maintain
  • Fail to check for problems
    • Packages, utilities, files or folders missing
    • Invalid data entry
  • Inefficient
    • Redoing work already done (installing, downloading, etc.)
    • Processing more data than needed, usinfg more RAM
    • Looping more times than needed
    • Not taking advantage of language features

Common Programming Solutions

  • Code reuse, modular design (functions, loops)
function task:
    steps 1, 2, 3, 4, and 5

mylist = [A, B, C, D]
foreach item in mylist: do task to item
  • Consider dependencies, anticipate problems
  • Validate input (with e.g., regular expressions)
  • Use resources wisely (RAM, CPU, storage)
  • Use language "idiomatically" - as designed to be used
  • Use clear comments, descriptive and consistent variable naming scheme

Programming Examples

  • Web programming: Panopto Link Maker (HTML/JavaScript/Regex)
    • Input validation with Regular Expressions
    • Completely client-side programming: completely portable
  • API Programming: Amazon Search (Python/R)
    • Embedding non-R system commands in Rmd
    • Small "remove duplicates" Python script to use from shell with pipe
  • Functions: Reddit Wordclouds (Python/R)
    • Loops versus *apply functions
    • Example of Rmd ioslides presentation with license info and icon
  • Regular expressions filtering, streaming: Filter-fastq (Python/Perl/Regex)
  • Validation: Get-logo-png (Bash/Regex)
    • Checking requirements, command-line arguments
    • Input validation with Regular Expressions
    • Automation using CLI tools, pipes

Version Control

Different systems: an evolution of version control systems (VCS)

  • RCS: Revision Conrtrol System (1982, single user)
  • CVS: Concurrent Versions System (1990, multi user, central server)
  • SVN: Subversion (2000, updated, improved, CVS-like, central server)
  • various distributed VCS: Git (2005), Mercurial (2005), Bazarr (2005)

Demo basic "git" workflow:

  • fork, clone, pull, edit, commit, push, pull-request
    • From RStudio GUI
    • From command line
    • From other GUI or web browser (GitHub site)

Exercise: Practice Git

Try to fork, clone, commit, push, and send a "pull request" with these projects.

Pick a project (from course modules page, GitHub):