Coursera-Getting-and-Cleaning-Data

This Repository contains the asssignment of the Coursera R-training named 'Getting and Cleaning Data with R'

The assignment was to create a script that tidies a dataset. For a description of the dataset and its variables, check out the file codebook.md.

This file describes what the script does.

======== Transformations of the R script tidy.R ========================================================= The script

Takes the the obeservations of the two subsets "train" and "test".
Adds the column names to the observations
Adds per row in the subsets "train" and "test"the subjects and activity codes
Removes all columns from the two subsets "train" and "test"which do not contain the strings "std" or "mean", except for the newly attached columns "subject" and "activity".
Adds the two subsets "train" and "test"together into one dataframe "total".
Replaces the activity codes with the activity names.
Reorganize dataframe "total" so that the columns "subject" and "activity" come first and have the data grouped by these two columns.
Based on the dataframe "total" a second dataframe "total_av" is created which takes the mean of all observations per subject per activity

======== How to use the R script =========================================================

Download the raw data from https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
Rename the zip-file "getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip" to phone.zip.
Unzip the file
Copy the R-script tidy.R to the folder above the folder "\phone".
Make sure you have downloaded the dplyr-package
Open the script in R-Studio and source it.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CodeBook		CodeBook
README.md		README.md
run_analysis.R		run_analysis.R
total_av.txt		total_av.txt

Provide feedback