Getting and Cleaning Data Course Project

`run_analysis.R` Explained

Assumption: The Project Data File has been extracted to the working directory.

Requirement #1 - Merges the training and the test sets to create one data set

Reads the test and train data sets each into their own data frame using read.table.
Combines the test and train data sets into one data frame using rbind to append the rows from each data frame together.
Reads the variable names in from features.txt and sets the names of the variables in the combined data frame.
Reads the subject IDs for the test and train data sets into separate variables from their respective txt files.
Combines the test and train subject ID data frame rows together by using rbind to append the rows from each data frame.
Applies the label "Subject_ID" to the IDs column
Combines the combined variable data frame to the subject ID's data frame by using cbind.

Requirement #2 - Extracts only the measurements on the mean and standard deviation for each measurement

Extracts the mean, standard deviation (and Subject ID) columns from the combined data frame using subsetting and grep

Requirement #3 - Uses descriptive activity names to name the activities in the data set

Reads in the test and train activity IDs from their respective txt files as independent data frames
Combines the test and train activity IDs using rbind to combine their rows vertically, and names the column for those IDs as "Activity_ID".
Combines those test and train activity IDs to the filtered variable data frame containing the mean and standard deviation measurements
Reads in the activity labels from the txt file, and sets the column names for that data frame.
Merges the activity labels to the test and train variable data frame. Note that this reorders data.

Requirement #4 - Appropriately labels the data set with descriptive variable names

For each of the column that is a measurement / variable for the test and train data.
Parse the name of the column using grepl to determine if it meets certain criteria, and assign each criteria match to a varible.
For each criteria match, use a series of paste operation to build up a more descriptive string to describe that variable.
Append the label for each column to a character vector of descriptive variable names.
Apply the character vector of descriptive variable names to the data frame, along with the labels for the other columns.

Requirement #5 - Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

Melt the working data frame with the merged test and train data by Activity Name and Subject ID, and purposely eliminating the Activity_Id with a subset
Apply a dcast on the melted data frame by Activity Name and Subject ID, aggregating using the mean operation.
Write the final data set to a file named getdata-course-project-tidy-data.txt in the working directory.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
CodeBook.md		CodeBook.md
README.md		README.md
getdata-course-project-tidy-data.txt		getdata-course-project-tidy-data.txt
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting and Cleaning Data Course Project

`run_analysis.R` Explained

Requirement #1 - Merges the training and the test sets to create one data set

Requirement #2 - Extracts only the measurements on the mean and standard deviation for each measurement

Requirement #3 - Uses descriptive activity names to name the activities in the data set

Requirement #4 - Appropriately labels the data set with descriptive variable names

Requirement #5 - Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Getting and Cleaning Data Course Project

run_analysis.R Explained

Requirement #1 - Merges the training and the test sets to create one data set

Requirement #2 - Extracts only the measurements on the mean and standard deviation for each measurement

Requirement #3 - Uses descriptive activity names to name the activities in the data set

Requirement #4 - Appropriately labels the data set with descriptive variable names

Requirement #5 - Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`run_analysis.R` Explained

Packages