Assumption: The Project Data File has been extracted to the working directory.
- Reads the test and train data sets each into their own data frame using
read.table. - Combines the test and train data sets into one data frame using
rbindto append the rows from each data frame together. - Reads the variable names in from
features.txtand sets the names of the variables in the combined data frame. - Reads the subject IDs for the test and train data sets into separate variables from their respective txt files.
- Combines the test and train subject ID data frame rows together by using
rbindto append the rows from each data frame. - Applies the label "Subject_ID" to the IDs column
- Combines the combined variable data frame to the subject ID's data frame by using
cbind.
Requirement #2 - Extracts only the measurements on the mean and standard deviation for each measurement
- Extracts the mean, standard deviation (and Subject ID) columns from the combined data frame using subsetting and
grep
- Reads in the test and train activity IDs from their respective txt files as independent data frames
- Combines the test and train activity IDs using
rbindto combine their rows vertically, and names the column for those IDs as "Activity_ID". - Combines those test and train activity IDs to the filtered variable data frame containing the mean and standard deviation measurements
- Reads in the activity labels from the txt file, and sets the column names for that data frame.
- Merges the activity labels to the test and train variable data frame. Note that this reorders data.
- For each of the column that is a measurement / variable for the test and train data.
- Parse the name of the column using
greplto determine if it meets certain criteria, and assign each criteria match to a varible. - For each criteria match, use a series of paste operation to build up a more descriptive string to describe that variable.
- Append the label for each column to a character vector of descriptive variable names.
- Apply the character vector of descriptive variable names to the data frame, along with the labels for the other columns.
Requirement #5 - Creates a second, independent tidy data set with the average of each variable for each activity and each subject.
Meltthe working data frame with the merged test and train data by Activity Name and Subject ID, and purposely eliminating the Activity_Id with a subset- Apply a
dcaston the melted data frame by Activity Name and Subject ID, aggregating using the mean operation. - Write the final data set to a file named
getdata-course-project-tidy-data.txtin the working directory.