added script to generate data tables for the ameriflux data by areevesman · Pull Request #6 · NCEAS/data-processing

areevesman · 2018-03-19T18:46:23Z

@dmullen17 I just added the script for the data tables!

dmullen17

Overall really solid work!

If you read my comments you'll find that you can generalize alot of this code so you aren't repeating yourself. Take your time implementing these changes.

When you start making changes follow these steps so that you're able to commit (basically save) your work https://github.com/NCEAS/data-processing#making-changes-to-your-contribution

dmullen17 · 2018-03-20T20:02:29Z

R/Ameriflux_data_tables.R

+definitions <- read.csv('/home/reevesman/Ameriflux/attribute_function/definitions.csv',
+                        stringsAsFactors = F)
+
+data1 <- read.csv('/home/reevesman/Ameriflux/AMF_US-Ivo/AMF_US-Ivo_BASE_HH_2-1.csv',


Rather than read all of these in one at a time, you can have your function do so for you.

it would read something like:

attribute_definitions <- function(data_path, definitions) { data <- read.csv(data_path, skip = 2, stringsAsFactors = FALSE) (the rest of your code) }

dmullen17 · 2018-03-20T20:04:14Z

R/Ameriflux_data_tables.R

+
+data2 <- read.csv('/home/reevesman/Ameriflux/AMF_US-ICt/AMF_US-ICt_BASE_HH_2-1.csv',
+                  skip = 2,
+                  stringsAsFactors = F)


R is weird in that it lets you abbreviate TRUE and FALSE as T/F. It's generally considered best practice to spell these out to increase readability.

When you're making changes you can split up your commits. For instance in one commit you can just change all the T/Fs to TRUE/FALSE, and just make that commit something like "updated T and F syntax"

dmullen17 · 2018-03-20T20:06:46Z

R/Ameriflux_data_tables.R

+
+    if (str_sub(att,-5,-1) == '_PI_F'){
+      x <- str_sub(att,-5,-1)
+      extra <- paste(definitions[which(definitions$uniqueAttributeLabel == '_PI'), 'uniqueAttributeDefinition'],


this is something i mentioned to sharis as well. At the beginning of your function you could have a line that sets new column names for your attributes. These column names in the attributes csv are pretty bad and make the R card harder to read.

colnames(attributes) <- c("category", "name", "defintion", "units", "SI_units")

dmullen17 · 2018-03-20T20:26:01Z

R/Ameriflux_data_tables.R

+
+##############################################################################
+
+attribute_units <- function(data, definitions){


It looks like this function searches for qualifiers in variable names and deletes them. You could probably simplify this by using the gsub function and a regular expression. Take a look at what this code does and you should be able to simplify this quite a bit. | stands for "or" in R regular expressions.

names <- c("var_1", "var_F_2") gsub("_1|_2|_F", "", names)

https://www.rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf

dmullen17 · 2018-03-20T20:49:07Z

R/Ameriflux_data_tables.R

+      QUALIFIERS_EXIST <- TRUE
+    }
+
+    else if (str_sub(att,-3,-1) %in% c('_PI','_QC','_IU','_SD')){


It looks like you're looking for every possible iteration of qualifiers and treating each one as a unique case. While this is totally acceptable, it's a good idea to think about how you might scale this if there were too many to write out by hand.

You can reverse the %in% statement so you're looking for if a qualifier is in an attribute. Then you can run the rest of the commands that you have after an else-if statement to get the extra definition.

See if you can use this example to simplify your code. You should only need to do this twice, once for the integer qualifiers and once for the character ones. The logic here is a bit tricky so don't hesitate to ask me about it

att <- "CO2_F_1 int_qualifiers <- c('_1','_2','_3','_4','_5','_6','_7','_8','_9') sapply(int_qualifiers, grepl, x = att) # this applies grepl to each value in int_qualifiers, with the additional argument x = att

areevesman · 2018-03-20T22:14:22Z

Thanks @dmullen17! I will definitely work on all of these. I really enjoyed this project and your feedback! Please let me know about any similar projects!

dmullen17 · 2018-03-22T16:40:20Z

@areevesman glad to hear it! If you want to focus on making these changes, I'll think of a similar task for you to work on by the time you're done.

…aframes

…ary in script

areevesman · 2018-04-04T20:38:51Z

Hey @dmullen17, I made all of the edits that I was planning to based on your comments! Thanks for reviewing for me! Jesse said that you've got an intense job interview today, good luck!

added script to generate data tables for the ameriflux data

ffcaf8a

dmullen17 reviewed Mar 20, 2018

View reviewed changes

areevesman added 3 commits April 3, 2018 12:35

spelled out T/F and changed args of functions to paths instead of dat…

64cd932

…aframes

simplified attribute_definition function, made it more scale-able

63d05cd

simplified attribute_units function, got rid of need for stringr libr…

88ebd8d

…ary in script

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added script to generate data tables for the ameriflux data#6

added script to generate data tables for the ameriflux data#6
areevesman wants to merge 4 commits intoNCEAS:masterfrom
areevesman:master

areevesman commented Mar 19, 2018

Uh oh!

dmullen17 left a comment

Uh oh!

dmullen17 Mar 20, 2018

Uh oh!

dmullen17 Mar 20, 2018

Uh oh!

dmullen17 Mar 20, 2018

Uh oh!

dmullen17 Mar 20, 2018

Uh oh!

dmullen17 Mar 20, 2018

Uh oh!

dmullen17 Mar 20, 2018

Uh oh!

areevesman commented Mar 20, 2018

Uh oh!

dmullen17 commented Mar 22, 2018

Uh oh!

areevesman commented Apr 4, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		##############################################################################

		attribute_units <- function(data, definitions){

Conversation

areevesman commented Mar 19, 2018

Uh oh!

dmullen17 left a comment

Choose a reason for hiding this comment

Uh oh!

dmullen17 Mar 20, 2018

Choose a reason for hiding this comment

Uh oh!

dmullen17 Mar 20, 2018

Choose a reason for hiding this comment

Uh oh!

dmullen17 Mar 20, 2018

Choose a reason for hiding this comment

Uh oh!

dmullen17 Mar 20, 2018

Choose a reason for hiding this comment

Uh oh!

dmullen17 Mar 20, 2018

Choose a reason for hiding this comment

Uh oh!

dmullen17 Mar 20, 2018

Choose a reason for hiding this comment

Uh oh!

areevesman commented Mar 20, 2018

Uh oh!

dmullen17 commented Mar 22, 2018

Uh oh!

areevesman commented Apr 4, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants