Tibble Exercises

Create a new tibble

Create a new tibble called tbl1 with 5 observations and the following variables:

name
age
height
weight
smoker (TRUE or FALSE)
Male (TRUE or FALSE)

Once you have created tbl1, add a new dummary variable, agecat, that categorizes age by decade (i.e. agecat = 0 when age < 10, agecat = 1 when 10 <= age < 20, ...)

Conversion of a data.frame to a data_frame

Convert the iris data set to a tibble.

data(iris)
str(iris)

## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

How can you verify that your modified iris data set is formatted as a tibble?

NEISS data

The datset that we will be using for this section comes from the
National Electronic Injury Surveillance System (NEISS) https://www.cpsc.gov/research--statistics/neiss-injury-data Here is a short description of the data file from the NEISS.

"Each record (case) is separated by a carriage return/line feed, and the fields (parameters and narrative) are separated by a tab character, which you can specify as the delimiter when importing into a spreadsheet or database."

Read the data file, nss15.tsv file from the sub-folder/directory, Data and call the data as nss15 variable.

Hint: You could use the File --> Import --> Data DataSet option to read the file. The data is in a tab-separated format.

Hint: Watch out for the data type choices that are suggested to you and be sure to select the appropriate ones

Hint: Please watch out for any warnings or issues while R/Rstudio is reading the files. If you spot any errors, think of how to fix it. After fixing the problems (if any), go back and read the file.

### Setup
# this will create look-up variables so you can make better sense of the codes in the data set
source('Data/NEISSlabels.R')

# for example, if you want to know what body parts these codes indicate...
codes <- c(30, 30, 77, 35)
get_label(codes, body_part_lab)

## [1] "SHOULDER" "SHOULDER" "EYEBALL"  "KNEE"

### Import data code goes here


### mutate code converting codes into human readable labels goes here


### further exploration to answer questions goes here

Answer the following questions.

How many cases are reported in this dataset?
How many covariates this dataset has?
Access CPSC Case # 150620565 and report the following things:
What is the age of the patient?
What is the Race, weight, Stratum, Sex, Race and Diagnosis
How many are more than 100 years old?
From the reported cases, get the CPSC case number and age for the 20th entry.

Date/Time Exercises

Answer the following questions using the NEISS data set that you imported above.

Report the number of cases for the month of May?
- Tell us how many cases were reported for May 13 - May 16, 2015?
- Use this information to answer the following questions.
How many were children ( < 5 years)?
provide the proportion of male/female?
What was the race distribution?

Regular Expressions Exercises

regexpr() returns the position in the string of the pattern. It will also return the length of the pattern matched. Use regexpr() to locate all species names with the pattern 'sa'.
Identify all species that end with the letter 's'

species <- c("Arabidopsis_thaliana", "Bos_taurus", "Caenorhabditis_elegans", "Danio_rerio", 
             "Dictyostelium_discoideum", "Drosophila_melanogaster", "Escherichia_coli",
             "Homo_sapiens", "Mus_musculus", "Mycoplasma_pneumoniae",
             "Oryza_sativa","Plasmodium_falciparum","Pneumocystis_carinii","Rattus_norvegicus",
             "Saccharmomyces_cerevisiae","Schizosaccharomyces_pombe","Takifugu_rubripes","Xenopus_laevis",
             "Zea_mays", "Hepatitis_C_Virus")

# your regexpr code goes here

Find out what regexec() and gregexpr() do. How are they different from the other regular expression functions we have covered?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exercises.md

Exercises.md

Tibble Exercises

Create a new tibble

Conversion of a data.frame to a data_frame

NEISS data

Date/Time Exercises

Regular Expressions Exercises

Files

Exercises.md

Latest commit

History

Exercises.md

File metadata and controls

Tibble Exercises

Create a new tibble

Conversion of a data.frame to a data_frame

NEISS data

Date/Time Exercises

Regular Expressions Exercises