Bulletin

Course evaluation is now open! It will be closed on Tuesday, June 21 at 11:59pm. If more than 80% of you complete the evaluation, everyone will receive a bonus point!

Check your emails or visit duke.evaluationkit.com to complete the evaluation.

Getting Started

Introduction

Refer to STA199: Final Project - Tips for general guidelines for formatting of your report.

Text

In order to make bullet points, skip a line and use a hyphen -.

  • italic
  • bold
  • function or variable
  • links
    • tab once for a sub-list
    • Q - Knit the document to PDF and see how the link appears.
    • If you know your PDF will be printed, add links-as-notes: TRUE to YAML to display URLs as a footnote in the knitted PDF.
  • Latex equations
    • inline: \(y = \beta_0 + \beta_1x_1 + \epsilon\)
    • chunk: \[\hat{y} = \hat{\beta}_0 + \hat{\beta}_1x_1\]
  • The default font size is 10pt. You can increase it to 11pt or 12pt. In order to change the font size to 11pt, add fontsize: 11pt to YAML.

Citation

Your report will include citations about the data source, previous research, and other sources as needed. At a minimum, you should have a citation for the data source.

Follow the steps below to easily include citations in your report:

  1. Prepare a .bib file. All of your bibliography entries will be stored in it. Let’s take a look at references.bib.

Q - Add a bibliography entry for the book R markdown: The definitive guide to references.bib.

  • Search the title in Google scholar.
  • Click “Cite”.
  • Click “BibTeX”.
  • Copy all and paste it in references.bib.
  1. Include bibliography: references.bib in the YAML.

  2. At the end of the report, include ## References. This will list all of the references at the end of the document under the section “References”.

  3. If you want to include an Appendix after References, include the additional code shown at the end of this document.

Here are examples of different in-text citations:

  • In @gorman2014structural, the authors focus on the analysis of Adelie penguins.
  • Studies have shown whether environmental variability in the form of winter sea ice is associated with differences in male and female pre-breeding foraging niche [@gorman2014structural].

Q - Add a citation for R markdown: The definitive guide in the sentence below.

The following code chunk is adapted from ______.

Options for Code Chunks

Code chunk options are used to customize how the code and output is displayed in the knitted R Markdown document. There are two ways to set code chunk options:

  • In the header of an individual code chunk
  • As a global setting to apply to all code chunks

A few options to change what we show / hide in the knitted document:

  • message = FALSE to hide messages.
  • warning = FALSE to hide warnings (with caution!)
    • While you conduct analyses, you should know what warning messages you receive and decide if it is okay to ignore them. For the final knitting, you may use warning = FALSE.
  • echo = FALSE to hide code.
    • For the project, you will set the option echo = FALSE to hide all code in your final report.
  • include = FALSE to run code but hide code, output, messages, warnings, etc.
    • Avoid using this option as a global setting.
  • eval = FALSE to not run code.
    • Useful when your code chunk is incomplete. You should not have any incomplete code chunks in your final report!

Q - Clean up the mess below.

knitr::opts_chunk$set(message = TRUE, 
                      warning = TRUE, 
                      echo = TRUE,
                      fig.width = 6, # width of figure
                      fig.asp = .618, # aspect ratio of figure
                      out.width = "70%", # width relative to text
                      fig.align = "center" # figure alignment
                      )
library(knitr)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
theme_set(theme_bw()) # set global theme 

Q - Try echo = FALSE and take a look at the updated PDF.

Theme for Plotting

For ggplots, you can set a global theme (see the code chunk above). Click here to explore available themes and choose your favorite. In case your PDF is printed, gray or dark background for plots are not recommended.

You can also set a color theme of your choosing. You can call any color by any name you would like in my_palette.

Q - Add “dukeblue” in my_palette. Hint: Find hex code.

Customizing Plots

mpg %>% 
  ggplot(aes(y = manufacturer, fill = drv)) + 
  geom_bar()

We can do better than this.

Q - Incorporate the following recommended changes in the plot.

  • Axes labels begin with a capital letter.
  • The title of the fill legend is “The type of drive train”.
  • The levels of the fill legend are “front-wheel” for “f”, “rear-wheel” for “r”, and “four-wheel” for “4” in written order.
  • Use fill colors from my_palette.
  • Manufacturer names begin with a capital letter.
  • Shorter bars locate on top.
  • Add a caption “Counts of the type of drive train by manufacturer”.

Customizing Tables

Calculate mean, median, and standard deviation of cty.

mpg %>% 
  summarize(mean_cty = mean(cty), median_cty = median(cty), sd_cty = sd(cty))
## # A tibble: 1 × 3
##   mean_cty median_cty sd_cty
##      <dbl>      <dbl>  <dbl>
## 1     16.9         17   4.26

We can also do better than this.

Q - Incorporate the following recommended changes in the table.

  • Print it as a nice table.
  • Results are rounded to 3 decimal places.
  • Column names are “mean mpg”, “median mpg”, “sd of mpg”.
  • Add a caption “Summary statistics of city miles per gallon (mpg) in city”.

Acknowledgements

These notes were adapted from the following: