%>%
media_background ggplot(aes(x=od)) +
geom_histogram(binwidth = 0.01) +
facet_wrap(~condition+conc)
R workshop
Learn how to use RStudio and the
tidyverse
for data exploration
Slides: Intro to RStudio, Quarto, and R
Download the data for the workshop
There are five files to download and place in your R project folder.
Each row of this table is a well on a plate that was measured. This has the bacterial isolate that was tested, the growth condition (long chain fatty acid) and the concentration of the long chain fatty acid.
plate_id
– an identifier of which plate was measured
row
– row of the plate (A-H)
column
– column of the plate (1-12)
bug
– the isolate/species that was tested in this well condition
- which long chain fatty acid (LCFA) was added conc
- what was the concentration of the LCFA in this well?
plate_id
– an identifier of which plate was measured
row
– row of the plate (A-H)
column
– column of the plate (1-12)
od
- the optical density that was measured in this well. This is a measure of bacterial growth.
Read in data
Data is often in tables, and the easiest way to store tabular data is in csv
or tsv
format.
csv
- comma separated values
tsv
- tab separated values
to read in data stored this way use read_csv(filename)
or read_tsv(filename)
<- read_csv("data/metadata.csv") metadata
bind
ing tables together
We have data from 4 different plates in separate csv
files. Use bind_rows
to make a single table with all the data.
<- read_csv("data/plate_1_data.csv")
plate_1 <- read_csv("data/plate_2_data.csv")
plate_2 <- read_csv("data/plate_3_data.csv")
plate_3 <- read_csv("data/plate_4_data.csv")
plate_4
<- bind_rows(plate_1, plate_2, plate_3, plate_4) all_plate_data
join
ing metadata to the data
Connect the metadata to the plate reader data using left_join
<- all_plate_data %>%
joined_data left_join(metadata)
Data visualization activity
On excalidraw.com or pen and paper, create a visualization to answer the questions below. Check out the data visualization cheatsheet for inspiration. Think about values you’d have to compute, or if you have everything you need already in our joined_data
table.
Option 1: What is the spread of values you typically see in the no media controls for different concentrations of LCFA? Bonus: are there any effects of where the well is on the plate with the values measured?
Option 2: What is the typical growth in the media of the different bugs without any LCFA?
Option 3: How do the different LCFAs tested impact the growth of L. crispatus?
Option 4: Are there any batch effects between plates? (What conditions are measured on multiple plates, and what conditions are only on one plate?)
Example answers for data analysis questions:
<- joined_data %>%
media_background filter(bug == "CTRL")
<- media_background %>%
media_background_medians group_by(plate_id, condition, conc) %>%
summarise(no_bacteria_control = median(od))
<- joined_data %>%
bug_no_lcfa_control filter(condition == "CTRL") %>%
filter(bug != "CTRL") %>%
left_join(media_background_averages) %>%
mutate(background_subtracted_od = od - no_bacteria_control)
<- bug_no_lcfa_control %>%
bug_no_lcfa_control_medians group_by(plate_id, bug) %>%
summarise(no_lcfa_control = median(background_subtracted_od)) %>%
ungroup()
<- joined_data %>%
relative_growths filter(condition != "CTRL", bug !="CTRL") %>%
left_join(media_background_medians) %>%
mutate(background_subtracted_od = od - no_bacteria_control) %>%
left_join(bug_no_lcfa_control_medians) %>%
mutate(relative_growth = background_subtracted_od/ no_lcfa_control)
%>%
bug_no_lcfa_control ggplot(aes(x=background_subtracted_od)) +
geom_histogram(binwidth = 0.05) +
facet_wrap(~bug)
%>%
relative_growths mutate(conc = as.factor(conc)) %>%
ggplot(aes(x=conc, y=relative_growth)) +
geom_boxplot(outlier.shape = NA) +
::geom_quasirandom() +
ggbeeswarmscale_y_continuous(trans=scales::log2_trans()) +
facet_wrap(~condition+bug, ncol=6)