EXAM CHECKPOINT~50 MIN65 QUESTIONS

Midterm 1 Checkpoint

ADAPTIVE FLASHCARDS
Flashcard Study Mode
Study this module with spaced repetition. Wrong answers come back weighted heavier.
EXAM CHECKPOINT
Midterm 1 Checkpoint
0/65
ANSWERED
Q1

What will this code output? sum(baseball$position == infield)

sum(baseball$position == infield)
Q2SELECT ALL

Which lines of code correctly return the average BAT_AVG for the baseball dataset, ignoring NA values? Select ALL that apply.

Select all that apply — click all correct answers

Q3FILL IN

Complete the argument to ignore NAs when calculating mean PIT_AVG:

mean(baseball$PIT_AVG, _____ = TRUE)
Q4

A student wants to plot the number of players at each position as a bar chart. They already have the data (baseball has a 'position' column with one row per player). Which geom should they use?

Q5SELECT ALL

In this code: ggplot(films, aes(x = year, y = duration, color = century, shape = century)) + geom_point(size = 2) Which are VARIABLE aesthetics (mapped to data)? (Select all that apply)

ggplot(mendota, aes(x = year, y = duration)) +
  geom_point(aes(color = century, shape = century), size = 2)

Select all that apply — click all correct answers

Q6FILL IN

Fill in the correct geom to add a horizontal reference line at the mean duration:

ggplot(mendota, aes(x = year, y = duration)) +
  geom_point() +
  _____(yintercept = mean(mendota$duration), color = "red")
Q7

The following code produces an error on geom_smooth(). What is the fix?

states %>%
  ggplot() +
  geom_point(aes(x = Income, y = Illiteracy)) +
  geom_smooth()
Q8SELECT ALL

Which filter() conditions correctly find non-pitchers with more than 70 double plays AND fewer than 10 errors? Select ALL that apply.

Select all that apply — click all correct answers

Q9FILL IN

Complete the join to return only players from baseball who have NOT previously won an award (from past_awardees, joined on id):

_____(baseball, past_awardees, by = "id")
Q10

Given the children height/weight data in LONG format (columns: name, measurement, value), complete case_when to convert height to cm and weight to kg:

data %>%
  mutate(metric = case_when(
    measurement == "height" ~ value * _____,
    measurement == "weight" ~ value * _____ ))
Q11SELECT ALL

After running pivot_wider(data, names_from=measurement, values_from=value) on the children long-format data, which statements are TRUE? Select ALL that apply.

Select all that apply — click all correct answers

Q12

You want to join the 'Major' and 'Language' tables so that ONLY the 15 students present in BOTH tables are in the result. Which join is correct?

Q13FILL IN

Fill in the correct pivot function to go from WIDE format (one column per language) to LONG format (one row per student-language pair):

Language %>%
  select(Name, English:Spanish) %>%
  _____(English:Spanish, names_to = "Language", values_to = "Fluent")
Q14SELECT ALL

Match each scenario to geom_bar. Which scenarios should use geom_bar() (not geom_col, geom_histogram, or others)? Select ALL that apply.

Select all that apply — click all correct answers

Q15

What is the class() of the following vector? c(1, TRUE, "banana")

x <- c(1, TRUE, "banana")
Q16

Consider: rent <- 1200; utilities_adj <- utilities + '10'. Assuming utilities is not defined, what happens when this code runs?

Q17

What does class(TRUE) return?

Q18

Which of these is an ILLEGAL variable name in R?

Q19

Run these lines: x <- 5; x + 10. What is the value of x after both lines execute?

Q20

What is the result of 0 / 0 in R?

Q21

What does the $ operator do in R?

Q22

What does nchar('statistics') return?

Q23SELECT ALL

Given x <- NA, which of the following lines of code will return NA? Select ALL that apply.

# Assume starwars is loaded
x <- NA
(A) x + 5
(B) is.na(x)
(C) mean(c(1, NA, 3))
(D) x == NA

Select all that apply — click all correct answers

Q24FILL IN

Complete the code to combine 'Hello' and 'World' into a single string with a space between them:

Q25SELECT ALL

Which of the following are VALID variable names in R? Select ALL that apply.

Select all that apply — click all correct answers

Q26

Given the starwars dataset (87 rows, 14 columns), what does starwars[1,] return?

Q27

What does starwars[,4] return?

Q28

What does starwars$hair_color return?

Q29

What does sum(c(TRUE, FALSE, TRUE, TRUE)) return?

Q30

What does c(2, TRUE, 'banana') produce in R?

Q31

Which code correctly checks if nothing is an NA value, where nothing <- NA?

Q32

What is the result of (height > 72) | (height + 6 < 70) when height = 60?

Q33SELECT ALL

Given x <- c(10, 20, NA, 40), which of the following expressions evaluate to TRUE? (Select all that apply)

x <- c(10, 20, NA, 40)
# Which of these return TRUE (not NA)?

Select all that apply — click all correct answers

Q34FILL IN

Fill in the correct function to check if a value is missing. Using == will NOT work:

Q35SELECT ALL

Given x <- c(2, TRUE, "banana"), which statements are TRUE? Select ALL that apply.

Select all that apply — click all correct answers

Q36

You have the noble gases data with columns Gas and num_isotopes. You want bars showing the number of isotopes for each gas. Which geom should you use?

Q37

An airport wants to count the number of flights per carrier from a raw flights dataset. Which geom is best?

Q38

In ggplot(mtcars, aes(y = mpg)) + geom_boxplot(aes(fill = as.factor(cyl))), where is fill — global or local? Variable or constant?

Q39

Consider this code: ggplot() + geom_point(aes(x = Illiteracy, y = Income), size = 3). What does size = 3 do?

Q40

Which code adds a horizontal red line at the mean duration of the mendota dataset?

Q41

A zoologist wants a scatter plot of body length vs height with color varying by species, all points semi-transparent. What does alpha control?

Q42

The code ggplot() + geom_point(aes(x = Income, y = Illiteracy)) + geom_smooth() returns an error. What is the fix?

Q43SELECT ALL

In the following code: ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point(size = 3, alpha = 0.7) Which of the following are VARIABLE aesthetics (mapped to data)? (Select all that apply)

ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  geom_point(size = 3, alpha = 0.7)

Select all that apply — click all correct answers

Q44FILL IN

We want to color the INTERIOR of bars by a variable. Fill in the aesthetic name (not color, which colors the border):

Q45SELECT ALL

Which of the following statements about geom_bar() and geom_col() are TRUE? Select ALL that apply.

Select all that apply — click all correct answers

Q46

You want to find pitchers with at least one shutout game (PIT_SHO > 0). Which filter condition is correct?

Q47

You want non-pitcher players with more than 70 double plays AND fewer than 10 errors. Which filter is correct?

Q48

You want to add a column metric_value where height (inches) is converted to cm (x 2.54) and weight (lbs) to kg (x 0.45). Which case_when is correct?

Q49

What does birthwt %>% group_by(smoke) %>% summarize(n = n()) compute?

Q50

What is the difference between summarize() and mutate() when used after group_by()?

Q51

What does slice_max(body_mass_g, n = 3) return?

Q52

What does drop_na() (no arguments) do to a dataframe?

Q53SELECT ALL

The baseball dataset has a position column ('infield', 'pitcher', 'outfield'). Which filter() inputs correctly find pitchers with at least one shutout (PIT_SHO > 0)? Select ALL that apply.

Select all that apply — click all correct answers

Q54FILL IN

Fill in the dplyr function that collapses grouped rows into summary statistics:

Q55SELECT ALL

Which of the following dplyr operations keep ALL original rows of the dataframe? Select ALL that apply.

Select all that apply — click all correct answers

Q56

A hiring agency wants to interview students whose language fluency data IS available. The Major table has 20 students; Language has 50 students; 15 are in both. Which join gives only the 15 common students?

Q57

You want all players from the baseball dataset who have NOT previously won an award (stored in past_awardees). Which join do you use?

Q58

The long-format children dataset has 6 rows (3 kids x height/weight). After pivot_wider(names_from = measurement, values_from = value), what is true about wide_data?

Q59

The Language dataset has columns: Name, Mom_speaks, Dad_speaks, English, Chinese, French, Arabic, Spanish. You want only Name and the language columns (English through Spanish) in long format. Which code is correct?

Q60

Major has column Student_Name; Language has column Name. How do you join on these different key names?

Q61

What does left_join(A, B) do when A has a row with no match in B?

Q62

After pivot_wider on the children data (6 rows, 3 columns: name, measurement, value), which is TRUE about the resulting wide_data?

Q63SELECT ALL

The 'students' table has 100 rows and 'grades' has 80 rows (only students who submitted work), joined on student_id. Which statements are TRUE? Select ALL that apply.

Select all that apply — click all correct answers

Q64FILL IN

The 'Major' table uses 'Student_Name' but the 'Language' table uses 'Name'. Complete the join_by() to match them:

inner_join(Major, Language,
  by = join_by(Student_Name == _____))
Q65SELECT ALL

Given a wide dataset with columns: Name, English, Chinese, French, Arabic, Spanish — which code correctly pivots to long format with columns Name, Language, Fluent? Select ALL that apply.

Select all that apply — click all correct answers