MEDS Class of 2023

Program Learning Outcome (PLO) #1 Assessment - Core Knowledge

Author

Sam Csik

Published

June 9, 2023

Summary

This post-program PLO assessment was administered to the MEDS class of 2023 (response rate = 29 / 31 students) on June 7, 2023.

Heads-up! No pre-program PLO assessment data for the Class of 2023

This assessment of MEDS PLO #1 (Core Knowledge) was developed by the MEDS Curriculum Cohesion Committee in winter/spring 2023, and therefore was not administered to the MEDS Class of 2023 before beginning the program in August 2022 (i.e. no pre-program assessment was conducted). No benchmark data is available for the Class of 2023.

The survey consists of 31 questions (28 multiple choice, 3 short free-response) and takes ~30 minutes to complete. Questions 1 - 15 ask respondents to rank how often they use certain data science tools or workflows, or familiarity/comfort levels with particular topics (see Part 1 through Part 4, below). Questions 16 - 31 assess respondents’ familiarity and application of domain-specific knowledge/tools taught during the MEDS program (see Part 5 onward, below). Many of these question types are multi-part and begin with a question phrased as:

“How familiar/comfortable are you with X” (rank 1 (never heard of it) > 5 (very familiar))
“Have often have you done/implemented Y” (rank 1 (never) > 5 (all the time))

If a respondent chooses a level of 2 or greater, they proceed to the remaining part(s) of the question to be tested on their knowledge/understanding of that topic. If a respondent chooses a level 1, they are skipped to the next question (to prevent respondents from guessing at answers). A perfect score is 14 points. See the distribution of scores, below:

Individual Questions

Note

Questions that have a correct answer are color-coded green.

Part 1: OS and data/document storage

Part 2: How often do you currently use the following?

Part 3: Workflow satisfaction

Part 4: Rank the following from 1 (strongly disagree) to 5 (strongly agree)

Part 5: Stats

All students answered all parts of question 17

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to answer all parts of question 17.

Below is a chunk of code showing a simple linear regression relating the number of pieces of microplastics to the number of days per year with rainfall.

Question 17b raw responses

Some respondents recorded their answers in sentence form, while others did not round their answers to the nearest integer. Cleaned responses are shown in the plot, above. Responses as they were recorded are included in the table, below:

	Free Response Answer to Q17b
1	47
2	26
3	47
4	47
5	47

All students answered all parts of question 18

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to answer all parts of question 18.

62% of respondents correctly answered question 18b (i.e. chose exactly the following options: normal, uniform, bimodal, symmedtric)

Part 6: Programming 1

All students advanced to answer question 19b

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to answer part b of question 19.

All students answered all parts of question 19

29 / 29 student respondents (100%) chose a comfort level of 2 or greater, and therefore were directed to part c of question 19.

The following code (in R) defines a function:

compute_turbine_power <- function(height, flowrate, efficiency, maxheight){
  
  if (height < maxheight) {
    
    power = height * flowrate * efficiency
    
  } else {
    
    power = maxheight * flowrate * efficiency
    
  }
  
  return(power)
  
}

This R code applies this function to data:

flowrate = 2
maxheight = 20
power_turbine_a <- compute_turbine_power(10, flowrate, 0.5, maxheight)

Part 7: Environmental Modeling

Part 8: Geospatial Analysis & Remote Sensing

All students answered all parts of question 21

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to answer all parts of question 21.

100% of respondents correctly answered question 21b (i.e. chose exactly the following options: raster, vector)

All students answered all parts of question 22

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to answer all parts of question 22.

28 / 29 respondents advanced to Question 23b

28 / 29 student respondents (96.5517241%) chose a familiarity level of 2 or greater – these respondents were directed to answer Question 23b. One respondent selected 1 (never worked with it before) in response to Question 23a and was jumped directly to Question 24.

27 / 29 respondents advanced to Question 24b

27 / 29 student respondents (93.1034483%) chose a familiarity level of 2 or greater – these respondents were directed to answer Question 24b. Two respondents selected 1 (never heard of it) in response to Question 24a and were jumped directly to Question 25.

Part 9: Machine Learning

All respondents advanced to Question 25b

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to Question 25b.

All respondents advanced to Question 25c

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to Question 25c.

All respondents advanced to Question 26b

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to Question 26b.

All respondents advanced to Question 26c

29 / 29 student respondents (100%) chose a familiarity level of 2 or greater, and therefore were directed to Question 26c.

75.9% of respondents correctly answered question 26c (i.e. chose exactly the following options: My model is likely to perform very well when applied to new data, My test set has data entry errors in it)

Part 10: Environmental Justice

Part 11: Data Viz & Communication

Identify 4 areas for improvement in the following data visualization that shows information about Michigan counties with highest college attendance.

Wordcloud of most frequently occurring words used to describe suggested improvements to the above data visualization (Question 30)

Question 30 raw responses

Free responses as they were recorded are included in the table, below:

Show entries

	Free Response Answer to Q30
1	y-axis is too populated, x axis labels and legend can be lowercased, labels on bars are redundant/distracting, color palette is distracting and doesnt convey new information, bars can be ordered from smallest to largest, gridlines can be removed
2	Reduce the clutter, change the colors, make the y axis have only some of the number (intervals of 10 maybe), improve axis label, add a title
3	Numbers on top of each column are hard to read. Too many colors that do not mean anything. Not in decending/acending order or easier visualization on the eyes, too many tick marks on the y axis, unnecessary legend as the names of the counties are already under each column, unhelpful axis titles
4	color pallette, theme, axis values more concise, tittle, no need for a legend
5	The colors don't mean anything. The order could be changed so it's in descending order. The numbers should be removed or placed higher. The x axis labels should be smaller and maybe tilted to be fully vertical. Apply theme minimal pls.

Previous1 2 3 4 5 6Next

Part 12: Programming 2

# define function
def convert_F_to_C(temp_F):
  temp_C = (temp_F-32)*5/9
  return temp_C

# use function
convert_F_to_C(32)

End MEDS Class of 2023 PLO Assessment Report

Return to main page

--- title: "MEDS Class of 2023" subtitle: "Program Learning Outcome (PLO) #1 Assessment - Core Knowledge" author: "Sam Csik" date: June 9, 2023 format: html: toc: true toc-location: left code-tools: source: true toggle: false theme: - styles.scss mainfont: Nunito execute: eval: true echo: false message: false warning: false editor_options: chunk_output_type: console --- ```{r} #..........................load packages......................... library(googlesheets4) library(tidyverse) library(janitor) library(showtext) library(ggtext) library(DT) library(tidytext) library(wordcloud) library(scales) #........................import functions........................ source("functions.R") #..........................import data........................... medsJune2023 <- read_sheet("https://docs.google.com/spreadsheets/d/1Sq1rOmBP-g6iOCBS7NoexCj4K-j3OpeErMJ2Hxd3AHo/edit?usp=sharing") #...........................clean data........................... medsJune2023_clean <- clean_PLO_data(medsJune2023) #......................import Google fonts....................... sysfonts::font_add_google(name = "Sanchez", family = "sanchez") sysfonts::font_add_google(name = "Nunito", family = "nunito") # automatically use showtext to render text for future devices ---- showtext::showtext_auto() ``` # **Summary** {{< include /sections/class2023_after/summary.qmd >}} ```{r} #| fig-align: center scores <- medsJune2023_clean |> select(sc0) mean_score <- mean(scores$sc0) median_score <- median(scores$sc0) ggplot(scores, aes(x = sc0)) + geom_histogram(binwidth = 1, color = "white", fill = "#047C91") + geom_vline(xintercept = median_score, linetype = "dashed", color = "black") + annotate(geom = "label", x = 10.5, y = 8, label = paste0("Median Score = ", median_score), hjust = "right") + annotate(geom = "segment", x = 10.5, y = 8, xend = median_score, yend = 7.5, arrow = arrow(length = unit(3, "mm"))) + scale_x_continuous(breaks = seq(1, 14, 1)) + labs(x = "Score", y = "Number of MEDS studnets", title = "Distribution of scores", caption = "Out of 14 available points") + meds_theme() ``` # **Individual Questions** ::: {.callout-note} Questions that have a correct answer are color-coded green. ::: ## **Part 1: OS and data/document storage** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 1: What OS? ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ q1_os_data <- clean_q1_os_pre(medsJune2023_clean) plot_q1_os_pre(q1_os_data) ``` ```{r} #| fig-cap: "NOTE: Percentages will not sum to 100%" ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 2: Where do you store data? ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # combine written-in "server" option ---- server <- c("Server", "Taylor server", "external server") # wrangle ---- q2_store_data <- clean_q2_store_data_pre(medsJune2023_clean) # plot ---- plot_q2_store_data_pre(q2_store_data, survey = "Pre") ``` ## **Part 2: How often do you currently use the following?** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 3: GUI ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q3_gui_data <- clean_q3_gui_pre(medsJune2023_clean) # plot ---- plot_frequency_use_pre(data = q3_gui_data, title = "A specialized software with a point-and-click graphical user\ninterface (e.g., for statistical analysis: SPSS, SAS...;for Geospatial\nanalysis: ArcGIS, QGIS...; for Genomics analysis: Geneious, …)", caption = "Question 3", survey = "Pre") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 4: Programming Languages ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q4_prog_lang_data <- clean_q4_prog_lang_pre(medsJune2023_clean) # plot ---- plot_frequency_use_pre(data = q4_prog_lang_data, title = "Programming languages (R, Python, etc.)", caption = "Question 4", survey = "Pre") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 5: Databases ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q5_databases_data <- clean_q5_databases_pre(medsJune2023_clean) # plot ---- plot_frequency_use_pre(data = q5_databases_data, title = "Databases (SQL, Access, etc.)", caption = "Question 5", survey = "Pre") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 6: Version Control ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q6_version_control_data <- clean_q6_version_control_pre(medsJune2023_clean) # plot ---- plot_frequency_use_pre(data = q6_version_control_data, title = "Version control software (Git, Subversion (SVN), Mercurial, etc.)", caption = "Question 6", survey = "Pre") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 7: Command Shell ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q7_command_shell_data <- clean_q7_command_shell_pre(medsJune2023_clean) # plot ---- plot_frequency_use_pre(data = q7_command_shell_data, title = "A command shell (usually accessed through Terminal on macOS or\nPowerShell on Windows)", caption = "Question 7", survey = "Pre") ``` ## **Part 3: Workflow satisfaction** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 8: Workflow Satisfaction ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q8_workflow_satisfaction_data <- clean_q8_workflow_satisfaction_pre(medsJune2023_clean) # plot ---- plot_q8_workflow_satisfaction_pre(q8_workflow_satisfaction_data) ``` ## **Part 4: Rank the following from 1 (strongly disagree) to 5 (strongly agree)** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 9: Raw Data ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q9_raw_data_data <- clean_q9_raw_data_pre(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q9_raw_data_data, title = "Having access to the original, raw data is important to be able to\nrepeat an analysis", caption = "Question 9") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 10: Small Program ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q10_small_program_data <- clean_q10_small_program(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q10_small_program_data, title = "I can write a small program, script, or macro to address a problem\nin my own work", caption = "Question 10") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 11: Find Help Online ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q11_find_help_online_data <- clean_q11_find_help_online(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q11_find_help_online_data , title = "I know how to search for answers to my technical questions online", caption = "Question 11") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 12: Overcoming Problems ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q12_overcoming_problems_data <- clean_q12_overcoming_problems(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q12_overcoming_problems_data, title = "While working on a programming project, if I get stuck, I can find\nways of overcoming the problem", caption = "Question 12") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 13: Confidence ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q13_confidence_data <- clean_q13_confidence(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q13_confidence_data, title = "I am confident in my ability to make use of programming software\nto work with data", caption = "Question 13") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 14: Easier Analyses ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q14_easier_analysis_data <- clean_q14_easier_analysis(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q14_easier_analysis_data, title = "Using a programming language (like R or Python) can make my\nanalyses easier to reproduce", caption = "Question 14") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 15: Increase Efficiency ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q15_increase_efficiency_data <- clean_q15_increase_efficiency(medsJune2023_clean) # plot ---- plot_rank_data_pre(data = q15_increase_efficiency_data, title = "Using a programming language (like R or Python) can make me\nmore efficient at working with data", caption = "Question 15") ``` ## **Part 5: Stats**  ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 16a: Median ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q16a_median_data <- clean_q16a_median_pre(medsJune2023_clean) # plot ---- pal <- c("14" = "#7ECD7A", "0" = "#047C91", "10" = "#047C91") plot_q16a_median_pre(q16a_median_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 16b: Mode ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q16b_mode_data <- clean_q16b_mode_pre(medsJune2023_clean) # plot ---- pal <- c("14" = "#7ECD7A") plot_q16b_mode_pre(q16b_mode_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 17a: Linear Regression Familiarity ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q17a_familiarity_lr_data <- clean_q17a_familiar_lr_pre(medsJune2023_clean) # plot ---- plot_q17a_familiar_lr_pre(q17a_familiarity_lr_data) ``` {{< include /sections/class2023_after/q17-callout-inline-code.qmd >}}  Below is a chunk of code showing a simple linear regression relating the number of pieces of microplastics to the number of days per year with rainfall. ```{r} #| fig-align: center knitr::include_graphics("images/17-microplastics-lm.png") ```  ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 17b: Microplastics ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q17b_microplastics_data <- clean_q17b_microplastics_pre(medsJune2023_clean) # plot ---- pal <- c("47" = "#7ECD7A", "27" = "#047C91", "900" = "#047C91", "26" = "#047C91", "956" = "#047C91","43" = "#047C91", "911" = "#047C91", "46" = "#047C91") plot_q17b_microplastics_pre(q17b_microplastics_data) ``` ::: {.callout-note} ## Question 17b raw responses Some respondents recorded their answers in sentence form, while others did not round their answers to the nearest integer. Cleaned responses are shown in the plot, above. Responses as they were recorded are included in the table, below: ```{r} #............................wrangle............................. q17_microplastics_dt <- medsJune2023_clean |> # select necessary cols ---- select(microplastics_lr) DT::datatable(q17_microplastics_dt, colnames = c("Free Response Answer to Q17b"), options = list(autoWidth = TRUE, pageLength = 5, lengthMenu = c(5, 10, 20, 30), dom = 'ltp') ) ``` ::: ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 18a: Probability Distribution Familiarity ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q18a_familiar_prob_dist_data <- clean_q18a_familiar_prob_dist_pre(medsJune2023_clean) # plot ---- plot_q18a_familiar_prob_dist_pre(q18a_familiar_prob_dist_data) ``` {{< include /sections/class2023_after/q18-callout-inline-code.qmd >}} ```{r} #| column: margin #| fig-cap: "62% of respondents correctly answered question 18b (i.e. chose exactly the following options: normal, uniform, bimodal, symmedtric)" ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 18b: Probabilty Distribution Terms ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # for calculating percentages in wrangling step below ---- q18b_num_answers <- medsJune2023_clean |> select(prob_dist_terms) |> count() |> pull() # wrangle (FULLY CORRECT; margin plot) ---- q18b_prob_dist_FULLY_CORRECT_data <- clean_q18b_FULLY_CORRECT_pre(medsJune2023_clean) # plot (FULLY CORRECT; margin plot) ---- pal <- c("yes" = "#7ECD7A", "no" = "#047C91") plot_q18b_FULLY_CORRECT_pre(q18b_prob_dist_FULLY_CORRECT_data) ``` ```{r} # wrangle (INDIV RESPONSES) ---- q18b_prob_dist_terms_data <- clean_q18b_prob_dist_terms_pre(medsJune2023_clean) # plot (INDIV RESPONSES) ---- pal <- c("normal" = "#7ECD7A", "uniform" = "#7ECD7A", "bimodal" = "#7ECD7A", "symmetric" = "#7ECD7A", "variable" = "#047C91") plot_q18b_prob_dist_terms_pre(q18b_prob_dist_terms_data) ``` ## **Part 6: Programming 1** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 19a: Familiarity with Functions ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # for calculating percentages in wrangling step below ---- q19_num_answers <- medsJune2023_clean |> select(term_function) |> count() |> pull() # wrangle ---- q19a_familiar_functions_data <- clean_q19a_familiar_functions_pre(medsJune2023_clean) # plot ---- plot_q19a_familiar_functions_pre(q19a_familiar_functions_data) ``` {{< include /sections/class2023_after/q19b-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 19b: Writing Functions ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q19b_writing_functions_data <- clean_q19b_writing_functions_pre(medsJune2023_clean) # plot ---- plot_q19b_writing_functions_pre(q19b_writing_functions_data) ``` {{< include /sections/class2023_after/q19c-callout-inline-code.qmd >}} {{< include /sections/all_classes/q19c-turbine-function.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 19c: Function Output ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q19c_fxn_output_data <- clean_q19c_fxn_output_pre(medsJune2023_clean) # plot ---- pal <- c("10" = "#7ECD7A") plot_q19c_fxn_output_pre(q19c_fxn_output_data) ``` ## **Part 7: Environmental Modeling** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 20a: Run Environmental Model ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q20a_run_env_mod_data <- clean_q20a_run_env_mod_pre(medsJune2023_clean) # plot ---- plot_q20a_run_env_mod_pre(q20a_run_env_mod_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 20b: Sensitivity Analysis ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q20b_sa_data <- clean_q20b_sa_pre(medsJune2023_clean) # plot ---- plot_q20b_sa_pre(q20b_sa_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 20c: Parameter Interactions ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q20c_param_int_data <- clean_q20c_param_int_pre(medsJune2023_clean) # plot ---- pal <- c("a global sensitivity analysis" = "#7ECD7A", "a local sensitivity analysis" = "#047C91", "I'm not sure" = "#047C91", "No response" = "#047C91") plot_q20c_param_int_pre(q20c_param_int_data) ``` ## **Part 8: Geospatial Analysis & Remote Sensing** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 21a: Comfort with Spatial Data ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # for calculating percentages in wrangling step below ---- q21a_num_answers <- medsJune2023_clean |> select(spatial_data) |> count() |> pull() # wrangle ---- q21a_comfort_spatial_data <- clean_q21a_comfort_spatial_pre(medsJune2023_clean) # plot ---- plot_q21a_comfort_spatial_pre(q21a_comfort_spatial_data) ``` {{< include /sections/class2023_after/q21-callout-inline-code.qmd >}} ```{r} #| column: margin #| fig-cap: "100% of respondents correctly answered question 21b (i.e. chose exactly the following options: raster, vector)" ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 21b: Representing Spatial Data ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # for calculating percentages in wrangling step below ---- q21b_num_answers <- medsJune2023_clean |> select(rep_spatial_data) |> count() |> pull() # wrangle (FULLY CORRECT; margin) ---- q21b_FULLY_CORRECT_data <- clean_q21b_FULLY_CORRECT_pre(medsJune2023_clean) # plot (FULLY CORRECT; margin) ---- pal <- c("yes" = "#7ECD7A", "no" = "#047C91") plot_q21b_FULLY_CORRECT_pre(q21b_FULLY_CORRECT_data) ``` ```{r} # wrangle (INDIV RESPONSES) ---- q21b_rep_spatial_data <- clean_q21b_rep_spatial_pre(medsJune2023_clean) # plot (INDIV RESPONSES) ---- pal <- c("vector" = "#7ECD7A", "raster" = "#7ECD7A", "tabular" = "#047C91", "relational" = "#047C91") plot_q21b_rep_spatial_pre(q21b_rep_spatial_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 21c: Vector vs. Raster ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q21c_vec_ras_data <- clean_q21c_vec_ras_pre(medsJune2023_clean) # plot ---- pal <- c("vector" = "#7ECD7A", "raster" = "#047C91", "I'm not sure" = "#047C91") plot_q21c_vec_ras_pre(q21c_vec_ras_data) ``` ```{r} #| fig-align: center knitr::include_graphics("images/21c-vector.png") ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 22a: Comfort with Remote Sensing ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q22a_comfort_rs_data <- clean_q22a_comfort_rs_pre(medsJune2023_clean) # plot ---- plot_q22a_comfort_rs_pre(q22a_comfort_rs_data) ``` {{< include /sections/class2023_after/q22-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 22b: Reflected Radiation ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q22b_rs_sun_data <- clean_q22b_rs_sun_pre(medsJune2023_clean) # plot ---- pal <- c("passive" = "#7ECD7A", "active" = "#047C91", "I'm not sure" = "#047C91") plot_q22b_rs_sun_pre(q22b_rs_sun_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 23a: Map Projections ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # for calculating percentages in wrangling step below ---- q23a_num_answers <- medsJune2023_clean |> select(map_proj_comfort) |> count() |> pull() # wrangle ---- q23a_comfort_map_proj_data <- clean_q23a_comfort_map_proj_pre(medsJune2023_clean) # plot ---- plot_q23a_comfort_map_proj_pre(q23a_comfort_map_proj_data) ``` {{< include /sections/class2023_after/q23-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 23b: Reprojection ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q23b_reproj_data <- clean_q23b_reproj_pre(medsJune2023_clean) # plot ---- pal <- c("3D to 2D" = "#7ECD7A", "Imprecise locations to precise locations" = "#047C91", "Meters to latitude/longitude" = "#047C91", "No response" = "#047C91") plot_q23b_reproj_pre(q23b_reproj_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 24a: Familiarity with Reflectance Spectra ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q24a_familiarity_rs_data <- clean_q24a_familiarity_rs_pre(medsJune2023_clean) # plot ---- plot_q24a_familiarity_rs_pre(q24a_familiarity_rs_data) ``` {{< include /sections/class2023_after/q24-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 24b: Vegetation Wavelength ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q24b_veg_wave_data <- clean_q24b_veg_wave_pre(medsJune2023_clean) # plot ---- pal <- c("green" = "#7ECD7A", "red" = "#047C91", "blue" = "#047C91", "blue and thermal" = "#047C91", "I'm not sure" = "#047C91", "No response" = "#047C91") plot_q24b_veg_wave_pre(q24b_veg_wave_data) ``` ## **Part 9: Machine Learning** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 25a: Familiarity with ML ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q25a_familiar_ml_data <- clean_q25a_familiar_ml_pre(medsJune2023_clean) # plot ---- plot_q25a_familiar_ml_pre(q25a_familiar_ml_data) ``` {{< include /sections/class2023_after/q25a-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 25b: Unsupervised Learning Algorithm ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q25b_unsup_alg_data <- clean_q25b_unsup_alg_pre(medsJune2023_clean) # plot ---- plot_q25b_unsup_alg_pre(q25b_unsup_alg_data) ``` {{< include /sections/class2023_after/q25b-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 25c: Kmeans ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q25c_kmeans_data <- clean_q25c_kmeans_pre(medsJune2023_clean) # plot ---- pal <- c("unsupervised, does not require expert labeling of data" = "#7ECD7A", "unsupervised, requires expert labeling of data" = "#047C91", "supervised, does not require expert labeling of data" = "#047C91", "supervised and requires expert labeling of data" = "#047C91", "I'm not sure" = "#047C91") plot_q25c_kmeans_pre(q25c_kmeans_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 26a: Dividing Data ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q26a_div_data <- clean_q26a_div_data_pre(medsJune2023_clean) # plot ---- plot_q26a_div_data_pre(q26a_div_data) ``` {{< include /sections/class2023_after/q26a-callout-inline-code.qmd >}} ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 26b: Train, Validate, Split ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q26b_tvs_data <- clean_q26b_tvs_pre(medsJune2023_clean) # plot ---- plot_q26b_tvs_pre(q26b_tvs_data) ``` {{< include /sections/class2023_after/q26b-callout-inline-code.qmd >}} ```{r} #| column: margin #| fig-cap: "75.9% of respondents correctly answered question 26c (i.e. chose exactly the following options: My model is likely to perform very well when applied to new data, My test set has data entry errors in it)" ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 26c: Model Performance ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # for calculating percentages in wrangling step below ---- q26c_num_answers <- medsJune2023_clean |> select(learning_from_model) |> count() |> pull() # wrangle (FULLY CORRECT) ---- q26c_FULLY_CORRECT_data <- clean_q26c_FULLY_CORRECT_pre(medsJune2023_clean) # plot (FULLY CORRECT) ---- pal <- c("yes" = "#7ECD7A", "no" = "#047C91") plot_q26c_FULLY_CORRECT_pre(q26c_FULLY_CORRECT_data) ``` ```{r} # wrangle (INDIV RESPONSES) ---- q26c_mod_perf_data <- clean_q26c_mod_perf_pre(medsJune2023_clean) # plot (INDIV RESPONSES) ---- pal <- c("My model is unlikely to perform well when applied to new data" = "#7ECD7A", "My model is overfitting the training set" = "#7ECD7A", "My model is likely to perform very well when applied to new data" = "#047C91", "My test set has data entry errors in it" = "#047C91") plot_q26c_mod_perf_pre(q26c_mod_perf_data) ``` ## **Part 10: Environmental Justice** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 27: Data Justice ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q27_data_justice_data <- clean_q27_data_justice_pre(medsJune2023_clean) # plot ---- plot_q27_data_justice_pre(q27_data_justice_data) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 28: Bias ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q28_bias_data <- clean_q28_bias_pre(medsJune2023_clean) # plot ---- plot_q28_bias_pre(q28_bias_data) ``` ## **Part 11: Data Viz & Communication** ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 29: Create Data Viz ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q29_create_viz_data <- clean_q29_create_viz_pre(medsJune2023_clean) # plot ---- plot_q29_create_viz_pre(q29_create_viz_data) ``` #### Identify 4 areas for improvement in the following data visualization that shows information about Michigan counties with highest college attendance. ```{r} #| fig-align: center knitr::include_graphics("images/30-plot.png") ``` ```{r} #| fig-width: 3 #| fig-align: center #| fig-cap: "Wordcloud of most frequently occurring words used to describe suggested improvements to the above data visualization (Question 30)" #| fig-cap-location: top ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 30: Improve Data Viz ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q30_improve_dv <- clean_q30_improve_dv_pre(medsJune2023_clean) # plot ---- plot_q30_improve_dv_pre(q30_improve_dv) ``` ::: {.callout-note} ## Question 30 raw responses Free responses as they were recorded are included in the table, below: ```{r} # wrangle ---- q30_improve_dv <- medsJune2023_clean |> # select necessary cols ---- select(improve_data_viz) # create table ---- DT::datatable(q30_improve_dv, colnames = c("Free Response Answer to Q30"), options = list(autoWidth = TRUE, pageLength = 5, lengthMenu = c(5, 10, 20, 30), dom = 'ltp') ) ``` ::: ## **Part 12: Programming 2** ```{r} #| eval: false #| echo: true # define function def convert_F_to_C(temp_F): temp_C = (temp_F-32)*5/9 return temp_C # use function convert_F_to_C(32) ``` ```{r} ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ~ Question 31: What Language ---- ##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # wrangle ---- q31_lang_data <- clean_q31_lang_pre(medsJune2023_clean) # plot ---- pal <- c("Python" = "#7ECD7A", "R" = "#047C91", "SQL" = "#047C91") plot_q31_lang_pre(q31_lang_data) ``` <br> ::: {.center-text} ***End MEDS Class of 2023 PLO Assessment Report*** ::: <br> ::: {.center-text} *Return to [main page](index.html)* :::