index

Teach Me How to Google

The case for debugging & search skills in the age of AI + tips on how to do so effectively

Published: October 11, 2021

Last updated: Jul 28, 2025

Sam Shanny-Csik |
Lecturer & Data Training Coordinator

Master of Environmental Data Science |
Bren School of Environmental Science & Management

Let’s address the elephant in the room . . .

Image source: Wikipedia

Generative AI tools (e.g. ChatGPT) are everywhere now (and maybe you’re already using them!)
Even Google provides an AI summery with each query
Does it even pay to “Google,” in the traditional sense, anymore?

We argue, YES!

Evidence suggests that overreliance on ChatGPT can erode critical thinking skills

Group 1: could use ChatGPT

Group 2: could use Google

Group 3: only their brains!

Evidence suggests that overreliance on ChatGPT can erode critical thinking skills

Group 1: could use ChatGPT

low brain engagement
“souless,” lacked originality
copy / pasting by 3rd essay

Group 2: could use Google

Group 3: only their brains!

Evidence suggests that overreliance on ChatGPT can erode critical thinking skills

Group 1: could use ChatGPT

low brain engagement
“souless,” lacked originality
copy / pasting by 3rd essay

Group 2: could use Google

Group 3: only their brains!

high neural connectivity
engaged / curious
claimed ownership & expressed higher satisfaction

Evidence suggests that overreliance on ChatGPT can erode critical thinking skills

Group 1: could use ChatGPT

low brain engagement
“souless,” lacked originality
copy / pasting by 3rd essay

Group 2: could use Google

also high levels of brain activity and satisfaction!

Group 3: only their brains!

high neural connectivity
engaged / curious
claimed ownership & expressed higher satisfaction

Evidence suggests that overreliance on ChatGPT can erode critical thinking skills

After 3 essays, everyone was asked to re-write one of their previous essays, but Group 1 could no longer use ChatGPT, while Group 2 & Group 3 could now use ChatGPT

Group 1: difficulty remembering, weaker alpha & theta brain waves (creative ideation & memory load); suggests that they didn’t integrate work into their memory networks

Group 2: performed well, significant increase in brain connectivity across all bands; suggests that if used properly, AI can enhance learning as opposed to diminishing it

“The LLM undeniably reduced the friction involved in answering participants’ questions compared to the Search Engine. However, this convenience came at a cognitive cost, diminishing users’ inclination to critically evaluate the LLM’s output or”opinions” (probabilistic answers based on the training datasets). This highlights a concerning evolution of the ‘echo chamber’ effect: rather than disappearing, it has adapted to shape user exposure through algorithmically curated content. What is ranked as “top” is ultimately influenced by the priorities of the LLM’s shareholders.”

Other motivating findings

“Moreover, while GenAI can improve worker efficiency, it can inhibit critical engagement with work and can potentially lead to long-term overreliance on the tool and diminished skill for independent problem-solving. Higher confidence in GenAI’s ability to perform a task is related to less critical thinking effort. When using GenAI tools, the effort invested in critical thinking shifts from information gathering to information verification; from problem-solving to AI response integration; and from task execution to task stewardship. Knowledge workers face new challenges in critical thinking as they incorporate GenAI into their knowledge workflows.”

Other motivating findings

“Students who substitute some of their learning activities with LLMs (e.g., by generating solutions to exercises) increase the volume of topics they can learn about but decrease their understanding of each topic. Students who complement their learning activities with LLMs (e.g., by asking for explanations) do not increase topic volume but do increase their understanding. We also observe that LLMs widen the gap between students with low and high prior knowledge.”

And one final good analogy

“Using LLMs requires a good baseline knowledge of R [or other languages] to actually be useful. A good analogy for this is with recipes. ChatGPT is really confident at spitting out plausible-looking recipes. A few months ago, for fun, I asked it to give me a cookie recipe. I got back something with flour, eggs, sugar, and all other standard-looking ingredients, but it also said to include 3/4 cup of baking powder. That’s wild and obviously wrong, but I only knew that because I’ve made cookies before.”

GenAI in the MEDS calendar

Term	Incorporation of GenAI
SUMMER	Establish context Student use is discourage
FALL	Critical interrogation Instructors demonstrate examples of use and discuss pros / cons
WINTER	Guided Use Workshops early in quarter Instructors model use
SPRING	Supported Use Instructors model use

You’re here because you want to learn! ChatGPT (and related tools) will certainly become a part of your workflow, but in this early stage of MEDS, we want you to focus on core competencies and critical thinking skills, including an understanding of how to properly use tools, design workflows, write and organize code, and troubleshoot problems.

To do that most effectively, you need to commit to active learning processes and approaches.

Welcome to data science, where questions are aplenty!

You will become increasingly more comfortable with not immediately knowing the answers to all your coding problems (even when using GenAI tools). It’s all part of the job.

Googling can be difficult, and it is a skill that requries practice. But you can and will get better at it over time.

-Me, everytime I sit down to program

It doesn’t mean you won’t still feel like this at times:

-Me still, about half the times I sit down to program

But the goal is to be a bit more at peace with that feeling…and have the confidence that you can find your way

Artwork by Allison Horst

I typically find myself turning to Google because:

I got an error and need help fixing it

I know what I want my code to do, but I have no idea how to actually pull it off

Sometimes, it’s both of these things happening at the same time

We’ve all been here before:

Artwork by Allison Horst

Pause, exhale, narrow down your potential Google search

Restart R

Check the easy stuff

Read that error message

Try to islate the problem

Double-check the documentation

Talk about it out loud

Restart R

“Restart R often, especially when things get weird…We install and update packages from R, which is a little bit like working on your airplane engine while you’re flying.”

-Jenny Bryan, in her 2020 rstudio::conf keynote, Object of type ‘closure’ is not subsettable

Similarly, going to sleep and trying again tomorrow is a legitimate (and often impactful) strategy – think of it as restarting your own internal computer (i.e. your brain).

Check the easy stuff

Read that error message

# load packages ----
library(tidyverse) # a collection of data wrangling & visualization packages
library(palmerpenguins) # contains the 'penguins' data set

# print out the first three rows of the penguins data frame ----
head(penguins, 3)

# A tibble: 3 × 8
  species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
  <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
1 Adelie  Torgersen           39.1          18.7               181        3750
2 Adelie  Torgersen           39.5          17.4               186        3800
3 Adelie  Torgersen           40.3          18                 195        3250
# ℹ 2 more variables: sex <fct>, year <int>

# what unique values are in the species column of the penguins data frame? ----
unique(penguins$species)

[1] Adelie    Gentoo    Chinstrap
Levels: Adelie Chinstrap Gentoo

Read that error message

# create a new data frame with just rows (observations) containing "Gentoo" penguins ----
gentoo <- penguins |> 
  filter(species = "Gentoo")

Error in `filter()`:
! We detected a named input.
ℹ This usually means that you've used `=` instead of `==`.
ℹ Did you mean `species == "Gentoo"`?

Returns a helpful error message with a potential fix!

Read that error message

# create data object, named 'dat' ----
dat <- data.frame(x = 1, y = 2)
dat

  x y
1 1 2

# extract column 'x' from your data object (oops, we forgot we named it 'dat' and not 'df') ----
df$x

Error in df$x: object of type 'closure' is not subsettable

“Your first “object of type ‘closure’ is not subsettable” error message is a big milestone for an R user. Congratulations, if there was any lingering doubt, you now know that you are officially programming!“

-Jenny Bryan, in her 2020 rstudio::conf keynote, Object of type ‘closure’ is not subsettable

This error often arises when you attempt to subset a function (i.e. treat a function in a way that it is shouldn’t be; a “closure” is a type of function in R). Here, we forgot that we called called our object dat, and not df. df() also happens to be a function that gives you the density of the ‘F’ distribution and we are attempting to subset (i.e. extract) a column (x) from it.

Read that error message

Error messages provide helpful context and information, even if they seem unhelpful on the surface!

You’ll become more familiar with common error messages the more time you spend coding, but it can be helpful to explore some resources for deciphering the big ones:

Artwork by Allison Horst

Try to isolate the problem

It can be overwhelming to figure out where an error or issue is occurring in a large chunk of code. A small example:

# load libraries ----
library(dplyr)
library(palmerpenguins)

# wrangle data ----
penguins_new <- penguins |> 
  select(species, sex, bill_length_mm) |> 
  filter(species == "Adelie") |>
  reorder(bill_length_mm)

Error: object 'bill_length_mm' not found

Running all lines together can make it difficult which line(s) is responsible for this error (and imagine dealing with much longer, more complex code chunks!).

Instead, run line-by-line to isolate where the problem is occurring so that you can begin investigating from there.

Try to isolate the problem

Run line-by-line until you hit the error:

penguins_new <- penguins |> 
  select(species, sex, bill_length_mm) # |> 
  # filter(species == "Adelie") |> 
  # reorder(bill_length_mm)

Works!

penguins_new <- penguins |> 
  select(species, sex, bill_length_mm) |> 
  filter(species == "Adelie") # |> 
  # reorder(bill_length_mm)

Works!

penguins_new <- penguins |> 
  select(species, sex, bill_length_mm) |> 
  filter(species == "Adelie") |> 
  reorder(bill_length_mm)

Error: object 'bill_length_mm' not found

Doesn’t work… let’s look into what reorder() is / does…

Try to isolate the problem

Searching for reorder() (either by looking up documentation – more on that in a moment – or Googling it) reveals that it’s not actually a function

Googling, “R tidyverse reorder values high to low,” leads us to the {dplyr} documentation for the arrange() function, which allows us to sort values in descending order when coupled with desc():

penguins_new <- penguins |> 
  select(species, sex, bill_length_mm) |>
  filter(species == "Adelie") |>
  arrange(desc(bill_length_mm))

head(penguins_new, 4)

# A tibble: 4 × 3
  species sex   bill_length_mm
  <fct>   <fct>          <dbl>
1 Adelie  male            46  
2 Adelie  male            45.8
3 Adelie  male            45.6
4 Adelie  male            44.1

Double-check the documentation

Documentation provides critical info for understanding how to correctly use a package or function

written by the people who actually developed the tools you’re using
describes inputs, outputs, how a function can be modified to acheive a particular outcome
demonstrates standards
often includes reproducible examples

Pull up documentation for a loaded function by typing ?function_name in your console. E.g.

?dplyr::filter

Open up RStudio and practice pulling up the documentation for filter()

Double-check the documentation

With the person(s) next to you, explore the documentation and consider the following (and be prepared to share out):

What is the filter() function used for? Where did you locate this information?
What do the .data and ... arguments do? How easy or difficult of a time did you have understanding the descriptions?
Try running a few of the Examples in your console. How do these help (or not help) you better understand how the filter() function works?

04:00

Double-check the documentation

Vignettes are long-form guides / tutorials for R packages. These can offer helpful (and often less jargony) examples and explanations for how to use various functions. Check for a vignette by typing vignette("package_name") in your console. E.g. if we want to learn more about how to use filter(), which comes from the {dplyr} package:

vignette("dplyr")

Talk about it out loud

This is often referred to as rubber duck debugging, and it goes something like this:

1. Beg, borrow, steal, buy, fabricate or otherwise obtain a rubber duck (bathtub variety).

2. Place rubber duck on desk and inform it you are just going to go over some code with it, if that’s all right.

3. Explain to the duck what your code is supposed to do, and then go into detail and explain your code line by line.

4. At some point you will tell the duck what you are doing next and then realise that that is not in fact what you are actually doing. The duck will sit there serenely, happy in the knowledge that it has helped you on your way.

-rubberduckdebugging.com with original credit to Andy from lists.ethernal.org

Image source: Wikipedia

Talk about it out loud

But how is rubber duck debugging different than using ChatGPT?

Talking through your problem out loud forces you to systematically think through your logic step-by-step. This process often helps to identify previously overlooked details, errors, or logical inconsistencies.

In contrast, tools like ChatGPT tend to deliver complete (and sometimes incorrect) answers or solutions, without requiring you to engage in the same level of critical thinking.

Talk about it out loud

With careful prompting, ChatGPT can act as a rubber ducky (though we encourage talking to one another for now!):

“through carefully crafted prompts and easily accessible platforms, rubber duck LLMs can assist learners with specific questions while also situating those questions alongside larger computer science concepts and computational thinking practices.”

-Gonzales et al. 2025

An example ChatGPT prompt:

“I am getting an error in my R code. Instead of giving me the answer, can you help me logically think through my code line-by-line so that I might identify the problem on my own? Here is my code:

[code here]

My error message is: [error message here]“”

Check out this example conversation using a code example from earlier in the slides!

Still stumped? Create a reprex!

Create a reprex i.e. a minimal reproducible example. Strip away the cruft and keep only what is required to reproduce your issue. More often than not, this will help you solve your own probelm. And if you still haven’t figured it out, you can bring your reprex to a friend, colleague, or online community – making it easier for others to help you should always be the goal!

Artwork by Allison Horst

Still haven’t figured it out after lots of debugging? Enter Google.

General Googling tips:

[r] + error message
error message + function or package name
sometimes even just the error message alone (especially if hyperspecific) will suffice

Check the date of online solutions:

past solutions can become outdated quickly – target recent posts and responses (may also be an issue when using GenAI!)

Read multiple search results:

sometimes one explanation will make sense in a way that another explanation will not

Isolate the relevant part(s) of an answer:

not every part of an online response will be relevant to your problem – take care when copy / pasting entire “solutions”

Use search operators to get more specific results

For example, compare these two Google queries: r dplyr join vs. site:stackoverflow.com R “dplyr join”

So where on Google should I actually look?

There are many excellent resources online, but the following are great places to start:

Q&A forums

GitHub

Documentation

Personal Blogs

Q&A forums

Stack Overflow has historically¹ been the question-and-answer website for programmers. Refine your query using search operators, and look for accepted anwers with many upvotes:

I return to this particular Stack Overflow post on undoing a git commit and this post on checking out a remote git branch all the time (probably at least 1x/month!)

Q&A forums

Stack Overflow initially banned ChatGPT-written responses, but now partners with a number of AI chatbot developers, including OpenAI – this means that tools like ChatGPT are trained on Stack Overflow content (among other sources).

¹Stack Overflow has seen a sharp decline in user activity since ChatGPT was released as users increasingly turn to GenAI tools.

The implications of this decline still remain to be fully seen, though research suggests that replacing human content creation with AI-generated responses may make it more difficult to train future AI models. This change also represents a shift of knowlege sharing from the public to private domains (del Rio-Chanona et al. 2024).

Q&A forums

Does this mean I should stop using community-based Q&A forums? No! They offer a unique and complementary value to AI chatbots, particularly when developing your core competencies and critical thinking skills, including:

diverse viewpoints and nuanced perspectives (especially regarding edge cases)
debate over and refinement of solutions
an understanding of how others diagnose and approach problems
an introdution to important terminology
learning the etiquette and norms of your programming community(ies)
contributing to a collective knowledge base

AI is trained on past knowledge, forums offer living knowledge

Also check out:

forum.posit.co: for questions related to Posit’s tools (e.g. {tidyverse}, RStudio and more)
discuss.python.org: the official Python Community forum

GitHub

The source code for most (if not all) of your favorite data science software (e.g. packages / libraries) lives on GitHub. Check for known issues or fixes underway by exploring open issues and pull requests. Some repositories may also include additional documentation or examples using wikis (e.g. see the {xaringan} wiki).

Open issues for {ggplot2}

Open pull requests for {ggplot2}

Documentation

We already saw how to access documentation directly from RStudio, but looking at the official documentation online may reveal additional helpful articles, FAQs, and resources. Documentation sites built with pkgdown are particularly easy to navigate (they all are structured the same) and approachable. For example, the {ggplot2} pkgdown documentation:

Personal blogs

I love a good blog post – and the data science field is full of incredible bloggers who make complex computational tasks more approachable.

A still-very-relevant tweet by David Robinson

Blogs are an excellent place to find tutorials interwoven with engaging narratives (at least more engaging than documentation), less jargon (or jargon that’s explained), and often include example code, outputs, and other visuals.

Personal blogs

How do I find folks to follow? Here are some bloggers that I’ve personally enjoyed, but you might also consider joining Bluesky (a Twitter replacement that’s gaining popularity with the data science community) and join some relevant “starter packs” (you can find recommendations on the EDS 240 course website):

Nick Tierney
Isabella Velásquez
Danielle Navarro
Andrew Heiss
Maya Gans
Jadey Ryan
rostrum.blog (by Matt Dray and Adriana De Palma)

What else can should I be doing?

Read books! R for Data Science (2e) is the primer for all things R for data science (also see Python for Data Analysis (3e)). Reading introduces you to important terminology and concepts that make Googleing help a whole lot easier
Create a curated vocab or function list. There’s always a handful of MEDS students each year who keep a running list of important functions / methods to refer back to while they’re learning. Abandoning this once you feel you’ve committed things to memory (or have easier ways to find what you’re looking for) is totally great.

Closing thoughts

When you take the time to troubleshoot your own code, either independently or with help from Google, you’re doing more than just fixing errors. You are actively developing the core skills that make great data scientists:

Applying critical thinking and logical reasoning to break complex problems into manageable pieces
Reflecting on what your code should do versus what it’s actually doing
Developing your intuition around recognizing bugs, patterns, and solutions
Practicing persistence, resilience & resourcefulness, all qualities that stand out to employers!

By building these foundational technical and problem-solving skills now, you’re setting yourself up for long-term success.

et oogling!