MEDS README Guidelines
When and where to include a README, and what to talk about when you do
What is a README?
A README is a text file that lives in a project directory (often the root directory) and provides essential information about the software / product to users, developers, and / or contributors.
The exact origin of the README isn’t totally clear, but they date back to at least 1974 with the invention of PDP-10, an early mainframe computer. A README accompanied PDP-10 with information on how to operate some of its programs.
The ASCII (American Standard Code for Information Interchange) system sorts uppercase letters before lowercase ones. Running ls
(to list files) in the command line will list the README near the top, making it one of the first files a user will encounter (ideal for a file that’s meant to contain critical information about how to use / engage with directory content!).
Platforms like GitHub heavily encourage the use of README files (you can just check a box to add one when creating a new repository on GitHub). The contents of a README that lives in a repo’s root directory are automatically displayed on the repo’s landing page (but note that you can also include READMEs in subdirectories as well). GitHub supports the use of README.md
files, which are written in Markdown.
Repository-level READMEs
You should always include a README in the root directory of your GitHub repositories. A README is typically the first item that visitors will see when they arrive at your repository. This makes them the perfect place to tell people what your project is, why its useful, and how to get started using it. What you include in a given README will look different depending on the project, but here are some guidelines for getting started:
There are a number of different ways:
From GitHub:
- Check the Add a README file box when creating a repository from GitHub, or
- Click the green Add a README button that appears below your files on a GitHub repository that does not yet have a README
- Click the Add file button at the top of a repository’s landing page (next to the green Code button) > select + Create new file > name it
README.md
and add content > click Commit changes…
From RStudio (GUI):
- Use the Files pane to click Blank File > select Text File > type
README.md
From the command line:
- Navigate to your repository’s root directory using
cd
(usepwd
to verify / check your location) > typetouch README.md
Things you should always include in your GitHub repository READMEs:
These core elements are required for all MEDS-related homework assignments and projects.
-
- A README’s title is set to the repository name by default - change this!
-
- Paragraphs or a bulleted list are both acceptable options
- You may include an image or logo that represents the project
-
- This includes information about the repository structure or file organization
-
- Any necessary information on where data lives (e.g. is it housed in the repo, on a server, in a library / package etc.) and how to access it in order to run the code
-
- Consider hyperlinking collaborators’ GitHub profiles or other professional profile
-
- In an appropriate, consistent format, including links
- Provide reference to any other individuals or sources that supported the development of the repository. For example, did you fork an existing repository? Did the work have any funding sources? Were there individuals you consulted with or were inspired by?
- Don’t forget to add references for data sets too
Things you should consider including, but may vary depending on the goals and complexity of the project:
-
- Does your repository contain software that users will need to download or install? Do users access your software via a web browser? Do they need to install any dependencies? Do users need to clone your repository? etc.
-
- Related to #1, above. Its important to keep this concise! You may include images. Any long-form instructional documentation is best moved to the repository’s wiki
-
- Make mention of GitHub issues and what information a user should include in an issue
-
- Do you welcome contributions from others? If so, its important to explain how one might contribute (e.g. fork & pull request, open an issue, both?)
-
- Important for allowing others to reuse your work (which is copyrighted, by default; read about what it means when no license is available)
- What license you choose depends on what type of work you are trying to license. There are different licenses used for code / software, content, and data. Some helpful resources for getting started:
- Licensing code / software: check out this page, Choose an open source license, by GitHub. A couple popular options for software include MIT License and GNU GPLv3
- Licensing content (i.e. non-software): check out this page, About CC Licenses, by Creative Commons. A few good CC options for non-software content include CC BY, CC BY-SA, and CC BY-NC.
- Licensing data: You’ll chat more about this in EDS 213 (Databases and Data Management)!
While it may be tempting to provide as much information as you possibly can into your README, it might not necessarily be the right home for everything (see the Wikis section, below). I find this advice from Kira Oakley in her article, Art of README to be a helpful reminder:
“The lack of a README is a powerful red flag, but even a lengthy README is not indicative of there being high quality. The ideal README is as short as it can be without being any shorter. Detailed documentation is good – make separate pages for it! – but keep your README succinct.”
Example GitHub repository READMEs:
Each project is different and so is its README. As you browse through different repositories you will see that not all of them have the same sections. However, they all offer a clear starting point for a newcomer to understand what the project is about. Here are some GitHub repositories with READMEs we like:
- strava-dashboard, by Samantha Csik - code for a Shiny dashboard
- EDS-240-data-viz, by Samantha Csik - code for a course website
- thomas-fire, by Anna Ramji - a MEDS student project
- xarray - a Python package for working with multidimensional arrays and datasets
- palmerpenguins - an R package that contains teaching data
- metajam - an R package for downloading and reading in metadata from repositories in the DataONE network
- awesome-readme, by Matias Singers - a curated list of awesome READMEs
To the right of every GitHub repository lives an “About” section, where visitors can find some brief but helpful information about the project. For example, take a look at Allison Horst’s palmerpenguins repository:
Click on the gear icon to update your repo’s About section. You should always include a short description of the project. It can be super helpful to also add relevant links (e.g. package documentation, a report, a hosted GitHub Page, etc.)
Wikis
While READMEs are used to provide a quick overview of what your project is / does, wikis should be used to provide additional documentation. From GitHub Docs:
“You can use your repository’s wiki to share long-form content about your project, such as how to use it, how you designed it, or its core principles”
A great way to streamline a repository’s README is to move any documentation-style information to a wiki, and then link to the appropriate wiki page from your README. Each wiki page should focus on a single topic.
- Navigate to your repository’s landing page > click Wiki (top menu)
- Click the green Create the first page button to create your wiki’s landing page > add content > click Save page
- To add additional pages, click the green New page button (top right corner when you’re on your wiki’s home page) > provide a title / content > click Save page
Example GitHub repository wikis:
- Openscapes website wiki, by Openscapes (a user-guide of sorts; notes and conventions for Openscapes website maintainers and contributors)
- NCEAS Roundtable (August 2023) workshop materials wiki, by Samantha Csik (setup instructions for workshop participants)
{xaringan}
wiki, by Yihui Xie (includes tips and further customizations that aren’t covered in the official R package documentation)
Organization-level READMEs
GitHub organizations are shared accounts, that comprise members (each with their own personal GitHub account) who can collaborate across many projects at once.
You have the option to add both public (visible to anyone) and private (visible only to organization members) profile READMEs to separately serve each of those communities. The content you choose to include in an organization profile README is quite flexible, but it’s often valuable to add the purpose of the organization, any high-level summary information, and links to important websites / external resources / specific repositories within the organization.
README that’s visible to the public:
- Create a public repository within your organization and name it
.github
- Add a folder named
profile
to your.github
repository, then add aREADME.md
file insideprofile/
(i.e..github/profile/README.md
)
README that’s visible to members only:
- Create a private repository within your organization and name it
.github-private
- Add a folder named
profile
to your.github-private
repository, then add aREADME.md
file insideprofile/
(i.e..github-private/profile/README.md
)
Alternatively (or in addition to), you can update your organization’s profile information, including a title and description, as well as relevant links – these will appear across the top of your organization’s landing page.
Click on the Settings tab (top menu) to update your organization’s profile picture, description, add URLs, etc.
For example, see LTER’s GitHub organization, which includes both updated profile information and a public README:
In EDS 411 (Capstone Project), you’ll create a GitHub organization to house all your Capstone-related repositories and code. You’ll also be required to include specific information in your organization’s README (you’ll talk much more about the required checklist in EDS 411). In the meantime, check out a few examples GitHub organizations and their associated README information.
Example GitHub organizations with added profile information:
- UCSB MEDS, by the Master of Environmental Data Science program (a place where MEDS admin organize teaching materials and other related content)
- The Nature Conservancy, by TNC (scripts and apps from TNC scientists and geologists)
- NCAR, by the NSF National Center for Atmospheric Research (a home for NCAR software and projects)
Example (public) GitHub organization READMEs:
- Outdoor Equity, by Halina Do-Linh & Clarissa Boyajian (MEDS 2022 Capstone project)
- CASAschools, by Liane Chen, Charlie Curtin, Kristina Glass & Hazel Vaquero (MEDS 2024 Capstone project)
- NMFS Open Science, by the National Marine Fisheries Service (contains work which supports open science and open data literacy across NOAA fisheries)
- NASA Goddard Institute for Space Studies, by NASA (laboratory in the Earth Sciences Division of NASA’s Goddard Space Flight Center)
- GitHub, by GitHub (yes, GitHub is built on GitHub!)