Introduction to Data Analysis and Visualization with R

MLA Course
Listing Archived: Thursday, April 20, 2017

Primary contact information...
National Institutes of Health Library
Bldg 10, Rm 1L21, MSC 1150, 10 Center Drive
Bethesda MD , 20892
United States
Lisa Federer is the primary contact.
Phone: 301-594-6283
Region: Mid-Atlantic

Description: So you’ve heard about R – how it will helps with statistical analyses, create beautiful graphs, and makes science more reproducible – but you’ve never written a line of code in your life. Don’t be scared! In this course especially for non-programmers, you’ll learn the basics to get started with using R. These skills are useful for librarians who work with students and researchers, as well as those who have their own data to analyze. We’ll use RStudio, a user-friendly interface for R, and cover topics including: • Key terminology and concepts • Essentials of data processing • Basic statistical analysis • Creating simple graphics

Experience Level: Intermediate
Continuing Education Experience: Basic computer skills
CE Contact Hours: 4
Professional Competencies: Health Sciences Information Services, Information Systems and Technology, Research + Analysis and Interpretation
Subject: Health Care Informatics, Research, Technology/Systems
Course Type: Face to Face, Hands-on

Educational Objective: After participating in this course, learners will understand how to use R, an open source programming language, to work with a variety of different types of data, including library-related data, scientific research data, and clinical data. Specifically, learners will be able to use R to: • organize and “wrangle” (or clean) data • conduct basic data and statistical analyses • create exploratory and publication-ready visualizations.

Agenda:

-	Why R? Applications and uses: 5 minutes
-	Key terminology, concepts, and syntax: 10 minutes
-	Data processing and analysis: 1.5 hours
o	Understanding data structures
o	Data “wrangling”: organizing and cleaning messy data
o	Basic statistical analysis
-	Break: 15 minutes
-	Data visualization: 1.5 hours
o	The Grammar of Graphics: a framework for building data visualizations
o	Exploratory data visualization
o	Creating publication-ready visualizations
o	Customizing color and appearance
-	Questions and next steps for learning more: 30 minutes

Need for This Course: Librarians increasingly need data literacy skills to be successful in today’s data-intensive information environment. This is especially true for medical librarians who work closely with clinicians and biomedical researchers, whose work tends to be highly data-driven. Many users within the scientific community are turning to R for their data analysis, organization, and visualization needs. R is a free and open source scripting language that provides users the ability to work with data in highly customized ways to accomplish tasks that would be difficult or impossible with point-and-click software like Excel. R is easy to learn, even if you’ve never written a line of code before, and is a great way for librarians to get started with working with their own data, or assisting patrons with their data needs. Many scientific researchers have embraced R as a simple and effective solution for their data needs, and librarians who provide support for such groups will find it useful to be conversant in using R. Librarians who can master R and provide training for their patrons will undoubtedly find that their skills are much sought-after; the instructor for this CE course offers monthly R workshops and has trained hundreds of researchers at her institution. Even after a year of providing these monthly workshops, classes continue to have waitlists of dozens of researchers. Librarians may also find R helpful in working with their own data, including bibliometric data, library statistics, or budget data.

The instructional methods used include Lecture, Demonstration, Discussion, and Hands-on Exercises.

Participant Materials: Handouts, including detailed, step-by-step instructions for completing the hands-on exercises, with screen shots. Example datasets for use with the exercises.

Facility Requirements: Computers (preferably with internet access) with R and RStudio software (free and open source, but both require installation), one for each attendee and one for the instructor.