Data Collection and Management for Clinical Research (EPI 218)

Summer 2022 (2 units)

All research studies must securely store data, query it, and prepare it for statistical analysis. Many research studies will also collect new data. Spreadsheet applications, like Microsoft Excel, are commonly used for these tasks but are inadequate. This course explains why researchers should nearly always choose true database applications for collecting and managing their research data. It covers the theory of relational database management systems (RDBMSs), which organizes data according to a model developed more than 50 years ago, which is used by virtually all major commercial database systems (e.g., Oracle, Microsoft SQL Server, MySQL, and SalesForce). Students will get hands-on experience developing systems for data collection and management using REDCap (a web-based data capture system) and Microsoft Access (the world's most popular desktop RDBMS), as well as moving and converting data between systems. The course will prepare students to begin their own research data collection and management, including the creation of data management plans both for grant applications and ongoing projects.

Online Syllabus


The objectives for this course are for participants to:

  • Comprehend the basics of the Relational Database Model, including key concepts such as tables, records, fields, data types, relationships, and primary/foreign keys.
  • Develop simple but useful data collection systems using REDCap.
  • Execute the basics of querying a multi-table, relational database using SQL.
  • Prepare (and budget) for data management in a research study.




Course Director: Michael Kohn, MD, MPP

Professor of Epidemiology & Biostatistics
email: [email protected]
Course Co-Director: Josh Senyak

email:  [email protected]


Each week, for seven weeks, curricular material is introduced with a lecture and an assignment in which students apply the material. Weekly computer lab sessions give students the opportunity to practice, ask questions, and interact with course faculty.

Lectures: Thursdays, 8:30 AM to 9:20 AM, July 21 through September 1, 2022.
In-person lectures, providing an overview of the curricular material for the week, are given 8:30 AM to 9:20 AM every Thursday at Mission Hall, beginning Thursday, July 21. A recording of the lecture is available on the course's syllabus and can be viewed by students who are unable to attend in person.

Computer Labs: Thursdays, 9:30 AM AM to 10:30 AM, July 21 through September 1, 2022.
The computer lab is held from 9:30 AM to 10:30 AM every Thursday at Mission Hall, beginning July 21. Most students use the lab session to complete the weekly assignment, either individually or in groups. Course faculty are available during the lab session to address questions on the assignment or on any aspect of the curriculum. Although lab attendance is highly recommended, students may complete the weekly assignments without attending the corresponding lab session.

All course materials and handouts will be posted on the course's online syllabus.


"Chapter 16: Data Management" by MA Kohn and TB Newman in Designing Clinical Research (5th ed.) Browner, WS et al. Lippincott Williams & Wilkins. 2022.

Books may be purchased either through the publisher or a variety of commercial venues (e.g.,

Microsoft Access

Microsoft Access will be used for several assignments. MS Access is not available for the Mac, but version 2016 or higher can be used on the PC. For students who do not already own MS Access, it can be used via the UCSF RAE (Research Analysis Environment, formerly called MyResearch), which is a secure data hosting service for UCSF researchers. In addition to providing secure, HIPAA-compliant storage for research study data, RAE provides remote-desktop access to several applications including Microsoft Access. We will provide enrolled EPI 218 students with a streamlined process for obtaining an RAE account. By the first computer lab in the course, you should have tested your RAE account and ensured that you can log in.


REDCap is a web-based research data collection system developed by an academic consortium based at Vanderbilt University. REDCap enables researchers to build browser-based data entry forms, surveys, and surveys with attached private data entry forms. The survey builder is similar to SurveyMonkey or Qualtrics. REDCap is typically only available through institutions, and this course will use the instance of REDCap maintained by UCSF Academic Research Systems.  We will provide students enrolled in EPI 218 with instructions for obatining a REDCap account with your RAE account.   You must have a functional REDCap log-in prior to the 2nd session of the course.



Weekly Assignments
The course has six weekly assignments. As students work through the instructions, they complete an online quiz or create files for upload to the course syllabus site. The six assignments are worth 60% of the final course grade.  A seventh optional extra-credit assignment uses R. 

Final Project

The final project has two parts: Generation of research study data collection and management system that you have built; and creation of a brief data collection and management plan The final project is worth 40% of the course grade.

Students not in full-year TICR Programs who satisfactorily pass all course requirements will, upon request, receive a Certificate of Course Completion

UCSF Graduate Division Policy on Disabilities

To Enroll

ATCR and MAS students use the Student Portal

Students taking individual courses:

Course Fees will be available May 1, 2023
How to pay (please read before applying)
Summer Course Schedule will be available May 1, 2023

Apply by July 10, 2023 for summer quarter.  Application available by May 1, 2023
Only one application needs to be completed for all courses desired during the quarter.