Preface

This book was written largely during 2021–2022, a pandemic period where SARS-CoV-2 brought massive changes to our lives and our work, and the desire to share and use pathogen genomic data for public health action greatly accelerated. As we worked – not just using sequence data to understand SARS-CoV-2 epidemiology but also teaching public health officials how to interpret epidemiological dynamics from those data – we identified a significant gap in the genomic epidemiology literature. There are excellent review papers providing short, accessible summaries of how pathogen genomics can support aims in public health1. On the other end of the spectrum are scientific papers describing the methods and findings of genomic epidemiological studies in technical detail. We felt that something was missing from the middle; a resource that introduced the basic theory and utility of genomic epidemiology, with a focus on how to use those data in public health practice, and provided step-by-step guidance for using pathogen genomic data in epidemiologic investigations. This book is our attempt to fill that gap.

It is our intention that after reading this handbook, you should be able to:

  • Understand how genomic epidemiology can support certain investigations;

  • Design genomic surveillance data collection to meet your needs; and

  • Apply genomic epidemiology to routine investigations in public health practice;

Given this book’s narrow focus on using pathogen genomic data in applied epidemiology, there are many things that this book will not be. It is not a review of the primary literature in genomic epidemiology, nor do we aim to provide an exhaustive description of all the questions that scientists can investigate with genomic epidemiology. We will not present every method for genomic epidemiological analysis, nor provide information on the entire suite of available analytic tools. Rather, this handbook is meant as a practical guide to applied genomic epidemiology. As such, we focus on the questions that we see public health practitioners encounter most frequently, and present analytical methods and tools that are easily used within public health departments and other applied epidemiology settings.

For whom is this handbook written?

This book aims to help you understand how to use pathogen genomic data for public health surveillance, outbreak response, and public health decision-making. This book is for you if you are already involved in, or want to develop a program for, pathogen genomic data collection, pathogen genomic data analysis, pathogen genomic data interpretation, and/or policy evaluation in public health. For example:

  • Public health microbiologists or lab directors developing a genomic surveillance program.

  • Bioinformaticians working in public health, and wanting to increase their familiarity with the goals, theory, and approaches specific to genomic epidemiology.

  • Epidemiologists who typically work with surveillance data, but who want to integrate molecular information into the investigations they conduct.

  • Health officers or other policy makers who want to understand more about pathogen genomic data as a source of epidemiological information.

  • Academics collaborating with public health institutions who want to learn more about genomic epidemiology and how these methods support the standard questions we ask in public health.

How should you read this handbook?

You can think of this resource as funnel-shaped, moving from imperative concepts for all readers towards more specific implementation information that is most pertinent for those readers specifically involved in data collection and analysis. We recommend that all readers read the first two chapters introducing the utility of pathogen genomics in public health and the fundamental theory underlying genomic epidemiology. These two sections will help you understand how genomic data enrich public health investigations and the basic mechanics behind genomic epidemiology. These sections will also introduce the common language you’ll encounter when discussing genomic epidemiology studies.

Readers involved in designing or implementing genomic surveillance and epidemiology programs within their agencies should also read the chapters introducing sampling strategies and outlining the broad use cases for genomic epidemiology. These sections will help describe when you can use pathogen genomic data and how you should approach collecting it.

From there, readers who wish to see investigations falling under these different use cases in action should peruse the provided case studies. Our intent with the case studies is to show step-by-step why we initiated an investigation, how we framed our question of interest, and how we investigated it, including quality control, evaluating competing hypotheses, and weighing uncertainty. While narratives presented in the published literature are by design cohesive and smooth, with these case studies we aim to show exactly how an investigation occurred, including bumps and questions along the way.

To help guide readers into hands-on analysis of pathogen genomic data, Chapter 6 introduces “Tools and Methods for Applied Genomic Epidemiological Analysis”, which lives towards the end of this handbook since it is primarily pertinent to those readers actively involved in genomic epidemiological analysis. We are well aware that this chapter will go out of date the quickest; however, we feel that the tangible support that this chapter provides for getting started warrants its inclusion.

Finally, this book concludes with a deeper dive into theory and analysis, touching on the greater genomic complexity that exists for both viruses and bacteria beyond what we introduced in the first theory chapter. Some concepts in these chapters may veer slightly towards the academic. However, we’ve encountered enough situations where this additional knowledge was useful that we wanted it to be available to readers. While you might skip these chapters initially, you will likely come back to them if you dig more deeply into genomic epidemiology.

We’ve read enough textbooks to know that reading textbooks generally isn’t “fun”. But we do sincerely hope that this resource helps you begin to integrate pathogen genomic data into your public health work, wherever you’re starting from. And we hope that the fun will come from getting to explore this new territory.