-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathLab02.qmd
More file actions
52 lines (35 loc) · 3.3 KB
/
Lab02.qmd
File metadata and controls
52 lines (35 loc) · 3.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
title: "Lab02: Design a Genome Project"
bibliography: references.bib
csl: nature.csl
lang: en-US
---
## Lab 2 {#sec-lab02}
In Chapter 1, we introduced three broad perspectives on bioinformatics and genomics (cell, organism, tree of life) [@pevsner2015]. Later in the textbook, Pevsner expands these 3 perspectives into five basic perspectives on genome sequencing.
For this lab, you will use these 5 perspectives, along with your new understanding of sequencing technologies from Chapter 2, to sketch a 1–2 page narrative for a hypothetical genome project [@pevsner2015; @vandijk2018; @vandijk2023].
The five perspectives on genomics discussed in Pevsner Chapter 15 can be summarized as:
- **Catalog genomic information**: basic genome features (size, chromosomes, GC content, repeats, gene counts), based on sequencing, assembly, and annotation.
- **Catalog comparative genomic information**: whole‑genome comparisons to related species, orthologs, divergence times, lateral gene transfer, using whole‑genome alignments and genome browsers.
- **Biological principles**: how genome structure and variation underlie development, metabolism, behavior, and evolutionary processes (e.g., genome size evolution, polyploidization, gene birth/death).
- **Human disease relevance**: how genomes relate to disease in humans or plants, including SNPs, linkage and association studies, and host–pathogen interactions.
- **Bioinformatics aspects**: databases, software, and visualization tools that make genome analysis possible.
These map naturally onto the three perspectives in Chapter 1: cell‑scale questions, organism‑level questions, and tree‑of‑life/comparative questions [@pevsner2015].
### The Assignment
Write a 1–2 page narrative of the genome analysis project you have designed. To guide you, begin by reflecting on the following questions:
- _If you could sequence the genomes of 100 individuals from any species, which species would you choose?_
- _What hypotheses would you test, how would you perform data analyses, and what resources would you require in terms of hardware, software, and collaborators?_
- _What ethical issues might arise in sequencing these genomes?_
Specifically, think about:
- Which sequencing platforms and coverage would you choose (Chapter 2)?
- Which file formats and QC steps would your workflow rely on (Chapter 3)? [@pevsner2015; @vandijk2018; @vandijk2023]
For your design narrative:
- Choose a species and justify why 100 genomes from this species would be informative. - State one or two main hypotheses, or guiding questions if discovery-based.
- Identify which of the five perspectives (and which of the three Chapter 1 perspectives) your project emphasizes.
- Outline your sequencing strategy (platforms, approximate coverage, sample types) using concepts from Chapter 2.
- Sketch the main analysis steps (from raw FASTQ through alignment/assembly and downstream analyses) using ideas from Chapters 1 and 3.
- Briefly note any major ethical issues if humans or other sensitive species are involved.
The focus of your design narrative should be on **motivation and approach**. You may refer to assigned readings (e.g., @hotaling2021; @marks2021; @bogan2025) in your narrative, but you do not need a formal reference list.
## References
```{=latex}
\printbibliography[heading=none]
```