Hands-on activity: Central dogma of molecular biology in our own cells#

Part 1. Human genome, transcriptome, proteome#

There are multiple resources listed towards the bottom of this page, which provide information about the human genome. For example, you can download information about the human genome, transcriptome, and proteome: from RefSeq or from GENCODE

Using these resources, report on the following questions.

  1. How many genes are there in the human genome ?

  2. How many transcripts are there in the human transcriptome?

  3. What are the different kinds/classes of RNA present in the human transcriptome ?

  4. How many protein-coding genes are there in the human genome?

  5. How many proteins are there in the human proteome?

  6. Plot the distribution of lengths of genes.

  7. Plot the distribution of the length of exons.

  8. Plot the distribution of the length of introns.

Part 2. Greatest hits of the human genome#

Although maybe a bit outdated, [Dolgin, 2017] compiled a list of the top 10 most studied genes in the human genome. Choose 3 out of the 10; and for each gene, answer the following questions.

Questions#

  1. Where is it located in the genome, based on hg38 reference genome coordinates?

  2. In which human tissues is the gene known to be expressed ?

  3. How many gene isoforms (alternatively spliced mRNAs) are known for each gene?

  4. What is the average length of exons and average length of introns, across all isoforms?

  5. How many protein isoforms are known for each gene?

  6. What functions of the genes/proteins are known?

Useful databases and resources#

  • USCS Genome Browser

  • UniProt

  • GENCODE

  • RefSeq

  • Ensembl

  • GTEx

  • gffutils (a GFF parser)

References#

[Dol17]

Elie Dolgin. The most popular genes in the human genome. Nature, 551(7681):427–431, November 2017.