Homework #7

Due 11:59 pm EST, Friday April 22nd, 2022.

Email your solutions (both .ipnb and .html files) to: compscbio@gmail.com.

Background:

There is extensive, dyanmic cell-cell communication during development. In the Cell fate engineering (CFE) world, we frequently ignore this and make the convenient assumption that our manipulation of signaling pathways via exposure to exogenous ligands and inhibitors will drown out such cell-cell communications that impact differentiation. In this homework, you will infer the cell-cell communications that are occuring during … wait for it … directed differentiation of mESCs. You will use a superset of the data from HW2, which was a timecourse of directed differentation. Recall that this data also included some residual fibroblasts. Unfortunately, there are not many tools for performing CCC available in Python.

The data and Ligand receptor database.

  1. mESC differention day0-day4 This is the raw counts data. It is a superset of the data from HW2, which was actually published in Emily Su’s Epoch paper. There are other lineages beyond mesoderm. We have also included leiden labels, and timepoints, and treatments. The cluster labels and timepoints are all that you will need.

  2. Mouse ligand-receptor pairs as defined by CellTalkDB. The first three columns are most useful: l-r pair, ligand gene name, receptor gene name. Feel free to explore CellTalkDB more.

Your mission:

1.) Write a function in Python that will quantify potential cell-cell interactions based on expression of ligands and cognate receptors. The input will be the expression data with cell labels (you can use the leiden clusters that we provide), as well as the ligand receptor database. The output should quantify the likely interaction between each pair of clusters for each ligand-receptor pair. Feel free to implement any of the methods that we discussed in class, or devise your own method. Please see the review paper that we posted to Blackboard for more comprehensive discussion of the topic.

2.) Apply your method to cells from each timepoint of the mESC differentiation data.

3.) Devise an efficient way to visualize the most prominent cell-cell interactions that characterize this data and use it.