The definition of proteins and domains are obtained from SwissPfam directly, where proteins are identified by their Swissprot, TrEMBL ids. The SGD ids for yeast genes are downloaded from the gene name table in the SGD database, and mapped to their corresponding SwissProt, TrEMBL ids. With the mapping between the Yeast proteins and their corresponding Swissprot ids, we then map SwissPfam 6.5 domains to the Yeast proteins. The final version of the protein-domain relationship is given as below.
Text file (585K byte)2. Protein-protein interaction data.
Compressed file (232K byte--Recommended)
There are two sources of yeast two-hybrid protein-protein interactions. One is from Fields' group released with their Nature paper (Uetz et al. 2001). Another is from Ito's group released with their PNAS paper (Ito et al. 2001). The combined data contain 5719 protein-protein interactions.
Uetz+Ito, Text file (157K byte)
Uetz+Ito, Compressed file (47K byte----Recommended)
MIPS, Text file (171K byte)
MIPS, Compressed file (39K byte----Recommended)
We also use gene expression data to verify our prediction. The expression data contain 2465 genes with 79 time points (original 2467 but two genes duplicated):
Text file (1.4M byte)
Compressed file (347K byte----Recommended)
We compute domain-domain interaction probabilities from Y2H protein-protein interactions, and then use these domain-domain interaction probabilities to compute the interaction probability between every pair of proteins. The prdiction results with a false positive rate fp=2.5E-4 and a false negative rate fn=0.80 are listed blow.
Text file (452K byte)
Compressed file (95K byte---Recommended)
Text file (5.07M byte)
Compressed file (866K byte---Recommended)
We compute specificity and sensitivity for our predictions based on different fns (fp fixed) and compare them with those for the association method.
The matches between our predictions and the MIPS data are counted, and compared with random matches (fold numbers). The results are listed in the following tables.
We compute the gene expression correlation for each predicted interacting protein pair and compare the predictions against randomly chosen pairs. The statistics are given in following table.
We repeat the above procedures on MIPS data with fp=0.00, fn=0.80 :
Text file (207K byte)
Compressed file (44K byte---Recommended)
Text file (3.03M byte)
Compressed file (507K byte---Recommended)
We compute specificity and sensitivity for our predictions based on different fns (fp fixed to 0.0) and compare with the association method.
Similarly, the following tables list all the pairwise matches among our predictions, the MIPS data, and the Y2H data.
We computed the expression correlation for each interacting protein pair and compare the predictions against randomly chosen pairs. The statistics are given in following table.
Tim Chen
Last modified: Wednesday Jun 12, 2002.