Figure S2.
Cancer Functional Events on Cancer Cell Lines, Related to Figure 2
(A) Status of 1,273 Cancer Functional Events (CFEs) identified from primary tumor data in 1,001 cancer cell lines. Each column is a cell line, colors at the top indicate different cancer types, and each row is a CFE. The heatmap is horizontally divided in three parts with (i) high confidence cancer driver genes; (ii) focal recurrently aberrant copy number segments and (iii) informative CpG islands. A white space denotes absence of the functional events, whereas presence is indicated using the color schemes in the adjacent legends.
(B) Number of cancer-specific CFEs occurring in at least one cell line from the corresponding tissue, across the three molecular data types. Box plots on the right show the frequency of the missing CFEs in the primary tumors for each cancer type. Percentages of missing cancer genes for each cancer types are grouped based on their confidence (i.e., A = more than two signals of positive selection, B = two signals of positive selection, C = one signal of positive selection).
(C) Example of CFE frequency scatter plot for COAD/READ. Each circle is a CFE whose occurrence frequency across cell lines and primary tumors is given by its coordinates, respectively on the x- and y axis. Different CFE types are indicated by color and corresponding correlation scores are reported in the inset.
(D) Nearest neighbor analysis for similarities among cell lines and primary tumors based on frequency profiles accounting for all the CFEs. The proximity of two points is proportional to the correlation across the two corresponding CFE frequency profiles. A line connects a point to its closest neighbor (indicated by the small black dot).
(E) Performance of a k-nearest-neighbor classifier based on a comprehensive correlation distance between cell lines and primary tumors, accounting for all the CFEs.