number of focal copy number changes, a lower incidence of homozygous deletions (e.g., tumor suppressor NF1 is rarely hit in genome-doubled cases) and a higher cancer recurrence rate.
As ABSOLUTE and similar methods are refined in the future, it is worth considering why they cannot deduce tumor purity and ploidy in all cases. In the pan-cancer data set, ABSOLUTE judged 9.1% of samples to be nonaberrant. These cancers have genomes with perfectly normal copy number profiles (at least at the resolution of the assay). It is therefore impossible to determine the sample purity from copy-number data alone, although this could theoretically be resolved using point mutation data. Another reason for failure is insufficient purity (observed in 7.3% of cases) because an excess of normal cells contaminated the sample. This is inherently a sample problem, and will affect some tumor types more than others. Finally, 6.9% of samples failed because they were too complex to interpret. For instance, when every region in the genome is subject to subclonal copy-number changes, it may become mathematically impossible to calculate absolute purity and ploidy. However, these may be very interesting cases biologically because of their ongoing evolution and the competition between subclones. Improved methods, possibly combining copy number and point mutation data, may be able to handle this complexity.
Tens of thousands of cancer genomes will be sequenced over the next few years. Methods such as ABSOLUTE will undoubtedly contribute toward their analysis, generating new insights into oncogenesis, tumor evolution and subclonal diversification. Future methods will likely be run directly on sequencing data, eliminating the cost and extra sample requirements of SNP arrays, which are now almost routinely performed alongside massively parallel sequencing experiments. Methods that integrate across different ‘views’ of the data and across different classes of mutation will become increasingly important to understand the complexity of cancer genomes. The work of Carter et al.1 exemplifies such a strategy, which combines multiple sources of data to analyze cancer genomes, revealing considerable
biological insights.