Moreover, there was inverse correlation between H3K27me3 levels and expression (Fig. will yield new insights into Gatifloxacin hydrochloride cancer not anticipated by existing knowledge. Multiple myeloma (MM) is an incurable malignancy of mature B-lymphoid cells, and its pathogenesis is only partially comprehended. About 40% of cases harbor chromosome translocations resulting in over-expression of genes (including and their juxtaposition to the immunoglobulin heavy chain (IgH) locus1. Other cases exhibit hyperdiploidy. However, these abnormalities are likely insufficient for malignant transformation because they are also observed in the pre-malignant syndrome known as (MGUS). Malignant progression events include activation of and and activation of the NF-B pathway1-3. More recently, loss-of-function mutations in the histone demethylase have also been reported4. A powerful way to understand the molecular basis of cancer is to sequence either the entire genome or the protein-coding exome, comparing tumor to normal from the same patient in order to identify the acquired somatic mutations. Recent reports have described the sequencing of whole genomes from a single patient5-9. While useful, we hypothesized that a larger number of cases would permit the identification of biologically relevant patterns that would not otherwise be evident. Scenery of MM mutations We studied 38 MM patients (Supplementary Table 1), performing whole-genome sequencing (WGS) for 23 patients and whole-exome sequencing (WES; assessing 164,687 exons) for 16 patients, with one patient analyzed by both approaches (Supplementary Information). WES is usually a cost-effective strategy to identify protein-coding mutations, but cannot detect non-coding mutations and rearrangements. We identified tumor-specific mutations by comparing each tumor to its corresponding normal, using a series of algorithms designed to detect point mutations, small insertions/deletions (indels) and other rearrangements (Supplementary Fig. 1). Based on WGS, the frequency of tumor-specific point mutations was 2.9 per million bases, corresponding to approximately 7,450 point mutations per sample CXCR4 across the genome, including an average of 35 amino acid-changing point mutations plus 21 chromosomal rearrangements disrupting protein-coding regions (Supplementary Tables 2 and 3). The mutation-calling algorithm was found to be highly accurate, with a true positive rate of 95% for point mutations (Supplementary text, Supplementary Tables 4 and 5, and Supplementary Fig. 2). The mutation rate across the genome rate varied greatly depending on base composition, with mutations at CpG dinucleotides occurring 4-fold more commonly than mutations at A or T bases (Supplementary Fig. 3a). In addition, even after correction for base composition, the mutation frequency in coding regions was lower than that observed in intronic and intergenic regions Gatifloxacin hydrochloride (p < 110?16; Supplementary Fig. 3b), potentially owing to unfavorable selective pressure Gatifloxacin hydrochloride against mutations disrupting coding sequences. There is also a lower mutation rate in intronic regions compared to intergenic regions (p < 110?16), which may reflect transcription-coupled repair, as previously suggested10, 11. Consistent with this explanation, we observed a lower mutation rate in introns of genes expressed in MM compared to those not expressed (Fig. 1a). Open in a separate window Physique 1 Evidence for transcription-coupled repair and functional importance (FI) of statistically significant mutations(a) Intronic mutation rates subdivided by gene expression rates in MM. Rates of gene expression were estimated by proportion of Affymetrix Present (P) calls in 304 primary MM samples. Error bars indicate standard deviation. (b) FI scores were generated for all those point mutations and divided into distributions for non-significant mutations (upper histogram) and significant mutations (lower). Comparison of distributions is the Kolmogorov-Smirnov statistic. Frequently mutated genes We next focused on the distribution of somatic, non-silent protein-coding mutations. We estimated statistical significance by comparison to the background distribution of mutations (Supplementary Information). 10 genes showed statistically significant rates of protein-altering mutations (significantly mutated genes) at a False Discovery Rate (FDR) of 0.10 (Table 1). To investigate their functional importance, we compared their predicted consequence (based on evolutionary conservation and nature of the amino acid change) to the distribution of all coding mutations. This analysis showed a dramatic skewing of functional importance (FI) scores12 for the 10 significantly mutated genes (p = 7.610?14; Fig. 1b), supporting their biological relevance. Even after RAS and p53 mutations are excluded from the analysis, the skewing remained significant (p < 0.01). Table 1 Statistically significant protein-coding mutations in MMTerritory (N) refers to total covered territory.