knowledge and ideas
Integrative omics for health and disease
Advances in omics technologies — such as genomics, transcriptomics, proteomics and metabolomics — have begun to enable personalized medicine at an extraordinarily detailed molecular level. Individually, these technologies have contributed medical advances that have begun to enter clinical practice. However, each technology individually cannot capture the entire biological complexity of most human diseases. Integration of multiple technologies has emerged as an approach to provide a more comprehensive view of biology and disease. In this Review, we discuss the potential for combining diverse types of data and the utility of this approach in human health and disease. We provide examples of data integration to understand, diagnose and inform treatment of diseases, including rare and common diseases as well as cancer and transplant biology. Finally, we discuss technical and other challenges to clinical implementation of integrative omics.
Long-read sequence and assembly of segmental duplications
Mitchell R. Vollget et. al
We have developed a computational method based on polyploid phasing of long sequence reads to resolve collapsed regions of segmental duplications within genome assemblies. Segmental Duplication Assembler (SDA; https://github.com/mvollger/SDA) constructs graphs in which paralogous sequence variants define the nodes and long-read sequences provide attraction and repulsion edges, enabling the partition and assembly of long reads corresponding to distinct paralogs. We apply it to single-molecule, real-time sequence data from three human genomes and recover 33–79 megabase pairs (Mb) of duplications in which approximately half of the loci are diverged (<99.8%) compared to the reference genome. We show that the corresponding sequence is highly accurate (>99.9%) and that the diverged sequence corresponds to copy-number-variable paralogs that are absent from the human reference genome. Our method can be applied to other complex genomes to resolve the last gene-rich gaps, improve duplicate gene annotation, and better understand copy-number-variant genetic diversity at the base-pair level.
De novo assembly of haplotype-resolved genomes with trio binning
Sergey Koren et. al
Complex allelic variation hampers the assembly of haplotype-resolved sequences from diploid genomes. We developed trio binning, an approach that simplifies haplotype assembly by resolving allelic variation before assembly. In contrast with prior approaches, the effectiveness of our method improved with increasing heterozygosity. Trio binning uses short reads from two parental genomes to first partition long reads from an offspring into haplotype-specific sets. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction. We used trio binning to recover both haplotypes of a diploid human genome and identified complex structural variants missed by alternative approaches. We sequenced an F1 cross between the cattle subspecies Bos taurus taurus and Bos taurus indicus and completely assembled both parental haplotypes with NG50 haplotig sizes of >20 Mb and 99.998% accuracy, surpassing the quality of current cattle reference genomes. We suggest that trio binning improves diploid genome assembly and will facilitate new studies of haplotype variation and inheritance.
A Tunable Mechanism Determines the Duration of the Transgenerational Small RNA Inheritance in C. elegans
Lea Houri-Ze’evi et. al
In C. elegans, small RNAs enable transmission of epigenetic responses across multiple generations. While RNAi inheritance mechanisms that enable “memorization” of ancestral responses are being elucidated, the mechanisms that determine the duration of inherited silencing and the ability to forget the inherited epigenetic effects are not known. We now show that exposure to dsRNA activates a feedback loop whereby gene-specific RNAi responses dictate the transgenerational duration of RNAi responses mounted against unrelated genes, elicited separately in previous generations. RNA-sequencing analysis reveals that, aside from silencing of genes with complementary sequences, dsRNA-induced RNAi affects the production of heritable endogenous small RNAs, which regulate the expression of RNAi factors. Manipulating genes in this feedback pathway changes the duration of heritable silencing. Such active control of transgenerational effects could be adaptive, since ancestral responses would be detrimental if the environments of the progeny and the ancestors were different.