Tuesday, February 24, 2015

Fwd: NxTrim: optimized trimming of Illumina mate pair reads

Fwd: please follow footer link

NxTrim: optimized trimming of Illumina mate pair reads: "

Motivation: Mate pair protocols add to the utility of paired-end sequencing by boosting the genomic distance spanned by each pair of reads, potentially allowing larger repeats to be bridged and resolved. The Illumina Nextera Mate Pair (NMP) protocol employs a circularisation-based strategy that leaves behind 38bp adapter sequences which must be computationally removed from the data. While 'adapter trimming' is a well-studied area of bioinformatics, existing tools do not fully exploit the particular properties of NMP data and discard more data than is necessary.

Results: We present NxTrim, a tool that strives to discard as little sequence as possible from NMP reads. NxTrim makes full use of the sequence on both sides of the adapter site to build 'virtual libraries' of mate pairs, paired-end reads and single-ended reads. For bacterial data, we show that aggregating these datasets allows a single NMP library to yield an assembly whose quality compares favourably to that obtained from regular paired-end reads.

Availability: The source code is available at https://github.com/sequencing/NxTrim

Contact:acox@illumina.com

"

(Via Bioinformatics - Advance Access.)

Fwd: rbamtools: an R interface to samtools enabling fast accumulative tabulation of splicing events over multiple RNA-seq samples

Fwd: please follow footer link

rbamtools: an R interface to samtools enabling fast accumulative tabulation of splicing events over multiple RNA-seq samples: "

Summary: The open source environment R isf the most widely used software to statistically explore biological data sets including sequence alignments. BAM is the de facto standard file format for sequence alignment. With rbamtools, we provide now a full spectrum of accessibility to BAM for R users such as reading, writing, extraction of subsets and plotting of alignment depth where the script syntax closely follows the SAM/BAM format. Additionally, rbamtools enables fast accumulative tabulation of splicing events over multiple BAM files.

Availability and implementation: rbamtools is available on CRAN and on R-Forge.

Contact:kaisers@med.uni-duesseldorf.de

Supplementary information:Supplementary data are available at Bioinformatics online.

"

(Via Bioinformatics - Advance Access.)

Monday, February 23, 2015

Fwd: Vanno: A Visualization-Aided Variant Annotation Tool

Fwd: please follow footer link

Vanno: A Visualization-Aided Variant Annotation Tool: "

ABSTRACT

Next-generation sequencing (NGS) technologies have revolutionized the field of genetics and are trending toward clinical diagnostics. Exome and targeted sequencing in a disease context represent a major NGS clinical application, considering its utility and cost-effectiveness. With the ongoing discovery of disease-associated genes, various gene panels have been launched for both basic research and diagnostic tests. However, the fundamental inconsistencies among the diverse annotation sources, software packages, and data formats have complicated the subsequent analysis. To manage disease-associated NGS data, we developed Vanno, a Web-based application for in-depth analysis and rapid evaluation of disease-causative genome sequence alterations. Vanno integrates information from biomedical databases, functional predictions from available evaluation models, and mutation landscapes from TCGA cancer types. A highly integrated framework that incorporates filtering, sorting, clustering, and visual analytic modules is provided to facilitate exploration of oncogenomics datasets at different levels, such as gene, variant, protein domain, or three-dimensional structure. Such design is crucial for the extraction of knowledge from sequence alterations and translating biological insights into clinical applications. Taken together, Vanno supports almost all disease-associated gene tests and exome sequencing panels designed for NGS, providing a complete solution for targeted and exome sequencing analysis. Vanno is freely available at http://cgts.cgu.edu.tw/vanno.

Thumbnail image of graphical abstract

Vanno supports almost all disease-associated gene tests and exome sequencing panels designed for NGS, providing a complete solution for targeted and exome sequencing analysis. Vanno also integrates information from biomedical databases, functional predictions from available evaluation models, and mutation landscapes from TCGA cancer types. A highly integrated framework that incorporates filtering, sorting, clustering, and visual analytic modules is provided to facilitate exploration of oncogenomics datasets at different levels, such as gene, variant, protein domain, or three-dimensional structure.

"

(Via human mutation.)

Fwd: mit-o-matic: A comprehensive computational pipeline for clinical evaluation of mitochondrial variations from next-generation sequencing datasets

Fwd: please follow footer link

mit-o-matic: A comprehensive computational pipeline for clinical evaluation of mitochondrial variations from next-generation sequencing datasets: "

ABSTRACT

The human mitochondrial genome has been reported to have a very high mutation rate as compared with the nuclear genome. A large number of mitochondrial mutations show significant phenotypic association and are involved in a broad spectrum of diseases. In recent years, there has been a remarkable progress in the understanding of mitochondrial genetics. The availability of Next Generation Sequencing technologies have not only reduced sequencing cost by orders of magnitude but has also provided us good quality mitochondrial genome sequences with high coverage, thereby enabling decoding of a number of human mitochondrial diseases. In this study, we report a computational and experimental pipeline to decipher the human mitochondrial DNA (mtDNA) variations and examine them for their clinical correlation. As a proof of principle, we also present a clinical study of a patient with Leigh disease and confirmed maternal inheritance of the causative allele.  The pipeline is made available as a user-friendly online tool to annotate variants and find haplogroup, disease association and heteroplasmic sites. To the best of our knowledge, this is the first and the most comprehensive tool for clinical evaluation of mitochondrial genomic variations from Next Generation Sequencing datasets. The tool is freely available at http://genome.igib.res.in/mitomatic/.

This article is protected by copyright. All rights reserved

"

(Via human mutation.)

Fwd: Oncotator: Cancer Variant Annotation Tool

Fwd: please follow footer link

Oncotator: Cancer Variant Annotation Tool: "

ABSTRACT

Oncotator is a tool for annotating genomic point mutations and short nucleotide insertions/deletions (indels) with variant- and gene-centric information relevant to cancer researchers. This information is drawn from 14 different publicly available resources that have been pooled and indexed, and we provide an extensible framework to add additional data sources.  Annotations linked to variants range from basic information, such as gene names and functional classification (e.g. missense), to cancer-specific data from resources such as the Catalogue of Somatic Mutations in Cancer (COSMIC), the Cancer Gene Census, and The Cancer Genome Atlas (TCGA). For local use, Oncotator is freely available as a python module hosted on Github (https://github.com/broadinstitute/oncotator). Furthermore, Oncotator is also available as a web service and web application at http://www.broadinstitute.org/oncotator/. ©2015 Wiley-Liss, Inc.

This article is protected by copyright. All rights reserved

"

(Via human mutation.)

Monday, January 26, 2015

BLAST: Create Custom Databases for Web BLAST - YouTube

Create a subset database on the fly to speed up your homology search:
restrict your subset to

* a set of organisms
* a range of transcript lengths
* a range of protein lengths
* a range of protein MW
* any other Entrez query

Several very useful webcasts are found on youtube

Create Custom Databases for Web BLAST

!! also watch these
BLAST Results: Expect Values, part 1
BLAST Results: Expect Values, part 2


Have a nice Blast!

Monday, January 19, 2015

Fwd: Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data

Fwd: please follow footer link

Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data: "

The current version of the Human Disease Ontology (DO) (http://www.disease-ontology.org) database expands the utility of the ontology for the examination and comparison of genetic variation, phenotype, protein, drug and epitope data through the lens of human disease. DO is a biomedical resource of standardized common and rare disease concepts with stable identifiers organized by disease etiology. The content of DO has had 192 revisions since 2012, including the addition of 760 terms. Thirty-two percent of all terms now include definitions. DO has expanded the number and diversity of research communities and community members by 50+ during the past two years. These community members actively submit term requests, coordinate biomedical resource disease representation and provide expert curation guidance. Since the DO 2012 NAR paper, there have been hundreds of term requests and a steady increase in the number of DO listserv members, twitter followers and DO website usage. DO is moving to a multi-editor model utilizing Protégé to curate DO in web ontology language. This will enable closer collaboration with the Human Phenotype Ontology, EBI's Ontology Working Group, Mouse Genome Informatics and the Monarch Initiative among others, and enhance DO's current asserted view and multiple inferred views through reasoning.

"

(Via NAR.)