Consiglio Nazionale delle Ricerche

Tipo di prodottoContributo in atti di convegno
TitoloStatistical analysis of electrophoresis time series for improving basecalling in DNA sequencing
Anno di pubblicazione2006
FormatoElettronico
Autore/iTonazzini A., Bedini L.
Affiliazioni autoriCNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy
Autori CNR e affiliazioni
  • ANNA TONAZZINI
Lingua/e
  • inglese
AbstractIn automated DNA sequencing, the final algorithmic phase, referred to as basecalling, consists of the translation of four time signals in the form of peak sequences (electropherogram) to the corresponding sequence of bases. Commercial basecallers detect the peaks based on heuristics, and are very efficient when the peaks are distinct and regular in spread, amplitude and spacing. Unfortunately, in the practice the signals are subject to several degradations, among which peak superposition and peak merging are the most frequent. In these cases the experiment must be repeated and human intervention is required. Recently, there have been attempts to provide methodological foundations to the problem and to use statistical models for solving it. In this paper, we exploit a priori information and Bayesian estimation to remove degradations and recover the signals in an impulsive form which makes basecalling straightforward.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da-
Pagine a-
Pagine totali10
Rivista-
Numero volume della rivista-
Serie/Collana-
Titolo del volume-
Numero volume della serie/collana-
Curatore/i del volume-
ISBN-
DOI-
Editore-
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)-
Parole chiaveDNA sequencing
Link (URL, URI)-
Titolo convegno/congressoICDM 2006, Workshop on Mass-Data Analysis of Images and Signals in Medicine, Biotechnology and Chemistry MDA´2006
Luogo convegno/congressoLipsia
Data/e convegno/congresso13/07/2006
RilevanzaInternazionale
RelazioneContributo
Titolo parallelo-
Note/Altre informazioniCodice Puma: cnr.isti/2006-A2-32
Strutture CNR
  • ISTI — Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"
Moduli/Attività/Sottoprogetti CNR
  • INT.P02.003.004 : Modelli, metodi ed algoritmi per l'analisi di dati genomici e e di processi biologici
Progetti Europei-
Allegati
Statistical analysis of electrophoresis time series for improving basecalling in DNA sequencing (documento privato )
Tipo documento: application/pdf

Dati storici
I dati storici non sono modificabili, sono stati ereditati da altri sistemi (es. Gestione Istituti, PUMA, ...) e hanno solo valore storico.
Area disciplinareComputer Science & Engineering
NoteIn: ICDM 2006, Workshop on Mass-Data Analysis of Images and Signals in Medicine, Biotechnology and Chemistry MDA´2006 (Lipsia, 13 July 2006). Proceedings, pp. 10-. 2006.
Descrizione sintetica del prodottoABSTRACT: In automated DNA sequencing, the final algorithmic phase, referred to as basecalling, consists of the translation of four time signals in the form of peak sequences (electropherogram) to the corresponding sequence of bases. Commercial basecallers detect the peaks based on heuristics, and are very efficient when the peaks are distinct and regular in spread, amplitude and spacing. Unfortunately, in the practice the signals are subject to several degradations, among which peak superposition and peak merging are the most frequent. In these cases the experiment must be repeated and human intervention is required. Recently, there have been attempts to provide methodological foundations to the problem and to use statistical models for solving it. In this paper, we exploit a priori information and Bayesian estimation to remove degradations and recover the signals in an impulsive form which makes basecalling straightforward.