AIST > CBRC > SEQ > MCF
Approximate Bayesian Approach to Mapping Paired-End DNA Reads to a
Reference Genome has been published in Bioinformatics!
2013-01: Finding Protein-Coding Genes through Human Polymorphisms has been published in PLoS ONE!
I am a computational biologist working at the CBRC, which is part of AIST. The CBRC is on the island of Odaiba, a futuristic entertainment district near central Tokyo.
|Age||3 thousand years||3 billion years|
|Length||4 million letters||3 billion letters|
Genomes are palimpsests of unthinkable antiquity, which hold the secrets to technology more advanced than any achievement of human civilization. We live in exciting times: genomes have been sequenced only recently, and we have barely begun to decipher them.
I have two research styles: developing software tools for analyzing biological data, and investigating biological questions computationally. Recently I have been sucked into tool development, but I'd like to return to biological questions sometime.
|LAST is a general-purpose, high-throughput sequence aligner. It can: compare multi-gigabase datasets to each other, use sequence quality data in a rigorous fashion, align DNA to proteins with frameshifts, estimate the reliability of each aligned column.||Genome Research, 2011|
|tantan masks low-complexity regions in biological sequences. It aims to prevent spurious alignments when searching for homologs (evolutionarily-related sequences). It does so much more reliably than previous methods.||Nucleic Acids Research, 2011|
|seg-suite provides tools for manipulating segments and alignments. It can compose alignments, find intersections, etc.||Unpublished|
|DNemulator is a package for simulating DNA sequencing errors, polymorphisms, cytosine methylation and bisulfite conversion.||Nucleic Acids Research, 2012|
|Paraclu is a method for finding clusters in data attached to sequences. For example, transcription start counts in genome sequences. It imposes minimal prior assumptions, and it typically finds a hierarchy of clusters within clusters.||Genome Research, 2008|
|GLAM2 is a method for discovering motifs (re-occurring sequence patterns) in sequences. It allows motif instances to vary by insertions and deletions. It is part of the MEME Suite.||PLoS Computational Biology, 2008|
|Clover tests whether known sequence motifs are over-represented in a set of DNA sequences.||Nucleic Acids Research, 2004|
|Cluster-Buster finds clusters of pre-specified motifs in DNA sequences.||Nucleic Acids Research, 2003|
You can find most of my published articles by searching PubMed for Frith MC. (A few are just Frith M.)
I welcome postdocs and visitors to come and work with me, but you would probably need your own funding. Likely sources include JSPS and HFSP, and there will be others depending on your nationality and other circumstances. Here is a funding guide for Europeans (pdf). Strong quantitative skills are desirable, e.g. from a background in physics or mathematics. Knowledge of biology is not essential, but willingness to learn about and deal with messy biological details is. Here are some project ideas, although original projects are especially welcome. Knowledge of Japanese is not necessary. 日本人も歓迎です。You can apply to work at the CBRC here.
Email: martin followed by @ followed by cbrc.jp. This
may change periodically to avoid spam. For personal email, please use
my gmail.com address.
Address: AIST Tokyo Waterfront Bio-IT Research Building, 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan. Access.
Tel: I prefer email. Fax: +81-3-3599-8081
Last modified 2013-04-25