Tuesday, May 3, 2011

Treebase trees from R

UPDATE: See Carl Boettiger's functions/package at Github for searching Treebase here.



Treebase is a great resource for phylogenetic trees, and has a nice interface for searching for certain types of trees. However, if you want to simply download a lot of trees for analyses (like that in Davies et al.), then you want to be able to access trees in bulk (I believe Treebase folks are working on an API though). I wrote some simple code for extracting trees from Treebase.org.

It reads an xml file of (in this case consensus) URL's for each tree, parses the xml, makes a vector of URL's, reads the nexus files with error checking, remove trees that gave errors, then a simple plot looking at metrics of the trees.

Is there an easier way to do this?




2 comments:

  1. Hi Scott,

    Very nice example. I've been curious about this too; I've just put together a little R implementation of the treebase phylows API to help make this easier. If you'd like to look at the package, it's here: https://github.com/cboettig/treeBASE The examples in demos dir should give a good place to start. Let me know what you think or if you have any feedback too.

    ReplyDelete
  2. Hi Carl,

    Your search_treebase function is nice. And you can actually search instead of just downloading all trees. Nope, I can't think of any feedback at the moment at least.

    Scott

    ReplyDelete