Tuesday, June 28, 2011

rbold: An R Interface for Bold Systems barcode repository

Have you ever wanted to search and fetch barcode data from Bold Systems?

I am developing functions to interface with Bold from R. I just started, but hopefully folks will find it useful.

The code is at Github here. The two functions are still very buggy, so please bring up issues below, or in the Issues area on Github. For example, some searches work and other similar searches don't. Apologies in advance for the bugs.

Below is a screenshot of an example query using function getsampleids to get barcode identifiers for specimens. You can then use getseqs function to grab barcode data for specific specimens or many specimens.
Screen shot 2011-06-28 at 9.24.00 AM.png

Wednesday, June 22, 2011

iEvoBio 2011 Synopsis

We just wrapped up the 2011 iEvoBio meeting. It was awesome! If you didn't go this year or last year, definitely think about going next year.


Here is a list of the cool projects that were discussed at the meeting (apologies if I left some out):
  1. Vistrails: workflow tool, awesome project by Claudio Silva
  2. Commplish: purpose is to use via API's, not with the web UI
  3. Phylopic: a database of life-form silouhettes, including an API for remote access, sweet!
  4. Gloome
  5. MappingLife: awesome geographic/etc data visualization interace on the web
  6. SuiteSMA: visualizating multiple alignments
  7. treeBASE: R interface to treebase, by Carl Boettiger
  8. VertNet: database for vertebrate natural history collections
  9. RevBayes: revamp of MrBayes, with GUI, etc. 
  10. Phenoscape Knowledge Base
    • Peter Midford lightning talk: talked about matching taxonomic and genetic data
  11. BiSciCol: biological science collections tracker
  12. Ontogrator 
  13. TNRS: taxonomic name resolution service
  14. Barcode of Life data systems, and remote access
  15. Moorea Biocode Project
  16. Microbial LTER's data
  17. BirdVis: interactive bird data visualization (Claudio Silva in collaboration with Cornell Lab of Ornithology)
  18. Crowdlabs: I think the site is down right now, another project by Claudio Silva
  19. Phycas: Bayesian phylogenetics, can you just call this from R?
  20. RIP MrBayes!!!! replaced by RevBayes (see 9 above)
  21. Slides of presentations will be at Slideshare (not all presentations up yet)          
  22. A birds of a feather group I was involved in proposed an idea (TOL-o-matic) like Phylomatic, but of broader scope, for easy access and submission of trees, and perhaps even social (think just pushing a 'SHARE' button within PAUP, RevBayes, or other phylogenetics software)! 
  23. Synopses of Birds of a Feather discussion groups: http://piratepad.net/iEvoBio11-BoF-reportouts

Tuesday, June 21, 2011

PLoS journals API from R: "rplos"

The Public Libraries of Science (PLOS) has an API so that developers can create cool tools to access their data (including full text papers!!).

Carl Boettiger at UC Davis and I are working on R functions that use the PLoS API. See our code on Github here. See the wiki at the Github page for examples of use. We hope to deploy rplos as a package someday soon. Please feel free to suggest changes/additions rplos in the comments below or on the Github/rplos site.

Get your own API key here.

Friday, June 10, 2011

OpenStates from R via API: watch your elected representatives

I am writing some functions to acquire data from the OpenStates project, via their API. They have a great support community at Google Groups as well.

On its face this post is not obviously about ecology or evolution, but well, our elected representatives do, so to speak, hold our environment in a noose, ready to let the Earth hang any day.

Code I am developing is over at Github.

Here is an example of its use in R, in this case using the Bill Search option (billsearch.R on my Github site), and in this case you do not provide your API key in the function call, but instead put it in your .Rprofile file, which is called when you open R. We are searching here for the term 'agriculture' in Texas ('tx'), in the 'upper' chamber.

> temp <- billsearch('agriculture', state = 'tx', chamber = 'upper')
 
> length(temp)
[1] 21
 
> temp[[1]]
$title
[1] "Congratulating John C. Padalino of El Paso for being appointed to the United States Department of Agriculture."
 
$created_at
[1] "2010-08-11 07:59:46"
 
$updated_at
[1] "2010-09-02 03:34:39"
 
$chamber
[1] "upper"
 
$state
[1] "tx"
 
$session
[1] "81"
 
$type
$type[[1]]
[1] "resolution"
 
 
$subjects
$subjects[[1]]
[1] "Resolutions"
 
$subjects[[2]]
[1] "Other"
 
 
$bill_id
[1] "SR 1042"
Created by Pretty R at inside-R.org


Apparently, the first bill (SR 2042, see $bill_id at the bottom of the list output) that came up was to congratulate John Paladino for being appointed to the USDA.

The other function I have ready is getting basic metadata on a state, called statemetasearch.

I plan to develop more functions for all the possible API calls to the OpenStates project.

Tuesday, June 7, 2011

How to fit power laws

A new paper out in Ecology by Xiao and colleagues (in press, here) compares the use of log-transformation to non-linear regression for analyzing power-laws.

They suggest that the error distribution should determine which method performs better. When your errors are additive, homoscedastic, and normally distributed, they propose using non-linear regression. When errors are multiplicative, heteroscedastic, and lognormally distributed, they suggest using linear regression on log-transformed data. The assumptions about these two methods are different, so cannot be correct for a single dataset.

They will provide their R code for their methods once they are up on Ecological Archives (they weren't up there by the time of this post).

Friday, June 3, 2011

searching ITIS and fetching Phylomatic trees

I am writing a set of functions to search ITIS for taxonomic information (more databases to come) and functions to fetch plant phylogenetic trees from Phylomatic. Code at github.

Also, see the examples in the demos folder on the Github site above.