The international explosion of genetic data means that genetics researchers are encountering increasingly large data sets, and are having to rely on computational methods when performing analyses. While there has been a corresponding increase in the availability of shared computing resources internationally, a major problem is that many genetics researchers are not computing experts, and are thus unable to easily access high-performance computing resources, despite the best efforts of those who administer such systems.
It is the interface which presents the largest challenge, as most biological researchers are (far) more comfortable with graphical user interfaces than the command-line, where the majority of powerful analytic tools reside. Some “HPC-savvy” users are able to perform computationally intensive analyses by scripting the submission of jobs to high performance computing resources, but where does this leave the “non-savvy” researcher?
Galaxy is an open source, web-based framework for biological researchers. Galaxy overcomes the challenge of using command-line tools by letting researchers run such tools from a web browser, and is easily extendable to provide support for additional applications.
In this talk we will describe work being undertaken at the University of Otago (funded by New Zealand eScience Infrastructure and the Virtual Institute of Statistical Genetics) to allow genetics researchers to run complex computational analyses across collections of local and remote HPC resources from the convenience of the Galaxy interface.
We will also discuss our research into adding resource and performance monitoring probes to the Galaxy framework. Despite the spread of multicore computing, many of the tools used within our Galaxy workflows are not taking full advantage of the computing resources available to them. In some cases small software changes led to significant increases in the performance of the analyses being run.