Loading Events

Computational challenges and performance optimizations in NGS data analyses

  • This event has passed.

Next-generation sequencing datasets are continuing to increase in size, and even small genomic projects now generate terabytes of data. The sheer scale of these datasets now poses a great computational challenge: how can we improve software pipelines to analyse data more efficiently?

Designed jointly by Intel and the Francis Crick Institute, this workshop will tackle some of these challenges and train participants in the principles and practicalities of optimizing NGS analysis pipelines.   


Nicholas Luscombe (UCL and and Cancer Research UK London Research Institute)

Robert Maskell (Intel)

Gabriella Rustici (EMBL-European Bioinformatics Institute)


Participation fee (refundable deposit): £100

The registration fee will be invoiced to your sponsor in advance of the course and is payable prior to course start date. It is our intention to refund the registration fee after the course has completed to those who attend and complete the course.

Number of participants: 30

Participants will be selected based on applicants’ experience and relevance of their current work to the objectives of the course.

Application deadline: 15 July 2013 (Preference will be given to applicants from Francis Crick Institute partners that apply before 1 July 2013).

Please apply through the registration page.


Course description:

Is this course right for me?

The aim of this course is to familiarize participants with high-performance computing (HPC) methodologies and to provide hands-on training on how to optimize a next-generation sequencing (NGS) analysis pipeline. The workshop is aimed at bioinformaticians who are actively involved in NGS data analysis projects and want to learn how to use HPC solutions to run their analytical pipelines in an efficient and reproducible manner. DNA and RNA sequencing analysis workflows will be used to explore bottlenecks and demonstrate solutions.

What will I learn?

Lectures will outline the computational challenges and bottlenecks associated with the analysis of NGS data and present HPC optimization approaches to overcome such challenges. Practicals will consist of computer exercises that will enable the participants to compare optimized vs. non-optimized software code for the analysis of NGS data, under the guidance of the lecturers and teaching assistants.

Prerequisites: A high degree of familiarity with the LINUX/UNIX operating system and knowledge of the R programming language. Applicants will also need to demonstrate their current involvement in high-throughput sequencing data analysis projects.

What will it cover?

Topics will include:

  • How to optimize NGS analysis workflows through HPC best practices
  • Optimal use of software tools for short read alignment, with emphasis on Bowtie2 and Tophat2
  • HPC concepts including parallelization, single/multi-process, shared/distributed memory, CPU memory and I/O constraints, etc.
  • Diagnostic tools for debugging and monitoring of parallel programs
  • Benchmarking of various technology and system architecture approaches
  • Cloud-based analytics
  • Scaling up a workflow to deal with a production scale environment and increasingly large datasets


Kristina Kermanshahche (Intel)
Clay Beshears (Intel)
Ketan Paranjape (Intel)
Vincent Plagnol (UCL)
Robert Sugar (Cancer Research UK London Research Institute)
Ernest Turro (Department of Haematology, University of Cambridge (tbc))
Kathi Zarnack (Cancer Research UK London Research Institute)


September 3, 2013
September 6, 2013


Darwin, Ground Floor, Wellcome Trust
Gibbs Building, 215 Euston Road, London, NW1 2BE United Kingdom
+ Google Map


Intel and The Francis Crick Institute

Cookies on IBME website

Our website uses cookies to provide us with important information about visitors. By continuing to browse the site we'll assume that you are happy to receive all cookies set by IBME.