The Big Biology Days are an initiative by the Society of Biology to bring together researchers from the life sciences and young students, as well as the broader public. The goal is to introduce children to the basic concepts of biology through play activities.
I animated the DNA sequencing, Sorting Algorithm and DNA alphabet activities.October 2014
Low-depth sequencing allows to assess the genetic landscape of individuals at a lower cost than other methods, but also lower accuracy. I am in charge of developing an analysis pipeline that accurately predicts millions of mutations in small (thousands) to large (hundred thousands) human populations, and uses them to model traits such as disease status, height or cholesterol levels. This involves large multidimensional analyses to characterise and choose the best types of population to sequence, building and testing disease models, and data quality control among other things.
Skills used: Bash, Perl, C++ (boost libraries), Python (scipy libraries), R, Tableau
Statistics used: Logistic regression, PCA/MDS, Linear Mixed Models, hypothesis testing, clustering (k-means), methods for sparse data (imputation)August 2013
Topics covered: Introduction to statistics, estimation theory, hypothesis testing, regression models, (M)AN(C)OVA, model building. Fortnightly 2-hour sessions and practicals using R.Autumn/Winter 2012
Genome sequencing reads the DNA from living cells by duplicating it many times and breaking it into small fragments. The fragments can be read individually, but the whole sequence has to be reconstructed algorithmically, which is done using a reference sequence onto which the fragments are aligned. However, when structural variations (such as copy and paste of long sequences) affect the genome, alignment will fail. I designed a program, TE-Tracker, that clusters sequencing reads that do not align properly and tries to model the structural mutation that produced them.
Skills used: Bash, Perl, C++ (boost libraries)
Statistics used: clustering (single-linkage), supervised classificationJanuary 2012
When predicting the value and risk of an investment portfolio, the worth of each financial asset needs to be priced accurately by incorporating market data. Due to the complexity of exotic financial products, no explicit formula can be derived to value them using probabilistic theory. It is however possible to estimate their price using simulation.
A client requested this functionality in a call for tender, so I was sent to the Paris R&D department to develop it. I used a Monte-Carlo pricing algorithm that models the behavior of the derivative under different volatility scenarios using a volatility surface based on historical measurements.
Skills used: C#
Statistics used: Monte-Carlo methods, Stochastic processes, derivatives valuation modelsAugust 2011
Misys Sophis is a major solutions provider to the banking and fund management industry. Its main products at the tine, Risque and Value, were designed to manage large investment portfolios down to the single trade. My role was to provide technical support to institutional clients as well as answering specific client demands and calls for tender.
Skills used: C#, Excel, SQLJanuary 2011