Analytics

comparativeDNAprofiles

The main role of the Analytical Platform is to provide the input data necessary by establishing data sets from reference and use cases, and develop new techniques required for the future of truly personalised medicine.

Three grand lines of activity will be pursued: a) Establishment of reference data sets using the current generation of technology; they are essential to gain understanding of the baseline, b) Assembly and completion of datasets for the use cases; whenever possible we will rely on datasets generated in the context of other projects such as the ICGC, OncoTrack, BLUEPRINT etc. and complement with the missing data types needed for the models, and c) Technology development; the current generation of tools allow us to describe systems with a certain resolution. To achieve further analytical potential will require technological developments and their implementation.

Establishment of reference data sets using the current generation of technology

The development of reliable computational models of diseases requires that molecular data are properly attributed to the various cell types in an organ (e.g. epithelial and stromal cells, different lymphocyte sub‑types etc.). Furthermore, cellular heterogeneity is a major challenge in modelling of cancer, since different sets of mutations and epigenetic alterations are present in different cells within the cancer of an individual. In order to establish baseline measures of the physiology of a human we will use tissue from different organs and cell sort the most abundant cell types in each tissue. Using this material, as well as the tissue samples, we will measure the genome sequence of the individual, and the epigenome, transcriptome, proteome and metabolome in each cell sample. This will be complemented by data from new technologies, as they become available from the Technology Development work e.g. to provide spatially resolved deep sequence information.

Assembly and completion of the datasets for four use cases

The data sets for four use cases [colorectal cancer (CRC), chronic lymphocytic leukaemia (CLL), metastatic melanoma, type 2 diabetes (T2D)] will be established. Many of the required datasets already exist at least in part and can be complemented with necessary additional measurements. In cancer somatic genome alterations have a huge functional impact and already comprehensive datasets from a small number of patients are a good starting point for the establishment of a model.

Additions to the use cases: There are several opportunities for complementation of datasets with data from other technologies that are very relevant within the context of the use case. Of particular interest in the CRC use case, we will additionally integrate metagenome analysis, exposome analysis, functional effects, cell spectral phenotyping and whole‑body imaging with existing genome, epigenome, transcriptome, proteome and metabolome data.

Technology development

Technology development of methods that will provide information that currently is not obtainable will be carried out; this will have great potential in clinical diagnostics in the next years and that will provide valuable input for modelling. It is very important to understand the interplay between technologies that can be applied to deliver data for the use cases and ones that we need to have for the sake of accessing the most suitable compartment of biological information. There is every reason to expect that technologies will evolve at a rapid rate. The technologies that will be needed in the future to unfold the full potential of ITFoM will therefore be prioritised in the ramp‑up phase. They are targeted at specific issues that currently are not well covered by available technologies.