From data fusion to data driven visualizations of embryogenesis – Présentation de Paul Villoutreix (Princeton University)
“Embryogenesis is a fascinating process. It is the process by which a single fertilized cell is turned into a multi-cellular organism. It is a really beautiful phenomenon, studied in various species, from the sea urchin, to the fruit fly, to the zebrafish, to the chicken, to the mouse, to the human.
Recent technological developments in microscopy imaging techniques have transformed embryology into a data intensive science. As an interdisciplinary scientist, I am interested in topics at the intersection of developmental biology, mathematics and data art. Currently, I am a postdoc in Stas Shvartsman lab at Princeton University and a visiting fellow of the Center for Data Arts at the New School.
In addition, I am leading the Embryo Digital Atlas, an open source web based platform for the visualization of complex experimental datasets of embryogenesis in an easy and beautiful way. This is supported by the Mozilla Science Lab and you can contribute here. Thanks Abby for the interview! Here is a blog postdescribing The Embryo Digital Atlas’ journey to the Global Sprint.”
Animal embryogenesis is a multivariate process :
Changes in morphology
Changes in patterns of gene expression
Anatomical descriptions are works of art
Ramón y Cajal (1852-1934)
Ernst Haeckel (1834 – 1919)
Henry Gray (1827 – 1861)
Molecular and interactive visualization
The HIV gag polyprotein (shown in red) is translated from the HIV RNA genome (in yellow) by cellular ribosomes.
Can we build accurate visualizations of developing embryos that can be data-driven, interactive and visually appealing?
Data fusion: an algorithm to merge complex and heterogeneous datasets in fly embryogenesis
A mapping between morphology and chemical signal
We learn a mapping between images morphology and stained images
An accurate movie is obtained from the reconstruction
Towards interactive and visually appealing visualizations
Data Visualization – Choice of colors
Chemical signals can be described with molecule type, spatial position and concentration => Encoded as coloring a pixel, with a given hue, and a given brightness.
Data Visualization – How to interact with the datasets – Exploded view
To enable large use of this platform as a generic visualization tool, researchers need to be able to visualize and possibly share their own dataset, which requires data storage solution such as Omero and unified file format.
An online library with datasets that people can use to learn, share knowledge by collaborative tagging of the datasets, use the platform to generate figures, videos and extract measurements.
Datasets so far:
- 2D cross sections of the developing fly
- The Transparent Human Embryo database
- 3D visualization of the developing fly ?
Towards a credit system behind data fusion?
Data-driven visualization can be obtained by aggregating datasets from heterogeneous datasets sources
- Various labs
- Various teams
- Various experimental systems
- Published or unpublished papers
-> How to give credit to people, based on much their datasets have been used?
-> How to guarantee the source of data?
Using Blockchain on Data
Address Data by content and keep track of its history and integrity.
Can we replace the role of journal which serve as a trusted third party by a distributed system?
BitCoin is a monetary system which overcome the need for central bank by using a distributed cryptographic protocol based on blockchain.
Also used in Github … and in the Interplanetary File System (IPFS).