"To consult the statistician after an experiment is finished
is often merely to ask him to conduct a post mortem examination. He
can perhaps say what the experiment died of." R. A. Fisher,
geneticist, biologist and statistician
In other words: think before you start!
The success of a project relies heavily on good planning,
including experimental design and power calculation. We can advise
researchers on these topics, including when high-dimensional data
Starting with the main research questions, we put together an
experimental design and an analysis plan. Power analyses can
subsequently be done. This part should be included in the project
Our expertise Our group has wide expertise which
includes classic statistical problems, analysis of high-dimensional
data such as omics data, as well as genetic screen data.
Transparent and reproducible research Data
analyses need to be well understood by all involved in the project.
This helps us provide better advice, as well as researchers present
their results better. In addition, research results must be
reproducible and traceable, so that someone else later on can
follow and reproduce findings.
To this end, we use R for all data analysis. It gives flexibility
as even relatively new statistical methods are implemented as R
packages. In addition, we use Rmarkdown, which combines comments
describing the analysis, scripts performing the analysis, and
Contact us Researchers can contact us via the
email email@example.com for an
Preparing your data Researchers are
responsible for providing the data to be analysed in a suitable
format. Datasets from classic clinical studies should be organized
in a table form, with one row per individual and one column per
variable. If categorical variables are included, classes
corresponding to codes used must be provided in a separate file.
The data should reach us in a tab- or comma delimited format.
For studies involving omics data, per data type the following
(tab- or comma-delimited) files should be provided: the omics data
with features on rows and samples on columns, with unique row
identifiers as well as unique column identifiers; the annotation
data with features on rows and annotation variables on columns,
with the same type of row identifiers as the omics data; and a
phenotypic data table, which is a file with samples on rows and
sample annotation variables on columns, including one column with
the column identifiers of the omics data file. If multiple types of
omics data are provided, then a single phenotypic data table may be
used, where the column identifiers of each omics data file is
included as a separate variable.
Resources In all our projects, we use R.
One convenient interface to use R is RStudio, which is free and can be
used with all operational systems. It has useful features to help
you with writing clear code.
Another convenient tool is the RStudio package RMarkdown.
This tool enables you to combine comments, scripts and output in a
single document. This is an easy way of making your research
transparent and reproducible. Your RMarkdown report containing your
data analysis can be made available for publication in a scientific
journal or repository, together with your manuscript and the