tmtk - TranSMART data curation toolkit

Author:Jochem Bijlard
Source Code:https://github.com/thehyve/tmtk/
Generated:Jan 29, 2018
License:GPLv3
Version:0.2.0

Philosophy

A toolkit for ETL curation for the tranSMART data warehouse for translational research.

The TranSMART curation toolkit (tmtk) aims to provide a language and set of classes for describing data to be uploaded to tranSMART. The toolkit can be used to edit and validate studies prior to loading them with transmart-batch.

Functionality currently available:
  • create a transmart-batch ready study from clinical data files.
  • load an existing study and validate its contents.
  • edit the transmart concept tree in The Arborist graphical editor.
  • create chromosomal region annotation files.
  • map HGNC gene symbols to corresponding Entrez gene IDs using mygene.info.

Note

tmtk is a python3 package meant to be run in Jupyter notebooks. Results for other setups may vary.

Indices and tables