tmtk - TranSMART data curation toolkit¶
|Generated:||Jan 29, 2018|
A toolkit for ETL curation for the tranSMART data warehouse for translational research.
The TranSMART curation toolkit (
tmtk) aims to provide a language
and set of classes for describing data to be uploaded to tranSMART.
The toolkit can be used to edit and validate studies prior
to loading them with transmart-batch.
- Functionality currently available:
- create a transmart-batch ready study from clinical data files.
- load an existing study and validate its contents.
- edit the transmart concept tree in The Arborist graphical editor.
- create chromosomal region annotation files.
- map HGNC gene symbols to corresponding Entrez gene IDs using mygene.info.
tmtk is a
python3 package meant to be run in
Jupyter notebooks. Results
for other setups may vary.