Let the testing BEGIN!

Today we are announcing an early bird test demo site addressed mainly to avid testers who actually like to break the application than seeing it work! Over the next two to three months the application is entering it’s first main testing phase which hopefully will result into a fully functional application, database and tools for management and integration of translational research data.

Preview software

Please note that eHS v0.57102 is currently released as a preview software. There is no guarantee the software will behave in the intended ways.

We have provided a sample CDISC study for users to download and use to test the EHS. Get it here.

An eHS refresher

The eHS is a data integration platform for end users. Data curation has always been seen as a tedious boring activity that takes place only within silos of expertise who know the art of cleaning and polishing the data to look its best when its time comes to pose in a fancy journal figure. With the diversity of data being generated in translational research the risk of losing the focus of these studies is always high as the technologies become more prominent than the actual research and little swamps of potentially rich data end up drying out at the sources.

The main problem with this general scheme is that the owners of the data, the actual authors who understand best the context within which the data was produced may not always get the chance to bring their insight into the bigger picture. This is mainly due to the absence of user friendly, intuitive and modern tools that help bring the curation know-how to the data owners because then they can do a lot better in integrating and interpreting their data than others who just know the tools but know nothing about the data.

For these reasons we have (and still are) building the different components of the eHS platform. At the core of this platform is a multilayered meta-data model for describing contextual, structural and descriptive information about a translational research project. This is implemented by Biospeak-db, a standards compliant data repository and tools for translational research data management. Other complementary tools enable the users to bring in their unformatted and unstructured data and be able to achieve syntactical and semantic harmonisation.

In simple terms the way we break down the process of data harmonisation is two fold; first is achieving structural conformity and syntactical standardisation to provide a standard organisational model. We achieve this by providing community standard based templates across different domains to map the data into. This then makes way for the second process to achieve content harmonisation and semantic conformity across the different data sources.

As of today we have provided an implementation that puts standards into action by helping the user build these templates and to load their data into these standard formatted datasets. The small tutorials below show the main ETL steps to load data into the eHS . More videos to come. Watch this space! … to be continued.

Getting started

eHS is accessible throught this website in two forms, direct access to our demo server and packaged binaires which will include a docker container distribution package available by early October this year.

Where is the source code

As the code is currently undergoing aggressive testing, the source code is not yet available outside eTRIKS.

0. How do I login

Notice the little Google authenticator barcode that gets generated after creating your account. It's not being uesd now, but if you are dealing with sensitive data and you want a higher level of security, two-way factor authetication can be enabled

1. Project Setup

A project serves as a workspace for bringing in data from different studies under a common goal. A TR project usually involves multiple studies working towards a common research goal. An activity is usually a clinical activity that results in data. A dataset template is a standard formatted

2. Data Stager

This is equivalent to your drive on the platform. You upload all the data files that need to be loaded into the datasets previously generated.This is the main ETL area for loading, unloading and manipulating data files before upload.

3. Load Files

This is a wizard that helps the user map their data into the standard datasets created before. In this exmaple, it's fairly straightwforward as the files in this example are already in the standard format. In future videos we will show you how to upload a non-standard file and use the wizard to guide you transform the file. The fisrt dataset that has to be loaded is the Demographics as it contains information about the study(s) and subjects that the rest of the datasets will relate to.

4 Load and explore

This video shows how to upload more data files into the different dataset templates once the demographics data has been loaded into the database.