Skip to main content
Fig. 3 | Biology Direct

Fig. 3

From: A novel framework for horizontal and vertical data integration in cancer studies with application to survival time prediction models

Fig. 3

Workflow of data integration of the independent datasets, performed within our framework. In data preparation phase we transform and store the raw data of different formats in a document database, performing horizontal data integration per data type. We generate relations between the data based on the available raw patient datasets, including clinical information and molecular data, and we store these in a graph-based database, creating an internal network. We then look up mutated proteins within the networks and search for related information in the external knowledge sources. This way we build the new general relations network which is considered, finally performing the vertical data integration. We store these enriched relations in the graph-based database, together with the internal relationships

Back to article page