OptimusKG
Data Sources

Data Sources

Overview of all biomedical data sources integrated into OptimusKG.

OptimusKG integrates data from 12 external biomedical data sources. Each source is processed in the bronze layer by dedicated node functions.

Source Overview

SourceTypeDescription
OpenTargetsTargets, Diseases, DrugsPlatform for target identification and validation
DrugBankDrugs, ProteinsComprehensive drug and drug-target database
DrugCentralDrugs, DiseasesDrug information resource
CTDExposuresComparative Toxicogenomics Database
DisGeNETDiseases, GenesDisease-gene association database
BgeeGene ExpressionGene expression in anatomy
OnSIDESSide EffectsDrug side effects from FDA labels
ReactomePathwaysBiological pathway database
OntologiesOntologiesGO, HP, MONDO, UBERON
PPIInteractionsProtein-protein interactions
Gene NamesNomenclatureHGNC gene nomenclature

Data Flow

Each data source follows the same processing pattern:

  1. Landing: Raw data downloaded via Providers
  2. Bronze: Parsed and standardized into typed Polars DataFrames
  3. Silver: Merged with other sources into unified entity and relationship tables

On this page