Data Sources
Data Sources
Overview of all biomedical data sources integrated into OptimusKG.
OptimusKG integrates data from 12 external biomedical data sources. Each source is processed in the bronze layer by dedicated node functions.
Source Overview
| Source | Type | Description |
|---|---|---|
| OpenTargets | Targets, Diseases, Drugs | Platform for target identification and validation |
| DrugBank | Drugs, Proteins | Comprehensive drug and drug-target database |
| DrugCentral | Drugs, Diseases | Drug information resource |
| CTD | Exposures | Comparative Toxicogenomics Database |
| DisGeNET | Diseases, Genes | Disease-gene association database |
| Bgee | Gene Expression | Gene expression in anatomy |
| OnSIDES | Side Effects | Drug side effects from FDA labels |
| Reactome | Pathways | Biological pathway database |
| Ontologies | Ontologies | GO, HP, MONDO, UBERON |
| PPI | Interactions | Protein-protein interactions |
| Gene Names | Nomenclature | HGNC gene nomenclature |
Data Flow
Each data source follows the same processing pattern:
- Landing: Raw data downloaded via Providers
- Bronze: Parsed and standardized into typed Polars DataFrames
- Silver: Merged with other sources into unified entity and relationship tables