OptimusKG
Graph Schema

Edges

Edge types and their schema in OptimusKG.

OptimusKG encodes 26 edge types connecting the node types across molecular, clinical, anatomical, and environmental domains.

LabelRelation(s)Count
DIS-GENASSOCIATED_WITH9,734,774
ANA-GENEXPRESSION_PRESENT, EXPRESSION_ABSENT8,787,955
DRG-DRGSYNERGISTIC_INTERACTION, PARENT1,345,376
PHE-GENASSOCIATED_WITH793,279
GEN-GENINTERACTS_WITH327,924
DIS-PHEPHENOTYPE_PRESENT157,144
BPO-GENINTERACTS_WITH158,410
DRG-DISINDICATION, CONTRAINDICATION, OFF_LABEL_USE70,442
MFN-GENINTERACTS_WITH90,933
DRG-PHEADVERSE_DRUG_REACTION, ASSOCIATED_WITH, CONTRAINDICATION, INDICATION, OFF_LABEL_USE13,758
PWY-GENINTERACTS_WITH46,977
BPO-BPOIS_A44,494
DIS-DISPARENT44,215
CCO-GENINTERACTS_WITH105,309
DRG-GENACTIVATOR, AGONIST, ALLOSTERIC_ANTAGONIST, ANTAGONIST, BINDING_AGENT, BLOCKER, CARRIER, DEGRADER, ENZYME, INHIBITOR, INVERSE_AGONIST, MODULATOR, NEGATIVE_ALLOSTERIC_MODULATOR, NEGATIVE_MODULATOR, OPENER, PARTIAL_AGONIST, POSITIVE_ALLOSTERIC_MODULATOR, POSITIVE_MODULATOR, RELEASING_AGENT, STABILISER, SUBSTRATE, TARGET, TRANSPORTER20,694
PHE-PHEPARENT24,862
MFN-MFNIS_A12,587
PWY-PWYPARENT2,819
EXP-GENINTERACTS_WITH2,989
EXP-DISLINKED_TO2,391
EXP-EXPPARENT2,443
EXP-BPOINTERACTS_WITH2,260
ANA-ANAPARENT17,082
CCO-CCOIS_A4,639
EXP-MFNINTERACTS_WITH47
EXP-CCOINTERACTS_WITH13

All edges share the same base schema in the unified edges.parquet and largest_connected_component_edges.parquet tables:

fromStringSource node identifier in CURIE format
toStringTarget node identifier in CURIE format
labelStringEdge type label (e.g. DIS-GEN)
relationStringRelation type (e.g. ASSOCIATED_WITH)
undirectedBooleanTrue if the edge has no intrinsic directionality
propertiesStringJSON-encoded edge-specific properties. Expanded to a native Struct in per-type parquet files.

In the stratified per-type parquet files (edges/<label>.parquet), properties is expanded into native typed columns as a Polars Struct.


Anatomy-Anatomy

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (ANA-ANA)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Anatomy-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (ANA-GEN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
expression_rankInt32Bgee expression rank score (lower = higher expression)
call_qualityStringExpression call quality (gold/silver)
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Biological Process-Biological Process

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (BPO-BPO)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Biological Process-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (BPO-GEN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidenceList[String]GO evidence codes (e.g. IDA, IMP, TAS)
gene_productList[String]Gene product IDs annotated to this term
eco_idsList[String]Evidence & Conclusion Ontology (ECO) IDs
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Cellular Component-Cellular Component

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (CCO-CCO)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Cellular Component-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (CCO-GEN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidenceList[String]GO evidence codes (e.g. IDA, IMP, TAS)
gene_productList[String]Gene product IDs annotated to this term
eco_idsList[String]Evidence & Conclusion Ontology (ECO) IDs
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Disease-Disease

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DIS-DIS)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Disease-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DIS-GEN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidence_scoreFloat64Aggregated association evidence score
evidence_countInt64Number of evidence items supporting the association
evidence_indexFloat64Combined evidence index (Open Targets)
disease_specificity_indexFloat64DSI, specificity of the gene to this disease
disease_pleiotropy_indexFloat64DPI, number of disease classes the gene is associated with
disgenet_scoreFloat64DisGeNET gene–disease association score
year_initialStringYear of the earliest supporting publication
year_finalStringYear of the most recent supporting publication
number_of_pmidsInt16Number of supporting PubMed publications
number_of_snpsInt16Number of supporting SNPs (GWAS evidence)
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Disease-Phenotype

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DIS-PHE)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
aspectList[String]HPO annotation aspect (P=phenotypic, I=inheritance, etc.)
evidence_typeList[String]Evidence type codes (e.g. IEA, PCS, TAS)
frequencyList[String]Phenotype frequency annotations
onsetList[String]Age of onset annotations
modifiersList[String]Clinical modifier annotations
sexesList[String]Sex-specific annotations
qualifier_notBooleanTrue if phenotype is explicitly absent
bio_curationList[String]Biocuration provenance entries
referencesList[String]Supporting publication or database references
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Drug-Disease

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DRG-DIS)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
highest_clinical_trial_phaseFloat64Highest clinical trial phase for this indication
structure_idStringDrugCentral structure ID
drug_disease_idStringDrugCentral drug–disease identifier
reference_idsList[String]Supporting reference identifiers
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Drug-Drug

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DRG-DRG)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
interaction_descriptionStringDescription of the drug–drug interaction
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Drug-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DRG-GEN)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
mechanisms_of_actionList[String]Mechanism of action descriptions
source_idsList[String]Source-specific interaction identifiers
source_urlsList[String]URLs to source evidence records
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Drug-Phenotype

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (DRG-PHE)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
highest_clinical_trial_phaseFloat64Highest clinical trial phase
structure_idStringDrugCentral structure ID
drug_disease_idStringDrugCentral drug–disease identifier
reference_idsList[String]Supporting reference identifiers
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Exposure-Biological Process

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (EXP-BPO)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidence_countUInt32Number of evidence entries
number_of_receptorsInt64Number of receptor/study participants
receptorsList[String]Receptor identifiers (e.g. cell line, organism)
receptor_notesList[String]Free-text notes on receptors
smoking_statusesList[String]Smoking status of study subjects
sexesList[String]Sex of study subjects
racesList[String]Race/ethnicity of study subjects
methodsList[String]Measurement methods used
mediumsList[String]Biological mediums measured (e.g. blood, urine)
detection_limitList[String]Lower limit of detection values
detection_limit_uomList[String]Units of detection limit values
detection_frequencyList[String]Detection frequency values
age_entriesUInt32Number of age-stratified entries
age_range_valuesList[String]Age range values for subjects
age_mean_valuesList[String]Mean age values
age_median_valuesList[String]Median age values
age_point_valuesList[String]Point age values
age_open_range_valuesList[String]Open-ended age range values
study_countriesList[String]Countries where studies were conducted
states_or_provincesList[String]States or provinces of study
city_town_region_areasList[String]City/town/region of study
outcome_relationshipsList[String]Observed outcome relationships
exposure_event_notesList[String]Notes on the exposure event
exposure_outcome_notesList[String]Notes on the exposure outcome
referencesList[String]Supporting literature references
associated_study_titlesList[String]Titles of associated studies
enrollment_start_yearsList[String]Study enrollment start years
enrollment_end_yearsList[String]Study enrollment end years
study_factorsList[String]Study design factors
assay_notesList[String]Notes on the assay used
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Exposure-Cellular Component

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (EXP-CCO)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidence_countUInt32Number of evidence entries
number_of_receptorsInt64Number of receptor/study participants
receptorsList[String]Receptor identifiers (e.g. cell line, organism)
receptor_notesList[String]Free-text notes on receptors
smoking_statusesList[String]Smoking status of study subjects
sexesList[String]Sex of study subjects
racesList[String]Race/ethnicity of study subjects
methodsList[String]Measurement methods used
mediumsList[String]Biological mediums measured (e.g. blood, urine)
detection_limitList[String]Lower limit of detection values
detection_limit_uomList[String]Units of detection limit values
detection_frequencyList[String]Detection frequency values
age_entriesUInt32Number of age-stratified entries
age_range_valuesList[String]Age range values for subjects
age_mean_valuesList[String]Mean age values
age_median_valuesList[String]Median age values
age_point_valuesList[String]Point age values
age_open_range_valuesList[String]Open-ended age range values
study_countriesList[String]Countries where studies were conducted
states_or_provincesList[String]States or provinces of study
city_town_region_areasList[String]City/town/region of study
outcome_relationshipsList[String]Observed outcome relationships
exposure_event_notesList[String]Notes on the exposure event
exposure_outcome_notesList[String]Notes on the exposure outcome
referencesList[String]Supporting literature references
associated_study_titlesList[String]Titles of associated studies
enrollment_start_yearsList[String]Study enrollment start years
enrollment_end_yearsList[String]Study enrollment end years
study_factorsList[String]Study design factors
assay_notesList[String]Notes on the assay used
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Exposure-Disease

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (EXP-DIS)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
evidence_countUInt32Number of evidence entries
number_of_receptorsInt64Number of receptor/study participants
receptorsList[String]Receptor identifiers (e.g. cell line, organism)
receptor_notesList[String]Free-text notes on receptors
smoking_statusesList[String]Smoking status of study subjects
sexesList[String]Sex of study subjects
racesList[String]Race/ethnicity of study subjects
methodsList[String]Measurement methods used
mediumsList[String]Biological mediums measured (e.g. blood, urine)
detection_limitList[String]Lower limit of detection values
detection_limit_uomList[String]Units of detection limit values
detection_frequencyList[String]Detection frequency values
age_entriesUInt32Number of age-stratified entries
age_range_valuesList[String]Age range values for subjects
age_mean_valuesList[String]Mean age values
age_median_valuesList[String]Median age values
age_point_valuesList[String]Point age values
age_open_range_valuesList[String]Open-ended age range values
study_countriesList[String]Countries where studies were conducted
states_or_provincesList[String]States or provinces of study
city_town_region_areasList[String]City/town/region of study
outcome_relationshipsList[String]Observed outcome relationships
exposure_event_notesList[String]Notes on the exposure event
exposure_outcome_notesList[String]Notes on the exposure outcome
referencesList[String]Supporting literature references
associated_study_titlesList[String]Titles of associated studies
enrollment_start_yearsList[String]Study enrollment start years
enrollment_end_yearsList[String]Study enrollment end years
study_factorsList[String]Study design factors
assay_notesList[String]Notes on the assay used
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Exposure-Exposure

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (EXP-EXP)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
evidence_countUInt32Number of evidence entries
number_of_receptorsInt64Number of receptor/study participants
receptorsList[String]Receptor identifiers (e.g. cell line, organism)
receptor_notesList[String]Free-text notes on receptors
smoking_statusesList[String]Smoking status of study subjects
sexesList[String]Sex of study subjects
racesList[String]Race/ethnicity of study subjects
methodsList[String]Measurement methods used
mediumsList[String]Biological mediums measured (e.g. blood, urine)
detection_limitList[String]Lower limit of detection values
detection_limit_uomList[String]Units of detection limit values
detection_frequencyList[String]Detection frequency values
age_entriesUInt32Number of age-stratified entries
age_range_valuesList[String]Age range values for subjects
age_mean_valuesList[String]Mean age values
age_median_valuesList[String]Median age values
age_point_valuesList[String]Point age values
age_open_range_valuesList[String]Open-ended age range values
study_countriesList[String]Countries where studies were conducted
states_or_provincesList[String]States or provinces of study
city_town_region_areasList[String]City/town/region of study
outcome_relationshipsList[String]Observed outcome relationships
exposure_event_notesList[String]Notes on the exposure event
exposure_outcome_notesList[String]Notes on the exposure outcome
referencesList[String]Supporting literature references
associated_study_titlesList[String]Titles of associated studies
enrollment_start_yearsList[String]Study enrollment start years
enrollment_end_yearsList[String]Study enrollment end years
study_factorsList[String]Study design factors
assay_notesList[String]Notes on the assay used
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Exposure-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (EXP-GEN)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
evidence_countUInt32Number of evidence entries
number_of_receptorsInt64Number of receptor/study participants
receptorsList[String]Receptor identifiers (e.g. cell line, organism)
receptor_notesList[String]Free-text notes on receptors
smoking_statusesList[String]Smoking status of study subjects
sexesList[String]Sex of study subjects
racesList[String]Race/ethnicity of study subjects
methodsList[String]Measurement methods used
mediumsList[String]Biological mediums measured (e.g. blood, urine)
detection_limitList[String]Lower limit of detection values
detection_limit_uomList[String]Units of detection limit values
detection_frequencyList[String]Detection frequency values
age_entriesUInt32Number of age-stratified entries
age_range_valuesList[String]Age range values for subjects
age_mean_valuesList[String]Mean age values
age_median_valuesList[String]Median age values
age_point_valuesList[String]Point age values
age_open_range_valuesList[String]Open-ended age range values
study_countriesList[String]Countries where studies were conducted
states_or_provincesList[String]States or provinces of study
city_town_region_areasList[String]City/town/region of study
outcome_relationshipsList[String]Observed outcome relationships
exposure_event_notesList[String]Notes on the exposure event
exposure_outcome_notesList[String]Notes on the exposure outcome
referencesList[String]Supporting literature references
associated_study_titlesList[String]Titles of associated studies
enrollment_start_yearsList[String]Study enrollment start years
enrollment_end_yearsList[String]Study enrollment end years
study_factorsList[String]Study design factors
assay_notesList[String]Notes on the assay used
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Exposure-Molecular Function

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (EXP-MFN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidence_countUInt32Number of evidence entries
number_of_receptorsInt64Number of receptor/study participants
receptorsList[String]Receptor identifiers (e.g. cell line, organism)
receptor_notesList[String]Free-text notes on receptors
smoking_statusesList[String]Smoking status of study subjects
sexesList[String]Sex of study subjects
racesList[String]Race/ethnicity of study subjects
methodsList[String]Measurement methods used
mediumsList[String]Biological mediums measured (e.g. blood, urine)
detection_limitList[String]Lower limit of detection values
detection_limit_uomList[String]Units of detection limit values
detection_frequencyList[String]Detection frequency values
age_entriesUInt32Number of age-stratified entries
age_range_valuesList[String]Age range values for subjects
age_mean_valuesList[String]Mean age values
age_median_valuesList[String]Median age values
age_point_valuesList[String]Point age values
age_open_range_valuesList[String]Open-ended age range values
study_countriesList[String]Countries where studies were conducted
states_or_provincesList[String]States or provinces of study
city_town_region_areasList[String]City/town/region of study
outcome_relationshipsList[String]Observed outcome relationships
exposure_event_notesList[String]Notes on the exposure event
exposure_outcome_notesList[String]Notes on the exposure outcome
referencesList[String]Supporting literature references
associated_study_titlesList[String]Titles of associated studies
enrollment_start_yearsList[String]Study enrollment start years
enrollment_end_yearsList[String]Study enrollment end years
study_factorsList[String]Study design factors
assay_notesList[String]Notes on the assay used
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Gene-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (GEN-GEN)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Molecular Function-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (MFN-GEN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
evidenceList[String]GO evidence codes (e.g. IDA, IMP, TAS)
gene_productList[String]Gene product IDs annotated to this term
eco_idsList[String]Evidence & Conclusion Ontology (ECO) IDs
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Molecular Function-Molecular Function

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (MFN-MFN)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Pathway-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (PWY-GEN)
relationStringRelation type
undirectedBooleanTrue
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Pathway-Pathway

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (PWY-PWY)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Phenotype-Gene

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (PHE-GEN)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
evidence_scoreFloat64Aggregated association evidence score
evidence_countInt64Number of evidence items supporting the association
evidence_indexFloat64Combined evidence index (Open Targets)
disease_specificity_indexFloat64DSI, specificity of the gene to this disease
disease_pleiotropy_indexFloat64DPI, number of disease classes the gene is associated with
disgenet_scoreFloat64DisGeNET gene–disease association score
year_initialStringYear of the earliest supporting publication
year_finalStringYear of the most recent supporting publication
number_of_pmidsInt16Number of supporting PubMed publications
number_of_snpsInt16Number of supporting SNPs (GWAS evidence)
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

Phenotype-Phenotype

fromStringSource node ID (CURIE format)
toStringTarget node ID (CURIE format)
labelStringEdge type label (PHE-PHE)
relationStringRelation type
undirectedBooleanFalse
propertiesStructEdge-specific properties
sourcesStructProvenance of this edge
directList[String]Datasets that directly contributed this relationship
indirectList[String]Datasets that referenced this relationship

On this page