OptimusKG
Getting Started

CLI Utilities

Command-line tools for maintenance and validation.

OptimusKG ships a Typer-based CLI for common maintenance tasks.

uv run cli --help

sync-catalog

Synchronize catalog schemas and checksums. For ParquetDataset entries, reads the Parquet file on disk and updates the YAML schema. For any dataset with a metadata.checksum field, recomputes the BLAKE2b checksum.

# Sync all schemas and checksums
uv run cli sync-catalog

# Preview changes without writing
uv run cli sync-catalog --dry-run

# Validate without updating (useful in CI)
uv run cli sync-catalog --validate

# Target a specific layer
uv run cli sync-catalog --layer bronze

# Target a specific dataset
uv run cli sync-catalog --dataset bronze.opentargets.disease
OptionShortDescription
--layer-lTarget layer: landing, bronze, silver, or all (default).
--dataset-dSpecific dataset name.
--validate-vValidate without updating files.
--dry-run-nPreview changes without writing.
--catalog-dirPath to the catalog directory (default: conf/base/catalog).
--data-dirPath to the data directory (default: data).

checksum

Log and validate file checksums.

uv run cli checksum

metrics

Generate metrics Parquet files from the gold knowledge graph data.

uv run cli metrics

figure

Generate visualization figures for the knowledge graph.

uv run cli figure <figure-type>

Available figure types:

  • adjacency-heatmap
  • ccdf-degree-distribution
  • closeness-centrality
  • degree-distribution
  • metaedge-bubble-plot
  • metapath-counts
  • property-type-distribution

On this page