Skip to content

Lbron 1322 script for downloading datasets#105

Merged
brechtvdv merged 20 commits into
developmentfrom
lbron-1322-scrip-for-downlaoding-datasets
May 29, 2026
Merged

Lbron 1322 script for downloading datasets#105
brechtvdv merged 20 commits into
developmentfrom
lbron-1322-scrip-for-downlaoding-datasets

Conversation

@JonasVanHoof
Copy link
Copy Markdown
Contributor

@JonasVanHoof JonasVanHoof commented May 22, 2026

🗒️ Description

We want to have an export of all the triples per subset and per municipality. Preferably queries that can be run in a script so we can call it later in a cron job or something simular.

Subsets:

  • codelists
  • restricted mobility zones
  • expressions/work/manifests
  • human-validations

🦮 How to test

  1. use the test database dump
  2. Run the extractions for every subset (make sure you have a dataset that contains these values as otherwise the output files will be empty ;p)
cd ../scripts/download_datasets
python extract_ttl_to_file.py --list # Lists all jobs use the name of one in the next command
python extract_ttl_to_file.py --job codelists
=== Job: Codelist annotations (SDG) for municipality ===

[Step 1] Counting distinct subjects (?annotation  ?activity  ?agent) …
  [Step 1] ?annotation: 75 distinct
  [Step 1] ?activity: 2 distinct
  [Step 1] ?agent: 1 distinct
[Step 1] 78 distinct subject URIs → 1 batch(es) of 100000 (4 parallel)
  [Step 1] 1/1 batches done
[Step 1] Done. 78 distinct subject URIs queued.
[Pipeline] Processed 78 subjects …[Pipeline] Tmp graph is empty – extraction complete.

[Pipeline] Finished. 78 subjects written to './output/codelists.ttl'.

@JonasVanHoof JonasVanHoof self-assigned this May 22, 2026
@JonasVanHoof JonasVanHoof force-pushed the lbron-1322-scrip-for-downlaoding-datasets branch from 479c5fc to dd34785 Compare May 22, 2026 11:48
@JonasVanHoof JonasVanHoof requested a review from Rahien May 22, 2026 12:01
@JonasVanHoof JonasVanHoof changed the title Lbron 1322 scrip for downlaoding datasets Lbron 1322 script for downloading datasets May 27, 2026
@brechtvdv brechtvdv self-requested a review May 27, 2026 07:20
@JonasVanHoof JonasVanHoof marked this pull request as ready for review May 29, 2026 07:02
results in query returned organization & municipality

Refs: lbron-1322
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants