Lexicon is the repo where we gather and normalize all reference data to
feed lexicon schema of Ekylibre and Lexicon API project.
Table of Contents
Lexicon work under Docker with Python 3 / Ruby and Postgresql / Postgis stack.
Beware of disk space before running it (need at least 8 cores CPU > 3Ghz, 32Go RAM and 1To disk SSD free space)
See doc/INSTALL.md for more informations
See doc/UPGRADING.md for more informations
On the host, you could use ./lexicon <COMMAND> <OPTIONS>
To build a dataset (collect, load, normalize) in one time, you can use 'run' command.
example for phyto dataset
./lexicon run phytosanitary
You can dump all datasets or use flavors to filter dataset. A package <VERSION> is produce in out folder with <VERSION_NUMBER>-<FLAVOR> name
example for all datasets
./lexicon dump all --no-validate
example for dataset only in SAINT-PORCHAIRE(17250) zone
./lexicon dump all --flavor saint-porchaire --no-validate
with flavor file saint-porchaire.yml
---
name: saint-porchaire
without:
- pesticide_frequency_indicator
- enterprises
datasources:
cadastre: # name of the datasource, corresponding to a file in lib/datasources
registered_cadastral_parcels: # name of the table to filter in the datasource
filter: WHERE postgis.ST_DWithin(centroid , postgis.ST_PointFromText('POINT(-0.78 45.81)',4326) , 0.10)
cadastral_prices:
registered_cadastral_prices:
filter: WHERE postal_code = '17250' ORDER BY id
graphic_parcels:
registered_graphic_parcels:
filter: WHERE postgis.ST_DWithin(centroid , postgis.ST_PointFromText('POINT(-0.78 45.81)',4326) , 0.10) ORDER BY id
hydrography:
registered_hydrographic_items:
filter: WHERE postgis.ST_DWithin(centroid , postgis.ST_PointFromText('POINT(-0.78 45.81)',4326) , 0.10)
registered_area_items:
filter: WHERE postgis.ST_DWithin(centroid , postgis.ST_PointFromText('POINT(-0.78 45.81)',4326) , 0.10)
registered_cadastral_buildings:
filter: WHERE postgis.ST_DWithin(centroid , postgis.ST_PointFromText('POINT(-0.78 45.81)',4326) , 0.10)
postal_codes:
registered_postal_codes:
filter: WHERE postal_code = '17250'
weather:
registered_weather_stations:
filter: WHERE country = 'FR' AND country_zone = '17'
registered_hourly_weathers:
filter: WHERE station_id LIKE 'FR17%'
set credentials in your .env
MINIO_HOST=<YOUR-HOST> # https://api.opensourcefarm.org/
MINIO_ACCESS_KEY=<YOUR-ACCESS-KEY>
MINIO_SECRET_KEY=<YOUR-SECRET-KEY>
then you could launch remote upload command
./lexicon remote upload <VERSION>
See doc/USAGE.md for more informations
| Datasource | Record count | Size (MB) | Spatial | Last updated | Provider | Description |
|---|---|---|---|---|---|---|
administrative_areas |
119 | 2.34 | ✔ | 2021-01-01 | INSEE | Référentiel administratif France (régions & départements, COG INSEE 2021) |
agroedi |
1,814 | 0.26 | ⨯ | 2023-01-01 | AgroEDI | AgroEDI Europe |
budgets |
34 | 0.03 | ⨯ | 2021-07-12 | Ekylibre SAS | Budgets |
cadastral_prices |
20,382,915 | 6,728 | ✔ | 2026-04-14 | Etalab | Prices of cadastre |
cadastre |
93,487,746 | 102,200 | ✔ | 2026-03-29 | Etalab | Official cadastre |
cadastre_owners |
46,582,299 | 10,407 | ⨯ | 2025-08-19 | DGFiP | Cadastre owners (legal entities / personnes morales) — DGFiP MAJIC 2025 |
cap_beneficiaries |
1,606,007 | 757 | ⨯ | 2024-12-31 | Agence de Services et de Paiement (ASP) | Bénéficiaires des subventions de la Politique Agricole Commune (PAC) — FEAGA & FEADER |
chart_of_accounts |
1,595 | 0.41 | ⨯ | 2020-02-12 | Ekylibre SAS | Chart of accounts |
enterprises |
1,341,069 | 278 | ✔ | 2025-01-01 | INSEE | French Enterprises datasource |
eu_market_prices |
184,729 | 62.4 | ⨯ | 2026-05-10 | European Union | Europe market prices |
graphic_parcels |
9,686,744 | 12,414 | ✔ | 2024-01-01 | IGN / ASP / MASA | RPG v3.0 — Parcelles agricoles constatées (France métropolitaine) |
hydrography |
88,841,350 | 105,679 | ✔ | 2024-09-15 | IGN | hydro data from IGN (national BD TOPO GeoPackage) |
intervention_models |
532 | 0.33 | ⨯ | 2021-09-10 | Ekylibre SAS | Intervention models references used for ITKs |
legal_positions |
21 | 0.03 | ⨯ | 2020-02-12 | Ekylibre SAS | Legal positions |
msa_populations |
49,756 | 7.92 | ⨯ | 2026-05-16 | Caisse Centrale de la Mutualité Sociale Agricole (CCMSA) | MSA populations: retraités, chefs d'exploitation et nouveaux installés par commune |
phenological_stages |
54 | 0.03 | ⨯ | 2020-02-12 | Ekylibre SAS | Phenological stages |
phytosanitary |
122,292 | 140 | ⨯ | 2025-05-20 | ANSES | Phytosanitary products database from Ephy |
postal_codes |
39,192 | 711 | ✔ | 2026-05-08 | Groupe La Poste | French Enterprises postal and insee codes with gps coordinates |
prices |
2,337 | 0.77 | ⨯ | 2022-02-22 | Ekylibre SAS | Price catalog of variants |
productions |
19,553 | 3.21 | ⨯ | 2021-09-10 | Ekylibre SAS | Production database |
protected_natural_zones |
1,762 | 90.1 | ✔ | 2024-12-01 | MNHN / INPN | Natura 2000 — Sites SIC et ZPS (MNHN/INPN, NATURA_BDD 12/2024) |
protected_water_zones |
1,585 | 8.09 | ✔ | 2025-01-27 | IGN | protected water zone from SANDRE |
quality_and_origin_signs |
5,255 | 1.05 | ⨯ | 2025-03-10 | INAO | AOC - AOP - IGP |
rica |
96,809 | 648 | ⨯ | 2025-04-07 | SSP / Agreste | RICA (Réseau d Information Comptable Agricole) — microdonnées comptables agricoles annuelles |
seed_varieties |
20,479 | 3.47 | ⨯ | 2025-01-26 | SEMAE | Seed varieties from GNIS (Groupement National Interprofessionnel des Semences et plants) |
soil |
3,651 | 10.5 | ✔ | 2022-01-12 | INRAE | Agronomical soil data from INRAE |
taxonomy |
855 | 0.2 | ⨯ | 2021-01-27 | Ekylibre SAS | Taxonomy |
technical_workflow_sequences |
25 | 0.13 | ⨯ | 2020-02-12 | Ekylibre SAS | Technical workflows chaining for multiannual production |
technical_workflows |
1,321 | 1.02 | ⨯ | 2021-09-10 | Ekylibre SAS | Technical workflows references |
translations |
3,081 | 0.63 | ⨯ | 2021-01-27 | Ekylibre SAS | Translations of variants, productions, taxonomy and user_roles |
units |
189 | 0.17 | ⨯ | 2022-02-23 | Ekylibre SAS | Dimensions, units and packaging |
user_roles |
4 | 0.03 | ⨯ | 2022-02-23 | Ekylibre SAS | User roles |
variants |
1,607 | 0.69 | ⨯ | 2021-09-10 | Ekylibre SAS | Articles, Equipments, Services, Crops, Animals, Workers and Zones |
vine_varieties |
416 | 0.2 | ⨯ | 2020-10-19 | FranceAgriMer | Vine varieties |
weather |
214,843,019 | 44,144 | ✔ | 2025-05-23 | Météo France | Historical weather |
Last refreshed: 2026-05-16 16:59 UTC
See doc/DATASOURCES.md for more informations
See doc/CONTRIBUTING.md for more informations