Skip to content

Conversation

@hannahbast
Copy link
Member

@hannahbast hannahbast commented Dec 6, 2025

During parsing, for each triples with a literal object, and for each word in that literal, add an internal triple subject ql:has-word "word". These can be used for highly customized text search. To make this efficient, materialized views can be used.

TODO: This is currently done unconditionally, which makes it easier to test (we don't need special options in the Qleverfile). Eventually, there should be an option --add-has-word-triples to IndexBuilderMain to enable this behavior. Tests are also still missing

Hannah Bast added 5 commits December 6, 2025 02:41
During parsing, for each triples with a literal object, and for each
word in that literal, add an internal triple `subject ql:has-word
"word"`.

TODO: This is currently done unconditionally, which makes it easier to
test (we don't need special options in the Qleverfile). Eventually,
there should be an option `--add-has-word-triples` to `IndexBuilderMain`
to enable this behavior. Tests are also still missing
@codecov
Copy link

codecov bot commented Dec 6, 2025

Codecov Report

❌ Patch coverage is 98.18182% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 91.20%. Comparing base (50d08bc) to head (6dcd451).

Files with missing lines Patch % Lines
src/index/IndexBuilderTypes.h 97.43% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2579   +/-   ##
=======================================
  Coverage   91.20%   91.20%           
=======================================
  Files         473      473           
  Lines       40233    40264   +31     
  Branches     5378     5386    +8     
=======================================
+ Hits        36695    36724   +29     
- Misses       2006     2007    +1     
- Partials     1532     1533    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Writing the position is more general. But computing the term frequencies
for each text-word pair is currently not efficient in QLever (it
requires too much memory and a GROUP BY with two variables is much
slower than a GROUP BY with one variable). Since we never needed
positions so far, but we do want term frequencies for scoring, let's
make this the default for now.
@sparql-conformance
Copy link

Overview

Number of Tests Passed ✅ Intended ✅ Failed ❌ Not tested
525 379 67 79 0

Conformance check passed ✅

No test result changes.

Details: https://qlever.dev/sparql-conformance-ui?cur=6dcd451a4878273ffb816ba5255d33023376610c&prev=50d08bc03a4c1fe7495d4209a3542c90ce36b997

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 7, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants