Commit Graph

4 Commits

Author SHA1 Message Date
ghreprimand 22bd23f044 Word-boundary match for snippet and subject-term ranking (#1473 follow-up)
#1473 converted the title and sports-hint matches in services/search/ranking.py
to word boundaries but left two raw substring tests:

  - snippet_score: 'term in snippet.lower()' — query term 'port' hits
    'transport'/'support', inflating a result's relevance.
  - news_quality_adjustment: 't in text or t in netloc' for the subject term —
    query 'us' substring-matches 'business'/'music', so an off-topic page
    wrongly escapes the off-topic penalty on a country/subject news query.

Add a _has_word helper (the same \b...\b pattern title_score already used) and
route all three word checks (title, snippet, subject) through it, so the file
stays consistent and a future partial fix can't reintroduce the same bug class.
Pure ranking refinement: scores change only for spurious substring matches; no
API or schema change.
2026-06-03 07:07:12 -05:00
Alexandre Teixeira a75dd4a231 fix(search): apply recency UTC fix to live ranking module 2026-06-03 12:49:32 +01:00
Afonso Coutinho b55c970ec5 fix: sports-hint ranking penalty fires on 'transport'/'passport' substrings (#1473)
* fix: sports-hint ranking penalty fires on 'transport'/'passport' substrings

* Apply word-boundary sports-hint fix to src/search/ranking.py as well
2026-06-03 14:23:52 +09:00
pewdiepie-archdaemon e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00