Skip to content

fix(reranker): detect pre-normalized scores and use rank-based normalization#1512

Open
xuli500177 wants to merge 2 commits into
vectorize-io:mainfrom
xuli500177:fix/reranker-sigmoid-normalization
Open

fix(reranker): detect pre-normalized scores and use rank-based normalization#1512
xuli500177 wants to merge 2 commits into
vectorize-io:mainfrom
xuli500177:fix/reranker-sigmoid-normalization

Conversation

@xuli500177
Copy link
Copy Markdown

@xuli500177 xuli500177 commented May 7, 2026

Problem

External API rerankers (SiliconFlow, Cohere, etc.) return pre-normalized relevance_score in [0, 1] with very small absolute values. The current code applies sigmoid to all scores, assuming they are logits. This compresses everything to ~0.5, destroying the ranking signal and making recency the sole sorting factor.

Example with SiliconFlow BAAI/bge-reranker-v2-m3

Document Raw score After sigmoid After this fix
Relevant memory 0.0029 0.5007 1.0000
Somewhat relevant 0.0003 0.5001 0.5000
Irrelevant 0.0000 0.5000 0.0000

With sigmoid, all scores are ~0.5 and recency becomes the only ranking signal. With rank-based normalization, the CE signal correctly dominates.

Fix

Detect the score range in CrossEncoderReranker.rerank():

  • If all scores are in [0, 1]: Use rank-based normalization with tie handling (equal scores get equal ranks). This preserves relative ordering without depending on absolute score magnitudes.
  • Otherwise (logits): Use sigmoid as before. This maintains backward compatibility with local models (e.g. cross-encoder/ms-marco-MiniLM-L-6-v2).

Testing

Verified with real SiliconFlow API scores:

  • SiliconFlow [0,1] small values: correctly rank-normalized (1.0 > 0.5 > 0.0)
  • Local model logits: correctly uses sigmoid (0.92 > 0.86 > 0.57)
  • All identical scores: correctly assigned equal ranks (no artificial separation)
  • Tied scores: correctly grouped (e.g. two tied at top → 1.0, distinct lower → 0.0)
  • Empty input: returns []
  • Passthrough reranker: unaffected (apply_combined_scoring overrides with RRF rank)

Unit tests added in tests/test_reranker_score_normalization.py.

Related

…ization

External API rerankers (SiliconFlow, Cohere, etc.) return pre-normalized
relevance_score in [0, 1] with very small absolute values. Applying
sigmoid to these compresses everything to ~0.5, destroying the ranking
signal and making recency the sole sorting factor.

This fix detects the score range:
- If all scores are in [0, 1]: use rank-based normalization with tie
  handling (equal scores get equal ranks)
- Otherwise (logits): use sigmoid as before

This preserves the correct behavior for local models (logits) while
fixing ranking quality for external API rerankers.
Copy link
Copy Markdown
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, can you add unit tests on this function

- Rank-based normalization for [0,1] scores
- Tied scores receive identical normalized values
- Sigmoid normalization for logit scores
- Empty candidates returns [] without calling predict()
- Fix typo: "sole排序 factor" -> "sole sorting factor"
@xuli500177
Copy link
Copy Markdown
Author

Unit tests added for CrossEncoderReranker.rerank() score normalization:

  • Rank-based normalization for [0,1] scores
  • Tied scores get identical normalized values
  • Sigmoid normalization for logit scores
  • Empty candidates returns [] without calling predict()
  • Boundary values (0.0, 1.0) handling

Also fixed a typo in the comment: "sole排序 factor" → "sole sorting factor".

Tests use AsyncMock, no external API calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants