Skip to content

Add threshold (fuzzy match) parameters in match_text#460

Open
sc-stbt wants to merge 2 commits intostb-tester:mainfrom
sc-stbt:master
Open

Add threshold (fuzzy match) parameters in match_text#460
sc-stbt wants to merge 2 commits intostb-tester:mainfrom
sc-stbt:master

Conversation

@sc-stbt
Copy link

@sc-stbt sc-stbt commented Oct 15, 2017

I added a d fuzzy_match based on difflib to tolerate wrong matching specially in special character.
I add unidecode on _hocr_find_phrase function.

Youssef TRIKI added 2 commits October 15, 2017 19:40
Sometime match_text did not find text, we added a threshold to be more flexible.
it is reproduced on special caracters. we based on fuzzy match.
I added unidecode lib, may be there is some impact.
@drothlis
Copy link
Contributor

_stbt/core.py:2895:4: [E0401(import-error), _hocr_find_phrase] Unable to import 'unidecode'

The unit tests fail because you've added a new dependency unidecode.

What problem does this new dependency solve? Would it be possible to add the fuzzy_match functionality without adding a new dependency?

@sc-stbt
Copy link
Author

sc-stbt commented Nov 6, 2017

Sorry for the delay, I was busy with other urgent task.
We used unidecode lib to avoid fail due to some special character, for instance (ç will be read as 5). to go ahead on our test we passed unidecode word to be matched. but of course we can skip this lib.

@drothlis drothlis changed the title Add threshold parameters in match_text Add threshold (fuzzy match) parameters in match_text Jun 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants