Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to access confidence level? #87

Closed
1a1a11a opened this issue Mar 3, 2016 · 15 comments
Closed

Is there a way to access confidence level? #87

1a1a11a opened this issue Mar 3, 2016 · 15 comments

Comments

@1a1a11a
Copy link

1a1a11a commented Mar 3, 2016

We are running it on some documents and we need absolute accuracy so we are using human proofreading, but is there any way to access confidence level so that we can just examine the ones with low confidence level?

@zuphilip
Copy link
Collaborator

zuphilip commented Mar 3, 2016

I tried to adapt some work built on @danvk for this purpose, i.e. to show probabilities for each letter. It is really just a small hack from my side, but you can have a look at the branch https://github.com/zuphilip/ocropy/commits/probabilities

@Sandy4321
Copy link

Is it fast running, why not to use fast framework like Theano, they have
lstm too
On Mar 3, 2016 06:59, "Jason Yang" notifications@github.com wrote:

We are running it on some documents and we need absolute accuracy so we
are using human proofreading, but is there any way to access confidence
level so that we can just examine the ones with low confidence level?


Reply to this email directly or view it on GitHub
#87.

@1a1a11a
Copy link
Author

1a1a11a commented Mar 3, 2016

thank you, @zuphilip, I am checking that! It is great! By the way, do you know whether the result comes out every time is same or not? I mean if I use same model on same image, is it possible to get different results?

@1a1a11a
Copy link
Author

1a1a11a commented Mar 3, 2016

@Sandy4321 It's great to know Theano, but it seems it's just a mathematic library, how does that relate to this?

@Sandy4321
Copy link

It is deep learning framework
On Mar 3, 2016 18:11, "Jason Yang" notifications@github.com wrote:

@Sandy4321 https://github.com/Sandy4321 It's great to know Theano, but
it seems it's just a mathematic library, how does that relate to this?


Reply to this email directly or view it on GitHub
#87 (comment).

@1a1a11a
Copy link
Author

1a1a11a commented Mar 4, 2016

@zuphilip I just tried your version, it is so so so nice! But I have some problem to interpret the probability numbers, for example the ones listed below. what is the first number and second number mean in each line? BTW, It is great to see your work, it's really amazing and useful.

4   21.0    0.998147512334
1   18.0    0.998673532294
6   23.0    0.933822265371
    1.0 0.985126139538
W   56.0    0.995752034066
    0.0 0.669438453921
l   77.0    0.771479722076
l   77.0    0.826818186054
l   77.0    0.862084065346
a   66.0    0.992835549207
m   78.0    0.995445149948
s   84.0    0.982924709589
    1.0 0.986920309815
J   43.0    0.917265339312
    1.0 0.991153504035
W   56.0    0.665979106147
    0.0 0.343662815981
    1.0 0.993414011944
(   9.0 0.988318526414
e   70.0    0.664349017539
)   10.0    0.994918996523
    1.0 0.993739754716
g   72.0    0.981143592017
r   83.0    0.997430796532
o   80.0    0.989413900792

@zuphilip
Copy link
Collaborator

zuphilip commented Mar 4, 2016

I am no expert on this subject and I just tried out some code. The first number looks to be some index for the representing letter, for example all W are encoded with 56. The first column is calculated from the second column by the code l = network.l2s([p]), but It haven't looked further what this means. Therefore, the really interesting thing is the third column. I have no clue about the calculation. There are some older issues on this, see #25, #47. Maybe that helps.

@Sandy4321
Copy link

May you pls share code to try
On Mar 4, 2016 01:45, "Philipp Zumstein" notifications@github.com wrote:

I am no expert on this subject and I just tried out some code. The first
number looks to be some index for the representing letter, for example all
W are encoded with 56. The first column is calculated from the second
column by the code l = network.l2s([p]), but It haven't looked further
what this means. Therefore, the really interesting thing is the third
column. I have no clue about the calculation. There are some older issues
on this, see #25 #25, #47
#47. Maybe that helps.


Reply to this email directly or view it on GitHub
#87 (comment).

@1a1a11a
Copy link
Author

1a1a11a commented Mar 4, 2016

You really helped a lot, thanks so much @zuphilip

@1a1a11a
Copy link
Author

1a1a11a commented Mar 4, 2016

whose code are you asking? @Sandy4321 zuphilip has his code in his repository and he has url above.

@1a1a11a 1a1a11a closed this as completed Mar 8, 2016
@Sandy4321
Copy link

super
thanks a lot
which project you mean from
https://github.com/zuphilip?tab=repositories

On Fri, Mar 4, 2016 at 10:43 AM, Jason Yang notifications@github.com
wrote:

whose code are you asking? @Sandy4321 https://github.com/Sandy4321
zuphilip has his code in his repository and he has url above.


Reply to this email directly or view it on GitHub
#87 (comment).

@zuphilip
Copy link
Collaborator

zuphilip commented Mar 8, 2016

Same project: ocropy, different fork, different branch:
https://github.com/zuphilip/ocropy/tree/probabilities

@kba
Copy link
Collaborator

kba commented May 24, 2016

We should put this in the wiki and/or merge it.

@1a1a11a
Copy link
Author

1a1a11a commented May 24, 2016

yes, but it seems this repository is not maintained any more?

@zuphilip
Copy link
Collaborator

zuphilip commented Nov 6, 2016

An imroved version is now merged. Example for usage:

ocropus-rpred tests/1175-01002b.png --probabilities

@zuphilip zuphilip closed this as completed Nov 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants