FRDB Archives

Freethought & Rationalism Archive

The archives are read only.


Go Back   FRDB Archives > Philosophy & Religious Studies > History of Abrahamic Religions & Related Texts
Welcome, Peter Kirby.
You last visited: Today at 01:23 AM

 
 
Thread Tools Search this Thread
Old 05-26-2013, 09:19 AM   #1
Veteran Member
 
Join Date: Jan 2007
Location: Mondcivitan Republic
Posts: 2,550
Default Seeking Proof Reader for Latin Irenaeus OCR

Hey,

I finally got fed up with the fact that an online version of the Latin text of Irenaeus' Against Heresies is not available on the Web.

Maybe one can find a Logos module for $200 (and copyrighted), but not on the web.

One can find Unicode text of the surviving Greek fragments, which was scanned from from J P Migne's PG edition (volume 7), just about everywhere on the web (I can locate at least two sites).

What is funny about that is that the Latin text was in the very same volume as the Greek. Despite the fact that manuscripts of the Latin translation are the only complete or nearly complete ones to have survived, nobody has thought it important enough to OCR a critical Latin text for general distribution.

I think that this situation belies a bias against the writer. Since his work has not survived in Greek except for fragments (including the entire first book), he is, apparently, unworthy of serious study by amateur "dilletantes" (I am being facetious, of course).

Well, in any event, I have opened copies of the PDFs of W Wigan Harvey's edition of 1857 in ABBYY FineReader and OCRd the Latin Text only.

While the crappy quality of the original printed text, and the problems that ABBYY has with reading Latin ligatures (ae & oe, which to make it worse are identical in italics), and the letters e & s, made for some serious scanning errors, I was able to eradicate the greater bulk of them. However, I am quite confident that there are more.

Unfortunately I do NOT read Latin and would like for someone with either a background in the Classics or Latin in general proofread the resulting text. The PDFs of the original volumes are available online (or I can send them to you along with the Word file of the OCR results).

So Far I've worked through volume I (Books I & II), and am starting on Vol.II.

Let me know if anyone is game.

Thanks!

DCH
DCHindley is offline  
Old 05-26-2013, 10:04 AM   #2
Veteran Member
 
Join Date: Jun 2010
Location: seattle, wa
Posts: 9,337
Default

Smart thinking. I've often wondered about this. I am busy dealing with other things but there's someone in Toronto I know
stephan huller is offline  
Old 05-26-2013, 11:16 AM   #3
Veteran Member
 
Join Date: Nov 2005
Location: Texas
Posts: 3,884
Default

Quote:
Originally Posted by DCHindley View Post
Hey,

I finally got fed up with the fact that an online version of the Latin text of Irenaeus' Against Heresies is not available on the Web.

Maybe one can find a Logos module for $200 (and copyrighted), but not on the web.

One can find Unicode text of the surviving Greek fragments, which was scanned from from J P Migne's PG edition (volume 7), just about everywhere on the web (I can locate at least two sites).

What is funny about that is that the Latin text was in the very same volume as the Greek. Despite the fact that manuscripts of the Latin translation are the only complete or nearly complete ones to have survived, nobody has thought it important enough to OCR a critical Latin text for general distribution.

I think that this situation belies a bias against the writer. Since his work has not survived in Greek except for fragments (including the entire first book), he is, apparently, unworthy of serious study by amateur "dilletantes" (I am being facetious, of course).

Well, in any event, I have opened copies of the PDFs of W Wigan Harvey's edition of 1857 in ABBYY FineReader and OCRd the Latin Text only.

While the crappy quality of the original printed text, and the problems that ABBYY has with reading Latin ligatures (ae & oe, which to make it worse are identical in italics), and the letters e & s, made for some serious scanning errors, I was able to eradicate the greater bulk of them. However, I am quite confident that there are more.

Unfortunately I do NOT read Latin and would like for someone with either a background in the Classics or Latin in general proofread the resulting text. The PDFs of the original volumes are available online (or I can send them to you along with the Word file of the OCR results).

So Far I've worked through volume I (Books I & II), and am starting on Vol.II.

Let me know if anyone is game.

Thanks!

DCH
Roger Pearson has a website where he tries to collect such things. The Tertulian Project. You might ask there if he knows anyone who might be willing to help.
http://www.tertullian.org/

Cheerful Charlie
Cheerful Charlie is offline  
Old 05-26-2013, 12:28 PM   #4
Veteran Member
 
Join Date: Jan 2007
Location: Mondcivitan Republic
Posts: 2,550
Default

Cheerful,

Yes, Roger has done much that same thing with several of the texts he has placed online. I'm pretty sure he may eventually see my message.

Since he has his own circles that he hangs with, I am hoping he might spread the word.

DCH

Quote:
Originally Posted by Cheerful Charlie View Post
Quote:
Originally Posted by DCHindley View Post
Hey,

I finally got fed up with the fact that an online version of the Latin text of Irenaeus' Against Heresies is not available on the Web.

Well, in any event, I have opened copies of the PDFs of W Wigan Harvey's edition of 1857 in ABBYY FineReader and OCRd the Latin Text only.

While [issues with the text and OCR software] ade for some serious scanning errors, I was able to eradicate the greater bulk of them. However, I am quite confident that there are more.

[I] would like for someone with either a background in the Classics or Latin in general proofread the resulting text.

Let me know if anyone is game.
Roger Pearson has a website where he tries to collect such things. The Tertulian Project. You might ask there if he knows anyone who might be willing to help.
http://www.tertullian.org/

Cheerful Charlie
DCHindley is offline  
Old 05-27-2013, 07:30 AM   #5
Veteran Member
 
Join Date: Apr 2002
Location: N/A
Posts: 4,370
Default

Quote:
Originally Posted by DCHindley View Post
Yes, Roger has done much that same thing with several of the texts he has placed online. I'm pretty sure he may eventually see my message.
You are kind to think of me. But I cannot help -- sorry. I just don't seem to have the energy any more for what is, as you have discovered, back-breaking labour. Fifteen years ago I would have jumped at this project. Even ten years ago, perhaps even five. I couldn't do it now. Indeed I find that I have not had enough energy to finish my own last OCR project (an English translation of Theodoret's Commentary on Romans, which I discovered, forgotten, in a 1840 journal, a couple of years ago; maybe one day I will at least get that done!)

Now I always did my own OCR, so I don't know people who could help; unless, perhaps Gutenberg are still in the market for texts that can be proof-read? They have a system of proof-readers.

Latin is quite horrible to OCR anyway. I always wanted to make a spelling checker for it, but never got around to it. Never will now, I know. But when I did the Latin text for the Eusebius book, I was so tired at the time that I made an awful hash of it, which the translator, thankfully, caught.

One suggestion. You will find the Sources Chretiennes Latin text far easier to OCR in Finereader than any 19th century text. Something about the font. The main error I recall was treating "-um" as "-urn".

One other thought. The Latin Irenaeus comes with ancient indexes, prefaced to each book. Do include these; for the odds that the ancient book went forth into the world equipped with them are really quite good.

It is, I agree, a very nice idea to do this project. It ought to be online.

All the best,

Roger Pearse
Roger Pearse is offline  
Old 05-27-2013, 07:37 AM   #6
Veteran Member
 
Join Date: Jun 2010
Location: seattle, wa
Posts: 9,337
Default

Book One, as we have it, I believe is almost entirely made up of Epiphanius's text in Panarion Book One. I might suggest the following. I don't know if it is any faster but it is what I do. Go to archive.org. Get the Latin text. Then go to Google and enter in small Latin 'bites' and see the results in Google. Copy the results. Paste it in your document. Go back to your source in archive. Go back to Google. Do the next chunk. Copy and paste.

I find this method avoids many clerical errors because if you make the error in Google no results show up and what comes from Google is generally pristine.
stephan huller is offline  
Old 05-27-2013, 11:46 AM   #7
Veteran Member
 
Join Date: Nov 2005
Location: United Kingdom
Posts: 3,619
Default

Quote:
Originally Posted by Roger Pearse View Post
Quote:
Originally Posted by DCHindley View Post
Yes, Roger has done much that same thing with several of the texts he has placed online. I'm pretty sure he may eventually see my message.
You are kind to think of me. But I cannot help -- sorry. I just don't seem to have the energy any more for what is, as you have discovered, back-breaking labour. Fifteen years ago I would have jumped at this project. Even ten years ago, perhaps even five. I couldn't do it now. Indeed I find that I have not had enough energy to finish my own last OCR project (an English translation of Theodoret's Commentary on Romans, which I discovered, forgotten, in a 1840 journal, a couple of years ago; maybe one day I will at least get that done!)

Now I always did my own OCR, so I don't know people who could help; unless, perhaps Gutenberg are still in the market for texts that can be proof-read? They have a system of proof-readers.

Latin is quite horrible to OCR anyway. I always wanted to make a spelling checker for it, but never got around to it. Never will now, I know. But when I did the Latin text for the Eusebius book, I was so tired at the time that I made an awful hash of it, which the translator, thankfully, caught.

One suggestion. You will find the Sources Chretiennes Latin text far easier to OCR in Finereader than any 19th century text. Something about the font. The main error I recall was treating "-um" as "-urn".

One other thought. The Latin Irenaeus comes with ancient indexes, prefaced to each book. Do include these; for the odds that the ancient book went forth into the world equipped with them are really quite good.

It is, I agree, a very nice idea to do this project. It ought to be online.

All the best,

Roger Pearse
I had always thought of you as a young man in the prime of life, I still do notwithstanding your post!
Iskander is offline  
Old 05-27-2013, 12:50 PM   #8
Veteran Member
 
Join Date: Apr 2002
Location: N/A
Posts: 4,370
Default

Quote:
Originally Posted by Iskander View Post
Quote:
Originally Posted by Roger Pearse View Post
You are kind to think of me. But I cannot help -- sorry. I just don't seem to have the energy any more for what is, as you have discovered, back-breaking labour. Fifteen years ago I would have jumped at this project. Even ten years ago, perhaps even five. I couldn't do it now. ...
I had always thought of you as a young man in the prime of life, I still do notwithstanding your post!
You're very kind. Of course, when I started working on the website, I *was* a young man! But that is now more than fifteen years ago...

On a more positive note, I have been asked to contribute a paper to an academic volume on manuscript studies, to be published by a Very Serious university press. Never written an academic paper in my life. Don't possess any relevant qualifications. Nothing controversial, tho. Hadn't better say anything about content until submitted. Invitation came after the volume editor kept finding repeatedly that a web-page of mine, and blog posts by me, were the best available information on the subject. FRDB is partly responsible. Due by September. Wish me well... Very nervous!

All the best,

Roger Pearse
Roger Pearse is offline  
Old 05-27-2013, 01:13 PM   #9
Veteran Member
 
Join Date: Nov 2005
Location: United Kingdom
Posts: 3,619
Default

Quote:
Originally Posted by Roger Pearse View Post
Quote:
Originally Posted by Iskander View Post

I had always thought of you as a young man in the prime of life, I still do notwithstanding your post!
You're very kind. Of course, when I started working on the website, I *was* a young man! But that is now more than fifteen years ago...

On a more positive note, I have been asked to contribute a paper to an academic volume on manuscript studies, to be published by a Very Serious university press. Never written an academic paper in my life. Don't possess any relevant qualifications. Nothing controversial, tho. Hadn't better say anything about content until submitted. Invitation came after the volume editor kept finding repeatedly that a web-page of mine, and blog posts by me, were the best available information on the subject. FRDB is partly responsible. Due by September. Wish me well... Very nervous!

All the best,

Roger Pearse
I will ask my friends to pray for your success; I know you will do very well.


Reading your posts has taught me a great deal. Thank you.
Iskander is offline  
Old 05-28-2013, 04:19 AM   #10
Veteran Member
 
Join Date: Apr 2002
Location: N/A
Posts: 4,370
Default

Quote:
Originally Posted by Iskander View Post
Quote:
Originally Posted by Roger Pearse View Post

You're very kind. Of course, when I started working on the website, I *was* a young man! But that is now more than fifteen years ago...

On a more positive note, I have been asked to contribute a paper to an academic volume on manuscript studies, to be published by a Very Serious university press. Never written an academic paper in my life. Don't possess any relevant qualifications. Nothing controversial, tho. Hadn't better say anything about content until submitted. Invitation came after the volume editor kept finding repeatedly that a web-page of mine, and blog posts by me, were the best available information on the subject. FRDB is partly responsible. Due by September. Wish me well... Very nervous!

All the best,

Roger Pearse
I will ask my friends to pray for your success; I know you will do very well.


Reading your posts has taught me a great deal. Thank you.
Thank you. And I am glad to have helped anyone, at least a little.
Roger Pearse is offline  
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump


All times are GMT -8. The time now is 11:44 AM.

Top

This custom BB emulates vBulletin® Version 3.8.2
Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.