Thank you for you interest in participating.
I will not assume you know Norwegian, but of course most of you do.
Knowing Norwegian is not necessary to help out. I can understand
Norwegian but my writing still is "not so good". You can correspond
with me in either English or Norwegian.
To give you a little background...
NSF has issued 2 "hefter" (booklets) per year for the last 70 years.
4 hefter then make up a "bind" (volume). So there have been about 35
binds produced.
Individual indexes have been published for the 35 volumes (but in
some cases an index was created for 2 binds).
I have already OCR'ed and corrected volumes 1 through 10, and 34 and
35. I am looking for help with the remaining volumes.
I wrote a few programs to merge 2 volumes which went pretty well, and
if it worked for 2, well.... with a bit more work, I can get the 35
volumes merged.
This is what I would ask you to help with:
To take my OCR'ed output and compare it to the "original". You will
be provided with a plain "text" file of the unedited OCR output. You
can edit it in a plain text editor like "notepad". The "original" can
be an graphical image of the page, or I can send you a hardcopy.
The graphical image will be a TIFF format which can be read by most
image views. The image viewer/editor Paint Shop Pro (shareware) is
what I use. You can also try IrfanViewer an excellent graphics viewer
which is free. I put the text editor and the image view side-by-side
on my screen and do the proofreading that way. I must admit that if
you prefer a hardcopy the text is clear but a little small. You
can find the download for Paint Shop Pro (32 bit and 16 bit versions
are available) if you go back to my homepage and click on "Computers,
Internet". You may want to try some of the text editors noted.
There is a sample TIFF file, unedited text file and edited text file
at
http://home.nc.rr.com/risholmr/downloads
The most important things to note is that:
- If a persons information "wraps" to a second line, I edit it to be
on one line. This way each line has a complete set of information for
that one person.
- Punctuation (comma and period) are important.
- There should always be a "space" after a comma or a period, except
for the period at the end of the line.
- If in doubt about a character, or you just can't read it .. just
put a question mark (?) there and I will try to resolve it.
- Do not be concerned that some of the original characters are in
BOLD or ITALICS - just type the character.
- Some of the indexes where in a format where individual names where
not started on a new line, I have pre-done some work for you to put
them on separate lines.
- Watch out for OCR errors which are harder to spot like using the
letter "l" instead of the numeric 1 and vice-versa.
The Norwegian characters ÆØÅæøæ
can be difficult if you are using the standard English keyboard.
These characters can be found by running a program under Windows
called "charmap". You will also see some other characters like
ü which you may need to find using charmap.
After reading this please get back to me with whether you can use the
graphical type output, in which case I will put TIFF files in the
same directory as the sample which you can then "download" as you did
the sample. The unedited text will be sent to you via email. You then
just need to send me back the edited text.
You efforts will be much appreciated.
Please get back to me with your decision on how you would like the
"original", graphics file or hardcopy. If you can take the image
files in a large "zipped" email (about 1 meg) I can send it to you
that way, or I will put the individual "pages" (about 80k each in the
same directory as the samples.
Please send me address information such that I can send you your
copy of the book when it is published.
Thank You
Robert Anders Risholm
risholmr@yahoo.com