Read the instructions below or watch the tutorial.
We are doing a semi-diplomatic transcription of the documents. While a diplomatic transcription copies everything as it is, also the size and layout of words on the page, a semi-diplomatic transcription is less strict and makes some changes to improve clarity and readability.
Your tasks are two:
- correcting the already automatically generated transcription;
- tagging some words.
The following instruction help all us transcribers to create consistent and valuable transcriptions. Please, take 5 minutes to read them!
These instructions will answer most issues you will encounter, but if you have other questions not covered here, post them in the Discussion Forum or send an email to firstname.lastname@example.org
Consider only the text of the main page; ignore the bits of the adjacent page that you may see in the image.
Do not consider any page numbers or catalogue letters or numbers.
Transkribus automaticaly divided the images in regions and lines (layout analysis): it may happen that the recognition is not done perfectly. Delete the region/line which does not refers to the main text (e.g. bits of the adiacent page; archival notes; page numbers…) by selecting it and clicking the red “Remove a shape” button.
Sometimes the program creates two lines where only one is needed (it happens especially with superscript abbreviations) and you need to merge the two together: hold down the “CTRL” button on your keyboard and click on both lines; then click the “Merges the selected shapes” button in the menu.
The transcription represents exactly what is on the page: correct the automatically generated text only when it doesn’t match the manuscript.
Preserve punctuation, grammar, word order, Arabic and Roman numerals.
Preserve original text spelling, except when you find:
- u/v and U/V: modernise the spelling as in today’s Italian (for example ‘uero’ is transcribed as ‘vero’)
- i/j: the letter ‘j’ is always transcribed as the letter ‘i’ (for example ‘discorsij’ is transcribed as ‘discorsii’; ‘Cauallj’ as ‘Cavalli’)
- long s: it looks like a lowercase ‘f’; transcribe it as a lowercase ‘s’ (for example, ‘perſone’ is transcribed as ‘persone’)
- if ‘che’ or ‘cħ’ appears at the beginning of the phrase, capitalise the letter ‘c’ even if it seems a lower case. The conjunction ‘che’ indicates the beginning of a new news item in newsletters.
- two separate syntagmas which in today’s Italian are read together: join the two syntagmas (for example, ‘gatti pardi’ is transcribed as ‘gattipardi’)
- accents: accents are used according to modern Italian practice.
Except for the cases above, do not modify the spelling or punctuation, even if a word seems to you misspelled or the punctuation seems wrong.
Capture the formatting only when the word is underlined or struck through: select the letters or words and marked them up with the “Tag as underlined” / “Tag as strikethrough” button in the Text Editor field.
Do not format the text in bold or italic. Do not try to reproduce the formatting of the superscript and subscript letters: transcribe them as standard letters without any styling, even if within an abbreviation (for example, ‘Ambasre‘ is transcribed as ‘Ambasre’).
In the text, abbreviations are transcribed as you see them.
Superscript or subscript letters within an abbreviation are transcribed as standard letters.
To indicate a generic abbreviation sign within or at the end of an abbreviated word, use the ⁀ symbol (Character Tie, Unicode number U+2040. You can find it in Transkribus’ virtual keyboard, tab General Punctuation)
(for example, is transcribed as ‘Per l⁀re di Roma’).
The special characters used within abbreviations are:
|⁊||U+204°||TIRONIAN SIGN ET|
|▽||U+25BD||WHITE DOWN-POINTING TRIANGLE|
|ā||U+0101||LATIN SMALL LETTER A WITH MACRON|
|Ā||U+0100||LATIN CAPITAL LETTER A WITH MACRON|
|đ||U+0111||LATIN SMALL LETTER D WITH STROKE|
|ē||U+0113||LATIN SMALL LETTER E WITH MACRON|
|Ē||U+0112||LATIN CAPITAL LETTER E WITH MACRON|
|ħ||U+0127||LATIN SMALL LETTER H WITH STROKE|
|ī||U+012B||LATIN SMALL LETTER I WITH MACRON|
|Ī||U+012°||LATIN CAPITAL LETTER I WITH MACRON|
|ō||U+014D||LATIN SMALL LETTER O WITH MACRON|
|Ō||U+014C||LATIN CAPITAL LETTER O WITH MACRON|
|ꝑ||U+A751||LATIN SMALL LETTER P WITH STROKE THROUGH DESCENDER|
|Ꝑ||U+A750||LATIN CAPITAL LETTER P WITH STROKE THROUGH DESCENDER|
|ꝓ||U+A753||LATIN SMALL LETTER P WITH FLOURISH|
|ꝗ||U+A757||LATIN SMALL LETTER Q WITH STROKE THROUGH DESCENDER|
|ū||U+016B||LATIN SMALL LETTER U WITH MACRON|
|Ū||U+016°||LATIN CAPITAL LETTER U WITH MACRON|
Abbreviations are expanded not within the text but as tags.
If you are using Transkribus on your computer, the tagging interface can be found by clicking the “Metadata” tab, and then the “Textual” tab. Select the abbreviation in the text, click on the green + button of the “abbrev” tag and type the expansion in the “Tag Specification” section of the window.
If you are using the browser version, Transkribus Lite, tick the “Annotation” box, then select the abbreviation, right-click to add the “Abbrev” tag and type the expansion in the window that appears in the bottom-left corner.
Many abbreviations have already been automatically tagged and expanded: your main task will be to check their correctness, both for the transcription and the expansion, and to add the missing ones.
Here you can find a list of the most common abbreviations found in Bartoli’s documents: use it if you have any doubts on how to transcribe or solve an abbreviation.
If you are not sure about how to solve an abbreviation, just tag it without writing its expansion.
When there is an Arabic or Roman numeral with a line above, transcribe it as it appears, using the ̅ symbol (Combining Overline, Unicode number U+0305). Then, tag it and type the numeral without the overline as expansion.
For example: is transcribed as “1̅4̅0̅ galee”. 1̅4̅0̅ is tagged as “Abbrev” and expanded as “140”.
Another frequent case is a numeral with a line and the letter ‘m’ above. The letter ‘m’ stands for “mila” (thousand).
is transcribed as “4̅m fanti”. 4̅ is tagged as “Abbrev” and expanded ad “4”. The letter “m” is tagged as “Abbrev” and expanded as “mila”.
When text has been inserted over a line or written in the margin, but should be read as part of a sentence, bring it down into the original text and type it in the order you would read it aloud.
- UNSURE TEXT
Don’t leave notes in text: if you are unsure, leave the text as it is or post a question in the Discussion Forum.
If you cannot read a word or are unsure about a sentence, that’s ok! The project coordinator will review all the transcriptions before using them for the digital edition.
- Last but not least: DON’T FORGET TO SAVE YOUR WORK BEFORE YOU LEAVE THE DOCUMENT!
Within Transkribus, while you are working on the page, save it as “in progress”.
After correcting and tagging it, save it as “done”/” ready for review”. Please, don’t use “GT/Ground Truth” because it indicates that the page has been reviewed by the project coordinator and it is ready for the digital edition.
Remember also to change the Status in the spreadsheet: in “In Progress” once you have started and “Complete” once you have finished.