Kategori:Polyglotta:Documentation:ImportManual

Fra hf/dmlf
Revisjon per 21. sep. 2011 kl. 11:53 av Fredrili@uio.no (diskusjon | bidrag) (Abbreviations:)

Hopp til: navigasjon, søk

Input formats

Abbreviations:

bm:bookmark;
cn:critical note;
cs:container style;
ct:continuous text;
[o: when order is changed, the o with a following text reference after :: in parentheses refers to from which place a line is moved;]
rfn: reference note;
rn: realia note;
ss: sentence by sentence;
double parenteheses:[(...)];


There are mainly two display modes for text in the BP, sentence by sentence (ss) and continuous text (ct). Some of the formats have relevance only for the last mentioned. Texts are produced in the formats with a line as a basic textual unit, called sentence. The sentence is defined in a text of the original language from which translations are made. The translation sentences, in another file, are aligned with the file of the original language. Apart form the line breaks and html tags, all other format indicators are made only in the file of the orignal language. The allowed html tags are <i>...</i>, <b>...</b>, <br />. The <br /> tag is usually used with verses and in other cases where one wishes a sentence to fill more than one line in ss or ct – it being defined in the input format as one line. The following characters are not allowed in the input files, and have to be replaced as follows:


< less than &lt; &#60;
> greater than &gt; &#62;
& ampersand &amp; &#38;
¢ cent &cent; &#162;
£ pound &pound; &#163;
¥ yen &yen; &#165;
euro &euro; &#8364;
§ section &sect; &#167;
© copyright &copy; &#169;
® registered trademark &reg; &#174;


It is important to have a text editor (like Text Wrangler for Mac, easily and freely available on the internet), that can number lines independently in the left margin, so that one always can control that the correct sentences in each file (language, translation) corresponds to the sentences in the file with the original language.


Some test files containing most of the formats: 

Sanskrit with manually entered references and bookmarks
Tibetan with automatically generated references    
English without references


See also the examples at the end of this document.


1. A line break, \r, indicates the division between sentences. In ss it marks the end of a multilingual sentence unit, also called multiple, in the ct it is replaced with a blank.

2. A double line break, \r\r, indicates the division between sentences. In ss it is equivalent with the former, marking the end of a multiple, in the ct it is represented as a division between paragraphs, with 1,5 line spacing. Every format indicator mentioned below runs until the next \r\r. In the export \r\r is represented as the tag [(cs :: paragraph)].

3. [(bm::1::Section 1)] A bookmark usually marks a section, and they are numbered consecutively from 1 ... The bookmark is placed with the first line of the section, e.g. [(bookmark::1::Section 1)]. The bookmark is not represented in ss, but in ct as a double line break \r\r after the line to which it stands. The first number of the bookmark is its level, and thus, under each bookmark of level 1, there may be bookmarks of lover order or level. After the second :: in the tag comes the title of the bookmark which is written out on screen as the representation of the bookmark. If this place is empty, the first few words of the section denoted by the bookmark will be written out.

4. [(cs :: Maintitle)] Put the main title of a work on the same line after this indicator and make a double line break after to end the range of its influence. It is usually one line, but can have more than one line, either indicated with more lines in the input or a <br /> within the one line.

5. [(cs :: Subtitle)] Put the subtitles of a work on the same line after this indicator and make a double line break after to end the range of its influence. The title is usually one line, but if the title is more than one line, one can indicate that in the input with a <br /> within the one line of the input. If some of the translations contain subtitles not in the original language text, these have to be accommodated by means of the <br /> tag or with the format indicator [(Subtitle)] and no text in the original.

6. [(cs :: verse)] This indicator is put before the first line of any number of verses, with each verse one line. The parts of the verses may be put onto separate lines by <br />. Make a double line break before it, and after the end of the verses to end the range of the indicator’s influence.

Note: The indicators described in points 3-6 above should be put before any manually input reference, e.g. [(bm :: 1 :: Section 9)]IX :: de nas ’jam dpal gźon (Q265b) nur gyur pas

7. Any text critical note (cn) has the format {word in text}[(cn :: {Ms references}: {variant})], e.g. su źig stsol[(cn :: DJN: stsal)] bar bgyid. Realia notes (rn) have the same format, but using rn instead of cn. cn are presented in ct as running text under the paragraph to which the locus pertains. These notes can also be accessed through a pop up, and by click. rn will be treated as end notes, accessible by clicks. A reference note (rfn) is a not affixed to a sentence where one wishes to make a reference to another sentence in BP, either in the same text or in another text. Reference notes have to be finalized after the input.

8. References to page and line in sources should be given in the input lines in double parentheses [(15,1)]. See the example of gre below. After a page number the line number is given after a comma, or after a,b, c etc., when used for recto, verso. For the numbering of every line use the regex function in your text editor if the numbers are not given. If the order of sentences in the translations does not fit the original in the way of being moved, these lines are numbered manually,with a triple colon separating the reference and the text, e.g. 318,4-5 ::: xxxxx; or 1359b12-14 ::: xxxx.

9. The translations always follow the original, even though the order of the translated text has to be changed. As stated in the previous paragraph the line references are in these cases numbered manually, and this is marked in the database, so that the translated text can be written out also in the right order, if wished. [The place from where a sentence is moved is marked with [(o::{page,line})], e.g. [(o::4,1)], "place from where a line numbered page 4 line 1 is moved", and the place to where it is moved is marked manually with 4,1 ::: {text}. A line can be moved from both before and after the locus in question.] In the cases where the change of lines in the translation belong to a small number of consecutive lines, the translation may be put into one line, and the blank lines are marked with "see previous (or: next) record". In all cases where there is no translation, version, or even original, this is noted by "No English", "No Sanskrit", "No Chinese" etc., which is retained in ss, but is removed in ct.

10. Before and after the text to input there is a xml head and tail, which contain the information on how the files are processed to produce the printout on the screen and other informations on formats, etc. The reference engine should be integrated into the tail when needed, cf. also point 8. above. In the tail for Sanskrit below the refengine is integrated.

11. To name the source in the reference use the tag as e.g [(source :: kin 幾何原本)]. Such tags can be put in several places of the text, if the source changes, but in general it is placed after the main title of a work.

12. In cases where one wishes to display something in ct, but not in ss, the content in question may be framed by the div-class "nabs" (not in sentence-by-sentence): <div class="nsbs">...</div>


Underkategorier

Denne kategorien har kun følgende underkategori.