Forskjell mellom versjoner av «Kategori:Polyglotta:Documentation:SearchHelp»

Fra hf/dmlf
Hopp til: navigasjon, søk
Linje 3: Linje 3:
 
One can search any word or phrase in the corpus, in a specific library, or in a chosen set of texts, and have the search results written out, and further access any search result in its sentence by sentence multilingual mode by clicking on the "Go to record".
 
One can search any word or phrase in the corpus, in a specific library, or in a chosen set of texts, and have the search results written out, and further access any search result in its sentence by sentence multilingual mode by clicking on the "Go to record".
  
In Sanskrit words may be searched either as connected with sandhi or dissolved, thus, e.g. if the word ''sā'' is connected with ''eva'', producing ''saiva'', the all the searchwords ''sā'', ''eva'' and ''saiva'' will find ''saiva''.
+
In Sanskrit words may be searched either as connected in sandhi or dissolved, thus, e.g., if the word ''sā'' is connected with ''eva'', producing ''saiva'', the all the searchwords ''sā'', ''eva'' and ''saiva'' will find ''saiva''.
  
 
In Greek searches with both diacritics and without will find the word in question, thus both πραξις and πρᾶξις will produce πρᾶξις.
 
In Greek searches with both diacritics and without will find the word in question, thus both πραξις and πρᾶξις will produce πρᾶξις.
Linje 16: Linje 16:
 
The advanced search option "Choose specific texts..." may be used to limit one's search to specific libraries or specific texts.  
 
The advanced search option "Choose specific texts..." may be used to limit one's search to specific libraries or specific texts.  
  
For searching with regular expressions (regex) the BP employs Perl Compatible Reular Expressions (PCRE). The most important meta-characters include:  
+
For searching with regular expressions (regex) the BP employs Perl Compatible Regular Expressions (PCRE). The most important meta-characters include:  
  
 
\ &nbsp; &nbsp; Quote the next metacharacter<br> ^ &nbsp; &nbsp; Match the beginning of the line<br> . &nbsp; &nbsp; Match any character (except newline)<br> $ &nbsp; &nbsp;Match the end of the line (or before newline at the end)<br> | &nbsp; &nbsp; Alternation<br> () &nbsp; &nbsp;Grouping<br> [] &nbsp; &nbsp;Bracketed Character class<br>* &nbsp; &nbsp; Match 0 or more times<br> + &nbsp; &nbsp;Match 1 or more times<br>&nbsp;? &nbsp; &nbsp; Match 1 or 0 times<br>\l &nbsp; &nbsp; lowercase next char (think vi)<br> \u &nbsp; &nbsp;uppercase next char (think vi)<br> \L &nbsp; &nbsp;lowercase till \E (think vi)<br> \U &nbsp; uppercase till \E (think vi)<br>\w &nbsp; Match a "word" character (alphanumeric plus "_", plus&nbsp;other connector punctuation chars plus Unicode&nbsp;marks)<br> \W &nbsp; Match a non-"word" character<br> \s &nbsp; &nbsp;Match a whitespace character<br> \S &nbsp; &nbsp;Match a non-whitespace character<br> \d &nbsp; &nbsp;Match a decimal digit character<br> \D &nbsp; &nbsp;Match a non-digit character<br> \X &nbsp; &nbsp;Match Unicode "eXtended grapheme cluster"  
 
\ &nbsp; &nbsp; Quote the next metacharacter<br> ^ &nbsp; &nbsp; Match the beginning of the line<br> . &nbsp; &nbsp; Match any character (except newline)<br> $ &nbsp; &nbsp;Match the end of the line (or before newline at the end)<br> | &nbsp; &nbsp; Alternation<br> () &nbsp; &nbsp;Grouping<br> [] &nbsp; &nbsp;Bracketed Character class<br>* &nbsp; &nbsp; Match 0 or more times<br> + &nbsp; &nbsp;Match 1 or more times<br>&nbsp;? &nbsp; &nbsp; Match 1 or 0 times<br>\l &nbsp; &nbsp; lowercase next char (think vi)<br> \u &nbsp; &nbsp;uppercase next char (think vi)<br> \L &nbsp; &nbsp;lowercase till \E (think vi)<br> \U &nbsp; uppercase till \E (think vi)<br>\w &nbsp; Match a "word" character (alphanumeric plus "_", plus&nbsp;other connector punctuation chars plus Unicode&nbsp;marks)<br> \W &nbsp; Match a non-"word" character<br> \s &nbsp; &nbsp;Match a whitespace character<br> \S &nbsp; &nbsp;Match a non-whitespace character<br> \d &nbsp; &nbsp;Match a decimal digit character<br> \D &nbsp; &nbsp;Match a non-digit character<br> \X &nbsp; &nbsp;Match Unicode "eXtended grapheme cluster"  

Revisjonen fra 21. nov. 2011 kl. 23:41

Search Documentation for Bibliotheca Polyglotta

One can search any word or phrase in the corpus, in a specific library, or in a chosen set of texts, and have the search results written out, and further access any search result in its sentence by sentence multilingual mode by clicking on the "Go to record".

In Sanskrit words may be searched either as connected in sandhi or dissolved, thus, e.g., if the word is connected with eva, producing saiva, the all the searchwords , eva and saiva will find saiva.

In Greek searches with both diacritics and without will find the word in question, thus both πραξις and πρᾶξις will produce πρᾶξις.

The default search is performed within one's present location in the BP. The four search modes are:

  • Search for exact phrase (default; searches for every instance of the exact fragment, word or phrase in a record);
  • Search for exact phrase with regular expressions (same as above, with regex; see description below);
  • Search for every word in one record (searches for every instance of whole words, either a single word or two or more different words occuring separately within one multilingual record);
  • Search for every word fragment in one record (searches for every instance of a word fragment, either a single fragment or two or more different fragments occuring separately within one multilingual record).

The advanced search option "Choose specific texts..." may be used to limit one's search to specific libraries or specific texts.

For searching with regular expressions (regex) the BP employs Perl Compatible Regular Expressions (PCRE). The most important meta-characters include:

\     Quote the next metacharacter
^     Match the beginning of the line
.     Match any character (except newline)
$    Match the end of the line (or before newline at the end)
|     Alternation
()    Grouping
[]    Bracketed Character class
*     Match 0 or more times
+    Match 1 or more times
 ?     Match 1 or 0 times
\l     lowercase next char (think vi)
\u    uppercase next char (think vi)
\L    lowercase till \E (think vi)
\U   uppercase till \E (think vi)
\w   Match a "word" character (alphanumeric plus "_", plus other connector punctuation chars plus Unicode marks)
\W   Match a non-"word" character
\s    Match a whitespace character
\S    Match a non-whitespace character
\d    Match a decimal digit character
\D    Match a non-digit character
\X    Match Unicode "eXtended grapheme cluster"

(For a full description see perldoc.perl.org/perlre.html.)

Denne kategorien inneholder for tiden ingen artikler eller filer.