Modul:ur-translit/doc

A Wikiszótárból, a nyitott szótárból

Ez a modul urdu nyelvű szöveg átírására szolgál.

Ezt a modult lehetőleg nem kellene közvetlenül más sablonokból vagy modulokból hívni. Sablonból az {{xlit}}(?) segédsablon, modulból pedig a Module:languages#Language:transliterate metódus használandó.

Függvények[szerkesztés]

tr(text, lang, sc)
Átírja a text szöveg sc kódú írásrendszerrel írott és lang nyelvű részét. Ha az átírás sikertelen, nilt ad vissza.

Background[szerkesztés]

THIS WILL REQUIRE DIACRITICS (USED CORRECTLY), Diacritics can be found at http://udb.gov.pk (which is NOT always correct). This should work correctly for majority of the work, although is still in progress.

Read #Usage notes for tips on how to use the module correctly:

Usage notes[szerkesztés]

  1. All consonants must be paired to a vowel (a, i, u, ā, e, ī, o, ū) or a sukoon/jazm. Or else the module will return NIL (blank).
  2. Alif MUST be paired to either a consonant or a vowel, or the module will return NIL.
    if paired to a vowel, the alif will represent the vowel. If paired to a consonant, alif will represent "ā". If an initial alif is not paired to a vowel, the module will return nill.
    compare the pairings نا‎, اَن‎, اُن‎, and اِن‎.
  3. Whenever there are two adjacent semivowels, the first semivowel will become the consonant and the second will be the vowel. To reverse that order, simply put a sukoon on the second semivowel.
    baṛī ye (e, ai) and choṭī ye (ī, y) are distinguished medically using diacritics. However in the final position choṭī ye is always -ī (since choṭī ye cannot be "y" in the final position), and a final baṛī ye defaults to -ē but can become -ai with a preceding zabar.
    If, for some reason, a final choṭī ye needs to be a "y", put a sukoon on it.
  4. Noon ghunna ـن٘ـ/ـں generally represents a nasal vowel. The only exceptions being when a nasal vowel isn't possible (in Urdu), such as before b, bh, j, jh, g, gh, ḍh, dh, th, q (and before kh, ṭh, ṭ and ph, unless the nasal vowel is ā, ī, or ū). Since it is impossible for a nasal vowel to appear before these sounds, the module will return an assimilated nasal.
    1. If a nasal vowel is not desired, use a sukoon to return a regular noon.
    2. If ghunna is occurring after ā, ī, or ū but needs to assimilate with ph or kh, the assimilation must be manually entered. Since unlike all other vowels, ā, ī, and ū can actually be nasalized before these letters. For all consonants not mentioned, nasal assimilation can be predicted with only a jazm/sukoon (if it even occurs).
      • Noon ghunna assimilating with "k" is not as common in Urdu as it is in Hindi, it mainly appears in english loanwords (and only at the end of a word). In which case the assimilation can be predicted with a sukoon/jazm. This assimilation should disappear if the word is inflected. Compare بَینْک‎ to بَین٘کوں
  5. The aspirate 'he' (i.e. do-chashme-he, ھ) does NOT need ANY DIACRITICS in ANY CIRCUMSTANCES, all diacritics should either go on the previous letter or on the following letter.
    The module will work regardless but based on common practice putting a vowel on the aspirate ھ would be inappropriate.
  6. The Tashdeed/Shadda works with another diacritic, as well as alone
  7. The Sukoon/Jazm diacritic is required for consonant clusters, or else the module will return NIL

Updates[szerkesztés]

To do[szerkesztés]

  1. require an initial alif to be paired to a vowel (either a diacritic or vao/ye). needed to prevent false positives
  2. support for al- assimilation in Arabic loanwords.
    • how will we distinguish an initial al- in non arabic words? This is probably impossible and might not happen.
  3. DIFFERENCE BETWEEN ṇ, ṅ, ñ AND n .
  4. Detect when transliteration is needed and when not (i.e. if diacritics are present/needed or not)
  5. izafa/ezafe support
  6. Revert the module to transliterate initial "ای" as 'ē'
  7. اَللہ = sort out ہ + khari zabar diacritic
  8. transliteration detection often gives false positives

Example[szerkesztés]

Test Urdu:

تَرْکِ تَعَلُقّات پِہ رویا نَہ تُو نَہ مَیں لیکِن یِہ کْیا کَہ چَین سے سویا نَہ تُو نَہ مَیں

وُہ ہَمْسَفَر تھا مَگَر اُس سے ہَمْنَوَائی نَہ تھی کَہ دُھوپ چھاؤں کا عالَم رَہا جُدائی نَہ تھی

عَداوَتیں تِھیں، تَغافُل تھا رَنْجِشیں تِھیں مَگَر بِچَھڑْنے والے میں سَب کُچھ تھا بے وَفائی نَہ تھی

کاجَل ڈالو کُرْکُرا سُرْمَہ سَہا نَہ جائے جِن نَین میں پِی بَسے دُوجا کَون سَمائے؟

بِچَھڑْتے وَقْت اُن آن٘کھوں میں تھی ہَماری غَزَل غَزَل تھی وُہ جو کِسی کو کَبھی سُنائی نَہ تھی

Result:

tark-i ta'aluqqāt pe royā na tū na ma͠i lekin ye kyā ka cain se soyā na tū na ma͠i

vo hamsafar thā magar us se hamnavāī na thī ka dhūp chāõ kā 'ālam rahā judāī na thī

'adāvatẽ thī̃, taġāful thā rañjiśẽ thī̃ magar bichaṛne vāle mẽ sab kuch thā be vafāī na thī

kājal ḍālo kurkurā surma sahā na jāe jin nain mẽ pī base dūjā kaun samāe?

bichaṛte vaqt un ā̃khõ mẽ thī hamārī ġazal ġazal thī vo jo kisī ko kabhī sunāī na thī

Expected:

tark-i ta'aluqqāt pe royā na tū na ma͠i lekin ye kyā ka cain se soyā na tū na ma͠i

vo hamsafar thā magar us se hamnavāī na thī ka dhūp chāõ kā 'ālam rahā judāī na thī

'adāvatẽ thī̃, taġāful thā rañjiśẽ thī̃ magar bichaṛne vāle mẽ sab kuch thā be vafāī na thī

kājal ḍālo kurkurā surma sahā na jāe jin nain mẽ pī base dūjā kaun samāe?

bichaṛte vaqt un ā̃khõ mẽ thī hamārī ġazal ġazal thī vo jo kisī ko kabhī sunāī na thī

Test cases[szerkesztés]

4 teszt sikertelen. (frissítés)

Szöveg Várt Tényleges Első eltérés
test_translit_urdu:
Sikeres اِیرانِی īrānī īrānī
Sikeres ماشاءاَللّٰہ māśā'allāh māśā'allāh
Sikeres پَیدائِش paidāiś paidāiś
Sikeres بَرْقِیات barqiyāt barqiyāt
Sikeres عَقْل 'aql 'aql
Sikeres عِزَّت 'izzat 'izzat
Sikeres عَین 'ain 'ain
Sikeres عالَم 'ālam 'ālam
Sikeres عَورَت 'aurat 'aurat
Sikeres شُرُوع śurū' śurū'
Sikeres اِشْعاع iś'ā' iś'ā'
Sikeres تَعَلُّقات ta'alluqāt ta'alluqāt
Sikeres تَعَلُّق ta'alluq ta'alluq
Sikeres مُتَعَلِّق muta'alliq muta'alliq
Sikeres متعلق (nil) (nil) N/A
Sikeres عُمْر 'umr 'umr
Sikeres دَفْعَہ daf'a daf'a
Sikeres بَچَّہ bacca bacca
Sikeres قُوَّت quvvat quvvat
Sikeres مَۓ عِشْق ma-ye 'iśq ma-ye 'iśq
Sikeres شیرِ پَنْجاب śer-i pañjāb śer-i pañjāb
Sikeres مَلْکَۂ دُنْیا malka-yi dunyā malka-yi dunyā
Sikeres جَمُّوں jammū̃ jammū̃
Sikeres آم ām ām
Sikeres اِشْتِراکِیَّت iśtirākiyyat iśtirākiyyat
Sikeres سِسَکْنا sisaknā sisaknā
Sikeres پُل pul pul
Sikeres عِیسیٰ 'īsā 'īsā
Sikeres اَعْلیٰ a'lā a'lā
Sikeres لَفْظ lafz lafz
Sikeres حاضِر hāzir hāzir
Sikeres بَہورا bahorā bahorā
Sikeres نَہِیں nahī̃ nahī̃
Sikeres اِشْتِمالِیَت iśtimāliyat iśtimāliyat
Sikeres چَوڑا cauṛā cauṛā
Sikeres تِھیں thī̃ thī̃
Sikeres کُتّا kuttā kuttā
Sikeres پَہْلے pahle pahle
Sikeres کِھلائی khilāī khilāī
Sikeres کھلائی (nil) (nil) N/A
Sikeres ٹَھہَرْنا ṭhaharnā ṭhaharnā
Sikeres تَیمُور taimūr taimūr
Sikeres فَوراً fauran fauran
Sikeres کوئے koe koe
Sikeres مَنَّتوں mannatõ mannatõ
Sikeres گان٘وں gā̃õ gā̃õ
Sikeres مَیں ma͠i ma͠i
Sikeres آئی āī āī
Sikeres مَکَّھن makkhan makkhan
Sikeres خُدا xudā xudā
Sikeres کَئی kaī kaī
Sikeres کُئی kuī kuī
Sikeres چائے cāe cāe
Sikeres کُھلْواؤ khulvāo khulvāo
Sikeres غَدّار ġaddār ġaddār
Sikeres بَیٹھو baiṭho baiṭho
Sikeres بَطَّخ battax battax
Sikeres مُتَّحِدَۂ muttahida-yi muttahida-yi
Sikeres ساؤُتھ اَفْرِیقَہ sāuth afrīqa sāuth afrīqa
Sikeres کُلِّیَّہ kulliyya kulliyya
Sikeres دائِرَۃُ dāiratu dāiratu
Sikeres سُورَۃ sūra sūra
Sikeres بِلّا billā billā
Sikertelen دائِرَۃُ الْمَعارِف dāiratu l-ma'ārif dāiratu اlma'ārif 9
Sikertelen دائِرَۃْ اُلْمَعارِف dāirat ulma'ārif dāirah ulma'ārif 6
Sikeres آیَت اُللّٰہ āyat ullāh āyat ullāh
Sikertelen صَیّاد sayyād saiyād 3
Sikertelen کہاں (nil) khā̃ N/A