# Arabic script in different languages



## jonquiliser

Hi,

as I'm learning Arabic, I keep reading the claim that the structure of the language (that it's a language of patterns) is what makes the unvowelled script possible (or well, no short vowels).  

But then I wonder, how is it with other languages that use the Arabic script? Languages as diverse as Kazakh and Urdu, or Malay, Panjabi, Kurdish, Sindhi. Do they share structural features with Arabic, or do they have some other structure that make it possible? 

Or is it just a question of context that enables understanding particular words?

(I imagine also English could to some extent be written and read in the Arabic script, but aren't there too many words that would then be indistinguishable? Such as tan, tin, ten, ton... Would context really be enough?)

Thanks for any thoughts.


----------



## huhmzah

Mostly from context -- words in these languages do have some recurring internal structures etc but no where close to the regular templates or patterns in Arabic (or other Semitic languages). This does make spellings in these languages quite tricky since most of the time you're working with only the three vowel letters (the diacritics exist but aren't written - so words with short vowels are even trickier)   - I can't speak for Malay or Kurdish, but I do know Urdu and Punjabi and have some experience with Turkic languages

Urdu and Punjabi, which are Indo-European languages have 8 different vowel-sounds which conveyed by these three vowel-letters -- so for instance, in Urdu the word سو could be "soo" (direction), or "so" (thus) or "sau" (hundred). In Punjabi a word like ميل could be "mel" (meeting) or "meel" (mile) etc. Similarly Turkic languages have 6 vowel sounds being conveyed by three vowel-letters -- I studied Ottoman Turkish a while back and remember a telling example -- احمد باشا اولدى could be pronounced "Ahmet paşa oldu" (Ahmed became king) or "Ahmed paşa öldü" (King Ahmed died) -- and the Turkic "vowel harmony" rules added to the complication because besides the fact that the pronounciation could be oldu or öldü changing the meaning, the spelling of the word was "a-u-l-d-y" (so a novice would pronounce it awldi or avldi or something).
(Ottoman Turkish is an extreme example since there are languages like Uygur, another Turkic language, which uses the Arabic script but is modified to convey all the vowels and is thus as regular and "efficient" with its spellings as modern Turkish.)

That being said I learned Urdu as a child and have never had a problem reading it -- the act of "reading" after a certain point is much more based on recognizing shapes and our mind predicting through context rather than actually reading letter-by-letter. We've all gotten that email where the words are jumbled up yet we have no problem reading them, showing that we don't actually read out entire words but that our mind sort of predicts them through the rough shape and some tags etc:

http://www.1000topics.com/images/jumbled-letters-still-readable.gif


----------



## Frank06

Hi,


jonquiliser said:


> I keep reading the claim that the structure of the language (that it's a language of patterns) is what makes the unvowelled script possible (or well, no short vowels).


What exactly do you mean by structure in this context?
Anyway, I am very curious about which arguments are used in your sources to back up this claim. Frankly, I cannot think of any language related reason not to write Arabic in a (modified) Latin, cuneiform, hangeul, or katanaka script. I can think of some _cultural_ reasons .



> But then I wonder, how is it with other languages that use the Arabic script? Languages as diverse as Kazakh and Urdu, or Malay, Panjabi, Kurdish, Sindhi. Do they share structural features with Arabic, or do they have some other structure that make it possible?


Non-Semitic, Indo-European Persian uses a modified Arabic script, and the only (morphological) features it shares are a handful of so-called 'broken plurals'. It's perfectly possible to write Persian in the Latin script. This site gives a good example.
As for Urdu... well, look at Hindi. As for Turkish: look at the pre-Attaturk period. There is even a corpus of Afrikaans texts written in the Arabic script! For a quick reference see here.



> Or is it just a question of context that enables understanding particular words?


What Persian is concerned: at home I have the language books of Iranian primary schools, and only in the first year (out of five) the diacritics are given. So it's a matter of learning, studying and context.



> (I imagine also English could to some extent be written and read in the Arabic script, but aren't there too many words that would then be indistinguishable? Such as tan, tin, ten, ton... Would context really be enough?)


As long as you modify the Arabic script, which, as far as I know, is what Persian, Urdu, etc. has done any which way, I don't see a problem. It's just a matter of reaching a concensus and learn that concensus.

Which actually means that I don't see any linguistic reason not to write down any other language in the Arabic script (given obviously, the necessary modifications).

Groetjes,

Frank


----------



## jonquiliser

huhmzah, thank you for your insights! It is very true that reading is mostly a practice of understanding context and anticipating what follows, rather than 'deciphering' words one by one. And the letter with scrambled words that has circulated the web in various languages is a case in point. Nonetheless, I think the ability to read a text where words are messed up (with first and last letter being in their correct position) depends on the fact that there is a _right_ way of spelling words, where phonemes come in their right order of pronounciation. That is to say, it would be very hard -not to say impossible - to learn to read if _all _writing were more or less arbitrary, like the scrambled-words one, even if it were one's native language. 

What motivated my doubt was the fact that languages are very different, and while scripts and spelling are indeed matters of convention, not all existing scripts will be equally functional for each and every language. 

I'm completely at a loss, Frank, as to what you are trying to say. The claim I mentioned wasn't as much a "defence" of Arabic being written in the script it is, as it is an explanation to an English-speaker who finds the Arabic unvowelled writing hard to come to grips with. Something along these lines: "It may seem difficult at first but given the regular word formation of Arabic with a number of fixed patterns, the unmarked vowels are in fact quite predictable." As I thought all the languages using the Arabic script hardly would be equal in their structure, I was wondering what difficulties, if any, arises in these other languages, and how they work out any suchs issues. Nothing more nor less than that.


----------



## astlanda

Hei.



jonquiliser said:


> I keep reading the claim that the structure of the language (that it's a language of patterns) is what makes the unvowelled script possible (or well, no short vowels).
> ...
> Or is it just a question of context that enables understanding particular words?



I think the original reason to ignore the short vowels was the structure of Semitic languages. They have only few vowels, which can vary in dialects and are not crucial for understanding.

On another hand I believe, that the Arabic script shares some similarities with Peruvian "quipu". It's meant for recognizing the words, that you already know.

E.G.
Once an Egyptian wrote me down an address and as usual he did omit not only the diacritics but the "dots" as well, which distinguish ب, ت , ث and ن etc. So for me there were only knots and hooks. I had to ask, how to pronounce it.


----------



## Faylasoof

Many Semitic languages had strong oral traditions, at least in their ‘earlier’ stages before writing became paramount - certainly true of Arabic. So it would make sense that the script was just an aid to recognising known words as Astlanda says. 

  In 7th century Arabia not only were the diacritics absent, many of the consonants could not be distinguished as “dots” on them were absent too. These included (if my memory hasn’t deserted me) the following:

 ب ت ث ط ظ د ذ ر ز ص ض  ف ق ح خ ج

 It was a test of one’s complete mastery of the language that he / she could determine from the context which “dot less”-lettered word was meant to be there. 

 I agree with what Frank has said, and esp. when he says:



> …Which actually means that I don't see any linguistic reason not to write down any other language in the Arabic script (given obviously, the necessary modifications)….
> 
> Frank




  … and Astlanda I know what you are saying but I’m not quite sure if the short vowels were always unimportant:



> I think the original reason to ignore the short vowels was the structure of Semitic languages. They have only few vowels, which can vary in dialects and are not crucial for understanding.



  In the <classical> stage of a particular Semitic language when the written language became all important, it was crucial to get the vowels right as even a ‘subtle’ change of vowel radically altered the meaning. Biblical scholars have made this point repeatedly. For dialects it is a different matter but they came later anyway.


----------



## Frank06

Hi,


jonquiliser said:


> I'm completely at a loss, Frank, as to what you are trying to say.


Sorry .
I was basically reacting to your opening statement:


> I keep reading the claim that the structure of the language (that it's a language of patterns) is what makes the unvowelled script possible (or well, no short vowels).


What I understood from this -- and my apologies if I misunderstood -- is that there would be something within the "structure of a language" which makes a particular script, be it alphabet or abjad, possible.
I tried to give reasons why I don't think there is any linguistic reason to make this claim.

Groetjes,

Frank


----------



## Joannes

Frank06 said:


> What I understood from this -- and my apologies if I misunderstood -- is that there would be something within the "structure of a language" which makes a particular script, be it alphabet or abjad, possible.
> I tried to give reasons why I don't think there is any linguistic reason to make this claim.


Allez, Frank. You're right that any _type_ of script (be it alphabet, abjad or abugida) is theoretically _possible_ for any language. But, assuming that a phonological script is the best option (which is the case and the reason why English spelling sucks), an abjad would obviously work better (or would be more apprehensive) with some languages than with others. Arabic has a small inventory of vowels and a large inventory of consonants. Moreover the morphology of Arabic would give you an idea of the meaning of the word just by its (consonantal) root. An abjad (_any_ abjad, not just Arabic script) would not work as well for Swedish or Dutch as it would for Arabic, _punt. _(Let alone Vietnamese or other tonal languages.)

While the Arabic script was designed for (Classical) Arabic, for Persian, Urdu and Ottoman Turkish it is (was) a mere cultural heritage. Even with the modifications that were made, I'm sure this has effects on the learnability of the writing system.


----------



## jonquiliser

astlanda and Faylasoof, maybe it's not so much an issue of vowels being irrelevant, rather the small number of vowels makes the risk of confusion small, paired with patterns with predictable vowel combinations? In Swedish, with nine vowels and several more phonemic variations of those vowels, it would be hard to read and write using only three vowel symbols. 



> (Frank)
> What I understood from this -- and my apologies if I misunderstood -- is that there would be something within the "structure of a language" which makes a particular script, be it alphabet or abjad, possible.
> I tried to give reasons why I don't think there is any linguistic reason to make this claim.


Isn't this off-topic? 

Not every corner hides a hideous metaphysical monster. I thought the context of my post would have made clear that I didn't have in mind a discussion of the ontological status of any given script - I agree it's a matter of convention. But, as Joannes explained well, usefulness varies. Different languages adapt scripts that were developed using some given language, to suit their own needs. Or, for that matter, change the script they've developed, spelling reforms etc. (Swedish marked certain vowels by doubling them, until they developed into distinct letters - å ä ö. Because it made writing clearer. Yet many vowel sounds are not distinguished in writing, so at least the need is not all that pressing. It is true that text without these letters is quite possible to read, as evidenced by the many mails sent from computers lacking dots and circles, but it takes an extra effort and there definitely are cases of hilarious changes of meaning in certain sentences.) Given the idiosyncrasies of Arabic, I was wonder how other languages using the same script dealt with the "vowellessness". huhmzah's descriptions of Uygur, Urdu and Ottoman Turkish are good examples, that is the kind of info I was after.


----------



## arsham

another important problem that has not been addressed in the previous posts is the spelling of compounds. In Persian, as in most Indo-European languages, composition and derivation through affixation are the most productive mechanisms of word formation. Compounds in particular abund and as in many cases arabic letters can be written both attached and detached to the following letter, many compounds can have several orthographies. For example you can write aab-paash (watering can/sprinkler) both as آبپاش and آب پاش . This is a serious problem as even to this day there is not a clear set of rules govering the spelling of compounds. The other issue is the spelling of Arabic loanwords. This may seem like a trivial question, as you might think that one just needs to keep the orginal spelling; but this isn't always the case (at least in Persian) especially with spelling of hamze!


----------



## astlanda

jonquiliser said:


> Hi,
> I wonder, how is it with other languages that use the Arabic script? Languages as diverse as Kazakh and Urdu, or Malay, Panjabi, Kurdish, Sindhi. Do they share structural features with Arabic, or do they have some other structure that make it possible?



I think huhmzah  said almost everything there was to say already.
It has nothing to do with their structure. It came with Islam and they had to modify the script.

Just as an addition to your example of Swedish having 9 vowels -  Estonian has 9 as well. Moreover - Estonian has tons of diphthongs: õi, õe, öe, äe etc.
It is obvious, that it makes no sense to use Arabic script here without modifications.

http://forum.wordreference.com/showthread.php?t=167368#post3589432


----------



## jonquiliser

arsham, does the change of spelling of Arabic loanwords in Persian reflect a difference in how these words are prounced in Persian?

astlanda; quite. Modifications and adaptations would seem normal. I know though that several languages, as well as dialects of Arabic, use added consonants (modifications of consonant letters) for sounds not present in Classical Arabic or Fus7a. But I didn't know anything about how the issue of vowels was treated in languages using the Arabic script, hence my question.


----------



## astlanda

jonquiliser said:


> I didn't know anything about how the issue of vowels was treated in languages using the Arabic script, hence my question.



May be this helps:
http://en.wikipedia.org/wiki/Uyghur_language#Writing_system

I'm not too familiar with the language, but a Chinese-Uyghur conversation guide I have seems to ignore some short vowels at least in some expressions. Must be a nightmare for a foreigner to learn it.


----------



## sokol

Jonquiliser,
my Persian textbook (sadly not much used by me so far) says that Persian always is written without vowel markers - only texts for learners have those.

Thus (and as vowels are significantly more in Persian than in Arabic) it is quite difficult for learners to read Persian, says my textbook, because you have (to some extent) to guess meaning correctly before you can read properly, or (quote from Bozorg Alavi, Manfred Lorenz: Lehrbuch der persischen Sprache, Leipzig 1988, p. 15; my translation):
"One can read Persian correctly only if one gets the meaning of the words correctly."

Also Arabic loans are a problem even if there were no special issues with hamze (which I am sure arsham will answer); what arsham has not mentioned is that Arabic loans aren't pronounced like in Arabic, thus in several cases Arabic letters are neutralised to the same pronunciation: for example "sin", "sad" and "se" all are pronounced /s/ in Persian - but they are _*all *_written.

So in Persian there's not only the problem of vowels not being written; there's also the one of a consonant written with two or three or even more Arabic letters (for "z" there are 4!).

Further Persian has added four letters to the Arab script: "pe", "čim", "že" and "gaf", to adopt the script better for Persian.

To conclude, there have been some adoptions to make the script fit better (optional vowel markers hardly ever used, and four additional letters) but for the most part the Arabic script remains unchanged.


----------



## arsham

jonquiliser said:


> arsham, does the change of spelling of Arabic loanwords in Persian reflect a difference in how these words are prounced in Persian?


In general yes they do reflect the difference in the pronunciation, as the final hamze is never pronounced in Persian, if following a long vowel. However, there are also cases where the pronunciations don't differ. For example موسی musaa (Moses) should be spelt موسا when in ezafe construction!
That said, sokol raised another "_interesting_" problem, that is the multiplicity of letters representing the same consonant. Sometimes Persian words are also written using letters representing typically arabic sounds, such as صد sad (hundred, probably to avoid confusion with arabic سد meaning dam) but its derivatives are written with س like سده sade century!


----------



## jonquiliser

Thank you all for your input, very interesting!


----------



## astlanda

Hi.

Could anybody explain us, why did Iranians, Turks, Uighurs, Malayans and other ancient nations shift from their classical orthography to a less appropriate one?


----------



## Lugubert

astlanda said:


> Hi.
> 
> Could anybody explain us, why did Iranians, Turks, Uighurs, Malayans and other ancient nations shift from their classical orthography to a less appropriate one?


How is for example today's Turkish script less appropriate than Arabic? Now they can and do write all their eight vowels, compared to almost none before.

I suppose the same principle applies to Malay, but I can't be bothered to look up how many vowels they have.

The Pahlavi script is so complicated that the Iranians are probably very happy they left it.

Re Uighur, I suppose you refer to their adaption of Arabic script as their "classical orthography". Well, it isn't bad at all, but the four alphabets in use now should be as appropriate. And they also use Chinese characters.

If I'm not too mistaken, Kurdish in Sweden has even symbols for Swedish vowels in their adaption of the Perso-Arabic script.


----------



## panjabigator

astlanda said:


> Hi.
> 
> Could anybody explain us, why did Iranians, Turks, Uighurs, Malayans and other ancient nations shift from their classical orthography to a less appropriate one?



Can you explain what you mean here?  How is their current script inappropriate?


----------



## Alijsh

jonquiliser said:
			
		

> as I'm learning Arabic, I keep reading the claim that the structure of the language (that it's a language of patterns) is what makes the unvowelled script possible


It has nothing to do with language but circumstance (environmental culture, invasion, etc.; anything but language). Arabs had an abjad option at hand and wrote their language with it. Arabic can be written in Latin alphabet. Latin alphabet has been extended enough to have enough room for all phonemes of Arabic. I have not seen Arabic textbooks written for Westerners but I think there is a transliteration scheme, which proves my word. The best proof for "circumstance", however, is Persian: during its long history, it has been written with different writing systems: syllabic (-> Old Persian Cuneiform), abjad and phonemic. Can you relate it to any feature of the language?



			
				arsham said:
			
		

> This is a serious problem as even to this day there is not a clear set of rules govering the spelling of compounds.


Perso-Arabic abjad is inherently irregular (some letters have medial forms, some don't have, etc.) and thus, we can never have a set orthography but it can be improved from what we have currently. Read this interview in Persian (I could found it here with Google search. It is also hosted in other sites)



			
				Lugubert said:
			
		

> If I'm not too mistaken, Kurdish in Sweden has even symbols for Swedish vowels in their adaption of the Perso-Arabic script.


Kurds have made an alphabet from Perso-Arabic abjad (I have also seen a propositional one for Persian). They write their language in an alphabet (see here).



			
				Lugubert said:
			
		

> The Pahlavi script is so complicated that the Iranians are probably very happy they left it.


That's an insult but I ignore. They didn't leave it! You talk as if it has been optionally and there hasn't been Islamic conquest of Iran! In either case, coming from the same source as the Pahlavi script, Arabic has its own problems and thus, doesn't leave a notable place for being happy let alone "very happy". Besides, in pre-Islamic era we had invented an alphabet (there were in fact several scripts) and could write our language in it if we ever wanted to be "very happy".


*The discussion about the Pahlavi script has been moved.
Frank, moderator*


----------



## astlanda

Lugubert said:


> How is for example today's Turkish script less appropriate than Arabic? Now they can and do write all their eight vowels, compared to almost none before.
> 
> I suppose the same principle applies to Malay, but I can't be bothered to look up how many vowels they have.
> 
> The Pahlavi script is so complicated that the Iranians are probably very happy they left it.
> 
> Re Uighur, I suppose you refer to their adaption of Arabic script as their "classical orthography". Well, it isn't bad at all, but the four alphabets in use now should be as appropriate. And they also use Chinese characters.
> 
> If I'm not too mistaken, Kurdish in Sweden has even symbols for Swedish vowels in their adaption of the Perso-Arabic script.



Sorry, I asked an intriguing question.

I meant Orkhon script as the more proper script for Turkish and Uighur (http://en.wikipedia.org/wiki/Orkhon_script), Kawi or Pallava for Malay (http://en.wikipedia.org/wiki/Old_Malay).

I don't know anything about Old Malay. So Arabic script might have been even better for it than the previous ones.
The development of Iranian languages are well explained here:
http://en.wikipedia.org/wiki/Iranian_languages#Middle_Iranian_languages

What I was asking about, is the political background of introduction of Arabic script to those cultures, which already had an orthography.
I think that some kind of persecution of paganism by Christians interrupted the long lasted development of Ancient Egyptian knowledge. Coptic tradition looks like an immigrant culture from another world, but in fact it is not.
 I wonder if something similar happened with introduction of Islam - if the old tradition was abandoned as a pagan thing for religious reasons. I suppose, that the ancient scribes were usually educated by pagan priests.

I


----------



## astlanda

Alijsh said:


> Arabic can be written in Latin alphabet. Latin alphabet has been extended enough to have enough room for all phonemes of Arabic. I have not seen Arabic textbooks written for Westerners but I think there is a transliteration scheme, which proves my word.


I have seen them, but I personally prefer Arabic script with "haraka" diacritics as well as I prefer hangeul for Korean and not the numerous _Latinization_s.


----------



## Joannes

It is remarkable that many languages that use Arabic script now, or used it in the past, already had been written in other scripts before. Whether or not those former writing systems were (phonologically) more appropriate, I cannot judge. The reasons to change are probably multifold: social, economical, political and cultural. Just like the reasons for some languages to change script again later on. I think another factor that played was that in the days that those languages adopted Arabic script, spelling wasn't really formalized like it is now. Nowadays many people focus (too much) on written language, while I think back then writing was rather just a way to represent spoken language, so less rigid and less fixed upon. It would be way harder for the Persian Language Academy to adopt an alphabet now, than it was for Atatürk to do it, which in its turn was probably a harder job than convincing the Turks to adopt Arabic script before that.


----------



## Alijsh

Joannes said:


> It would be way harder for the Persian Language Academy to adopt an alphabet now, than it was for Atatürk to do it, which in its turn was probably a harder job than convincing the Turks to adopt Arabic script before that.


At the time of Ataturk, there were also attempts in Iran to switch to Latin alphabet (_it dates back to 100 years ago, so it is even earlier_) but they were futile for various reasons. Neither official bodies (say, Academy) nor the masses are agree with switching into Latin alphabet. They don't feel any need for change. They are habituated to the script and its received orthography (however it is), just like Anglophones with the complex English script.


----------



## Joannes

(I'm with you, Aljish, I think you're proving my point - if that wasn't your intention, let me know. )


----------



## arsham

It's also possible to modify the Arabic script through the addition of new characters and by assigning new values to letters that do not represent a native sound in languages other than Arabic in order to make the Arabic script phonetic! That's how the Greeks got their alphabet from Physicians!


----------



## zappbrannigan

jonquiliser said:


> But then I wonder, how is it with other languages that use the Arabic script? Languages as diverse as Kazakh and Urdu, or Malay, Panjabi, Kurdish, Sindhi. Do they share structural features with Arabic, or do they have some other structure that make it possible?



I'm only vaguely familiar with Arabic as a language, but how the script is as used in Malay (the script is called Jawi) is simply as a substitute of sorts for the Roman alphabet (though methinks the Malays were introduced to the Arabic alphabet first). Especially in recent times, new letters have also been crafted so that the Malay can be conveyed better through Jawi (eg. the Jawi letter "pa" to correspond to the "p" in the Roman alphabet, which is used in Malay as well). The letters "alif", "ya" and "wau" are used as vowels - we haven't found any proper way to compensate for the lack of "o" and "é/è", though, so it's always a guess (for those unfamiliar) as to whether "wau" is a "u" or an "o", etc. I would also say there's a more liberal use of "hamzah" and "alif" to indicate the "ah" sound, but I'm not too sure as to how to explain that


----------



## Maurice92

astlanda said:


> How is for example today's Turkish script less appropriate than Arabic? Now they can and do write all their eight vowels, compared to almost none before.


As a matter of fact, all these scripts are non pure Arabic scripts or pure Latin scripts; so there is no reason that an extended Latin or Arabic script cannot be fitted to any language by adding a number of vowels or consonants possibly by adding diactics signs to existing letters.


----------



## arsham

check this link:

http://www.unilang.org/viewtopic.php?f=73&t=25835&p=495094#p495094


----------



## yuechu

Re : Lugubert
Just a small correction: I don't believe that the Uyghur language is written in Chinese characters (or at least it never has been nor is written in Chinese characters widely, for reasons of practicality). Just the Arabic script (+modified now), Roman alphabet and the Cyrillic script (=minority) if I am not mistaken.


----------



## MarX

I read that there was a time when Malay was written using Arabic script.

As I've learned Perso-Arabic script, I find it hard to imagine Malay written with a similar system. I think clear vowels and consonants are quite important in Malay.


----------



## Lugubert

baosheng said:


> Re : Lugubert
> Just a small correction: I don't believe that the Uyghur language is written in Chinese characters (or at least it never has been nor is written in Chinese characters widely, for reasons of practicality). Just the Arabic script (+modified now), Roman alphabet and the Cyrillic script (=minority) if I am not mistaken.


I think you're right and I was a bit hasty. I checked a few of my sources, and what I remembered seems to have been rather purely Chinese translations of signs and 100% Chinese geographical names, like 





			
				Wiki said:
			
		

> *Kashgar* or *Kashi* (officially transliterated as *Kashgar* or *Kashi* (officially transliterated as  Kashgar) *Kaxgar* in Uyghur: قەشقەر/K̡ǝxk̡ǝr; Chinese: 喀什; pinyin: Kāshí.


----------

