# Languages Family Trees



## ronanpoirier

Hey, I wanna know what are the criteria to separate the languages into families... why is this Indo-European and why is that Finno-Ugric? Is it related only to grammar, sounds and, sometimes, geographical area or there is something more?

Thanks in advance!


----------



## Whodunit

If your language can be classified to Indo-European is relatively easy to explain:
1. If your script is Latin, Cyrillic, or Greek, you have a good chance to be Indo-European. (consider Afro-Asiatic languages, Sino-Tibetan languages, etc.)
2. The normal word order is subject-verb-object. (unlike in Arabic, for instance)
3. Sentences consist of words, not of symbols. (as in Chinese)
4. A word can be easily made by putting some letters together. (in Hebrew you have to pay attention to whether a letter is at the end or not, same in Arabic)
5. Indo-European languages have the same root. The Indo-European stem *u̩̩̩̣̥endh for instance can be found in German "winden", English "wind"/even in "went", French "venir", Latin "ventum", Czech "vinout" etc.


I don't think that the sounds of letters has anything to do with classification. For instance, there are sounds that do not/hardly occur in any other European language. like Czech ř [r̝], German ch [ç], or French "um"/"un" [œ̃].


----------



## pjay

Well it goes a lot deeper than that.

Obviously, languages change over time. But they don't change in random fashion. There are rules. For instance: in German you say wasser meaning water. So at some time in the past all [t]-sounds inside a word were replaced by [s] in a completely systematic way. This is called the second Germanic sound shift. Simliar are English "to hate", German "hassen", English "rattle" German "rasseln", English "let", German "lassen", English "cat", German "Katze", English "time", German "Zeit". 

The interesting thing with sound change rules is that they apply to all the sounds of a language, but only during a specific point in time, let's say over a period of 50 years or so. Some words that are used only very rarely don't occur often enough, and are spared. Others occur extremely often, that speakers never apply the new rule. Others again enter a language too late for this rule to apply.

Given these caveats language change is surprisingly uniform. Now if you know the rules, you can actually trace them back to a common ancestor language. We don't have any written records of Indo-European but we are able to reconstruct a language spoken more than 7000 years ago. We just have to trace back all the rules incorporated into the modern daughter languages. This is how you can create a family tree of languages.

Following this approach you find that some very old words that must have existed long before western civilization are similar in all European languages

English brother, German Bruder, Russian brat, Latin frater, French frère, Spanish hermano, Indian-Sanscrit bhrathar

This cannot be a coincidence. I don't know what folks in Hungary say for brother but it's probably something very different. So there are actually only three languages in all of Europe that do not fit this pattern and are therefore considered the only non-Indo-European languages in Europe.


----------



## modus.irrealis

Just to add to pjay's answer, because sometimes it's unclear what the actual sound changes were, you show that two languages are related by showing sound correspondences between the two languages. This means that you show that a sound in a word in one language corresponds by a rule to a sound in a word with the same meaning (or very similar meaning) in the second language. And for this to be persuasive you want to show that this occurs in a lot of words with very few exceptions and that it especially occurs in basic vocabulary, since otherwise the correspondences could be explained as one language having borrowed the word from another.

pjay's example shows correspondances like English b ~ Latin f and this is further supported by such pairs as bear ~ fero, blow ~ flo, break ~ frango, and so on. But the English word fraternal is not an exception because we know it was borrowed from Latin.

Also, the sounds that correspond to each other don't need to be similar. My favourite example of this is that English t corresponds to Armenian erk in words like two ~ erku, and other words.


----------



## Whodunit

pjay said:
			
		

> Well it goes a lot deeper than that.


 
Of course, it does. I've tried to summarize and generalize it. 



> Obviously, languages change over time. But they don't change in random fashion. There are rules. For instance: in German you say wasser meaning water. So at some time in the past all [t]-sounds inside a word were replaced by [s] in a completely systematic way. This is called the second Germanic sound shift. Simliar are English "to hate", German "hassen", English "rattle" German "rasseln", English "let", German "lassen", English "cat", German "Katze", English "time", German "Zeit".


 
You are right. Here's more about this phemenon:
*p:*
—> pf (in initial position/anlaut and after a consonant)
—> ff (following a vowel)

German: *Pf*eife (OHG: pfîfa); English *p*ipe; Lower Geman: *P*ipe; Dutch: *p*ijp; Czech: *p*íšťala; French: *p*ipe
German: stam*pf*en (OHG: stam*pf*ôn; Gothic: stam*p*on); English: to stam*p*; Dutch: stam*p*en
German: Schi*ff* (OHG: sci*f*; Gothic: ski*p*); English: shi*p*; Lower German: Schi*pp*; Dutch: schi*p*
*t:*
—> ts (in initial position and after a consonsant, mostly spelled z)
—> ss (following a vowel; mostly spelled ʐ or ʐʐ in OHG, or ß/ss/s in NHG)

German: *Z*unge (OHG: *z*unga; Gothic: *t*uggô); English: *t*ongue; Lower German: *T*ung; Dutch: *t*ong
e*ss*en: (OHG: e*ʐ**ʐ*an; i*t*an); English: to ea*t*; Lower German: e*t*en; Dutch: e*t*en; Czech: jí*s*t
German: *z*u (OHG: *z*uo; Gothic: *d*u); English: *t*o; Lower German: *t*o; Dutch: *t*e
*k:*
—> kch (in initial position and after a consonstant; nowadays only in Alemannian and Swiss German)
—> ch (following a vowel; written as h or hh in OHG, in today's German ch)

German: ma*ch*en (OHG: ma*hh*ôn); English: to ma*k*e; Lower German: mo*k*en; Dutch: ma*k*en
German: Bu*ch* (OHG: buo*h*; Gothic: bô*k*ôs); English: boo*k*; Lower German: Boo*k*; Dutch: boe*k*
German: su*ch*en (OHG: suo*hh*en; Gothic: sô*k*jan); English: to see*k*; Dutch: zoe*k*en



> Following this approach you find that some very old words that must have existed long before western civilization are similar in all European languages
> 
> English brother, German Bruder, Russian brat, Latin frater, French frère, Spanish hermano, Indian-Sanscrit bhrathar


 
Here's another of those universal words:
German: sehen
English: to see
Middle High German: sehen
Old High German: sehan
Gothic: saíƕan
Swedish: se
Latin: sequi (to follow)
Old Indian: sácetē (he follows)
Latvian: sekt (to follow)
Indo-European stem: *_sek_



> This cannot be a coincidence. I don't know what folks in Hungary say for brother but it's probably something very different. So there are actually only three languages in all of Europe that do not fit this pattern and are therefore considered the only non-Indo-European languages in Europe.


 
But unfornatuantely, you can't all too often find words that match well in German, French, Russian, Portuguese, and Swedish. 

It's interesting, though.


----------



## mansio

Pjay

I think there are a little more than three native (to distinguish from immigrant) non IE languages in Europe. Not to mention all the languages from the Caucasus, you have Basque, Hungarian, Finnish and Estonian, Sami, the Turkic languages of the Gagauz, Tatars, Bashkirs, etc.


----------



## ronanpoirier

Pjay,
Brother in Hungarian is _fiútestvér_, fiú = boy, son + test = body + vér = blood. :-S

It's really interesting all these similarities among languages... but, does anyone have an idea about the families from extreme east languages which include Japanese and Korean? And why some languages outside Europe are into the Indo-European groups? Were them the root to the Indo-European languages?


----------



## Outsider

pjay said:
			
		

> English brother, German Bruder, Russian brat, Latin frater, French frère, Spanish hermano, Indian-Sanscrit bhrathar
> 
> This cannot be a coincidence.


Spanish _hermano_ (as Portuguese _irmão_) has a different origin than the other words in that list. But both Spanish and Portuguese have words like _frade_, related to Latin _frater_.


----------



## Outsider

ronanpoirier said:
			
		

> It's really interesting all these similarities among languages... but, does anyone have an idea about the families from extreme east languages which include Japanese and Korean? And why some languages outside Europe are into the Indo-European groups? Were them the root to the Indo-European languages?


It's usually speculated that Indo-European originated somewhere near the Black Sea, and was then spread Eastwards to northern India, and Westwards to Europe by migrations, or by cultural borrowing. You'll find some maps here.


----------



## Pivra

Thai belongs to Tai-Kadai family tree. It belongs to the larger Austro-Tai group which is in the Austric family. Most of the languages in this group use south Indian influenced alphabets although Thai writting system is more like Devanagari but its appearance is clearly Dravidian-ish. 

 Long time ago this language family did not exist but instead were put with the Sino-Tibetan group. But later (30 years or so I guess) we became our own language group like most of the eastern languages which are not related at all. Main Tai- Kadai languages are:

Thai, Laotian (Laos), Isan (spoken in Thailand), Lanna (Kham Meung, Northern Thailand), and Shan(Shan state, Burma).

  Thai speakers can understand Lanna, Laotian, and Isan quite to some points without studying the languages but not Shan. For reading, Laotian is very simple for us but not Shan nor Lanna. 

 These languages are tonal and analytic. Thai has a very difficult orthography. Isan and Lanna has more tones than we do, I do not know about Laotian and Shan. 

 All languages in this group except Shan has the same ordering of words in sentences.


----------



## zaigucis

Whodunit said:


> Here's another of those universal words:
> German: sehen
> English: to see
> Middle High German: sehen
> Old High German: sehan
> Gothic: saíƕan
> Swedish: se
> Latin: sequi (to follow)
> Old Indian: sácetē (he follows)
> Latvian: sekot (to follow)
> Indo-European stem: *_sek_


----------



## Outsider

Here's a good explanation.


----------



## DrLindenbrock

Outsider said:


> Spanish _hermano_ (as Portuguese _irmão_) has a different origin than the other words in that list. But both Spanish and Portuguese have words like _frade_, related to Latin _frater_.


 
The Spanish and Portuguese word comes from Latin _germanus_, which my dictionary says it means either _brother_ or _stepbrother_. In Catalan it's _germà_.
Unfortunately, my dictionary does not provide a contrastive anlysis between _germanus_ and _frater_ (the word that lead to _fratello_ [from the diminutive form of _frater_, _fratellus_] in Italian and _frère_ in French).

Anyhow, my point related to the scope of this thread is that all the Romance languages derive the word for brother from Latin.
Cheers


----------



## Frank06

Hi,
Some remarks...


Whodunit said:


> If your language can be classified to Indo-European is relatively easy to explain:
> 1. If your script is Latin, Cyrillic, or Greek, you have a good chance to be Indo-European. (consider Afro-Asiatic languages, Sino-Tibetan languages, etc.)


I'm sorry, but script has nothing to do with a language being IE or not.



> 2. The normal word order is subject-verb-object. (unlike in Arabic, for instance)


There are so many IE languages with a deviating word order (and a mixed one, like German and Dutch), that this cannot be a good parameter either. Hindi, Persian and Armenian are OV languages.
BTW, word order is a typological feature...



> 3. Sentences consist of words, not of symbols. (as in Chinese)


Here you mix up words in se with the written representation of words. A Chinese sentence consists of words the same way an English sentence consists of words: 他是老师 consists of three words, not of three symbols.
BTW, Chinese characters are anything but symbols.



> 4. A word can be easily made by putting some letters together. (in Hebrew you have to pay attention to whether a letter is at the end or not, same in Arabic)


In how far is that a feature of IE languages? I really have the impression that you mix up letters (graphemes) and sounds...



> 5. Indo-European languages have the same root. The Indo-European stem *u̩̩̩̣̥endh for instance can be found in German "winden", English "wind"/even in "went", French "venir", Latin "ventum", Czech "vinout" etc.


 This is rather black and white, but more or less okay, imho. [But, between 30 and 60% of the Proto-Germanic words are not derived from PIE]
But this is basically the reason to decide whether languages belong to the same family: whether or not linguists can find reasons to believe that there was an ancestor language from wich language X and Y are derived. That ancestor language can even be reconstructed (the result would be *Proto-*(name of language), as in Proto-Indo-European or Proto-Germanic).
Looking for such an ancestor language does not involve scripts as suggested above: this would pose some problems with Old Persian (cuneiform), Luwean (Hieroglyphs), Old Norse (runes) etc... It does involve a thorough search and comparison of the lexicon, phonological, morphological and morpho-phonological features, etc. and a search for *regular *(sound) changes or regular patterns, or at least changes which can be explained (in the case of analogy etc).


Groetjes,

Frank


----------



## vince

Frank06 is right, the script in which a language is written in does not necessarily have any relation to its language family.

For example, Vietnamese is written in the Latin alphabet, but it is not Indo-European. Mongolian is written in the Cyrillic alphabet, but it is not Indo-European.

You may say "but how about historically? Perhaps the script that the classical literature of the language is in reflects the language family?"

No. Japanese, Vietnamese, and Korean, in the early days, mainly wrote in Chinese but pronounced the characters in their native language's phonology (similar to today's Chinese "dialects"). Or, as a precursor to developing their own writing systems, they used Chinese characters to phonetically represent words in their own languages. Yet neither Japanese, Vietnamese, nor Korean are related to the Chinese language family.


----------



## Anatoli

vince said:


> No. Japanese, Vietnamese, and Korean, in the early days, mainly wrote in Chinese but pronounced the characters in their native language's phonology (similar to today's Chinese "dialects"). Or, as a precursor to developing their own writing systems, they used Chinese characters to phonetically represent words in their own languages. Yet neither Japanese, Vietnamese, nor Korean are related to the Chinese language family.


Agree with you, Vince. Only wish to comment that a huge number of words from Chinese has penetrated these languages in such a way that even new words are often coined using the Chinese components, although the pronunciation of those words/components is significantly different from Middle Chinese or from any modern Chinese dialect, including Mandarin. Vietnamese is somewhat similar grammatically to Chinese, Japanese and Korean are very different grammatically from Chinese but the percentage of borrowed Chinese words/components in Japanese and Korean is from 40% to 60% (depends on the source). A learned - high level text in Japanese or Chinese could be somewhat comprehensible to the other speaker.

--
All Slavic languages belong to the Indo-European group of languages, some grammar and vocabulary can be traced to other European groups, as in above examples and much more. E.g. the Russian word "пескарь" (gudgeon) is related to English "fish", German "Fisch", French "poisson", etc.


----------



## vince

Anatoli said:


> Agree with you, Vince. Only wish to comment that a huge number of words from Chinese has penetrated these languages in such a way that even new words are often coined using the Chinese components,



English and other European languages do the same thing with Latin and Greek compounds: words like nanotechnology, cryogenics, and genome certainly didn't exist in Greco-Roman Antiquity, but they are composed of individual Greek/Latin parts.



> although the pronunciation of those words/components is significantly different from Middle Chinese or from any modern Chinese dialect



Japanese, Vietnamese, and Korean pronunciations for Chinese-derived Chinese characters (i.e., not the Chinese character representations of native words) are usually closer phonetically to southern Chinese languages such as Min and Cantonese, due to the fact that Mandarin as changed so much phonologically.

It is amusing that you refer to Chinese languages as "dialects". Do you consider that Polish and Czech to be the same language as Russian since Russians can understand more Polish and Czech than a Mandarin-speaker can understand in Cantonese (even when spoken slowly and in formal register)?



> , including Mandarin. Vietnamese is somewhat similar grammatically to Chinese, Japanese and Korean are very different grammatically from Chinese but the percentage of borrowed Chinese words/components in Japanese and Korean is from 40% to 60% (depends on the source). A learned - high level text in Japanese or Chinese could be somewhat comprehensible to the other speaker.



None are genetically related, so any grammatical similarities are due to "influences" of neighboring languages. Russian and English have far more in common with Bengali and Persian than Japanese has with Korean, Mandarin, and Vietnamese.

However, vocabulary is a different thing. You are right about Japanese and Chinese texts being partially intelligible when written, despite having no genetic relationship. But if Japanese and the Chinese languages adopted fully phonetic writing systems, this intercomprehension would drop close to zero (like a Spanish/French speaker trying to read Basque)


----------



## chung

ronanpoirier said:


> Hey, I wanna know what are the criteria to separate the languages into families... why is this Indo-European and why is that Finno-Ugric? Is it related only to grammar, sounds and, sometimes, geographical area or there is something more?
> 
> Thanks in advance!


 
You observe a bunch of languages, see if there are patterns. Based on what you observe with the patterns, you try to figure out whether these similarities are because the languages are related to each other (i.e. derive from a common source), have been in intense contact with each other for many years (i.e. leads to borrowing), or just coincidence.

After figuring out which similarities and differences you think arise from the fact that the languages derive from a common source, you work backwards and put them into language families.

There is also a certain element of process or politics involved in creating these family trees. There's always debate and arguments among comparative linguists over classification and methods to classify or compare languages. Do a search on Google for "splitters" vs. "lumpers", "anti-Altaicists" and "Altaicists", or Joseph Greenberg and Meritt Ruhlen. In turn, the results can be used to make the speakers of one language seem special by association to another prestigious language.


----------



## PianoMan

Plenty of it is geographical, and is based on the original main languages that conquered regions surrounding them, for example Proto Indo-European was a language spoken by many coming from Ukraine around 4000 b.c. and they conquered to the West, North, and Southeast giving countries of those regions respective to the Ukraine the Indo-European classification.  The same is used for many other areas.  This explains why Japanese is an independent language in family classification because of its isolationw when linguistics was taking a hold on the world.  In addition, Vietnamese has its own Chinese character based alphabet but was abandoned and replaced with the Latin when the French conquered, explaining why they use a circumflex.  Also, Mongol has a traditional script that is written vertically and was only using Cyrillic when the U.S.S.R. annexed them.


----------



## übermönch

As said before, language families are done by comparisson and tracing back to a common root. Most important features are not lexical, but grammatical - how many grammatical "genders" it has (f.e. male, female, neuter or old, young, living, superior, inferior), how certain aspects as future or possession are expressed, what is reflected on what (in indoeuropean languages f.e. the subject is reflected on the verb vs. Basque on the object); how respect is expressed; if tone changes the meaning; if there are other random features; basicly, all the things that constitute a language! It's not hardcoded in our brain, therefore there are quite some differences between unrelated families!

The lexicon is one of the other, minor aspects - Hungarian, for instance has as much Turkic words as Ugric, however, since it bears the features of an ugric language, it's counted in the F-U family. Also, the Kartvelian, Dravidian and Indoeuropean families have some similar features, but the vocabuly is completely different - ergo, some linguists think they derived from one superfamily; Same is sometimes applied to Turkic, Mongol, Ugric, Japanese and Korean and some others. 



PianoMan said:


> Plenty of it is geographical, and is based on the original main languages that conquered regions surrounding them, for example Proto Indo-European was a language spoken by many coming from Ukraine around 4000 b.c. and they conquered to the West, North, and Southeast giving countries of those regions respective to the Ukraine the Indo-European classification. The same is used for many other areas. This explains why Japanese is an independent language in family classification because of its isolationw when linguistics was taking a hold on the world. In addition, Vietnamese has its own Chinese character based alphabet but was abandoned and replaced with the Latin when the French conquered, explaining why they use a circumflex.


Well, they begun to write Chinese exactly the same way. The Greeks, for instance, borrowed their alphabetical writing from the semitic Phoenicians and passed it through the isolated Etruscs to indoeuropean Italics. The writing really isn't related to the language family.


> Also, Mongol has a traditional script that is written vertically and was only using Cyrillic when the U.S.S.R. annexed them.


That 'traditional' script cannot be traced to the Altaic roots of Mongol language - it was invented during the 13th century basing on existing writing systems. Also, the USSR didn't annex them  Don't know why they started using cyr.


----------



## Outsider

übermönch said:


> Also, the USSR didn't annex them  Don't know why they started using cyr.


Not explicitly, but, given their "conversion" to communism...


----------



## übermönch

Outsider said:


> Not explicitly, but, given their "conversion" to communism...


Ironically most of the central asian republics first swapped to Latin and only during Stalin to cyr. Maybe those Mongolian comrades tried to do a favor


----------



## PianoMan

Yeah, it was something like that, they adopted Cyrillic with their influence from their neighbors to the North to switch to communism.  But overall, yes, the main factor in deciding the language family is the ancient "Proto Language" that influenced surrounding areas and therefore more destinctive sub-languages appearing in the influenced area.  So in all, its geographic in the sense that, the areas around the initial birth place of the mother tongue are the ones that are members of its linguistic family and it's also grammtical in regards to the languages to follow would all have similar structual ties to their predecessor.


----------

