# What do linguists need to prove etymology?



## Encolpius

Hello, first of all I am not a linguist so I haven't got the slightest idea how professional linguists work. We had a small discussion in the Hungarian forum, but since it is off-topic, I am trying to find an proper response here. 

I think even a layman does not need much evidence to prove the origin of the word goulash, vodka, samurai or other culture-related words. I also think there is no problem to find the origin of the word robot, since in the 1920's the Czech Karel Čapek made up the word. 

But how about older words when the only evidence was written documents or maybe nothing. Here is an example: the origin of the Hungarian word szalma [straw], the dictionary says it is of Slavic origin, OK, you can accept it, if you speak any Slavic language you know it is a similar word. The word is dated back to the 14th century in Hungarian. I cannot imagine what evidence the scientists found to be sure it is of Slavic origin. They found a document from the 14th century, then what?

On the other hand I was told if there are two similar words in two languages, e.g.: Hungarian vidék, Slovák vidiek, my dictionary says the origin is uncertain. How come they are certain in the first case and not certain in the second one? 

How do linguists prove the origin of a word? Can they be 100% sure? How sure are they? 90%

Thanks.


----------



## Riverplatense

Encolpius said:


> I think even a layman does not need much evidence to prove the origin of the word goulash, vodka, samurai or other culture-related words



Well yes, it's not difficult to find their origin, yet the question of etymology is not completed with this solution, of course. But you are right if it's only about the immediate origin of a word.



Encolpius said:


> if you speak any Slavic language you know it is a similar word



Usually it's not that easy. If there's a _similar _word it might be a coincidence, too, or —more probably— both the Slavic language and Hungarian can have imported the word from another language. So one might not know Romanian, then find a Hungarian word with a formal and semantic correspondent in a Slavic language and think it must come from Slavic, even though it's actually Romance. So you also have to know the etymology of the Slavic word the Hungarian word is supposed to be derived from. 



Encolpius said:


> I cannot imagine what evidence the scientists found to be sure it is of Slavic origin.



If you consider the phonological development of Hungarian (including sound substitution, which is also important when adopting words from another language) and the typical structure of its words, and then the etymology of the Slavic word you can find pretty good evidence.



Encolpius said:


> Can they be 100% sure?



Since in most cases there's no 100% continuity one can't be 100% sure, before all in languages where there's no long or continuous literacy tradition. In Italian you have a very high ratio of relatively certain etymologies, while in German there are more doubtful cases, and in a language like Basque you just can't go back the same 2500—3000 years and reconstruction is more important.


----------



## ahvalj

_Szalma_ resembles the reconstructed Late Common Slavic *_salmā_ "straw", which has phonetically regular counterparts in other Indo-European languages, e. g. in Ancient Greek (can't type Greek letters from my phone, so no examples), which makes it unlikely that it was borrowed the other way, from Hungarian to Common Slavic. Plus, most probably this word is absent in other Uralic languages. So, we have a well-grounded Slavic word vs. an isolated Hungarian one: the conclusion is that it was borrowed to Hungarian and not vice versa. That's how etymology works in this case.


----------



## Encolpius

Ahvalj, so you do not need any written evidence to prove the origin of the Hungarian word szalma. You know it does not work like: in 1345 written in a Codex: "now we Hungarians started to use the word szalma which we heard from a Croatian peasant last Monday"...


----------



## ahvalj

Encolpius said:


> Ahvalj, so you do not need any written evidence to prove the origin of the Hungarian word szalma. You know it does not work like: in 1345 written in a Codex: "now we Hungarians started to use the word szalma which we heard from a Croatian peasant last Monday"...


Strictly speaking, nobody can prove that the Hungarian word was borrowed from Slavic: this similarity may be a tragic coincidence, but one can prove that *_salmā_ in Slavic was inherited.


----------



## Encolpius

Yes, that is what I have thought. And do you have any experience how old, ancient documents can help to find the etymology?


----------



## ahvalj

Encolpius said:


> Yes, that is what I have thought. And do you have any experience how old, ancient documents can help to find the etymology?


The old documents may contain older forms of the words, the lost cognate words and the lost additional meanings: each of this can help to clarify the etymology.

Also, since the evolution of unrelated languages is often very different, the recipient language sometimes preserves the ancient shape of the word, no more discernible in the loaner language (_szalma_ with the unmetathesized _al_ is a nice example; Finnish preserves such words as _kuningas_ and _lusikka_ vs. modern Swedish _kung_ and Russian _ložka_).


----------



## fdb

Etymology is not just about “old documents”; it is mainly about regular sound laws. The word for “straw” appears for example as:

Greek _κάλαμος_
Latin culmus
German Halm
Old Church Slavonic slama
Old Prussian salme

These are all more or less identical apart from the first consonant. But there are lots of other words that have k- in Greek and Latin, h- in Germanic, and s- in Slavic and Baltic. For these, linguists have reconstructed a proto-Indo-European form with * k̑- which develops into k-, h-, s- etc. in the daughter languages. Thus, Slavic slama has a perfectly regular IE etymology.


----------



## Encolpius

fdb said:


> Etymology is not just about “old documents”; it is mainly about regular sound laws. The word for “straw” appears for example as:...



I see. Is it possible to say, to guess in what percentage do documents help scientists? Or do they mainly stick to the rules of historical development? 
But there are a few exceptions in all rules, right?


----------



## ahvalj

Encolpius said:


> I see. Is it possible to say, to guess in what percentage do documents help scientists? Or do they mainly stick to the rules of historical development?
> But there are a few exceptions in all rules, right?


You can read at this forum the conversations between M. de Blois and Iranic-speaking posters: that's how etymology works in practice. Some languages are better investigated and overall more transparent, others are not, so it is hard to generalize. Hungarian is an isolated language in its current location, therefore the study of the etymology of its words has both advantages and disadvantages.


----------



## francisgranada

Encolpius said:


> ... Is it possible to say, to guess in what percentage do documents help scientists? ...


In my opinion it varies  from case to case ... (the lack of written documents may also be an important information).

Two _ad hoc_ examples for illustration that may give some idea about the importance of the documented sources:

Example 1 - _gomba _(mushroom, fungus)
If we knew only the today's forms of this noun, perhaps we shouldn't consider the Hungarian word _gomba _of Slavic origin, as the modern Slavic correspondences are mainly _huba_, _guba _(_goba _in Slovenian). However, according to old Slavic written documents, prior to the 10th/11th century (depending on the concrete area) this word sounded approximately _gomba  _in the Slavic languages (still preserving the nasal _ǫ_). At the same time, this word in Hungarian is documented from the 12th century, indicating that it is not a recent creation or loan. Conclusion: the Slavic origin is highly probable (taking in consideration also the presence of other loanwords of Slavic origin maintaining the nasal vowels, of course).

Example 2 - _dézsmál _(to pilfer)
According to the today's form and meaning of this verb,  we could probably consider it of "unknown origin". However, the noun _dézsma _(old spelling _desma_) is documented from the 13th century in the meaning of a certain "tax/tribute corresponding to the tenth part of the income/takings/etc". This suggests that _dézsma _finally comes from the Latin _decima _(tenth part). As there are  difficulties to explain phonetically _dézsma _directly from the medieval/ecclesiastical Latin form, we have to take in consideration also other factors, namely other documented loanwords with similar phonetics and the (documented) presence of Italian monks in Medieval Hungary. Conclusion: the noun _dézsma_  is (with very high probability) of Northern Italian provenience; the verb _dézsmál  _is a later Hungarian creation/derivation.

P.S. For curiosity, see also the Spanish word _diezmo, diezma  _


----------



## francisgranada

ahvalj said:


> _Szalma_ resembles the reconstructed Late Common Slavic *_salmā_ "straw" ...


 The first _*a*_ in _sz*a*lma _in this case is rather  a Hungarian solution to resolve the "forbidden" consonant cluster at the beginning of the word (more or less: _szlama > *szalama > szalma_). I.e. the word _szalma _is supposed to come from the Slavic variant _slama_.   An analagous example is _szreda _> _szereda _> _szerda.  _(The form _szereda _is documented).

P.S. The Hungarian _sz_ is pronounced [ s ]


----------



## ahvalj

francisgranada said:


> The first _*a*_ in _sz*a*lma _in this case is rather a later Hungarian solution to resolve the "forbidden" consonant cluster at the beginning of the word (more or less: _szlama > *szalama > szalma_). An analagous example is _szreda _> _szereda _> _szerda _(the form _szereda _is documented).
> 
> P.S. The Hungarian _sz_ is pronounced  .


Yet both _*salmā_ and _*serdā_ were contemporary forms at the time of the Hungarian invasion. _Shevelov GY · 1964 · A prehistory of Slavic∶ the historical phonology of Common Slavic:_ 417 writes:


> It is also since about 860 that German, It and Gr documents and chronicles which fix some Sl names of persons and places contain the forms with metathesis. Before that time, e. g., Byzantine chronicles mentioned Βαλδίμερ (Georgius Monachus Continuatus), Δαργαμηρός (Theophanus, _Chronographia_), Περσθλάβαν (Scylitzes-Cedrenus, _Historiarum Compedium_), cf. Bg _Vladimir, Dragomir, Preslav._ The turning point in the rendition of names of that type was about 860. This also applies to the Sn area. It is in 860 that the name _Trebinam_ is recorded (_*Trěbĭnā_), in 864 _Zebedrach_ (_*Sobědragъ_) in Salzburg charters, etc.



Thus, _szereda_ may have been a parallel East Slavic form or/and the intermediate in-Hungarian stage as you described.


----------



## Angelo di fuoco

fdb said:


> Etymology is not just about “old documents”; it is mainly about regular sound laws. The word for “straw” appears for example as:
> 
> Greek _κάλαμος_
> Latin culmus
> German Halm
> Old Church Slavonic slama
> Old Prussian salme


I thought Latin had calamus... (lapsus calami)


----------



## fdb

culmus is the "genuine" Latin word; calamus is a loan from Greek in the meaning "reed for writing".


----------

