# Meaning in comparative-historical linguistics



## DenisBiH

Hello everyone!

I am not a professional linguist so I apologize in advance for a potentially silly / stupid question, and I thank you for your patience. 

I have read works on historical-comparative linguistics, mostly for Slavic and other Indo-European languages. I may be mistaken, but the primary way of classification of languages seems to be based on (collections of) isoglosses that represent common sound shifts; and on their relative chronology I guess. I may have stated this in an awkward manner, but in pretty much every book that touches historical-comparative linguistics there is probably going to be a table somewhere at the beginning showing cognates in different IE languages for the word father etc as an illustration to the reader, and then an explanation of the sound shifts involved.

Now, words coming from some shared reconstructed root can have slightly, or not so slightly, different meanings in the daughter languages. This is understandable. 

But my question is - has anyone tried classification based not (only) on shared sound changes, but on shared/close/not-diverging-too-much meanings for words that can be established to originate from a single root?

I guess this might not make much sense in some comparisons (what would be the sense in comparing the meanings of Hittite and Celtic cognates for example), but let's say for grouping within certain language families such as Slavic.

Would there be any sense in, for example, taking the Swadesh list, finding the closest matches for the words in it that could be reconstructed in Proto-Slavic (closest matches by reconstructed meaning) and then going to daughter languages grouping them according to similarity/difference in meanings of cognates.

The reason I'm asking this (other than sheer curiosity) is the potential to get some further insight into cultural contacts between ancestors of speakers of these daughter languages. The idea would be, even when sound shifts start breaking apart a single speech community, those parts of that community that are in closer contact would (would they?) have a tendency to keep similar meanings for words that are frequently in use, due to frequent communication between speakers of their respective dialects. For those groups that are further apart, either geographically or by virtue of lesser contacts for any other reason (political etc) there would be no such force keeping the meanings close.

I guess even if one should try such a thing there is a strong possibility the classification would be no different than it is already, but maybe some additional information on past relationships could be obtained.

I hope I haven't bored you too much. I would welcome your opinions. 

If there is already a thread on this or a similar topic, I would love to read it.


----------



## humvee

Denis, I'm lost. Could you please summarize your question in one sentence?
I assume you mean semantic change. Or more precisely, semantic drift. Unlike syntax. Semantic drift is totally unpredictable. "Nice" once meant ugly and now means "good". "Bad" in black english means good.


----------



## DenisBiH

Well, in one sentence, I guess I'm talking about "classifying" closely genetically related languages according to the meanings of cognates, not  only their forms.

Let me give an example. Someone goes into some distant part of the world and finds six tribes living in some area speaking their own dialects. One notes there are similarities that can only be explained by those dialects originating from a same proto-language. One goes about reconstructing such a language and classifying these six dialects into subgroups.

Let's say this is an example of cognates in those six languages

a) putuk walk
b) putuka run
c) bodoge walk
d) buduga walk
e) bogoda walk
f) badag run

Let's say one reconstructs *bodoga, with the meaning "walk" for the protolanguage. Let's say all the six languages are pretty similar in morphology/grammar.

Let's say the first two languages, apart from devoicing voiced stops, share a number of other sound changes from the reconstructed proto-language. Let's say the remaining four languages also, among them, share a number of sound changes.

Am I right in assuming that in these circumstances one would group first two languages in one subgroup, and the remaining four in another?

However, once the proto-language has been reconstructed, one notes that languages b) and f), while having different forms, in some cases, *consistently* share the meaning of the words, contrasting them to the other four languages. These words are of the core vocabulary kind. These two languages either have preserved a meaning closer to the reconstructed meaning in the proto-language, or both share the same semantic shift.

One then goes about studying this in greater detail, to make sure this is not random or limited to a few choice words, but covers a large part of the vocabulary. The resulting conclusion that b) and f) share the meanings in a significant number of cases contrasting them to other four might not be of much interest to linguists, but it could be very interesting to historians trying to use linguistics to infer more about the past of the six tribes.

Now, my question - has this been done anywhere,for any group of languages in the world, in a consistent fashion (not just randomly noting shared semantic shifts in an etymological dictionary)? Is there a field of linguistics that deals with this? Judging by the description given here, I would say I'm talking about something that might be "historical & comparative semasiology", if it existed.


----------



## DenisBiH

Frank, I'm sorry about a second post, but I would like to keep this example separate from the post above if at all possible. If not, then you may join it to the post above.

Here's an example:



> Meillet and Vaillant considered that the semasiological development of the Proto-Slavic word for 'god' was an Iranianism. In both Slavic and Indo-Iranian, the root that denotes 'deity' also denotes 'wealth, share' (Proto-Slavic *_bagu_ > Common Slavic *_bogъ_) and Indo-Iranian (Old Persian _baga_, Sanskrit _bhága_).[7] However, they did not argue that the Proto-Slavic root itself was a borrowing, despite its similarity to the Old Persian and Sanskrit roots.


So basically, in this case, something is considered an Iranianism in Slavic not necessarily because of the form of the word, but because of the development in meaning. Now, rather than simply noting this in a few cases, has anyone tried to do this systematically, not necessarily for Slavic-vs-Iranian, but for any language?

Or to put it in yet another form, is there a discipline within linguistics that specifically studies "*semantic isoglosses*", as used in this article by Vyacheslav Ivanov?


----------



## berndf

DenisBiH said:


> Now, my question - has this been done anywhere,for any group of languages in the world, in a consistent fashion (not just randomly noting shared semantic shifts in an etymological dictionary)?


Not to my knowledge; not in a systematic way. As Humvee noted earlier, semantic changes tend to be much more erratic than phonetic shifts and are therefore in general less useful to derive robust classifications.


----------



## sokol

I also cannot remember any systematic classification which takes semantic changes _*into account*_ (in my opinion it would not make much sense to base one on semantics primarily - semantics is just too erratic for that, to use berndf's wording).

But I see how it could be relevant to _include_ semantics in language classifications with special reference to Slavic languages. 
However, one would have to be very careful about conclusions drawn from semantics - as calques often give a wrong picture (for example, Slovene introduced Czech loans and calques, like "vlak" which was even introduced into Croatian; so words like "vlak" might lead you to think that Slovene is relatively closely related to Western Slavic - which by the way is the case, but you arrived at that conclusion for the wrong reasons, one shouldn't quote the word "vlak" here but rather phonological isoglosses ).


----------



## DenisBiH

Thank you very much berndf and sokol for your insights. 

sokol. the idea actually occurred to me last night while reading the False friends thread on the Other Slavic Languages subforum, and it is becoming clearer only now that we are discussing it (I thank you for that). I definitely felt uneasy about using the term _classification_ when writing, I even used quotes once or twice but still chose to use it as I couldn't find a better word.

I guess one could not make a classification similar to the one existing now, but maybe a map (or better yet, a GIS map ) of more important semantic isoglosses would show some more patterns?

The example you mention, "vlak", is interesting. I didn't know it was borrowed/calque actually. Anyway, here's the thing - a comparison could be limited to cognates that are reflexes of Proto-Slavic core vocabulary, thus minimizing the occurrence of "trains" and hoping that strength indeed lies in numbers, i.e. that a comparison of a number of words/meanings would level-out those oddities here and there. Though I guess one could debate what would constitute core vocabulary, and what would be a reflex to be examined, given how Slavic languages make heavy use of prefixes on inherited roots to form new meanings.

But, let's say vlak does end up in the comparison. No, any conclusion on a close _genetic_ relationship between Croatian and Czech (i.e. closer than is already established, both being Slavic) would be wrong. But a conclusion on _mutual influences_ between Czechs-Slovenians-Croats in a previous time period would be valid. Now that's probably not much use to us now, but let say for a historian in a thousand years...if he had limited other sources (which is the case for early Slavic history), that might actually be a pretty important piece of information. 

So, from the strict perspective of a linguist, I guess, classification or rather "classification" based on semantic isoglosses would be hard to do, impossible, and probably pointless, but a study that ends up detailing some of those more important semantic isoglosses (if such exist) in an accessible manner could be gold for researchers of the past.

Now, one could argue that etymological dictionaries and books and articles such as the one by Ivanov indeed already have this information, one only needs to extract it. But, for a non-linguist knowing what is important and what less important in terms of semantic shifts would be much harder, if it is already hard enough for a linguist. And the learning curve needed to be comfortable with historical & comparative linguistics literature can be rather steep.

But enyway, let me stop now, I thank you again on your input dear sirs.


----------



## sokol

Well, Denis, between Slavic languages there have been many borrowings - loans and calques in some cases (that's the case with "vlak"), and others which worked on an "involuntary" basis, just through cultural influence (an interesting word here is Czech "musit" which travelled quite far, I think even Ukraine has it, and Slovak and Polish - musieć - certainly do; originally, it is a German loan, loaned first by Czechs I think - and handed on by them to other Slavic languages).

So what would be interesting concerning _*genetic*_ relationship only would be those similarities in meaning which are _*not*_ to loans occuring at a later stage; of course, those loans still would be interesting - but not in the genetic sense. 

Concerning Slovene I can tell you that there has been something of a "cultural influence" of Czech on Slovene in the 19th century: for Slovene nationalists of the 19th century the (then) yet more stabilised and much stronger Czech language culture was exemplary, they not only took over loans and calques but also the concept of how to "defend" (for want of a better word) their tiny language against the (then, in the Austrian Empire) all-dominating German language.

So there's a reason why Czech loans should appear in Slovene - which go back to the 19th century.
And it seems that Croats partially also loaned both concept of language planning as well as loans/calques over from Czech, either directly or probably through Slovene influence - that I don't know (you probably can find something about this in literature about Croatian language in the 19th century).

But what is _*really*_ interesting from a genetic point of view is that e. g. Slovene has retained some Old Church Slavonic meanings while neither Western Slavic nor their southern neighbours (BCS) have. Unfortunately I cannot remember any for now, I think (!) there are some which have cognates in Russian; however, that again might lead you on the wrong track - the meaning might have been preserved in Slovene, but in Russian it might be a loan from Old Church Slavonic (from which there was heavy borrowing).

So you see again, semantics is really tricky if you want to make use of it for genetic classification (and yes, one can use this word as long as one's aware of the pitfalls semantics might hold).


----------



## koniecswiata

Many European languages of not derived genetically from Latin (such as French or Spanish), have created many calques due to cultural influence based on Latin.  German, and following this, Scandinavian and Slavic languages, have "tons" of vocabulary based on Latin calques.  You can litterally start taking verbs apart, and notice how the same (or similar) prefix is used with a root.   Example:  Eindruck = Impression (sorry, the English version of the Latin word).
However, this would just show the cultural force of Latin on all those languages.  I suppose one could create cultural groupings around this--"Semantically Latin-derived languages".  Though, this would just show a cultural influence.


----------



## DenisBiH

koniecswiata said:


> Many European languages of not derived genetically from Latin (such as French or Spanish), have created many calques due to cultural influence based on Latin.  German, and following this, Scandinavian and Slavic languages, have "tons" of vocabulary based on Latin calques.  You can litterally start taking verbs apart, and notice how the same (or similar) prefix is used with a root.   Example:  Eindruck = Impression (sorry, the English version of the Latin word).
> However, this would just show the cultural force of Latin on all those languages.  I suppose one could create cultural groupings around this--"Semantically Latin-derived languages".  Though, this would just show a cultural influence.




This is a very interesting and important point. One BCS word for impression is *utisak*, u (in) + tisak (pressing, mostly used today in Croatian, in the meaning press, as in newspapers). But on the other hand there is the word *dojam*, deriving from Common Slavic *dojьmъ, again a compound do + imati, but a Common Slavic one, it seems. Now I would like to see how *dojьmъ developed in meaning in Slavic languages.

Etymological dictionaries (and the entire historical&comparative linguistics) and text corpora analysis could provide some guidance as to which words are later calques on Latin, German or some other language. These could be excluded from the comparison.


----------



## sokol

DenisBiH said:


> Etymological dictionaries (and the entire historical&comparative linguistics) and text corpora analysis could provide some guidance as to which words are later calques on Latin, German or some other language. These could be excluded from the comparison.


Exactly, you must at first exclude any possible calques and even "loaned semantics" (in some cases, the semantics of a Latin word were "transferred" to an already existing native word, so a change of meaning influenced by Latin happened; again, sorry but I have no good example at hand).

So to work like that really would be very hard work indeed; you need to do a lot of work before you can isolate the semantic "core" which is not loaned in any way, and thus might indicate genetic relationship.


----------



## DenisBiH

sokol, thanks for the interesting post. 

What I've been thinking about is maybe not far away from some of the points you made in your post.

Assuming one could successfully eliminate all the recent meaning-loans/calques which are not that interesting either for the study of the more distant cultural influences or some genetic relationships, what would one do next if it was shown there were layers upon layers of meaning-borrowings? How would one separate those layers for any sort of meaningful analysis?

Your example with Slovene and Russian is a good one. Basically, without knowing which words belonged to which layers one would be comparing apples and oranges. Maybe focusing on dialects more than on standard/literary languages could help a bit to eliminate high-culture meaning borrowings/calques? Focusing on dialects would also enable much better precision (and usefulness) if this data were to be plotted on a map. I may be wrong, but I think modern Geographic Information Systems have the spatial analysis features such as cluster detection that would maybe eliminate the need to do the analysis fully manually. What would be a very tedious task then would be getting the words of the core vocabulary, getting their meanings in the various dialects, and then transforming this into GIS maps (isoglosses as polygons maybe). From there, hopefully, run the spatial analysis and check the results for interesting patterns (but I'm not a GIS expert)

I'm rambling again so better stop. I see how this could prove to be a very difficult task for a larger number of words.


----------



## sokol

Well, Denis, you summed it up quite well - it might be extremely difficult to separate layers; especially (your post reminded me of this!) as there exist different words and meanings in different dialects of the same language - Slovene certainly would offer plenty of cases for that one.

It could be done still, and it might give insights, but it will be always difficult to be sure about which meanings were original and which weren't - even with sounds this oftentimes is difficult to decide, and meanings usually are even more evasive.


----------



## koniecswiata

It would be interesting to go through all the "semantic layers of borrowing".  However, even if the calques are based on another language, I don't see them as being somehow foreign, borrowed, or "intrusive".  The demonstrate the cultural history of a language and are an integral part.  I¡m generally opposed to the term "borrowing" anyways.  For the native speaker of any language, all words that they grow up with are a natural part of the language for them.


----------



## sokol

koniecswiata said:


> It would be interesting to go through all the "semantic layers of borrowing".  However, even if the calques are based on another language, I don't see them as being somehow foreign, borrowed, or "intrusive".


Yes, of course loans and calques are natural parts of language. My post wasn't about this but about the fact that loans and calques do not indicate genetic (but rather cultural) relationships.  (Which of course would be an interesting topic in itself.)


----------

