# Romance Languages: origin of the definite article



## ronanpoirier

In my French Grammar (dating from 1930!) there are a few lines about the origin of the French language and it says that since Latin had no articles, then, the neo-latin languages should have borrowed them from another language or group of languages. Some theories were from Arabic and another from Germanic languages. There were another one but I can't remember.

So what do you people think?


----------



## vince

The definite articles come from the Latin demonstratives, and the indefinite article comes from the number "one" in Latin (which is why the word for "a" and the word for "one" look alike or nearly alike in Romance languages).

ille and illa ("this"):
Le la (French)
El La (Spanish)
O A (Portuguese)
El La (Catalan)
Il La (Italian)

Sardinian has its own definite articles, unrelated to these.

An interesting fact is how many Romance languages say _si_ for "yes" . This of course comes from Latin _sic_. But French oui and Occitan oc comes from _hoc ille_.


----------



## modus.irrealis

Hi,

I don't know if another language gave the impetus to the Romance languages to use a definite article, but the definite article itself was formed out of Latin material. The Romance languages, at least the ones I'm familiar with, all use forms of the Latin word _ille_ "that." One big difference I know of is that Romanian puts it after the noun while the other languages put it before.

But this also seems to be a very regular development. Both English and Greek produced their definite article out of a word meaning "that," so maybe there's no need to think that there was some influence from another language on the Romance languages.


----------



## MarcB

vince said:
			
		

> Sardinian has its own definite articles, unrelated to these.


Sardinian su sa. Corsican  u  a both related to Portuguese


> An interesting fact is how many Romance languages say _si_ for "yes" . This of course comes from Latin _sic_. But French oui and Occitan oc comes from _hoc ille_.


French has si meaning yes as an answer to a negative question.


----------



## robbie_SWE

As it was said, Romanian puts its definite articles at the end of the noun. But Romanian does have indefinite articles, which express gender (in Romanian there are three gender forms; masculine, feminine and neuter). 

*Un barbat (m.)* = a man 
*O floare (f.)* = a flower
*Un tren (n.)** = a train 

_*(N.B. Romanian is the only Latin language that kept the Latin neuter. Personally I have a very hard time distinguishing between masculine and neuter, because they use the same article)._ 

To express the definite article, you have to add an ending to the noun according to the gender. The endings are formed according to the Latin "_ille_" and "_illa_" (but adapted "Vulgar Latin" forms!). 

*un barbat (m.)  **barbatul* = the man
*o floare (f.)  **floarea* = the flower
*un tren (n.)  **trenul* = the train

Latin did not have a word for yes, so the Romanian equivalent became "*da*". But you can also say "*desigur*" (also meaning "_yes_", but a bit more affirmative). "*Desigur*" is constructed de + sigur ("_sigur_" coming from the Greek word "*síghuros*"). 

 robbie


----------



## vince

What is the origin of Romanian _da_? Is it from Latin? Or is it of Slavic origin (like Russian da).


----------



## robbie_SWE

vince said:
			
		

> What is the origin of Romanian _da_? Is it from Latin? Or is it of Slavic origin (like Russian da).


 
The origin of the Romanian "da" comes from its Slavic neighbours. But in Romanian you will see many "*si*" (but with a little tail under the _s_). The "si" in Romanian, means "and" (like the French "et"). It derives from the Latin "sic".


----------



## modus.irrealis

robbie_SWE said:
			
		

> Latin did not have a word for yes, so the Romanian equivalent became "*da*". But you can also say "*desigur*" (also meaning "_yes_", but a bit more affirmative). "*Desigur*" is constructed de + sigur ("_sigur_" coming from the Greek word "*síghuros*").


And Greek borrowed that from Venetian, so you still get a Latin source in the end .



> _*(N.B. Romanian is the only Latin language that kept the Latin neuter. Personally I have a very hard time distinguishing between masculine and neuter, because they use the same article)._


I've had a question for a while about the neuter gender in Romanian. Is it the case that neuter nouns act like masculine nouns in the singular and feminine nouns in the plural, or are they set apart completely?


----------



## robbie_SWE

That's a tough one! 



> _More specifically, in Romanian, neuter nouns behave *in the singular as masculine nouns* and *in the plural as feminine nouns*. As such, all noun determiners and all pronouns only have two possible gender-specific forms instead of three. From this perspective, one can say that in Romanian there are really just two genders, masculine and feminine, and the category labeled as neuter contains nouns whose gender switches with the number._


 
So, you're totally correct in your statement! 

PS: thank you for the information about "desigur". Had no idea, but it's fascinating! 

Hope this helped! 

 robbie


----------



## karuna

To me articles seem a very radical change in language that it is hard to imagine that it could happen without external influence from other language. 

When the first Christian proselyters from Germany started to translate Biblical texts into Latvian they hadn't learned Latvian very well and tried to use the demonstrative pronoun as the definite article in their translations. Even the modern Bible translation by tradition uses _tas Kungs_ ("the Lord", literary, "that Lord"). But it sounds very stupid and serves as a living example that the articles are incompatible with Latvian language.


----------



## Outsider

MarcB said:
			
		

> Sardinian su sa. Corsican u a both related to Portuguese


If I'm not mistaken, some dialects of Italian use the articles _la, lo_ (pronounced [lu]). So maybe the source is not Portuguese, but some non-standard dialect of Italian. Or it could be an independent development...


----------



## MarcB

I meant it is a similar development like Portuguese but of course Latin is the origin, not Portuguese.


----------



## cajzl

The Sardinian su/sa came from the Latin demonstrative ipse/ipsa/ipsum.


----------



## alitza

robbie_SWE said:
			
		

> Latin did not have a word for yes, so the Romanian equivalent became "*da*". But you can also say "*desigur*" (also meaning "_yes_", but a bit more affirmative). "*Desigur*" is constructed de + sigur ("_sigur_" coming from the Greek word "*síghuros*").


 
Personally, I wouldn't  put an equal sign between "da" and "desigur" just as "yes" and "of course" are non synonims in English. I'd rather translate "desigur"  as "of course" and its synonim would be "bineinteles". The latter is a compound adverb, where "bine" means "well" and "inteles" means "understood".


----------



## danielstan

While Classical Latin (spoken by cultivated people in 1st century BC and used in writing by Cicero and others) had no definite article
today's all Romance languages have it.
Linguists expressed many hypothesis about the origin of this important feature. I will state my opinion based on what theories I read so far.

Because all Romance languages have the definite article we deduce this feature has developed before the fall of the Western part of Roman Empire in *476 AD *(conquered by Odoacer).
The text _Peregrinatio Aetherie_ (or _Itinerarium Egeriae_) supposed to have been written in *380 AD* is a significant proof of the emergence of a definite article in late vulgar Latin.
Excerpts from this text shows the demonstrative pronoun _*ille *_in enclitic positions (after the noun):



> 1. ... Interea ambulantes peruenimus ad quendam locum, ubi se tamen montes _illi_, inter quos ibamus, aperiebant et faciebant uallem infinitam ingens, planissimain et ualde pulchrarn et trans uallem apparebat mons sanctus Dei Syna. Hic autem locus, ubi se montes aperiebant, iunctus est cum eo loco, quo sunt memoriae concupiscentiae. In eo ergo loco cum uenitur, ut tamen commonuerunt deductores sancti _illi_, qui nobiscum erant, dicentes: consuetudo est, ut fiat hic oratio ab his qui ueniunt, quando de eo loco primitus uidetur mons Dei: sicut et nos fecimus. Habebat autem de eo loco ad montem Dei forsitan quattuor milia totum per ualle _illa_, quam dixi ingens.



but also in proclitic positions (before the noun):



> 2. ...  _Illud _sane satis admirabile est et sine Dei gratia puto illud non esse, ut cum omnibus altior sit _ille _medianus, qui specialis Syna dicitur, id est in quo descendit maiestas Domini, tamen uideri non possit, nisi ad propriam radicem _illius _ueneris, ante tamen quam eum subeas; nam posteaquam completo desiderio descenderis inde, et de contra _illum _uides; quod antequam subeas, facere non potes.



It is thinkable this important evolution of spoken Latin was made under the influence of a foreign language which already had a definite article.
The best chances are for German (of course, the German language spoken in Antiquity). The Germans taken prisoners and moved inside the Roman Empire or the Germanic tribes at the frontiers which have been colonised as _foederates _could have brought their definite article as a _linguistic calque_ in the vulgar Latin they have learnt.

Another antique language which could have offered Latin a definite article is Greek, but in Greek there is one single article (no distinction between definite and undefinite article) and its usage is different than in today's Romance languages.


----------



## wtrmute

robbie_SWE said:


> _*(N.B. Romanian is the only Latin language that kept the Latin neuter. Personally I have a very hard time distinguishing between masculine and neuter, because they use the same article)._



Not quite; Asturian also has it, although it is not a national language.

As for the definite article, even in Homeric times the Greek article worked like a weak demonstrative, so the same thing happening 1000 years later to Latin isn't so far-fetched.


----------



## danielstan

robbie_SWE said:


> _*(N.B. Romanian is the only Latin language that kept the Latin neuter. Personally I have a very hard time distinguishing between masculine and neuter, because they use the same article)._
> 
> To express the definite article, you have to add an ending to the noun according to the gender. The endings are formed according to the Latin "_ille_" and "_illa_" (but adapted "Vulgar Latin" forms!).
> 
> *un barbat (m.)  barbatul* = the man
> *o floare (f.)  floarea* = the flower
> *un tren (n.)  trenul* = the train


In Romanian one could distinguish the neuter from other genders by the following alternative rules:
1)
*un* băiat (m.) - *doi* băieț*i
o* floare (f.) - *două* flor*i
un* tren (n.) - *două* tren*uri*
-------
N.B. the numerals 1 and 2 have different shapes for masculine and feminine genders:
un (m. sg.) - o (f. sg.)
doi (m. pl.) - două (f. pl.)
The numerals for neuter gender behaves like masculine at singular and like feminine at plural.
The termination -*uri* (from Latin -*ora* like in: temp*us* - temp*ora*) is very common for Romanian neuters, but not the only possible.
Example of neuters without -*uri*:
*un* vis (n.) - *două* vis*e*

2)
băiat bun (m.) - băieț*i* bun*i*
floare bun*ă* (f.) - flor*i* bun*e*
tren bun (n.) - tren*uri* bun*e*
--------
N.B. the adjective at neuter gender behaves like masculine at singular and like feminine at plural:
bun (m. sg.) - bun*ă* (f. sg.)
bun*i* (m. pl.) - bun*e* (f. pl)

As rules of Romanian language:
- living beings are either of masculine or feminine gender, but never of neuter gender
- non animated things may be of any gender. During the evolution of the language some nouns have changed from masculine to neuter or viceversa.


----------



## danielstan

Preamble:
Linguists have observed there are some regular changes in the structure of the words during the continuous evolution of a language.
Those regular changes are named "*phonetic rules*" and are rules empirically discovered (postulated) by linguists and verified in the vast majority of the cases. Based on these phonetic rules which could be verified on many cases, the evolution of some words could be reasonably reconstructed.
The "reconstructed" words (not attested in written sources from Antiquity) are noted with an asterisk (*) before them,
e.g. *_lupu _(from Latin _lupum_)

Observation:
Most of the Romanian words of Latin origin are inherited from their *accusative *declension.
E.g.: _mons _(lat. nominative)/_montem _(lat. accusative) > _munte _(rom.) ("mountain")
The same goes for: Italian: _mons/montem_ (lat. acc.) > _monte _(it.); French: _serpens/serpentem_ (lat. acc.) > _serpent _(fr.)  ("snake"); Spanish: _serpens/serpentem_ (lat. acc.) > _serpiente _(sp.); Portuguese: _serpens/serpentem_ (lat. acc.) > _serpente _(pg.)

Reconstructing the evolution of the *definite article (masculine singular)* in Romanian:
-----------------------------------------------------------------------------------------------
1) definite article (masc. sg.)
E.g.: _lup _(m.) / _lup*ul* _(m. articulated) ("wolf" / "the wolf")

Reconstruction from Classical Latin to modern Romanian:
1.1 _lupus ille_ (lat. nom.) / _lupum illum_ (lat. acc.) > _*lupu illu_ (vulgar Latin)
1.2 _*lupu illu > *lupulu_ (proto-Romanian)
1.3 _*lupulu > lupul_ (modern Romanian)

1.1 Phonetic rule: _-m_ and _-s_ at the end of the Latin words have disappeared staring from the 1st century AD.
cf. (consult) the text _De Aedificiis_ (by Procopius of Caesareea, c. 560 AD), with the Balcanic toponyms: Burgualtu (Classical Latin: Burgus Altus), Asilva (c. lat. Ad Silvam), Castellonovo (c. lat. Castellum Novus)

1.2 Phonetic rule: Latin intervocalic *-LL-* > Romanian intervocalic *-L-*
e.g. _vallem _(lat. acc.) > _vale _(rom.)
Disappearence of _*-i-*_ from _illum _cannot be proved by phonetic rules.

1.3 Phonetic rule: final _-u _in proto-Romanian words has disappeared in most of the cases.
There are exceptions like:
_lucrum _(lat.) > _lucr*u* _(rom.) / _lucr*ul* _(rom. articulated) ("thing" / "the thing")

Exceptions:
Masculine nouns ending in vocals have other definite article from Latin _*ille*_, but adapted to the internal needs of the language:
_fratrem _(lat. acc.) > _frate _(rom.) / _frate*le* _(rom. articulated) ("brother")
_tata _(lat. nom.) / _tatam _(lat. acc.) > _tat*ă *_(rom.) / _tată*l *_(rom. articulated) ("father")

Reconstruction of non-articulated masc. sg. in Romanian:
------------------------------------------------------------------------------------------
_lupum _(lat. acc.) > *_lupu _(vulgar Latin) > _lup _(rom.)

Reconstruction of definite article (masc. sg.) in Aromanian (dialect of Romanian spoken by small groups of latinophone populations in Greece, Macedonia and Albania):
--------------------------------------------
1.1 _lupus ille_ (lat. nom.) / _lupum illum_ (lat. acc.) > _*lupu illu_ (vulgar latin)
1.2 _*lupu illu > *lupulu_ (proto-Romanian)
===========> so far it's like the Romanian evolution
1.3 _*lupulu > luplu_ (modern Aromanian)

Conclusions:

The definite article for masculine singular declension is normally *-ul*.
The _*-u*_ from _-ul_ comes from the Latin noun at masc. sg. which usually ended in _-*u*s_ (nominative) / _-*u*m_ (accusative).


----------



## danielstan

Reconstruction of the evolution of definite article (masc. sg.) in Italian:
--------------------------------------------------------------------------------
1.1 _ille lupus _(lat. nom.) / _illum lupum _(lat. acc.) > _*illu *lupu _(vulgar Latin) ("the wolf")
1.2 _*illu *lupu > *illo lupo_
1.3 _*illo lupo > il lupo_ (it.)

1.1 Phonetic rule: _-m_ and _-s_ at the end of the Latin words have disappeared staring from the 1st century AD.

1.2 Phonetic rule: -u termination > -o termination 
e.g. _ursus / ursum _(lat. acc.) > *ursu > orso_ (it_.)

1.3 *_*illo*_ has evolved to _*il*_ for most of the masculin nouns in Italian

For the nouns beginning with s-, z- and others:
--------------------------------------------------------------------------------
1.1 _ille sclavus _(lat. nom.) / _illum sclavum _(lat. acc.) > _*illu *sclavu _(vulgar Latin) ("the slave")
1.2 _*illu *sclavu > *illo sclavo  ===> _same as above
1.3 _*illo sclavo > lo schiavo_ (it.)

1.3 *_*illo*_ has evolved to _*lo* _for the masculin nouns starting with s-, z- and other few cases


----------



## Penyafort

ronanpoirier said:


> In my French Grammar (dating from 1930!) there are a few lines about the origin of the French language and it says that since Latin had no articles, then, the neo-latin languages should have borrowed them from another language or group of languages. Some theories were from Arabic and another from Germanic languages. There were another one but I can't remember.
> 
> So what do you people think?



The Arabic theory can't be taken seriously since:

1- You find articles in all of the Romance languages, many of which were never directly influenced by Arabic.
2- The derivational process from Latin ILLE/ILLU, ILLA in most of the Romance languages is perfectly explainable. Some very conservative varieties such as Central Aragonese even preserved the whole _el·lo, el·la _forms until the 20th century.
3- There are also articles deriving from similar Latin forms IPSE/IPSU, IPSA in Sardinian (su, sa, sos/is, sas/is) and Balearic Catalan (es/so/s', sa/s', es/sos, ses).
4- The change of meaning from a demostrative into an article is plausible and has happened in other languages. What is debatable is at what point these demostratives/pronouns began to be used as simple articles in Vulgar Latin, probably at a very early time.


----------



## Nino83

danielstan said:


> Because all Romance languages have the definite article we deduce this feature has developed before the fall of the Western part of Roman Empire in *476 AD *(conquered by Odoacer).



I agree.
Probably in spoken Latin the use of the definite article was common.



danielstan said:


> Reconstruction from Classical Latin to modern Romanian:
> 1.2 _*lupu illu > *lupulu_ (proto-Romanian)
> 
> 1.2 Phonetic rule: Latin intervocalic *-LL-* > Romanian intervocalic *-L-*
> e.g. _vallem _(lat. acc.) > _vale _(rom.)



I don't agree about it.
I think it is more probable that there was just a contracted form of _illo/a_ in spoken Vulgar Latin because it is more coherent with the following phonetic rules, at least in Western and Italo-Dalmatian Romance languages.

This is clear if we consider those languages which retained the double and single intervocalic /l/ and those, like Portuguese, which dropped intervocalic single /l/ and degeminated intervocalic double /ll/ to /l/.
And it is more clear if we consider the third person singular pronoun, from _ille/illum/illam_.

So, I think, there was just in spoken Vulgar Latin a weak form, _lo/la_ used as a definite article and a strong form, _elle/ello/ella_ used as personal pronoun, _he/she_.

Double /ll/ and single /l/ in intervocalic position:
Tuscan Italian: caballu(m) > kavallo, colore(m) > kolore
Central and Southern Italian: caballu(m) > kavallo/u/ə, kavaɖɖu, colore(m) > kolore, kuluri
Spanish: caballu(m) > kaβaʎo, colore(m) > kolor

Portuguese: caballu(m) > kavalu, colore(m) > kolor > koor > kor

Definite article:
Tuscan Italian: elle > eʎʎi, ella > ella, lo/la > lo/la
Central and Southern Italian: ello > illu, illə (metaphony, final -u), iɖɖu (Sicilian vocalic system, 5 vowels), ella > ella, ellə, iɖɖa, lo/la > lo/lu/la (somewhere also contracted forms o/u/a)
Spanish: ello > eʎo, ella > eʎa, lo/la > lo/la

Portuguese: elle > ele, ella > ela, lo/la > o/a

If we assume that the definite article in spoken Vulgar Latin was illo/illa, phonetic changes like illo/illa > o/a in Portuguese (with the elision of the double /l/) and illo/illa > lo/la in Italian (degemination of double /ll/) and illo/illa > lo/lu/la or o/u/a in Central and Southern Italian (degemination or elision of double /ll/) couldn't be explained.

I think there was just a difference between lo/la and ello/ella in spoken Vulgar Latin, both from illo/illa (< illum/illam), the first was the definite article and the second the third person singular subject pronoun.


----------



## danielstan

Well, you try to find a "common denominator" in spoken Vulgar Latin for the definite article ("the") and the personal pronoun ("he/she") in Romance languages.

While I agree that the Classical Latin _illum/illam_ is  (very probably) the source of both definite article and personal pronoun of 3rd person in Romance languages,




Nino83 said:


> I agree.
> So, I think, there was just in spoken Vulgar Latin a weak form, _lo/la_ used as a definite article and a strong form, _elle/ello/ella_ used as personal pronoun, _he/she_.



I observe your intuition let some important linguistic facts unexplained:

Italian definite article for masc. sg. has 2 forms: _il/lo_ which could be reasonably explained by a _*illo_ or _*ilo_ (I don't argue on single _/L/_ or double_ /LL/(*)_ here)
sometime during the evolution of Vulgar Latin to Italian, but unacceptable explained from a _lo/la_ form.
(*) I prefer the notation _-LL-_ or_ /LL/_ instead of _/ll/ _because I encountered it in some books with the justification that _/ll/ _could be confused with "double _i_ capitalized"

Spanish definite article for masc. sg.: _el_ cannot be explained by a _lo/la_ weak form in Vulgar Latin

I don't know other dialects of Italian or Spanish but I guess there are some of them where the masc. sg. definite article is similar to il (it.) or el (sp.). 
I invite knowledgeable people to give examples, if any.



My opinion (based on some intuition and a few things I read about this topic) about the personal pronoun on 3rd person:

Their original source should have been _illum/illam_, which become in spoken Vulgar Latin _*illu/illa_
Before the fall of the Western Roman Empire (476 AD) is thinkable that Vulgar Latin _*illu/illa_ has moved to _*ellu/ella_ when used as personal pronoun
After the Romance languages have been "born" from Vulgar Latin the evolution inside each language was governed by:
- the "need" of the speakers to make distinction between 2 different grammatical features: the definite article / the personal pronoun
- because the definite article is more frequently used in talking it was the best candidate for "reduction in length".
"reduction in length" was from 2-silabic _*ellu/ella _to 1-silabic form of today.
See the definite articles_ o/a_ in Portuguese and _-ul/-a_ in Romanian.

As a "principle" in the evolution of languages (confirmed in many cases) the words with highest frequences in usage tend to reduce in length (silabes and phonemes).
This "principle" of shortening high-usage words is known by linguists and mentioned in books.


----------



## danielstan

Continuous evolution of languages. The case of Romanian definite article masc. sg. during last 2 centuries
-----------------------------------------------------------------------------------------------------------------------
As a realization of the above mentioned principle of shortening the words with high frequence I will discuss the case of the definite article in Romanian (termination -_ul_).

Today in spoken (familiar) language the _-l_ of the _-ul_ termination is not pronounced.
In academical speeches or in official situations (political speech, theater etc.) the _-l_ is clearly pronounced.
Remark: the distinction between articulated and non-articulated forms of a noun is easily made with the -u termination: _om/omu_ instead of the correct form *om/omul* ("man/the man")

The Romanian texts of 500-600 years ago are using the _-ul_ termination everywhere (official documents, private letters etc.).
There are Romanian words borrowed by the neighboring languages where the final _-l_ is pronounced today (e.g. Serbian toponim Radula is coming from a Romanian masc. sg. name Rad*ul*).
We have thus all the reasons to believe that they spelled the _-ul _termination as they pronouced it.

Starting from the 19th century the family names which ended with a definite article at masc. sg. (_-ul_) are written with a -u termination (_-l_ disappeared in writing)
coexisting with family names ending in _-ul_.
Today all family names denoting an articulated masc. sg. word are ending in _-u_.
E.g. of some known Romanian people with _-ul _termination: Dimitrie Onci*ul* (1856-1923), Aron Pumn*ul* (1818 - 1866)
E.g. of some Romanian people with *-u* termination: Mihail Eminesc*u* (1850-1889), Nicolae Ceausesc*u* (1921-1989) etc.

All this time (last 500 years) the definite article for common nouns was spelled *-ul* in official documents (thus in literary language), while accidentally they were spelled with *-u* termination in particular letters.

Romanian linguists are mentioning the discrepancy between spoken and written form of this definite article since the beginning of the 20th century.
I read testimonials about old people in rural areas in the 1970s' - 1980s' which still pronounced the final _-ul_ very clearly, with a movement of opening the lips after the _-l_,
while most of the Romanians do not pronounce it or pronounce it very weak.

Conclusions:
The final_ *-l* _in the definite article masc. sg. (the *-ul *suffix) is on the path of extinction and this process is accelerating after the 19th century.
The pronounciation is not the same all over the country, but the vast majority of the Romanian speaker have lost this termination.
The literary form of the language (used in schools, public offices etc.) is still using this feature as this is the "correct academic" pronounciation.


----------



## Angelo di fuoco

Outsider said:


> If I'm not mistaken, some dialects of Italian use the articles _la, lo_ (pronounced [lu]). So maybe the source is not Portuguese, but some non-standard dialect of Italian. Or it could be an independent development...



No, it's just the more conservative variant. E. g. Dante:
_Lo giorno se n'andava_



danielstan said:


> _tata _(lat. nom.) / _tatam _(lat. acc.) > _tat*ă *_(rom.) / _tată*l *_(rom. articulated) ("father")



_Tat*ă*_ is definitely not Latin, but Slavic: cf. Polish tata and Russian тятя, both with the same meaning (actually, rather "daddy" than "father").


----------



## robbie_SWE

Angelo di fuoco said:


> No, it's just the more conservative variant. E. g. Dante:
> _Lo giorno se n'andava_
> 
> 
> 
> _Tat*ă*_ is definitely not Latin, but Slavic: cf. Polish tata and Russian тятя, both with the same meaning (actually, rather "daddy" than "father").



Just because it coincides doesn't mean that it comes from a Slavic language Angelo di fuoco.

https://en.wiktionary.org/wiki/tata#Latin
http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.04.0059:entry=tata

Then I guess Aromanian _tatã_, Dalmatian _tuota_, _teta_, (regional) Italian _tata_, Neapolitan _tata_, Portuguese _tatá_, Spanish _tata_, _tato _and _taita _are all Slavic, right?


----------



## Angelo di fuoco

The Aromanian and Dalmatian could be Slavic, given their linguistic environment.

_Scherzi a parte_: sorry for that one, but in Italian (modern Italian: papà, more conservative - Tuscan-oriented Italian: babbo) and peninsular Spanish as well as Portuguese it's rather _very_ rare. Won't say for Neapolitan, but suppose this word is on its deadbed, too.


----------



## robbie_SWE

Angelo di fuoco said:


> The Aromanian and Dalmatian could be Slavic, given their linguistic environment.
> 
> _Scherzi a parte_: sorry for that one, but in Italian (modern Italian: papà, more conservative - Tuscan-oriented Italian: babbo) and peninsular Spanish as well as Portuguese it's rather _very_ rare. Won't say for Neapolitan, but suppose this word is on its deadbed, too.



It doesn't matter how archaic or rare the word is - it exists nonetheless in several Romance languages. The reason why Romanian *tată *is believed to be from Latin, is because it fits morphologically and semantically. You're not actually denying that the word existed in Latin are you? 

From Treccani: 



> *tata* s. f. e m. [duplicazione della sillaba _ta_, consueta nel balbettio e nel richiamo dei bambini, già presente nel lat. _tata_ (masch., «papà» e «balio»), nel gr. τάτα, τατᾶ e τέττα (con usi analoghi all’ital. _tata_ e _tato_) e fin nel sanscr. _tatah_] (pl. m. -_i_). – Voce infantile, usata con sign. diversi nelle varie parti d’Italia:
> 
> *a.* Al femm., per indicare la balia, la bambinaia, la governante, la sorella maggiore o, più genericam., la donna, diversa dalla madre, che si prende cura di un bambino.
> 
> *b.* Al masch., region. e raro per indicare il padre (v. anche tato): _aveva mandato a Napoli il figliuolo maggiore_,_ con qualche soldo_,_ ad assistere suo padre_,_ il suo_ «_tata_», _come là si dice_ (De Amicis).



From RAE:


> *tata.*
> 
> (Del lat. _tata_).
> 
> 1. f. coloq. Niñera y, por ext., muchacha de servicio.
> 2. f. coloq. Voz de cariño con que se designa a una hermana.
> 3. m. afect. Am. padre (‖ varón que ha engendrado). U. en algunos lugares de América como tratamiento de respeto.


----------



## Angelo di fuoco

Robbie, I know many of you Romanians aren't proud of the Slavic vocabulary in your language (and of any other Slavic connection), but there's no need to get so upset.
I acknowledged my error, so just give it a break.


----------



## robbie_SWE

Angelo di fuoco said:


> Robbie, I know many of you Romanians aren't proud of the Slavic vocabulary in your language (and of any other Slavic connection), but there's no need to get so upset.
> I acknowledged my error, so just give it a break.



I'm sorry it comes off that way Angelo, since I truly always have respected your opinions and contributions and will continue doing so . I'm not ashamed of the Slavic vocabulary in Romanian – it's a part of the language and nothing to be ashamed of, just like the Germanic elements of French and the Arabic elements of Spanish and Portuguese.

What bugs me though is the tendency to generalise when something doesn't fit naturally into the mould one has built for the world.
With that said, I apologise for being a bit standoffish.

PS: I'm not Romanian, Swedish to be precise, so let's not open that can of worms.


----------



## Angelo di fuoco

Apology accepted. 

However, if I hadn't met some Romanians on this very forum who would the Slavic elements in both the Romanian language and nation had never existed (because they seemingly diminish the Roman and pre-Roman origins), I probably hadn't reacted that way.
I have my points of view, but I'm happy to correct them when presented with substantial arguments or evidence pointing in another direction.

A Swede with a native-like command of Romanian: how comes that?


----------



## Nino83

danielstan said:


> I observe your intuition let some important linguistic facts unexplained:
> 
> Italian definite article for masc. sg. has 2 forms: _il/lo_ which could be reasonably explained by a _*illo_ or _*ilo_ (I don't argue on single _/L/_ or double_ /LL/(*)_ here)
> sometime during the evolution of Vulgar Latin to Italian, but unacceptable explained from a _lo/la_ form.
> 
> Spanish definite article for masc. sg.: _el_ cannot be explained by a _lo/la_ weak form in Vulgar Latin



It is simple.
Some varieties used _il/el_, there is a _quasi-continuum_, Gallo-Italian languages, Catalan and Castillan use _el_, this can be considered the "central area", while all other areas, geographically distant, use _lo_ (French _le_, Occitan _lo_, Portuguese _o_, Ancient Romanesco _lo_, Median Italian dialects _lo/lu_, Neapolitan _(l)o_, Apulian _(l)u_, Sicilian _(l)u_.

In Tuscan and in the language of Dante, only _lo_ could be used when starting a sentence and in the other cases while _il/'l_ was used only when the article followed a vowel, for example _m'avea di paura il cor compunto, e il sol montava, dove il sol tace, in fin che'l Veltro_ (Dante) but _a rimitar lo passo, per lo suo mezzo cerchio_ and (when starting a sentence) _lo giorno se ne andava_.
Later the form _il_ (and _i_ with _raddoppio fonosintattico_, es. _il giorno > i ggiorno_ in the Tuscan dialect) was used in most contexts while _lo_ only before _s + consonant, gn, z, j_.

Source: Rohlfs, _Grammatica storica dell'italiano e dei suoi dialetti_



danielstan said:


> My opinion (based on some intuition and a few things I read about this topic) about the personal pronoun on 3rd person:
> 
> 
> After the Romance languages have been "born" from Vulgar Latin the evolution inside each language was governed by:
> - the "need" of the speakers to make distinction between 2 different grammatical features: the definite article / the personal pronoun
> - because the definite article is more frequently used in talking it was the best candidate for "reduction in length".
> "reduction in length" was from 2-silabic _*ellu/ella _to 1-silabic form of today.
> See the definite articles_ o/a_ in Portuguese and _-ul/-a_ in Romanian.
> 
> As a "principle" in the evolution of languages (confirmed in many cases) the words with highest frequences in usage tend to reduce in length (silabes and phonemes).
> This "principle" of shortening high-usage words is known by linguists and mentioned in books.


This explanation doesn't convince me because the passage /LL/ > /L/ and /LL/ > /Ø/ can't be explained in Spanish, Italian languages, Catalan and Portuguese. 
This explanation seems to be a bit Romanian-centric.



Outsider said:


> If I'm not mistaken, some dialects of Italian use the articles _la, lo_ (pronounced [lu]). So maybe the source is not Portuguese, but some non-standard dialect of Italian. Or it could be an independent development...



Yes, it is so, but the for a different evolution. 

- _dialetti mediani_ (south of Roma-Ancona line and north of Gaeta-Ascoli line) retained the difference between Latin final _-o_ and _u_, for example _porto < portō_ (I bring/take) and _portu < portǔm_ (the port) and have two different articles, _lo_ for uncountable abstract and unanimated nouns (called new neuter) and _lu_ for animated nouns and concrete/countable unanimated nouns, so they have _lu ferru_ (steam iron, countable, concrete) and _lo ferru_ (iron, material, uncountable)
- _dialetti alto-meridionali_ (Neapolitan, Apulian) it is equal to _dialetti mediani_ but final unstressed vowels became schwa /ə/ and these two articles are pronounced _o_ in Neapolitan and _u_ in Apulian, but the "neuter" article has _raddoppio fonosintattico_, for example _o fierrə_ (Neapolitan) _u ferrə_ (Apulian), steam iron, and _o ffierrə_ (Neapolitan) _u fferrə_ (Apulian), iron, material 

In Sicilian there is no difference between Latin final _-o_ and _-u_ or between abstract/uncountable and concrete/countable nouns. It is like in Portuguese, two genders (masculine/feminine) and two articles.
In Sicilian, closed /é/ and /ó/ became /i/ and /u/, so the two articles are _(l)u_ and _(l)a_ (the "l" is retained only in Southern Sicily).

So, we have _u pòrtu_ (the port) e _jo pòrtu_ (I bring/take), like the Portuguese _o porto /u pòrtu/_ and _eu porto /eu pòrtu/_, and in both languages _a casa_ (the house). 

Ancient Romanesco (before it was tuscanized in 1500) had _lo_ (today it has _er < el_ from Tuscan _il_).


----------



## Penyafort

Nino83 said:


> It is simple.
> Some varieties used _il/el_, there is a _quasi-continuum_, Gallo-Italian languages, Catalan and Castillan use _el_, this can be considered the "central area", while all other areas, geographically distant, use _lo_ (French _le_, Occitan _lo_, Portuguese _o_, Ancient Romanesco _lo_, Median Italian dialects _lo/lu_, Neapolitan _(l)o_, Apulian _(l)u_, Sicilian _(l)u_.



How is Occitan "geographically distant" if it forms the continuum between Catalan and Gallo-Italic?

But things are much more complex anyway.

In both Old and Modern Catalan, three forms of the article have coexisted: LO, EL and ES. Standard Catalan, based on the central dialect, uses EL, and so do Valencians, but the West and South of Catalonia use LO, and Majorcans use ES.

In Occitan and Gascon the general one is LO indeed, but you also find LE both in the south and north, ETH in Pyrenean Gascon, and even instances of SO in an area of Provence.

Not to mention Aragonese, in which the system is certainly complex, with general forms being O, A, OS, AS as in Portuguese, but retaining the LO, LA, LOS, LAS in intervocalic contexts, sometimes the L being pronounced as an R, and with the EL/LA/ES/LAS system for the Eastern varieties.


----------



## francisgranada

Only for curiosity, without any deeper analysis: there is an Andalusian song "La luna y el toro" where the following line (verse) occurs:
_
*El* torito se mete *en el* agua_

In the Andalusian interpretation it sounds (at least I hear it pronounced this way):

_*Er* torito se mete *nel* agua     _

What I want to say is that the forms "_er_" and "_nel_" do not automatically imply  that the Andalusian represents a continuum/transition between the Castilian and the Tuscan or Romanesco ... In other words, in my opinion the variety of the articles (_o, lo, el, il, er, 'l, l'..._) in the Romance languages can be explained also independently on each other.


----------



## Nino83

Penyafort said:


> How is Occitan "geographically distant" if it forms the continuum between Catalan and Gallo-Italic?
> But things are much more complex anyway.





Yes, they are more complex (I tried to summarize a bit).

The fact is that there are no Romance languages that retain a double /LL/ in definite articles and most (I don't know if "all" do so) Romance languages have a third (singular and plural) person pronoun that derives from _ille, illa, illud_, so the things are two: or the definite article and the third person pronoun developed in different times or there were two different forms of _ille_, a weak and a strong one.

About Aragonese, there is nothing strange if _lo/la_ is retained in intervocalic position. Was single intervocalic /L/ dropped in Aragonese?
It wasn't in Catalan. It was in Galician-Portuguese.



francisgranada said:


> _*El* torito se mete *en el* agua_
> 
> In the Andalusian interpretation it sounds (at least I hear it pronounced this way):
> 
> _*Er* torito se mete *nel* agua    _
> 
> What I want to say is that the forms "_er_" and "_nel_" do not automatically imply  that the Andalusian represents a continuum/transition between the Castilian and the Tuscan or Romanesco ... In other words, in my opinion the variety of the articles (_o, lo, el, il, er, 'l, l'..._) in the Romance speaking area can be explained also independently on each other.



Yes.
Nobody says that Ligurian and Portuguese are related because they lost intervocalic single /L/ or that Caipira (Brazilian Portuguese) or Andalucian/Carribean Spanish are related to Romanesco because of /L/ rhotacism.

EDIT:
It seems that in Aragonese, like in Spanish, intervocalic /LL/ and /L/ are retained:
caballo, color, calién (Portuguese cavalo, cor, quente), me lo dice (Portuguese mo diz < me o < me lo)

Also in Aragonese, there is /LL/ in _éll, ella, ell(o)s, ellas_ (he, she, they) an /L/ in _lo, la, los, las _(him, her, it, them, accusative) and no /L/ in _o, a, os, as_ (the).

It seems a general tendence that there are more /L/ in the third person pronoun than in the definite article.


----------



## danielstan

Angelo di fuoco said:


> Robbie, I know many of you Romanians aren't proud of the Slavic vocabulary in your language (and of any other Slavic connection), but there's no need to get so upset.
> I acknowledged my error, so just give it a break.


I am Romanian and I confirm that assessment.
During the last 3 centuries (XIX - XXI) the Romanian intellectuals have emphasized the Latin origin of the Romanian language (which is a fact) and have chosen to ignore, repudiate or treat badly the strong Slavic influence that Romanian has received in Middle Ages.

Latin origin was used by Romanian intellectuals as argument in the struggle for national awakening of the Romanians in Austro-Hungarian monarchy.
I remember one relevant case:
August Treboniu Laurian (https://en.wikipedia.org/wiki/August_Treboniu_Laurian) - a Romanian from Austro-Hungary has written a "Romanian dictionary" containing only the vocabulary of Latin origin and intended to publish a "Romanian glossary" containing the "foreign words of Romanian" (i.e. the non-Latin words, especially the Slavic ones).

In school I heard (I forgot the context) the expression "Romania is a Latin island in a Slavic sea".
During the last 200 years there was an effort of "re-Latinization" of the language by imports of words from French (most) and Italian languages. Whether this effort was conscious or was a side-effect of importing from French which was the "lingua franca" at that time (while now the imports are coming from American English) is hard to tell.
And there are many other aspects where Latin is preferred and opposed to Slavic in Romania, but I don't want to insist.
---
See in recent time the efforts made in Croatia (after 1991) for promoting a Croatian language distinct from Serbian, by reviving archaic Croatian words in replacement of the "Serbian" ones. Most of the Croatians do not admit the concept of "Serbo-Croatian language".

Another case is the (Slavic)-Macedonian language/dialect which is regarded differently in Macedonia and Bulgaria.


----------



## danielstan

Nino83 said:


> This explanation doesn't convince me because the passage /LL/ > /L/ and /LL/ > /Ø/ can't be explained in Spanish, Italian languages, Catalan and Portuguese.
> This explanation seems to be a bit Romanian-centric.
> .


The Latin double /LL/ intervocalic was pronounced in separate sylabes. E.g. _vallem _(lat.) = VAL-LEM
In today Italian the pronounciation is the same: _stella _(it.) = STEL-LA

In the case of Vulgar Latin _*illu/*illa_ I believe that somewhere in its evolution, whether while still in Vulgar Latin (*) (most probable, in my opinion) or later in all Romance languages,
the 2-sylabic word has been reduced to 1-sylabic form in the role of definite article.
Take the example of (standard) Italian with 2 realisations for the masc. sg. definite article:
_*illu > il/lo_
I see 2 possible evolutions:
1) _*illu > illo_ (pronounced IL-LO) >_ il / lo_ (reduction of 1 sylabe)
2) _*illu_ (pronnounced IL-LU) >_ il / lu_ (reduction of 1 sylabe) > _il/lo_ (final _-u_ > final _-o,_ in Italian)
Any of the 2 possible ways above do not interfere with the passage /LL/ > /LL/ (in standard Italian).
I think the above evolution could have happened (mutatis mutandis) with all other Romance languages, because today the definite article is 1-sylabic in all of them. Because of this common outcome I guess the reduction to 1 sylabe has happened before 476 AD with more probability, but I don't exclude the possibility of this happening in some Romance languages after the separation of Western and Eastern Roman empires.

The Portuguese case (where you deduced a /LL/ > /Ø/ transformation, that I don't see):
_*illu > illo > lo > o
*illa > la > a_
- it seems obvious to me that the Portuguese version of definite article is a more evoluated form (by reduction) of the Spanish one,
at least in the case of feminine sg.: Spanish _la_ > Portuguese _a_
I am confident there are medieval Portuguese texts where the _la/lo_ forms were preserved. I invite knowledgeable people to confirm or infirm my assumption.
------------------
(*) In my understanding the Vulgar Latin was a form of Latin spoken all over the Roman Empire, with possible small dialectal differences from region to region, but having a high degree of uniformity. I consider the Vulgar Latin phase has ended in 476 AD with the fall of the Western Roman Empire (I use 476 AD as a reference only, knowing that a language is not transformed instantly in something else and also knowing that most of the Romance languages are supposed to have been "born" around 800 AD).
In this view I consider the forms _*illu/*illa_ belonging to Vulgar Latin,
while the forma _*illo/*illa_ belonging to the Western Romance languages where the transformation:
final _-u _> final _-o _
has happened.


----------



## Nino83

danielstan said:


> Because of this common outcome I guess the reduction to 1 sylabe has happened before 476 AD with more probability, but I don't exclude the possibility of this happening in some Romance languages after the separation of Western and Eastern Roman empires.



It is what I said, in other words seeing that in all the Romance languages the definite article has less /L/ than the personal pronoun, this weak form was just present in spoken Vulgar Latin, before 476 AD.

About Spanish and Portuguese:
Spanish: /-LL-/ > /-ʎ-/ and /-L-/ > /-L-/: caballo > kaβaʎo, colore > kolor, calente > kaljente
Portuguese: /-LL-/ > /-L-/ and /-L-/ > /-Ø-/: caballo > kavalu, colore > kolor > koor > kor, calente > kalẽte > kaẽte > kẽte (quente)

Compare:
Spanish: ella > eʎa and la > la
Portuguese: ella > ela and la > a

Caballo, calente and colore were common, popular, words (core vocabulary) and they show us the phonetic changes in these languages. 

So, the article should have been reduced in Spanish and Portuguese (like in the other Romance languages) just before 476 AD.


----------



## danielstan

I don't insist on the matter of reduction of /LL/.

I would like to know from knowleadgeable people (and you, @Nino83, seem to be one of them) what Romance languages and dialects have the *rhotacisation of /L/* during their evolution.

The Romanian with all its dialects (Daco-Romanian spoken in Romania and Rep. of Moldova; Aromanian, Megleno-Romanian and Istro-Romanian) is the most noticeable case and I thought, so far, this is a distinctive mark in its evolution commpared to other Romance languages. Now I see I might be wrong.
For example, the Istro-Romanian (spoken today by few tens of people in Istria, Croatia) is considered a dialect of Romanian and not of neighbouring Italian, because (among other features) of this rhotacisation feature.
E.g.:
_pilus _(lat.) > _păr (daco-r_om.); _per _(istro-rom.) ("hair")


----------



## berndf

danielstan said:


> 1) *illu > illo (pronounced IL-LO) > il / lo (reduction of 1 sylabe)
> 2) *illu (pronnounced IL-LU) > il / lu (reduction of 1 sylabe) > il/lo (final -u > -o in Italian)


The shift from L /-um/=[-ʊ] or [-ʊ̃] to VL /-o/=[-o] is quite regular and not just Italian. I think it can safely be assumed to be an early development as a direct consequence of the loss of phonemic vowel length. I would therefore concentrate on the 1) option.

If I understand you correctly, you propose a split in pronunciation depending on the function, pronoun or article. And this change in pronunciation led to dropping the first syllable in one use and the second in the other. For this theory to be plausible, you must assume that the split started as a change in stress, i.e. /'il.lo/ to split into /'il.lo/ and /il'lo/. It that what you think?


----------



## danielstan

berndf said:


> If I understand you correctly, you propose a split in pronunciation depending on the function, pronoun or article. And this change in pronunciation led to dropping the first syllable in one use and the second in the other. For this theory to be plausible, you must assume that the split started as a change in stress, i.e. /'il.lo/ to split into /'il.lo/ and /il'lo/. It that what you think?


Yes, you correctly understood my ideas.
I did not paid attention to the stress and I don't know how could we reconstruct the stress in that phase of the evolution of Vulgar Latin (as the stress is not noted in texts, if I understand correctly).

---------------
From another hand I know that the Eastern Romance languages (Romanian with its dialects, Dalmatian) have been separated sooner (476 AD) from the Western Romance languages and the transformation:
L /-um/=[-ʊ] or [-ʊ̃] to VL /-o/=[-o]
seems to not have happened in the Balkans.
It this VL /-o/ reconstructed from modern Western Romance languages or are there any Late Antiquity texts with such notations? (I think I have read somewhere a _Bonifatzio _or so mentioned in some inscriptions...)


----------



## Nino83

danielstan said:


> I am confident there are medieval Portuguese texts where the _la/lo_ forms were preserved. I invite knowledgeable people to confirm or infirm my assumption.



In Galician we have: cabalo /kaˈβalo/, cor /ˈkoɾ/ and quente /ˈkɛnte/.



> 2) *Queda de -l- intervocálico *— Este fenômeno, provável resultado de uma pronúncia velar do l intervocálico, ia ter conseqüências importantes. *Ocorreu possivelmente em fins do século X*, pois num *documento em latim bárbaro de 995* lê-se Fiiz (< Felice) e Fafia (< Fáfila). Ele incidiu sobre um grande número de palavras e contribuiu para criar em galego-português vários grupos de vogais em hiato. ex.: salire > sai, palatiu > paaço (hoje paço), calente > caente (hoje quente), dolore > door (hoje dor), colore > coor (hoje cor), colubra > coobra (hoje cobra), voluntade > voontade (hoje vontade), filu > fio, candela > candea (hoje candeia), populu > poboo (hoje povo), periculu > perigoo (hoje perigo), diabolu > diaboo (hoje diabo), nebula > névoa, etc.
> 
> É a queda do -l- intervocálico que explica a forma que possuem no plu ral as palavras terminadas em -l- no singular: sol, plural soes, hoje sóis. Em grande número de palavras de origem semi-erudita ou erudita, o -l- intervocálico conservou-se; ex.: escola, astrologia. Em português moderno, os -l- intervocálicos deste tipo são inumeráveis; ex.: palácio (ao lado de paço), calor (ao lado de quente < calente) , alimento, cálice, guloso, volume, violento, etc.
> 
> *A queda do -l- intervocálico produziu-se apenas em galego-português. Não aparece nem a leste da área primi tiva desta língua — o leonês e o castelhano ignoram-na *—, nem ao sul, nos falares moçárabes5. Este último ponto é abundantemente documentado pela toponímia: tem-se, por exemplo, Mértola no Alentejo (< Mĭrtŭla, por Myrtilis, antigo nome dessa localidade), ou Molino (em lugar de Moinho), ou ainda Baselga (< Basĭlĭca). Nas palavras de origem árabe o intervocálico não raro permaneceu; ex.: azêmola, javali.



It happened in Galician-Portuguese and didn't happen in Asturian, Leonese and Castillan.

http://disciplinas.stoa.usp.br/pluginfile.php/158086/mod_resource/content/1/TEYSSIER_ HistoriaDaLinguaPortuguesa.pdf



danielstan said:


> I would like to know from knowleadgeable people (and you, @Nino83, seem to be one of them) what Romance languages and dialects have the *rhotacisation of /L/* during their evolution.



Portuguese only in some consonant clusters, like /pl/ (in semi-learned words, because in popular speech the change was /pl/ > /ʃ/, like plano > chão /ʃɐ̃u̯/, plena > cheia) platea > praça, duplo > dobro/dobre, and before a consonant in Andalusian (and Carribean) Spanish, Caipira Portuguese (Brazil) modern Romanesco (Italy). 



berndf said:


> For this theory to be plausible, you must assume that the split started as a change in stress, i.e. /'il.lo/ to split into /'il.lo/ and /il'lo/.



And I find it difficult that this "split" happened *independently* in *all* the Romance languages, in the same manner.


----------



## berndf

Nino83 said:


> And I find it difficult that this "split" happened *independently* in *all* the Romance languages, in the same manner.


But wouldn't your theory of early clitic form _lo/la_ in contrast to full forms _elle/ello/ella_ require a similar development?


----------



## Nino83

berndf said:


> But wouldn't your theory of early clitic form _lo/la_ in contrast to full forms _elle/ello/ella_ require a similar development?



I'm saying that in spoken Vulgar Latin before 476 AD there was just a strong form used as pronoun (personal, demonstrative) and a weak (splitted?) form used as article.  This could explain why in all Romance languages the definite article is monosyllabic, with one or no /L/, and personal and demonstrative pronouns have two or one /L/. 

(hoc) illa > aque*l*a (Portuguese), aque*ll*a (Spanish), que*ll*a (Italian), che*ll*a (Neapolitan), chi*ɖɖ*a (Sicilian), aque*ll*a (Catalan)
illa > e*l*a (Portuguese), e*ll*a, e*ll*a, e*ll*a, i*ɖɖ*a, e*ll*a
la > a (Portuguese), la, la, la, la, la


----------



## danielstan

And how do you explain the article in Italian with 2 forms for masc. sg. _(il / lo_) if there was only a monosyllabic weak form in Vulgar Latin to inherit from?

Same question goes for explaining the Spanish article_ el_ in comparisson with the French article_ le_: would they come from the same monosyllabic weak form in Vulgar Latin
or they come from different monosyllabic forms?


----------



## Nino83

danielstan said:


> And how do you explain the article in Italian with 2 forms for masc. sg. _(il / lo_) if there was *only a* monosyllabic weak form in Vulgar Latin to inherit from?



I haven't said there was *only one (equal for all, unique)* form.
For the feminine there was *la*, for the masculine there were *el* and *lo* and in most languages (Tuscan comprised) at the beginning there was a competition between *el* and *lo*. In Tuscan (then Italian) *lo* was the most used form but *il* gained ground and *lo* was used only before some consonants. In Spanish, *lo* is now used with only some adjectives, indicating something abstract/neuter, like in *lo* bueno y *lo* malo (the good and the bad "thing").
The same thing happened in Catalan, as Penyafort said.  

The French _le_ comes from _illum_.  
https://en.wiktionary.org/wiki/le#French


----------



## berndf

Nino83 said:


> I'm saying that in spoken Vulgar Latin before 476 AD there was just a strong form used as pronoun (personal, demonstrative) and a weak (splitted?) form used as article.  This could explain why in all Romance languages the definite article is monosyllabic, with one or no /L/, and personal and demonstrative pronouns have two or one /L/.
> 
> (hoc) illa > aque*l*a (Portuguese), aque*ll*a (Spanish), que*ll*a (Italian), che*ll*a (Neapolitan), chi*ɖɖ*a (Sicilian), aque*ll*a (Catalan)
> illa > e*l*a (Portuguese), e*ll*a, e*ll*a, e*ll*a, i*ɖɖ*a, e*ll*a
> la > a (Portuguese), la, la, la, la, la


The gap in your theory is that you haven't explained (yet) where the monosyllabic forms with the _l _at the beginning rather then the end come from in the first place. I can't understand how this could have happened unless you suppose a prior stress shift.


----------



## Nino83

berndf said:


> The gap in your theory is that you haven't explained (yet) where the monosyllabic forms with the _l _at the beginning rather then the end come from in the first place. I can't understand how this could have happened unless you suppose a prior stress shift.



There was a competition between _el_ and _lo_. It is well attested for Tuscan (and, I'm not sure about Spanish and Catalan but it seems it happened too). 
Then, in some languages prevailed the form _lo_ and in other the form _el_, while in other there are both _el/il_ and _lo_ (like in Tuscan and in Spanish).


----------



## danielstan

The fem. sg. case of the article is simple - let's forget it.

The most difficult case is the masc. sg. definite article.
You suppose that since the Vulgar Latin phase (before 476 AD) there was a monosyllabic form of article for masc. sg., but not equal for all.

I guess you mean there were 2 forms of monosyllabic articles for masc. sg. in Vulgar Latin, probably *_il _and *_lo_ (all of them coming from a bisyllabic *_illo_).
Even from the Vulgar Latin phase these 2 forms were in competition for the same function of definite article at masculine singular case.

Is this your theory?


----------



## Nino83

danielstan said:


> I guess you mean there were 2 forms of monosyllabic articles for masc. sg. in Vulgar Latin, probably *_il _and *_lo_ (all of them coming from a bisyllabic *_illo_).
> Even from the Vulgar Latin phase these 2 forms were in competition for the same function of definite article at masculine singular case.
> 
> Is this your theory?



Yes, and it is so today in some languages, like Italian and Spanish (which retained both _el/il_ and _lo_).


----------



## danielstan

Well, your choice of words was misleading when your spoke of "a weak form of article in Vulgar Latin".

Now that we understand each other I will grasp your theory.


----------



## berndf

Nino83 said:


> There was a competition between _el_ and _lo_.


We all know that there are monosyllabic forms that start with_ l _and those that end with _l_.

Again my question: Where do the _l-_ forms come?


----------



## Nino83

berndf said:


> Again my question: Where do the _l-_ forms come?



_ǐllum > ǐllu > ǐl(lu) > el(lo)
ǐllum > ǐllu > (ǐl)lu > (el)lo_

The _el > il_ change, in Italian, is due to the Tuscan change, in monosyllabic words in pre-tonic position _e > i_, in fact in Spanish, Catalan, Gallo-Italian languages and Romanesco we have, respectively, _el_ and _er_ (_el mar, de Madrid, er mare, de Roma, il mare, di Roma_).

Another curiosity:
the change _(ǐl)li > li > gli /ʎi_/ in Tuscan is due to the fact that, before a vowel, _li > lj > ʎi_, for example _li arabi > ljàrabi > ʎiàrabi_, then we write _gli arabi_ but _i tedeschi_ (in Tuscan also _li tedeschi_).


----------



## danielstan

In my opinion the 2 monosyllabic forms for masc. sg. have evolved from the same bisyllabic form *_illo _under the influence of the phonemes at the beginning of the noun they preceded.
We are in the case of the Western Romance languages where the article precedes the noun (while in Romanian and its dialects the article is a suffix).

I think that the rule of the Italian masc. sg. article of today (which depends on what sounds the noun begins with)
was the *determinant rule* of the speakers of Vulgar Latin to reduce the article to one of the 2 forms (_*il / *lo_).
I see here a process similar to the reduction of the article before a noun starting with a voyel, like in French _l'homme_ and others.

After the reduction process was completed the article has evolved with its 2 forms in the Romance languages where speakers may have chosen to eliminate one of the forms (the _*il_ form is not present in French) or they have changed the usage like in Spanish.


----------



## danielstan

Nino83 said:


> _ǐllum > ǐllu > ǐl(lu) > el(lo)_
> Another curiosity:
> the change _(ǐl)li > li > gli /ʎi_/ in Tuscan is due to the fact that, before a vowel, _li > lj > ʎi_, for example _li arabi > ljàrabi > ʎiàrabi_, then we write _gli arabi_ but _i tedeschi_ (in Tuscan also _li tedeschi_).


I imagine that in familiar speeches you may here the pronounciation:
LIA-RA-BI (3 syllabes) instead of the normal LI-A-RA-BI (4 syllabes)
and so we guess  why the speakers "have chosen" such a rule during the evolution of the language (reduction of the effort in speech).


----------



## berndf

Nino83 said:


> ǐllum > ǐllu > (ǐl)lu > (el)lo


Exactly! And that is not plausible unless you assume a prior stress shift. That's what I said all the time. And, if you don't agree, how else you would explain this?


----------



## danielstan

berndf said:


> Exactly! And that is not plausible unless you assume a prior stress shift. That's what I said all the time.


If you assume a previous stress shift you consider the word *_illo _in its evolution as an independent word,
but forget that as article it is always followed by a noun, so its context is determined by the phonemes at the beginning of its noun.

See the example of fr._ l'homme _
where the reduction of _le _is determined by the noun starting with a voyel.

I don't see the role of the stress shift in the evolution of this article.
----------------
On the other hand, the _*ello/*ella_ as pronoun is used without a noun after it, so this word is independent of the context and has retained a longer form in its evolution.


----------



## berndf

danielstan said:


> I don't see the role of the stress shift in the evolution of this article.


If you drop a syllable it never is the one carrying the main stress of the word. That's why German children shorten _Ele*fant* _an to _Fant _and English children shorten _*Ele*phant_ to _Ele_.


----------



## danielstan

I know the syllable carrying the main stress of the word is the syllable most conservative in the evolution of the word.

On another hand the words with the highest frequency in usage are the best candidates for reduction.

I see contradictory arguments in your approach and in mine...


----------



## Nino83

danielstan said:


> LIA-RA-BI (3 syllabes) instead of the normal LI-A-RA-BI (4 syllabes)



Today, in standard Italian there is only the first, [ˈʎaːrabi] or [ˈʎi ˈaːrabi] if you articulate these words with a pause in the middle.



berndf said:


> Exactly! And that is not plausible unless you assume a prior stress shift. That's what I said all the time. And, if you don't agree, how else you would explain this?



The fact that in most languages there were both forms at the beginning it means that these two forms were not incompatible.
In medevial Tuscan _il_ was used only when the preceding word ended with a vowel _m'avea di paura il cor compunto_ and _lo_ after a consonant, _per lo suo mezzo cerchio_ or, when starting a sentence, _lo giorno se ne andava_ (Dante).


----------



## berndf

Nino83 said:


> The fact that in most languages there were both forms at the beginning it means that these two forms were not incompatible.
> In medevial Tuscan _il_ was used only when the preceding word ended with a vowel _m'avea di paura il cor compunto_ and _lo_ after a consonant, _per lo suo mezzo cerchio_ or, when starting a sentence, _lo giorno se ne andava_ (Dante).


I see, I won't get an answer to my question (*how* and not *if*, _(il)lo>lo_ could happen).


----------



## francisgranada

In the Glosas Emilianenses we have e.g. "_tienet *era *mandacione_", "_denante *ela *sua face_"  and in the Documento de Quesos "_in *ilo *bacelare_", "_kesos V in *ilo *alio de apate_", "_in *ila *vinia majore_", etc. Don't these examples contradict to the "monosyllabic theory"?

P.S. At the same time, in the Glosas Emilianenses in case the article is used with  a preposition, the forms *o* and *a* are used: "_con*o* Patre con*o* Spiritu Sancto", "en*a *honore". _


----------



## Nino83

berndf said:


> If you drop a syllable it never is the one carrying the main stress of the word.



In  _A History of the Spanish Language_, Ralph John Penny  it is said that



> In pre-literary Spanish, the definite article was still disyllabic (ela < illa, elos < illos, elas < illas) although the masculine singular form is not unambiguosly attested at the same stage) but lack of stress allowed elision of the initial vowel for the plural form (> los, las).
> In the singular form, lack of stress led to the loss of one or other of the vowels (and, in some dialects, before a vowel, to loss of both). Thus masc. *elo > el. Pre-literary fem. ela is reduced, in Old Spanish, either to la (where the following word begins with a consonant) or to el (when next word had vocali onset).



The forms are pre-literary, this is why he writes *elo.

So, just there were a *weak form*, _elo/ela_ and a *strong form* _ello/ella_ in pre-literary Spanish, i.e from 476 to, more or less, 800 AD (?).

By the way, in Spanish we had _el_ in the singular and _los_ in the plural (and not _els_, like in Catalan), and the feminine _la_ but _el_ before a stressed vowel, for example _el agua_ (f.) and not _la agua_.



berndf said:


> I see, I won't get an answer to my question (*how* and not *if*, _(il)lo>lo_ could happen).



The question is complicated.
In Tuscan it depended on the previous syllable, while Spanish has both _el < ello < illum_ and _los < ellos < illos_, and the feminine changes depending on the next syllable.

I'm only saying that I think these weak forms were present just in spoken Vulgar Latin, because this difference is present in all Romance languages.
It seems that one form prevailed in some contexts and the other one in other contexts.
In some languages only one form prevailed, see Catalan _el/els_, Gallo-Italian _el/i_ (the plural _i_ is explained with the vocalic plural in Italian languages), Central and Southern Italian _lo/lu/li_, French _le/les_, Galician-Portuguese _o/os_, but Spanish _el/los_, Tuscan _il/lo/li_, Asturian _el_ but _l'_ before vowel and _los_, and so on.  



francisgranada said:


> In the Glosas Emilianenses we have [...] Don't these examples contradict to the "monosyllabic theory"?



They could be disyllabic, but they were just weak forms, with only one /L/ just in pre-literary stages.


----------



## berndf

Nino83 said:


> The question is complicated.


The book you quoted says it: "*lack of stress* led to the loss of one or other of the vowels".

So if, as you said, the "weak" article form carried no stress at all, neither on the first nor on the second syllable, that both can be lost, more or less randomly. The question that remains is then: How are pronouns like _lui_ possible?


----------



## francisgranada

berndf said:


> ...  I can't understand how this could have happened unless you suppose a prior stress shift.


I have the feeling that we cannot avoid the disappearence of the originally stressed syllable, whatever theory do we accept. Otherwise we couldn't obtain the forms like _lo, la, o, a_ and _l'_. The reason might be that practically none of the originally two syllables of the article was stressed, as the article was pronounced together with the following word (which bears the stress).

Sorry, I haven't noticed the previous 2 posts...


----------



## francisgranada

Nino83 said:


> ... They could be disyllabic, but they were just weak forms, with only one /L/ just in pre-literary stages.


We can find also  written examples with double LL, e.g. in the Cartularios de Valpuesta (cca A.D. 844): _de illa costegera, de illas uineas , ad illo plano de Elzeto, ..._ (however, this text is rather a mixture of Latin and vernacular)


----------



## Nino83

francisgranada said:


> We can find also  written examples with double LL, e.g. in the Cartularios de Valpuesta (cca A.D. 844): _de illa costegera, de illas uineas , ad illo plano de Elzeto, ..._ (however, this text is rather a mixture of Latin and vernacular)



Exactly, it depends on how the text is written and who wrote it.
For example in the same period in Venice, in 1300, there are texts in vernacular where all final unstressed vowels are dropped and texts where they are retained except before a final _r, l, s_. The second one is obviously written in Italian _koiné_ spoken by merchant in the port of Venice (upper class).



berndf said:


> How are pronouns like _lui_ possible?



Are you speaking of _lui_ in French or in Italian?

_Lui_ comes from the Vulgar Latin/Proto-Romance oblique/dative form _ǐllui_ (in analogy with _cui_, dative of _qui, quae, quod_) with the first syllable deleted.
_Loro/leur_ comes from the Vulgar Latin/Proto-Romance oblique form derived from _ǐllorum_, first syllable deleted.

In Tuscan, _lui_ is a tonic/*stressed*  oblique pronoun, dico *a lui*, vai *con lui*, and it is used also as subject pronoun, instead of *egli*, it is not used in unstressed environments. The same use there is in Romanesco. In Southern languages we have_ iɖɖu_ (Sicilian),_ illə_ (ancient Neapolitan, Apulian) from < _illum,_ _chillə/issə_ (modern Neapolitan, from _hoc illum_ and _ipsum_) both like tonic oblique and subject pronouns.

In Tuscan, _loro_ is a tonic/*stressed*  oblique pronoun, dico *a loro*, vai *con loro*, and subject pronoun, so it is in Romanesco and modern Neapolitan. In Sicilian we have _iɖɖi_ (from _illi_) as tonic stressed and oblique pronoun.

Now, in Tuscan _loro_ is used also as dative but only after the verb, i.e it has stress, dico *loro* = gli *di*co (in the other Italian laguages there is _gli/je/lə/cə/ci_).

Anyway, in these forms, even if they are stressed, the first syllable of _illui/illorum_ was dropped.

In French we have also _lui_ and _leur_ in *unstressed* position, je lui *dis* and je leur *dis*, but the pronunciation is different.
In Italian it is [ˈluːi̯ ] /lu.i/ while in French is [ˈlɥi], one syllable.
Also _leur_ is one syllable in French, [ˈlœʁ], while in Tuscan and standard Italian it is [ˈloːro] /lo.ro/, this is why it is used only in *stressed* position.


----------



## danielstan

There is a _lui _in Romanian, too, coming from the same dative form _ǐllui
_
E.g.: French: je *lui *dis = Romanian: eu îi zic *lui*
French: je *leur *dis = Romanian: eu le zic *lor*


----------



## Nino83

danielstan said:


> E.g.: French: je *lui *dis = Romanian: eu îi zic *lui*
> French: je *leur *dis = Romanian: eu le zic *lor*



It seems that in Romanian they are used in *stressed* position, like it happens in Italian languages.


----------



## danielstan

Well, I don't know the concept of "stressed position" (I guess is the stressed word in a sentence) - I thought the problem in the bisyllabic forms_ *illo/*illa_ was the stressed syllable.

Is the Italian _lui _pronounced in 2 syllables like LU-I (I don't know the international phonetic notation, sorry)?


----------



## Nino83

danielstan said:


> Is the Italian _lui _pronounced in 2 syllables like LU-I (I don't know the international phonetic notation, sorry)?



Yes, it is.


----------



## danielstan

Romanian _lui _is monosyllabic.

Out of context:
I envy the Spanish and Italian linguists for the multitude of texts they have, probably in every century of the last 2000 years.
Romanian language has the oldest preserved text from 1521 AD, few isolated glosses reproduced in Byzantine sources (written in Greek) in 10th century and also few toponyms and antroponyms mentioned in Romanian documents written in Old Church Slavonic since 1300+ AD...


----------



## Hulalessar

danielstan said:


> I envy the Spanish and Italian linguists for the multitude of texts they have, probably in every century of the last 2000 years.



The snag is though the lack of documents in the vernacular during the period when the various Romance languages were emerging. The earliest document regarded as being in a Romance language is the Strasbourg Oaths which dates from 842, 366 years after the fall of the Western Roman Empire. There is precious little record of what Vulgar Latin was like and it must have changed significantly over the centuries. Well after people stopped speaking anything we would recognise as Latin, if you were a Romance speaker and wrote you wrote in Latin and not in the vernacular. With some texts there is uncertainty whether they are an attempt to write the vernacular or "bad" Latin influenced by the vernacular. So, despite the continuity of documents from ancient times to today there is a gap in knowledge which has to be filled by hypotheses. Hence this thread.


----------



## wtrmute

berndf said:


> The book you quoted says it: "*lack of stress* led to the loss of one or other of the vowels".
> 
> So if, as you said, the "weak" article form carried no stress at all, neither on the first nor on the second syllable, that both can be lost, more or less randomly. The question that remains is then: How are pronouns like _lui_ possible?



That is what I'd expect, an atonic _illum/illam_ yields the article (in fact, all Romance articles today are atonic, as well, I think?) and a tonic _illum/illam_ yields the personal pronoun.


----------



## danielstan

I cannot imagine a word of 2 syllables with no stress on any of them (because in the languages I know I found no such case).
Could you give an example of such word in any language or explain how an atonic _illum _would be pronounced in comparison with a tonic _illum_?

(I have no linguistic formation, thus I am not familiar with some terms used here).


----------



## Nino83

danielstan said:


> I cannot imagine a word of 2 syllables with no stress on any of them (because in the languages I know I found no such case).



Francis probably meant "sentence stress". In Romance languages articles, like monosyllabic prepositions and unstressed object pronouns don't have the strongest stress into the sentence.  

For example, in _gli ho *de*tto, je lui a *di*t,_ primary stress is on the verb, while _gli/lui_ are unstressed and _ho/a_ has a secondary stress. 
This is why, in unstressed position, those articles and object pronouns derived from _ille_ were shortened.  

A simple example in order to understand sentence stress is vowel reduction in English. 
_He went to the shop_ and _this is for you_ are pronounced _he *we*nt tə ðə shop_ and _this is __fə_ *you* (_to, the_ and _for_ are unstressed and reduced).


----------



## virgulino

My two cents: Romance-based creole languages with African substrata usually have a very different use of the definite article if compared with their superstrata. Haïtian Créole places the definite article after the noun ("liv la"/"le livre") and Portuguese-based creoles don't even have a definite article.

Also, the use of the definite article in the Bahian dialect of Brazilian Portuguese suggests a decreolization process. Bahians rarely use the definite article before given names in a familiar context, which is usual both in Portugal and in the rest of Brazil. (Bahian "O filho _de _Paulo está?"/BP and EP "O filho _do_ Paulo está?")


----------



## wtrmute

danielstan said:


> I cannot imagine a word of 2 syllables with no stress on any of them (because in the languages I know I found no such case).
> Could you give an example of such word in any language or explain how an atonic _illum _would be pronounced in comparison with a tonic _illum_?
> 
> (I have no linguistic formation, thus I am not familiar with some terms used here).



Generally, an atonic word (which has no stress) will end up as a clitic (it will "lean on" another word), and phonetically they will probably look as if they're extra syllables on the appropriate word.

As for examples, I don't think I can quote any off-hand, but there are in Spanish groups of oblique pronouns which are atonic and "cluster together" on the verb.  For example: _No quieren dártelos_ ("they don't want to give them to you"), the oblique _te_ and _los_ pronouns are atonic, and are hung off the infinitive verb _dar_.  The phonetic realisation is not unlike the situation would be if they were analysed as a single disyllabic word _*telos_, with no accent in either syllable.

EDIT: Thinking back after a while, there is the Spanish/Portuguese preposition _para _ < Latin _pro_ "for".  Two syllables, neither have stress. Often the syllables get deleted and one gets forms like _pra_, _pa_, _par'_, etc. running onto the next word, which is the word which it modifies.


----------



## Ronin81

A brand new paper on the subject http://www.mcser.org/journal/index.php/mjss/article/viewFile/7800/7468


----------



## danielstan

I read that paper and since its 1st page I was astonished by the conclusions:



> We suggest that the _main source_ of article borrowing into the ancient languages of Western Europe (Germanic and Romance) was_ the Bible_. Supposedly, the grammatical category in question penetrated into the languages when the Bible was translated into national languages. We present a historical analysis of literary monuments in Old French, Old Spanish, Old German, and Old English. This shows that these languages had acquired the article before the Bible was translated into the mentioned national languages. It allows us to suppose that _Ulfilas’ Gothic Bible, which appeared earlier, was the source of penetration of the article into Western European languages_. This assumption is based on the analysis of literary monuments in ancient languages spoken in Europe, as well as on the comparison of the geographical spread of the article in European languages and the map of Gothic conquests in the 6th century AD.



So, the author makes a distinction between the definite article in the Western Romance languages (where this article is proclitic, i.e. before the noun) and the Eastern Romance languages (Romanian and its Balkanic dialects) where this article is enclitic (after the noun). But she does not explain how this article (with the same grammatical function) has been developed in all Romance languages.

As for the mechanism suggested by the author I am amazed to read that the Bible was the source of a grammatical change in all Romance languages, in a time when most of the population was illiterate and there was no printing technology (thus the Bibles were manually copied, thus they where in small numbers).
Indeed, there are savant loanwords from medieval Latin which have entered Western Romance languages, but they are isolated and relatively small in numbers to the inherited Latin words.
But there is no grammatical change propagated through Bible in the history of Romance languages.
Neither in modern time when information is easily reachable to all speakers of a language there are no such linguistic evolutions triggered by books.
------------------
As I pointed out in some other intervention of me, there are indications that in Romanian the definite article was "floating" before and after the noun, because some Romanian words lack the initial "a" from their Latin ancestors:
rom. _prier _< lat. _Aprilis _("month of April")
rom. _miel _< lat. _agnellum _("lamb")
rom. (dialectal in Banat region) _nămaie _< lat. _animalia_

which suggest a phase where the feminine definite article *a* (lat. _illa _> _la _> _a_) was used before the noun and confused by speakers with the initial _"a"_ in the precedent examples:
lat. _agnellum _= "_a_" + _gnellum _> "_a_" + rom. _miel_

Thus the Romance definite article was an evolution of a Vulgar Latin definite article which was floating in the beginning and which has evolved in a proclitic article in West and an enclitic article in East.

For the floating position of the definite article see the text _Peregrinatio Aetherie_ where the demonstrativ article _*ille *_is used in abundance before or after the noun: http://www.orbilat.com/Languages/Latin_Vulgar/Texts/Peregrinatio_Aetheriae.html
----------------
Apart from the issue of the Romanian definite article, there is another example of popular etymology which influenced the evolution of words:

rom. _buric _("navel") < lat. *_umbulicus_

The reconstructed evolution:
lat. _umbilicus _> VL *_umbulicus _> _umbu*l*icu _> _umbu*r*icu _(rhotacism of intervocalic single /L/) > _umburic _>
confusion with the expression "_un buric_" ("a navel") > rom. _buric_


----------

