# Romanian Transliteration



## IreneStr

I am currently working on a transliteration module for as many languages as possible. I'd like to include Romanian as well, but I am not sure about the transliterations.
The transliterations will be used in URLs and in case people only have a non-Romanian keyboard available. Therefore, the non-standard latin characters that are used in the Romanian alphabet should be transliterated to standard latin characters. Please imagine you are writing on an old-fashioned British typewriter without diacritics/accents. What would you write in that situation?

This is my current list:

ă --> a
â --> a
î --> i
ș  --> s? sh?
ț  --> t? tz? 

Are the above transliterations correct? What would you write for ș and ț? Do you have additions? Have I overlooked certain characters?

Thank you!


----------



## naicul

Wasn't there an effort to support internationalized domain names? Anyway, here is what you should probably use (the same is used even by some Romanian news sites, e.g. HotNews.ro - Actualitate):
ă - a
â - a
î - i
ș - s (note that some publications are also - wrongly - using the letter ş instead of ș (s with comma))
ț - t (note that some publications are also - wrongly - using the letter ţ instead of ț (t with comma))

And here is the full Romanian alphabet: Romanian alphabet - Wikipedia, the free encyclopedia


----------



## IreneStr

Thank you naicul! Your information is really helpful.


----------



## patriota

The concept of "transliteration" is about writing a language with a whole different alphabet. Romanian already uses the Latin alphabet. The word you had in mind was "transcription".

Anyway, URLs in most languages that use Latin letters with diacritics simply use the base forms of those letters when Unicode isn't implemented. That's the case of websites in languages such as Romanian, Portuguese, and Vietnamese. Changing them to something else would be counterproductive to the user experience and search engine rankings.

The only special case I'm aware of is German websites, which often follow some patterns, like replacing _ü_ with _ue_.


----------



## IreneStr

Thank you patriota for your input. 
Changing a character with a diacritic into multiple characters is more frequent than you might think. It is common in many Nordic languages as well. For example Danish ø --> oe (but Faroese ø --> o), Nynorsk and Finnish å --> aa, Swedish ä --> ae. Whether just dropping diacritics or changing a character to multiple characters is better for user experience and rankings therefore depends on the language. That is why I'm working on language specific modules and I like to ask native speakers


----------



## patriota

Interesting. I wonder if that only happens in Germanic languages. Turkish, Hungarian and Irish do the same as Latin languages and Vietnamese.


----------



## jimmyy

I agree with patriota, I did some transliteration from cyrilics to latin language. I also believe that the transcription is less frequent. If we are talking about writting properly in a certain language, then diacritics would be used, otherwise for writing with latin characters (basic latin) let's say in an SMS or email, one would simplify and just replace å --> a with one a.

I've worked with germans, and there was one pedantic that had a name with umlaut in it, and he has never complained when I was writing to him without the dots, especially in emails, in official documents it was different.

I would be curious to learn in which circumstances the transcription that Irene mentioned is used.


----------



## IreneStr

Hi Jimmyy, Thank you for your input. The transliteration module I am working on is for website URLs. For example, if the title of your website would be 'cămaşă', it would change your URL to mywebsite.com/camasa, instead of mywebsite.com/cămaşă, because many browsers would change 'ş' and 'ă' to something illegible.


----------



## jimmyy

IreneStr said:


> Hi Jimmyy, Thank you for your input. The transliteration module I am working on is for website URLs. For example, if the title of your website would be 'cămaşă', it would change your URL to mywebsite.com/camasa, instead of mywebsite.com/cămaşă, because many browsers would change 'ş' and 'ă' to something illegible.


Bedankt, very interesting, I think Wikipedia does such transliterations, or at least they have a sytem to handle it.


----------



## naicul

jimmyy said:


> I agree with patriota, I did some transliteration from cyrilics to latin language. I also believe that the transcription is less frequent. If we are talking about writting properly in a certain language, then diacritics would be used, otherwise for writing with latin characters (basic latin) let's say in an SMS or email, one would simplify and just replace å --> a with one a.
> 
> I've worked with germans, and there was one pedantic that had a name with umlaut in it, and he has never complained when I was writing to him without the dots, especially in emails, in official documents it was different.
> 
> I would be curious to learn in which circumstances the transcription that Irene mentioned is used.


I have a (Norwegian) colleague that signs his emails "Kaare". His name is Kåre.


----------

