# In Google we trust



## timpeac

Hi

I would like your opinions on how much trust we should put in google when searching for language usages. I am a firm believer in the fact that there are enough nutters in cyberspace for most things to have been written - rightly or wrongly - at some point and so the fact that you find a certain number of examples of a certain usage is not necessarily proof that it is correct/acceptable or even remotely admitted as a possibility by native speakers.

But how many hits showing a certain usage are necessary for us to say "whatever books on grammar, vocabulary or usage may say, it is clear that x or y usage/word/idiom is certainly said by a reasonable number of speakers?" In the French forum I just googled the supposedly French word "méditeur" which doesn't exist and had 135 hits (mostly misspellings of "médiateur" = "mediator"). Or would you say that that number of hits is enough to say that the word "méditeur" exists?

What are the pitfalls of googling? Should you try to exclude sites written by foreign speakers in your target language? How would you go about that?

It seems to me that google is a hugely useful and relevant source. But as with all statistical evidence you can prove pretty much whatever you want. What is the best way to get "clean" evidence?


----------



## Cath.S.

D'ailleurs...


----------



## timpeac

egueule said:
			
		

> D'ailleurs...
> juste pour te faire sourire, Tim, après j'efface !


 

75,400 hits for "shakspeare"!! Oh that's just fantastic . Actually, I've just started a thread on the theme of googling in the culture forum. Could you move this message and yours there? Cheers muchly.


----------



## astronauta

I love it and hate it....

It's highly commercial but useful.

However, have you tried looking for the direct phone # or postal address of a hotel???

It's almost impossible to find it unless you also google the area code as well; you get a zillion travel agent pages....

About what you said, New Scientist magazine published an article about the zillion pages that contain misspells, including universities and things of the sort...

I don't know, I would not know how to answer your question.


----------



## Kelly B

I wish I could accurately remember the example I looked at recently. There was an ASTONISHING number of hits on a misspelled word. Google was smart enough, however, to ask "did you mean _*correct spelling*_?" One suggestion, then, is that it is very important to check for that little block. The fact that vast numbers of people use it incorrectly does not make it correct (yet!), particularly when the point here is to discuss correct spelling and grammar with people who care about that sort of thing. 
As you said, many Google entries are written by non-native speakers, increasing the likelihood of errors. They don't have it emblazoned across their websites any more than they have it written on their foreheads, and it's much harder to know, in the absence of an accent.... I would not be surprised to find that this is very often the case on university sites, given the demographics of those pursuing advanced degrees in the U.S.


----------



## rob.returns

In my point of view, If you do google, you need to verify that information on other sites. Segregate what's fact and fiction. Ask questions. Do research. Know the Key Words. That way you get much truth and not trash on your Topic. 
And know "HOW" to search...


----------



## timpeac

astronauta vegetariana said:
			
		

> I love it and hate it....
> 
> It's highly commercial but useful.
> 
> However, have you tried looking for the direct phone # or postal address of a hotel???
> 
> It's almost impossible to find it unless you also google the area code as well; you get a zillion travel agent pages....
> 
> About what you said, New Scientist magazine published an article about the zillion pages that contain misspells, including universities and things of the sort...
> 
> I don't know, I would not know how to answer your question.


 
Hi Astronauta

Your comments are interesting and true - however I need to make sure this thread stays on track -

I am asking *only *about the use of google as a tool to get linguistic evidence.

Please feel free to start a thread on the merits of google generally, I think that would be very interesting too!! 

Thanks! Tim


----------



## timpeac

I suppose I should give my opinion - I would say that a google hit of less than 500 is pretty meaningless in terms of verifying a usage. I would like to see at least 10,000 before saying something is common.

However, of course, common does not equal right - as Egueule's googling of "Shakspeare" instead of "Shakespeare" returned 75,400 hits!! I imagine that is a bit of an exception since his name is very unusual but it goes to show some of the pitfalls.


----------



## panjandrum

Use of Google - or any equivalent search - needs to be accompanied by intelligence. As clearly stated by others, it finds the characters you search for. This gives no value judgement on the sites.
So you need to look at a good sample of the actual sites found. It is usually obvious whether or not they are credible - can you cite them as instances of legitimate use of the character string you searched for? 

Without care, you could find yourself endorsing usage that appears only in blogs.

Useful refinements are to list only UK sites or sites in a particular language.

I used Google recently to assess use of corpse v body. body outnumbered corpse by about 10:1 - but significantly, sites listed for "<words and> corpse" were fiction, sites for "<words and> body" were fact.
As with all information analysis, a great deal of judgement must be applied. Careless use of raw counts from Google should be punished severely.

You can get a much less random analysis of use of English by searching the British National Corpus.

Hit counts of <100 can still be useful, if you know why. My body count was looking for a very specific set of key words. But sure enough, anything that comes up with a low hit count is likely to be due to your poor spelling

I should add that I am an information sceptic.  I never trust any source on its own; always look for confirmation; never trust the first analysis; pain in the butt really


----------



## Cath.S.

> What is the best way to get "clean" evidence?


There is none. Language is dirty in essence. 

But going back to your _méditeur_ example, we know the word does not -yet - exist because that's not what the person who spelled it that way intended to write.
I know it's a pretty hard question, but we must ask ourselves, every time we use Google, did the writer really intend to write what he did? 

In my book, a word or usage only exists if it is voluntary.


----------



## germinal

egueule said:
			
		

> D'ailleurs...


 

Interesting choice Shakespeare but you probably know that his name was often spelled in different ways:   

*1. Introduction*

One of the most common articles of Oxfordian faith is that there is great significance in the various spellings of Shakespeare's name. The spelling "Shakespeare," according to most Oxfordians, was used to refer to the author of the plays and poems, while the spelling "Shakspere" (or "Shaksper," in the version sometimes promoted by more militant Oxfordians such as Charlton Ogburn) was used to refer to the Stratford man. A milder version of this claim acknowledges that Elizabethan spelling was not absolute, but still says that the usual and preferred spelling of the Stratford man's name was "Shaksper(e)," as opposed to the poet "Shakespeare." These claims about spelling are usually accompanied by an assertion that the two names were pronounced differently: "Shakespeare" with a long 'a' in the first syllable, as we are accustomed to pronouncing it today, but "Shakspere" with a "flat" 'a,' so that the first syllable sounds like "shack." A separate but related claim involves hyphenation: the name was occasionally hyphenated in print as "Shake-speare," a fact which Oxfordians say points to it being a pseudonym. These claims are given more or less prominence in different presentations of the Oxfordian theory, but they are virtually always present in one form or another. Indeed, they are vital for the Oxfordian scenario, since they make it easier for Oxfordians to believe that the "William Shakespeare" praised as a poet was some mysterious figure with no apparent connection to the glover's son and actor "William Shaksper" from Stratford-upon-Avon. 

www.shaksper.net/archives

Germinal.

.


----------



## timpeac

germinal said:
			
		

> Interesting choice Shakespeare but you probably know that his name was often spelled in different ways:
> 
> *1. Introduction*
> 
> One of the most common articles of Oxfordian faith is that there is great significance in the various spellings of Shakespeare's name. The spelling "Shakespeare," according to most Oxfordians, was used to refer to the author of the plays and poems, while the spelling "Shakspere" (or "Shaksper," in the version sometimes promoted by more militant Oxfordians such as Charlton Ogburn) was used to refer to the Stratford man. A milder version of this claim acknowledges that Elizabethan spelling was not absolute, but still says that the usual and preferred spelling of the Stratford man's name was "Shaksper(e)," as opposed to the poet "Shakespeare." These claims about spelling are usually accompanied by an assertion that the two names were pronounced differently: "Shakespeare" with a long 'a' in the first syllable, as we are accustomed to pronouncing it today, but "Shakspere" with a "flat" 'a,' so that the first syllable sounds like "shack." A separate but related claim involves hyphenation: the name was occasionally hyphenated in print as "Shake-speare," a fact which Oxfordians say points to it being a pseudonym. These claims are given more or less prominence in different presentations of the Oxfordian theory, but they are virtually always present in one form or another. Indeed, they are vital for the Oxfordian scenario, since they make it easier for Oxfordians to believe that the "William Shakespeare" praised as a poet was some mysterious figure with no apparent connection to the glover's son and actor "William Shaksper" from Stratford-upon-Avon.
> 
> www.shaksper.net/archives
> 
> Germinal.
> 
> .


 
Ah, excellent quote Germinal. I had a feeling that the present spelling of Shakespeare was more by convention than anything else. But that is really interesting for trusting google - it may not tell us the most accepted form, but a hit rate as high as 75k is enough to show that the variant is far from unheard of!! Maybe this is just an example of how much most people couldn't care less if they are told to spell something a certain way, they will carry on expressing themselves as the see fit regardless.


----------



## Cath.S.

Tim, be careful with the numbers Google quotes!
The first number you get includes duplicates.
In fact there are only *783* pages that contain the Shakspeare spelling. You cannot trust the number that appears on the first page of results, what you have to do in order to get the real number of instances is this:
1. select advanced search
2. set the number of results per page to 100
3. go to the *last* page of results, at the bottom of the page you'll see a message like this one:
_In order to show you the most relevant results, we have omitted some entries very similar to the 783 already displayed._

_Edit_
_Germinal, the information you give about alternate spellings of Shakespeare is truly fascinating!  Thanks! _


----------



## cuchuflete

To Tim's original question...usage, when prolonged and widespread, finds its way into grammar books and dictionaries. The following logically meaningless atrocity is on its way into the English language, and may have arrived, despite my futile protests:



> Results *1* - *10* of about *1,050,000* for * "veryunique"*.


----------



## timpeac

egueule said:
			
		

> Tim, be careful with the numbers Google quotes!
> The first number you get includes duplicates.
> In fact there are only *783* pages that contain the Shakspeare spelling. You cannot trust the number that appears on the first page of results, what you have to do in order to get the real number of instances is this:
> 1. select advanced search
> 2. set the number of results per page to 100
> 3. go to the *last* page of results, at the bottom of the page you'll see a message like this one:
> _In order to show you the most relevant results, we have omitted some entries very similar to the 783 already displayed._


 
Thanks Egueule - this is the sort of tip I was hoping for, along with the discussion of the merits of using google for linguistic information.


----------



## timpeac

cuchuflete said:
			
		

> To Tim's original question...usage, when prolonged and widespread, finds its way into grammar books and dictionaries. The following logically meaningless atrocity is on its way into the English language, and may have arrived, despite my futile protests:
> 
> 
> 
> 
> 
> ​


Haha, it's a brave new world Cuchu.


----------



## Edwin

Here are some more or less scholarly articles on this topic:

Google as a Quick'n Dirty Corpus Tool --TESL-Electronic Journal 

The Biggest Corpus of All 

Automatic Meaning Discovery Using Google

The Normalized Google Distance  (article describing the previous paper)


----------



## timpeac

Edwin said:
			
		

> Here are some more or less scholarly articles on this topic:
> 
> Google as a Quick'n Dirty Corpus Tool --TESL-Electronic Journal
> 
> The Biggest Corpus of All
> 
> Automatic Meaning Discovery Using Google
> 
> The Normalized Google Distance (article describing the previous paper)


 
That's absolutely brilliant - thanks!


----------



## lsp

I've questioned the use of Google in these forums many times. Spelling is a different case than grammar, because, as mentioned, Google will suggest an alternative. But I tihnk that the numbers alone don't mean anything, anyway. They are not a percentage of anything meaningful. The sample itself is not pure. Our examples of misspellings and bad grammar serve to increase the instances in search results. Non-natives seeking to be corrected are included. Teen blogs, song lyrics, quizzes with multiple choice answers...  and so on and so on. And there are plenty of sites that would likely add to the counts on the side of grammatical correctlness but don't because they are subscription based or not well indexed for Google's spiders (ever notice the Wall St. Journal doesn't come up in any Google results?). It's too soon IMHO to declare the mere number of results a fair resource to guide our understanding of language usage.


----------



## Edwin

lsp said:
			
		

> I've questioned the use of Google in these forums many times. Spelling is a different case than grammar, because, as mentioned, Google will suggest an alternative. But I tihnk that the numbers alone don't mean anything, anyway. They are not a percentage of anything meaningful. The sample itself is not pure. Our examples of misspellings and bad grammar serve to increase the instances in search results. Non-natives seeking to be corrected are included. Teen blogs, song lyrics, quizzes with multiple choice answers...  and so on and so on. And there are plenty of sites that would likely add to the counts on the side of grammatical correctlness but don't because they are subscription based or not well indexed for Google's spiders (ever notice the Wall St. Journal doesn't come up in any Google results?). It's too soon IMHO to declare the mere number of results a fair resource to guide our understanding of language usage.



LSP, in fact Google does give hits on WSJ articles. See 
"Wall Street Journal"  Bush 


Although counts don't count for grammatical constructions, I find it sometimes helpful to search on expressions restricted to Spanish language pages and then actually open some of the pages and try to determine the author and type of site.


----------



## lsp

Edwin said:
			
		

> LSP, in fact Google does give hits on WSJ articles. See
> "Wall Street Journal"  Bush


Only one of the results was an article within the domain of the Wall St. Journal. The rest referred to the journal but were on other sites. I am surprised there was even one, but if you were to have googled a quote from today's paper, without adding WSJ to your search terms, you'd have gotten no results.


----------



## panjandrum

Edwin said:
			
		

> Although counts don't count for grammatical constructions, I find it sometimes helpful to [...] actually open some of the pages and try to determine the author and type of site.


Absolutely agree.  Unless you do that for some of the links that are listed, you have no idea what you are really counting.


----------



## timpeac

lsp said:
			
		

> I've questioned the use of Google in these forums many times. Spelling is a different case than grammar, because, as mentioned, Google will suggest an alternative. But I tihnk that the numbers alone don't mean anything, anyway. They are not a percentage of anything meaningful. The sample itself is not pure. Our examples of misspellings and bad grammar serve to increase the instances in search results. Non-natives seeking to be corrected are included. Teen blogs, song lyrics, quizzes with multiple choice answers... and so on and so on. And there are plenty of sites that would likely add to the counts on the side of grammatical correctlness but don't because they are subscription based or not well indexed for Google's spiders (ever notice the Wall St. Journal doesn't come up in any Google results?). It's too soon IMHO to declare the mere number of results a fair resource to guide our understanding of language usage.


 
Ah - this highlights a very important point for me - I do not try to use google to find the "correct" usage - 9 times out of 10 this will just be the usage of the people with most power at the time, and there are lots of (coflicting) books to refer to on the issue - I use it to find out what is really said by people.

I agree you need to be really careful with using google to support usage - however, surely there is some number where you need look no further - it must be a normal spelling-usage. I don't know, if you had a million hits for something would you not be willing to bet a sizeable sum on that evidence alone that it is a normal usage?


----------



## Merlin

Google, google, google. Some say it's not accurate. Some say it is. All I can say is it helps a lot. You just have to find a way to get the best ones. It's a great start if you're searching for something. We just have to wisely use it...


----------



## lsp

timpeac said:
			
		

> Ah - this highlights a very important point for me - I do not try to use google to find the "correct" usage - 9 times out of 10 this will just be the usage of the people with most power at the time, and there are lots of (coflicting) books to refer to on the issue - I use it to find out what is really said by people.
> 
> I agree you need to be really careful with using google to support usage - however, surely there is some number where you need look no further - it must be a normal spelling-usage. I don't know, if you had a million hits for something would you not be willing to bet a sizeable sum on that evidence alone that it is a normal usage?


Reluctantly, with caveats. I still feel we don't know enough about or consider seriously enough what is searched, what is not, etc., to take numbers - even really big ones - at face value, which so many already are doing. I feel the tide is overwhelming, which means I'm afraid that further explanations won't be coming anytime soon as people accept what they already get from google as the end all and be all of incontrovertible scientific research, rather than interesting anecdotal information.


----------



## cuchuflete

It's curious that in all of this discussion about what Google is and is not good for, we have had such scant mention...one or maybe two posts...of the statistical sample from which Google data is extracted.

Imagine the largest concentric circle:  All speakers and writers, both native and non-native, of a language.  Now remove the majority of them, who do not have a computer or acccess to one.  That tends to leave the wealthiest, together with business and non-commercial organizations, including government agencies.  This inner circle is the population from which web pages emerge.

Perhaps in English and French one might argue that the computer users, and their written usage on line, are fairly representative of the population as a whole.  I am not so sure about that.  In Portuguese, a much smaller percentage of the total speakers have computer access.  


Google may be a useful tool in approximating the current usage of a significant sub-set of a population, but we need to take care in extrapolating to the total population, or we will risk some Google sized errors.   Groups probably under-represented in the Google sample include those over 50 years of age, the poor, and people in more rural areas without telecommunications infrastructure.  

In other words, if you are a Maine lobsterman, Google may not reflect your usage, while if you work for a bank in London, your words carry extra weight.


----------



## timpeac

cuchuflete said:
			
		

> It's curious that in all of this discussion about what Google is and is not good for, we have had such scant mention...one or maybe two posts...of the statistical sample from which Google data is extracted.
> 
> Imagine the largest concentric circle: All speakers and writers, both native and non-native, of a language. Now remove the majority of them, who do not have a computer or acccess to one. That tends to leave the wealthiest, together with business and non-commercial organizations, including government agencies. This inner circle is the population from which web pages emerge.
> 
> Perhaps in English and French one might argue that the computer users, and their written usage on line, are fairly representative of the population as a whole. I am not so sure about that. In Portuguese, a much smaller percentage of the total speakers have computer access.
> 
> 
> Google may be a useful tool in approximating the current usage of a significant sub-set of a population, but we need to take care in extrapolating to the total population, or we will risk some Google sized errors. Groups probably under-represented in the Google sample include those over 50 years of age, the poor, and people in more rural areas without telecommunications infrastructure.
> 
> In other words, if you are a Maine lobsterman, Google may not reflect your usage, while if you work for a bank in London, your words carry extra weight.


 
Very good points, Cuchu. You have underlined the exclusion of the usage of certain native speakers, so I will mention some other implications of this - the internet contains a lot of other evidence of usage other than current chatty usage. For a start it is going to favour the written over the spoken (so if you are someone who would always write "whom" but only ever say "who" you will be misrepresented). It will also contain quotes from people who haven't got access to the internet, the greatest majority of whom must be dead I suppose. So the fact that you can almost certainly find all of William Shakespeare's plays on line somewhere will influence our google-based decision on whether it is more usual to say "I am to bed" or "I'm off to bed".

It is part of my question as to how we can best filter down our body of evidence before googling - eg should we try to exclude non-natives? This is not a black and white question for me, since I do not think per se that someone can, or should be able, to influence a language only if they speak it as a mother tongue. But obviously you don't want people who can't speak it to save their lives to have influence either. Particularly for English, or rather English in its globish form, it would seem relevant to look at how Italians speak to Portuguese in English and vice versa.


----------



## panjandrum

Here is an illustrative example.
Supposing someone in these forums asked:
I am working in the elephant house in XXXXX zoo. I provide the elephant with food and water. I know I can say "I feed the elephant," but can I also say, "I water the elephant"?

So, here we go a googling (among the leaves so green..........)
....pause. Whistle Greensleeves.....

And I come back with this reply:
Thank you for your really interesting question. I have completed an exhaustive search of the known literature and can confirm that "water the elephant" is perfectly normal usage. It is a little less common than "feed the elephant", of course, but that is not surprising because the keepers must actually provide food for the elephants whereas the water is normally available from an automatic source.
Source: Google:
"feed the elephant" = 1,310 hits
"water the elephant" = 423 hits.

But in fact, when you look closely, none of the first ten links actually use the phrase in the sense I was looking for. To save you the bother of doing it yourself, of the first ten links:
5 = ...water. The elephant ...
3 = ...to get water, the elephant...
1 = ...sucking it full of water the elephant....
1 = ...The water the elephant sprays...

OK, so the hit rate is way less than timpeac's suggested million, but I think this makes the point, or some point, or at least wasn't totally pointless.

My sceptic's point, I suppose, is that Google is one of many sources.  Never trust any source - alone, and always make sure you know the quality and the characteristics of each.


----------



## timpeac

panjandrum said:
			
		

> OK, so the hit rate is way less than timpeac's suggested million, but I think this makes the point, or some point, or at least wasn't totally pointless.


 
Pan - your points are excellent as always!!  Can I please just point out that I was not suggestion a million - I was only picking a number so ridiculously large that it is pretty much beyond argument that there is something behind the usage. I would love to be able to bring that figure down somehow to something we can all agree is reasonable - I don't think that is going to be very likely though!!


----------



## panjandrum

Sorry Tim - I wasn't meaning to question your ridiculously large figure 

But even if you find a million of what you are looking for, you need to bear in mind that what Google counts is not necessarily what you think it counts - hence the need to inspect a good sample.


----------



## cuchuflete

I'm enjoying this thread because we have been using Google counts to determine whether or not certain words and phrases should be added to the WR translation dictionaries. 

We use a secret number, determined with total and absolute objective integrity by one M.K., to decide if a word or phrase is worthy. Translators are allowed to ignore this for certain technical terms which are not going to be in Google with any great frequency, and we use the Google counts more as a guideline than a rule.

I added a word with only about 400 citations, but it was very precise and correct in a narrow technical application. I also gave the more popular common speech version with its 50,000 appearances, and noted that the latter is colloquial. 

Where Google is of little use is determining which usages are current, and which are dated or obscure, as Panj pointed out with his mention of the dead writers.

Ah, the joys of playing Results 1 - 10 of about 103,000 for *lexicographer*


----------



## timpeac

cuchuflete said:
			
		

> Where Google is of little use is determining which usages are current, and which are dated or obscure, as *Panj *pointed out with his mention of the dead writers.


 
<<Spitting feathers>> Please! I don't mind not getting any recongnition for my comments, but equating me with an Irish monkey...


----------



## panjandrum

timpeac said:
			
		

> <<Spitting feathers>> Please! I don't mind not getting any recognition for my comments, but equating me with an Irish monkey...


*~~~~Big Chuckle~~~~*
Oh - that was YOU writing about dead righters 
I spent ages looking back to find out what on earth cuchu was getting at.
I had finally concluded he was harking back to my early post about corpses and bodies - although that seemed too elusive an allusion even for cuchu.
I am so glad to have had that resolved that I will forgive your use of *that word* instead of *orangutan* - one day, maybe soon.


----------



## cuchuflete

Apologies to all those I have not offended...

"the greatest majority of whom must be dead I suppose..." was in fact the work of Pandemonium's altered ego .

To Tim goes the credit for the astute observation, to Panj the credit for his orangargantuan patience in dealing with my misattribution.

Reminds me that as a child, when we were being taught to use library resources, the English teacher kept referring to the D A B. Some sassy child finally mustered the courage to ask what it was. "Oh," she replied, "Dictionary of American Biography, but easier to remember if you think of it as DAB for Dead and Buried."


----------



## timpeac

cuchuflete said:
			
		

> Apologies to all those I have not offended...
> 
> "the greatest majority of whom must be dead I suppose..." was in fact the work of Pandemonium's altered ego .
> 
> To Tim goes the credit for the astute observation, to Panj the credit for his orangargantuan patience in dealing with my misattribution.
> 
> Reminds me that as a child, when we were being taught to use library resources, the English teacher kept referring to the D A B. Some sassy child finally mustered the courage to ask what it was. "Oh," she replied, "Dictionary of American Biography, but easier to remember if you think of it as DAB for Dead and Buried."


 
Ouch - did I write "the greatest majority"!! OK OK I take it back, it was Panj it was Panj!!


----------



## cuchuflete

timpeac said:
			
		

> Ouch - did I write "the greatest majority"!! OK OK I take it back, it was Panj it was Panj!!



They are out there, watching us.  They have friends in the silentest majority.
The smaller majorities are alive and well and counting their googles.


----------



## Edwin

I imagine that everyone reading this thread is familiar with the various advanced search techniques available with Google. For example, at  http://www.google.com/advanced_search  there are a number of available options.   In particular there are links to the following:



> Google Print - Search the full text of books
> Google Scholar - Search scholarly papers
> 
> Apple Macintosh - Search for all things Mac
> BSD Unix - Search web pages about the BSD operating system
> Linux - Search all penguin-friendly pages
> Microsoft - Search Microsoft-related pages
> 
> U.S.*Government - Search all .gov and .mil sites
> Universities - Search a specific school's website



A complete list of advanced search operators can be found at http://www.google.com/help/operators.html 

Also by clicking the various links on the Google site one can find out a lot about how to get the most from Google. For example, here http://www.google.com/help/features.html you will find out about many search features including the Spell Checking that Google does for each inquiry. 

Actually there is probably enough material on optimal use of Google to justify a course. Eventually, I suppose one will be able to get a PhD in Google Studies. Maybe someone already has?


----------



## garryknight

cuchuflete said:
			
		

> Results 1 - 10 of about 1,050,000 for  "veryunique".


That's almost as bad as the following:



> Results *1* - *50* of about *6,770,000* for *alot*.


----------



## Edwin

garryknight said:
			
		

> That's almost as bad as the following:
> 
> Results 1 - 50 of about 6,770,000 for alot.



That's just *alot*  of people being cool!  The Macquarie Book of Slang   says

*alot*
(a common spelling error, but also *used deliberately as a `cool' spelling*) adverb 1. a great deal: The traffic has eased up and we're cruising alot faster now. --adjective 2. many; a great number or amount of: It was alot of fun.


----------



## cuchuflete

And I suppose dr. google will give us a high number for alotment: the excessive use of that very unique word alot.  Groannnnnnnnnn


----------



## modgirl

My favorite misspelling, definately   receives 

"*2,890,000* (hits) for *definately"*

Whoa baby......

It is true that several of those sites are correcting the spelling, but there are many in which the word is substituted for *definitely*!


----------



## Edwin

lsp said:
			
		

> Only one of the results was an article within the domain of the Wall St. Journal. The rest referred to the journal but were on other sites. I am surprised there was even one, but if you were to have googled a quote from today's paper, without adding WSJ to your search terms, you'd have gotten no results.



You are right lsp. I was annoyed to find that out.  Investigation shows that the WSJ is one of the many publications that are available online only for a price.  However, many libraries provide access via such electronic databases as LexisNexis, Access World News, and ProQuest Newpapers. Unfortunately LexisNexis does not provide access to the WSJ.  I am informed by the reference desk of my library that "The 'ProQuest newspapers' database searches the WSJ (as well as some other papers, though not as many as LexisNexis) back to 1982 and the new "ProQuest newspapers (historical)" database provides full-text access to the WSJ back to 1889.

Official description of Access World News: "Access world news from NewsBank provides full-text information and perspectives from over 600 U.S. and over *500 international sources*, each with its own distinctive focus offering diverse viewpoints on local, regional and world issues. Each newspaper or wire service provides unique coverage of local and regional news, including specific information about local companies, politics, sports, industries, cultural activities, and the people in the community. Paid advertisements are excluded."

Access World News might be useful for lexicographical  purposes. 

There are, of course, many other examples of databases that are accessible only for a fee. Much of the scientific literature has this problem. But by using libraries one can often get free access.


----------



## lsp

Edwin said:
			
		

> You are right lsp. I was annoyed to find that out.  Investigation shows that the WSJ is one of the many publications that are available online only for a price.  However, many libraries provide access via such electronic databases as LexisNexis, Access World News, and ProQuest Newpapers. Unfortunately LexisNexis does not provide access to the WSJ.  I am informed by the reference desk of my library that "The 'ProQuest newspapers' database searches the WSJ (as well as some other papers, though not as many as LexisNexis) back to 1982 and the new "ProQuest newspapers (historical)" database provides full-text access to the WSJ back to 1889.
> 
> Official description of Access World News: "Access world news from NewsBank provides full-text information and perspectives from over 600 U.S. and over *500 international sources*, each with its own distinctive focus offering diverse viewpoints on local, regional and world issues. Each newspaper or wire service provides unique coverage of local and regional news, including specific information about local companies, politics, sports, industries, cultural activities, and the people in the community. Paid advertisements are excluded."
> 
> Access World News might be useful for lexicographical  purposes.
> 
> There are, of course, many other examples of databases that are accessible only for a fee. Much of the scientific literature has this problem. But by using libraries one can often get free access.


Exactly, and to my earlier point, whole universes of spellings and grammatical usages that would certainly skew results in one direction or another are invisible to Google.


----------

