# Collective nouns - data <is, are> ... ?



## mrcoelho

Which of the sentences are correct regarding the use of "data + verb"?

a1. "Data about similar projects is useful for estimating project duration".
a2. "Data about similar projects are useful for estimating project duration".

b1. "Personal data is required."
b2. "Personal data are required."

I think I should use "are", but I'm really not sure.

Thanks.


----------



## marget

"Data" plural.  Therefore, *data are* is correct.  It sound odd to us, so we mistakenly use the verb in the singular fairly often, but the correct usage is plural.


----------



## mrcoelho

Thanks, marget. I'm asking the question because it sounds odd to me too, and I keep saying it wrong. I wanted to be sure to start trying to fix it.


----------



## panjandrum

Data is singular, data are plural, the usage varies.
In computing and associated areas, data is normally considered singular.
In statistical, scientific, and philosophical areas, data are normally considered plural.
These are generalisations.
If you are writing for a journal, or for a company, or an academic institution, find out what their house style is.

In the particular examples you give, I would regard data as singular.

For amusement, I had a look at the UK Data Protection Act.
It seems to consider data to be plural.
I also looked at UK Government guidance on applying DP Act.
It seems to consider data to be singular.

There is a tendency for people to avoid the issue completely by talking about items of data, or data items.


----------



## lazarus1907

Properly speaking, it is the plural of datum (a word nobody uses), but it is generally used in singular as a synonym of "information".


----------



## mrcoelho

Thanks panj and lazarus.


----------



## Mr.Blue

(1) known facts or things used as a basis for inference or reckoning.
(2) quantities or characters operated on by a computer.



> (1) In scientific, philosophical, and general use, this word is usually considered to denote a number of items and is thus treated as plural with datum as the singular. (2) In computing and allied subjects ( and sometimes in general use ), it is treated as a mass ( or collective ) noun and used with words like this , that and much, with singular verbs. _*e.g. useful data has been collected*_. Some people consider use (2) to be incorrect but it is more common than use (1). However, data is not a singular countable noun and cannot be preceded by a, every, each, or neither, or be given a plural form _datas_


 
I got this an excellent explanation from the OED , it's very useful. After reading it , I think you will be able to answer your question .


----------



## panjandrum

lazarus1907 said:
			
		

> Properly speaking, it is the plural of datum (a word nobody uses), but it is generally used in singular as *a synonym of "information".*


... not by information and IT specialists, who will explain at length the difference between data and information.


----------



## DaleC

marget said:
			
		

> "Data" plural. Therefore, *data are* is correct. It sound*s* odd to us, so we mistakenly use the verb in the singular fairly often 99.9 percent of the time, but the correct usage is plural.


Hardly anybody says "data are", therefore it is *not* correct. 

Hardly anybody in america thinks "data" is a plural. "Data" is a mass noun like "sand" or "water". The opposite view is pedantic clinging to the word's etymology.


----------



## maxiogee

DaleC said:
			
		

> Hardly anybody says "data are", therefore it is *not* correct.
> Hardly anybody in america thinks "data" is a plural.



These are very personal judgements. 
Don't tempt me to come to terms with the 'rightness' of what "hardly anybody in America thinks".
Going by what "hardly anyone" does is never a good way to decide on correctness.


----------



## CAMullen

The majority of the users of the term, "data" are IT professionals, and to us it is part of our professional jargon. This makes it immune to the rules of grammar, but it doesn't make it "English." The other major users of the term are the scientists. Their use of the word probably comes closer to the more standard English usage, and I would venture to guess they know that the word is plural. (Bear in mind, after all, that it is our profession that is responsible for "verbifying" such "nounifications" as "setup" and "logoff," so that I may be seen setupping my hard drive before logoffing from my computer.)

(But then again, I guess I have "an agenda."  )


----------



## DaleC

maxiogee said:
			
		

> Going by what "hardly anyone" does is never a good way to decide on correctness.


 I am making the point that it *is* usually the correct way in matters of good usage in some language. Most questions of usage are internal to language. Only when a word is rare and/or borrowed from another language or from a technical subvocabulary are we justified in appealing to an "objective" test (As a hypothetical example, if there is a garment worn only by ethnic Japanese and English speakers use the word only when discussing what Japanese do, then the only correct way for English speakers to use the name of the garment would be the Japanese way.) But as I noted, even the objective criterion of word history is not properly a compelling one in usage disputes -- it does not properly take precedence over mass practice. As you yourself have noted in the last two days (in a different thread), meanings evolve. 

The proper use of "chair" is "piece of furniture designed for one person to sit on that has a back". The sole reason this is the proper use of "chair" is that this is how the native speakers use it. 

The use of "data" as a mass noun instead of a count noun is not a youth innovation or a jargon phenomenon. Rather, it is old hat, having established itself in the 1950s or 1960s, if not generations earlier. Therefore, I can be confident my judgement on this one word is not highly personal, but eminently mainstream. 

Even if British Isles people like yourself disagree on "data", the member I was responding to is an American who was making a claim about AE.


----------



## panjandrum

British Isles people do not disagree about data being singular or plural. 
We agree to differ.
Our data protection legislation considers data to be plural.
Our public sector guidance on implementing that legislation considers data to be singular.
The information specialists that I work with consider data to be plural.
The information technology specialists that I work with consider data to be singular.
There is no disagreement, no tension, no conflict. We each respect the other's terminology.
As one who often lives on the boundary between the two, I can switch from one to the other at will. I have not been lynched yet.

I would not dare to declare either "data are" or "data is" to be incorrect.



			
				CAMullen said:
			
		

> The majority of the users of the term, "data" are IT professionals, and to us it is part of our professional jargon.


For goodness sake don't forget the information analysts Ours are plural data through and through - and there are millions of them. Would your equivalents - statisticians and the like - be plural data people?


----------



## DaleC

For those for whom "data" is plural, its singular must be "datum". But the term "datum" is quite rarely. 

From Panjandrum's exposition, it's clear that only small intellectual cliques make a point of using "data" as a plural. I would remind everybody that the real point of this thread is not the merits of this principle, but how to advise learners of English. It is wrong to tell a learner that some way is "*the* correct way" when it is followed only in highly demarcated fields of academia (Panjandrum has noted how some information technology fields are on the bandwagon and others not), and is *not *adopted by other scientists and technicians, not to mention by the general public. 

In Google: 

"how much data is": 162,000 hits; 
"how many data are": 9220 hits; 
"datum bits": 218 hits


----------



## Andy1

Hi everyone,

Which is correct:
a) There is no data.....
or
b) There are no data.....

I think technically b is correct as data is the plural of datum but a sounds better (and I think is the norm now)

What do you think?


Andy1


----------



## europefranc

Andy1 said:
			
		

> Hi everyone,
> 
> Which is correct:
> a) There is no data.....
> or
> b) There are no data.....
> 
> I think technically b is correct as data is the plural of datum but a sounds better (and I think is the norm now)
> 
> What do you think?
> 
> Andy1


Here is an interesting link:

http://www.bbc.co.uk/worldservice/learningenglish/radio/specials/1535_questionanswer/page50.shtml

I really hope that can help you


----------



## Andy1

Cheers! thats cleared it up!


----------



## la reine victoria

Andy1 said:
			
		

> Hi everyone,
> 
> Which is correct:
> a) There is no data.....
> or
> b) There are no data.....
> 
> I think technically b is correct as data is the plural of datum but a sounds better (and I think is the norm now)
> 
> What do you think?
> 
> 
> Andy1


 

Hi Andy 1,

Google shows equal hits for "is" and "are" - 898,000,000.

It sounds more natural to me to say, "There is no data."  I never use the singular "datum".




LRV


----------



## panjandrum

I've added today's question to the end of the most recent discussion on data is vs data are.

I'm a little surprised at LRV's 898,000,000 GoogleHits?
about *219,000,000* for *"data is*
about *90,900,000* for *"data are*


----------



## panjandrum

I think you have left out the " " around "data is" and "data are".


----------



## la reine victoria

panjandrum said:
			
		

> I think you have left out the " " around "data is" and "data are".


 




> Results *1* - *10* of about *211,000,000* for *"data is"*.
> Results *1* - *10* of about *95,100,000* for *"data are"*.


 


So I did.  Thanks Panj.  



LRV


----------



## river

According to one style guide, "_data _is rarely treated as a singular when it begins a clause and is not preceded by the definite article."

You don't have to be rocket scientist to know that _data_, like _phenomena _is plural and that not everyone accepts _data _as a collective.

So, *a2* "Data about similar projects *are* useful. . . *b* could go either way, but in formal contexts, _data _is plural.


----------



## Hutschi

*the Data are - or - the data is* 
Hi, 

usually I wrote: the data are ...

But in a specification, I found: the data is ...

The Shorter Oxford English Dictionary says:

Data is plural or a collective singular.

In which case do you use the plural and in which the singular?

I would be very glad, if somebody could help me to clarify the difference.

Could I use generally the plural form?

Best regards
Bernd


----------



## TrentinaNE

I work in a data analysis setting.  We always use the plural verb, e.g., "the data are..."

Elisabetta


----------



## nay92

the data is doesnt really make much sense to me so i would use the data are, however i have heard my maths teacher say a number of times the data is. I think she is wrong though!


----------



## caballoschica

Data is a plural of Datum as already discussed.  
Talking in plural, like The data I recorded are being analyzed now...that means that each and individual datum that you collected is being analyzed now.

My data is related to.  That means all your data is related to something in particular.  You're talking about all your data at one time.   My data is for chemistry lab.  It's collective.  You say the herd is....Not the herd are....although the herd is a group. 

That's the slight difference.


----------



## Lemminkäinen

A search for 'data' in topic titles in this forum gave me this thread which poses the same question.


----------



## Robbo

Hutschi said:


> *the Data are - or - the data is*
> Hi,
> 
> usually I wrote: the data are ...
> 
> But in a specification, I found: the data is ...
> 
> The Shorter Oxford English Dictionary says:
> 
> Data is plural or a collective singular.
> 
> In which case do you use the plural and in which the singular?
> 
> I would be very glad, if somebody could help me to clarify the difference.
> 
> Could I use generally the plural form?
> 
> Best regards
> Bernd




"Data" used to be the plural form of "datum" but in modern usage it has become rather detached from its origins.


When we see this word data in familiar contexts, its behaviour does NOT follow the rules associated with plural nouns.  

For example: 

When used as an adjectival modifier, the singular or uncountable (mass) form is normally used:
The ten-man team won the race.
The Prison Service ( not the Prisons Service )
The horse ranch.  ( not the horses ranch )
The EU wine subsidies.
Tobacco taxes.

Similarly:
The Data Protection Act.
She works as a data analyst.
Data validation and data verification are important aspects of any data processing system.

Data, in my opinion, clearly does not behave as a countable noun plural.  You can say "some data" (cf some rice) but not "three data" and if you said "three rices" or "three wines" it would mean three types of rice or three types of wine.

It is useful to think of data as a mass or uncountable noun - like rice.

Nevertheless, "the data is" and "the data are" are both common forms and neither should be regarded as an error.


----------



## panjandrum

Today's question on this topic has been added to the several previous threads on the same topic.
The earlier posts are worth reading, and support Robbo's conclusion:


> Nevertheless, "the data is" and "the data are" are both common forms and neither should be regarded as an error.


 Well, the posts I wrote support that


----------



## TrentinaNE

TrentinaNE said:


> I work in a data analysis setting. We always use the plural verb, e.g., "the data are..."


In the interest of context, I should add that much of our work is for litigation purposes, so it's critical to be clear when the word "data" represents many "pieces of data." You don't want an expert being asked at trial "So, Prof. Fullofyourself, do you mean to say that you based your estimate on one data point?"  

And yes, it would not be unusual for said reports to refer to "one piece of data" as a _datum_.  

Elisabetta


----------



## Setwale_Charm

In principle, "data" is supposed to be a plural form of "datum". However, I often see it as "data is ..not available, etc". Have never thought about it...until now!!


----------



## cuchuflete

In current usage it takes either a plural or a singular verb.  

Here is a good usage note:  http://dictionary.reference.com/browse/data


----------



## Setwale_Charm

Thank you, cuchuflee. although I still cannot make up my mind as to which is preferable.


----------



## samarje

Usually scientists prefer to say "data are" - the rest of the population says "data is". 

My biology professor gets really mad at anyone who says "The data is inconclusive", because he is very picky about grammar and insists that "data *are* inconclusive" is the right way - probably because data (like you said) is the plural of datum, a set of multiple pieces of info. However, I have seen in newspapers "data is" and I think it is considered acceptable because they think of all the information as a single *set* of data. So, "the *set of data is* inconclusive", but also, "the *data are* inconclusive". Newspapers just omit the word "set".


----------



## Nezquirc

I think "the data are inconclusive" is wrong. "Data" works as an uncountable. Another example is money.

"This money are false."

This is obviously wrong. Either make it

"These data are inconclusive." or

"The datas are inconclusive."

In my opinion, it's always grammatically incorrect to treat a noun as both countable and uncountable in a sentence. I also think it's wrong to treat data as a countable, but that's maybe just my opinion.


----------



## river

_Garner's Modern American Usage_ notes that "_data_ is rarely treated as a singular when it begins a clause and is not preceded by the definite article."   _Data _over the last three years _suggest_ that obesity is on the rise.

Garner also notes that _data_ is a "skunked term": whether you write _data is_ or _data are_, some will disapprove.


----------



## Nezquirc

river said:


> _Garner's Modern American Usage_ notes that "_data_ is rarely treated as a singular when it begins a clause and is not preceded by the definite article." _Data _over the last three years _suggest_ that obesity is on the rise.
> 
> Garner also notes that _data_ is a "skunked term": whether you write _data is_ or _data are_, some will disapprove.


 
Well, this supports my idea of not changing the "countability" of the word within the sentence. However, seeing this example, I must also agree that you can change the countability depending on the situation.


----------



## samarje

Hmm, you make a good point, although I'm not sure I would treat money in the same way as data because people don't ever say "the moneys" unless they are lawyers  ie, "monies"...

A very reliable college professor (an entomologist, actually) told me that "data are" is correct, but you never know... everyone's wrong once in a while, some more than others...especially those who are not English majors!

I don't think "The datas are inconclusive" would be correct ... it would be like trying to pluralize something that is already plural, so it would be redundant. "These data are inconclusive" sounds much better. That's just my personal preference though...


----------



## Setwale_Charm

But "money" may be an uncountable noun whereas "data" is a *plural *form.


----------



## Nezquirc

Setwale_Charm said:


> But "money" may be an uncountable noun whereas "data" is a *plural *form.


 
Perhaps, but we still treat data as an uncountable. Such as in "a set of data".

Of course you can't say "the moneys". That was my point. Data can be uncountable (a piece of data), but also countable (these data). However, money is always uncountable (except to the accountant... bada-bing!).


----------



## panjandrum

I've added this thread to the end of previous comments on this topic.  It's worth reading from the beginning, but I'll restate what I said in several previous posts.
Data is accepted as either a singular or plural word - _but not both in the same context_.


----------



## elirlandes

panjandrum said:


> I've added this thread to the end of previous comments on this topic.  It's worth reading from the beginning, but I'll restate what I said in several previous posts.
> Data is accepted as either a singular or plural word - _but not both in the same context_.



Strictly speaking, Data is plural and its singular form is datum. 

Data has become acceptable also as an uncountable noun.

"Data is..." and "Data are..." are both generally accepted forms.

I have never come across the use of the word "data" as a singular - i.e. you never hear in english "a data" or "one data". You do however hear "a piece of data" etc.


----------



## mplsray

elirlandes said:


> Strictly speaking, Data is plural and its singular form is datum.
> 
> Data has become acceptable also as an uncountable noun.
> 
> "Data is..." and "Data are..." are both generally accepted forms.
> 
> I have never come across the use of the word "data" as a singular - i.e. you never hear in english "a data" or "one data". You do however hear "a piece of data" etc.



The Merriam-Webster Online Dictionary labels _data_ as "noun plural but singular or plural in construction," but there's another way of looking at it. Under the entry "singular" in The Oxford Companion to the English Language, Tom McArthur writes: 

"In English, the term is often used to include uncountable noun usages like _love_ and _wine_ because they take singular verb concord, even though in other ways such nouns are different from singular countable nouns like _horse_ and _stone_."

That is how I see it: _Data_ is both singular and plural.


----------



## TrentinaNE

Or in common usage, the word data can be either collective (the data is) or plural (the data are).

When referring literally to one data point (granted, a rare occurrence), I think it would be misleading to speak of "the data," even with a singular verb form.

Elisabetta


----------



## databyte

I believe data to be a collective noun which takes a singular verb.  I accept the original derivation of datum as singular, and data as a plural of datum, but who uses datum any more?  The common usage of data I believe is singular being a collective noun.  There is a bit of data that is supporting this conclusion.  All collective nouns tend to use the singular sentence structure in common day to day usage.  I understand that there are exceptions in use for technical reasons, but I feel a lot more comfortable using collective nouns with singular verbs.  It sounds very clumsy to say, "The data are good" rather than "The data is good", or "The data have been taken", rather than "The data has been taken". 

The English language has been developed in part by common usage of it.  I think more people use data as a collective noun using singular sentence structure, and feel more comfortable doing so, in my humble opinion. 

Jim . . .


----------



## zMom

Since the mid-60s, in America, in a technical setting, I've always heard--and read--the word data used as a collective noun. The reason is: there is rarely a situation where there is only one bit (no pun intended) of data. If you are working with data, it is most likely a lot of data, else why use a computer to manage it? I've seen it used as plural by scientists who are usually user-level folks who collect lots of data points and, in those cases, they truly think of their data in the plural; makes sense in that setting. And I agree, recalling my high-school Latin, that datum is the singular form; however, who uses Latin these days when so many people are linguistically-challenged by simple English?


----------

