RBI Governor D Subbarao has recently expressed dismay that data on prices and industrial production are simply not reliable enough to form a good basis for policy. The inflation rate is frequently revised upward by as much as 100 basis points. Trends in industrial production data are so erratic as to leave analysts gasping. Governor Subbarao did not mention the National Sample Survey Organisation (NSSO), but its data are also looking unreliable enough to generate false conclusions and mistaken policy responses.
Consider the latest employment data from the 2009-10 round. These showed a fall in labour participation from 42% to 39.2% over five years, implying an increase of only two million jobs in this period. But after initial cries of “jobless growth” it became apparent that the real problem was “workerless growth.” Female willingness to work fell a whopping 6% in five years, implying that 35 million women workers had withdrawn from the labour force. All calculations about India’s demographic dividend went for a six.
This created a puzzle. Why on earth should poor people withdraw so massively at a time when wages are rising so sharply, even for casual unskilled workers? Rising numbers in secondary and college higher education can at best explain a small part of this.
Step back and take a longer look, and a new puzzle appears. Labour participation was 42% in the 1993-94 round of the NSSO, and fell to 39.7% in the 1999-2000. Then too this led to cries of “jobless growth” and theories on why economic reform had not created jobs. But then the participation rate rose again to 42% in the 2004-05 round, leading to the cheery conclusion that job growth was faster than workforce expansion, so the reforms were a great employment success. But now we once again have falling participation and talk of jobless growth. Deja vu, indeed.
What’s the true underlying trend? An expert from afar, looking at the data over two decades, will probably conclude that any trends that may indeed exist are getting obscured by data imperfections more than anything else. To put it succinctly, statistical noise may be drowning out genuine trends. Data imperfections may not be the only factors. But they do give a credible explanation of why participation seems to bounce up and down, why we see no sign of a demographic dividend although the numbers in the 15-60 age group have risen disproportionately, and why female participation has dipped so sharply as to be unbelievable.
While it is up to statistical experts to work on this puzzle and provide answers, let me highlight one possibility. This is that the sample size of the NSSO surveys has become too small, and hence is yielding too much noise to make out the underlying trends.
Back in the 1970s, the sample size of a “thick” NSSO survey was 140,000 after taking into account non-responses. Over time this has fallen to around 120,000, mainly because of increasing non-response. Now, increasing non-response itself is a problem. But more important, surely, is the fact that the sample size has fallen even as the population has more than doubled. Statisticians will tell you that the sample size need not go up in proportion to population, but surely it should not fall dramatically, as is the case.
Supporting evidence comes from poverty calculations. Economist Surjit Bhalla , a long time NSSO critic, was overjoyed when the 2007-08 round showed that poverty had declined by a whopping 10 percentage points in three years since 2004-05. Poverty reduction amounted to 3.3% per year, thrice the earlier rate of roughly 1% per year. But since then, according to the 2009-10 survey, poverty has shot up by 5 percentage points in two years.
Are such sharp fluctuations in poverty credible? Well, 2009-10 was a drought year, and that may be part of the explanation. But Bhalla points to a more potent factor. In 2007-08 the NSSO estimate of national consumption was 47% of consumption according to the national accounts. But by 2009-10, this proportion fell to 43%, one of the lowest ever anywhere in the world. This affects credibility.
Bhalla’s euphoria in 2007-08 was not echoed widely because it came from a “thin sample” of around 50,000, just 40% of a thick sample. Most experts believe that this 40% reduction in size makes the data unreliable for calculating poverty trends. Yet the thick sample has also fallen 40% in relation to the population since the 1970s! How reliable can that be?
The problem is compounded by lack of expert staff and rising hiring of contract workers with insufficient experience or training. It also appears that NSSO may not be covering newly urbanising areas properly.
The world over, people complain that there are lies, damn lies and statistics. India is no different. But it remains preferable to base our analyses and policies on statistics, no matter how flawed, than on mere personal impressions.
Some statistical issues are more fixable than others. For starters, we should increase the NSSO sample size, maybe double it. This will cost more, but is easily affordable for a fast-growing economy. The cost of making policy in a fog of statistical uncertainty is far greater than the cost of improving a country’s statistical system.