Saturday, July 15, 2017

Public Service Announcement: The US Labor Market is Still Losing Ground Relative to Trend

We keep hearing how good the labor market is these days. We've created more than 16 million jobs since the financial crisis! Unemployment is the lowest since 2001!  Time to raise rates, since the economy is overheating. Of course, this mostly comes from current and former policy makers, all of whom have a stake in trying to tell us that the Obama/Bernanke, or the Trump/Yellen economy has done quite well. However, how does job creation look like these days in terms of the long-run historical rate of job growth in the US? I plot total nonfarm employment relative to the long-run trend below. It looks a bit worse than I imagined, actually. We are now something like 23% below the long-run trend, but the surprise for me is that even in the past couple years, as the Fed tightens MP due to an economy that is supposedly overheating, we seem to still be moving further away from the long-run trend.


Of course, there are caveats here. Population growth did naturally slow a bit, and the absorption of women into the labor force was a one-time event that was mostly played out by the 2000s; 9/11 exogenously reduced immigration, and thus job growth, and the Boomers have been retiring, etc. Certainly, you could also quibble a bit with the trend. Yet, even if you plot the trend from 1945 to 1995, in recent years we still will not really have been gaining on this slower trend growth. These other events/excuses/caveats are not going to explain the relatively sudden collapse of employment some 20% +/- below trend. And why should exogenous negative shocks to labor supply cause wage growth to slow? I'm afraid I'm losing the plot of these other stories.

I have another explanation: maybe the economy is not really that overheated.


Note: you can follow me on twitter @TradeandMoney

Wednesday, July 12, 2017

In the Idiocy of Kevin Warsh: More Evidence for the 'Self-Induced Paralysis' Thesis

I believe it is clear that the main reason the economy has been growing slowly since the financial crisis is overly tight monetary policy. Inflation has been chronically low. The unemployment rate now admittedly looks good, but this is primarily due to workers leaving the labor force. The employment rate has not recovered, as can be seen below. Certainly, things are improving, and things will look better if you limit to prime-age adults, but then again, you could argue that the prime-age employment rate numbers might look better than usual due to baby boomer retirement. Wage growth is also slow, pointing to a still-weak labor market, nearly 10 years after the recession began. And, yet, despite that, the Fed has taken five consecutive tightening actions in terms of ending QE and raising interest rates. The result of this has surely been to help keep inflation below target and GDP growth below its long-run level.
















In particular, look at the above graph in 2009, when the Fed adopted no new stimulus despite headline deflation and mass job losses, on net (in terms of rates or asset purchases, forward guidance was done), or in 2010, when the Fed raised the discount rate. What on Earth could they have been thinking?

Despite this logic, I suspect that many economists have a deep respect for Ben Bernanke, who I also like and respect, even if I disagree with him on some things, and thus wonder how he could have gotten things so wrong. Part of the answer might be that Ben Bernanke, ever a consensus builder, would have liked to do more, but was also constrained by other members of the FOMC. Sam Bell provides some evidence for this in a can't miss article on Kevin Warsh, who now appears to be a front-runner for the Fed Chair job, who was still worried about inflation pressures even after Lehman Brothers failed in 2008.

First, Bell notes that Warsh is a lawyer by training, who was only appointed to the Fed at age 35 with a light resume after his father-in-law, Ronald Lauder, heir to the Estée Lauder fortune and apparently a confidant of Donald Trump, likely influenced his selection with donations.

Even as the economy was tanking in 2008 and 2009, Bell writes that "Warsh adopted a skeptical and increasingly oppositional posture. He doubted the Fed could do much good without creating much bigger problems."

Much bigger problems? What could be a bigger problem than letting the economy burn in a financial crisis?
"In March 2009 he told his Fed colleagues that he was “quite uncomfortable with the idea of purchasing long-term Treasuries in size” because “if the Fed is perceived to be monetizing debt and serving as a buyer of last resort in the name of lowering risk-free rates, we could end up with higher rates and less credibility as a central bank.”"
The Fed should hold off on more stimulus in the worst recession in 75 years because it might actually end up with higher rates and lose credibility? Why wouldn't the Fed lose credibility if it was perceived as not fighting the recession? Warsh continued to warn about the dangers of both monetary and fiscal stimulus in 2010.

Warsh was also far and away not the only crazy one at the Fed at that time. In 2011, when I worked as a Staff Economist at the President's Council of Economic Advisors, I had a conversation with Daniel Tarullo, who told me he believed that Jean-Claude Trichet's interest rate hikes in 2010 -- which are widely seen to have been premature and to have helped ignite the European Debt Crisis -- were justified. These comments suggested to me that Tarullo was somewhere to the right of Genghis Khan on monetary policy. Then, there were also worthies like Richard Fisher, Often Wrong but Seldom Boring, who "warned throughout most of 2008 that inflation was the primary danger to the economy". 

And that, my friends, is how the Tea Party was born.

The other thing to note about the FOMC is that it's a job most people seem to not want to do for very long. It's a revolving door. Most people will do it for 4-5 years, and then quit for greener pastures, as it is not a job that pays that much, particularly by the pornographic standards of finance and banking. Even top university professors can make much more. It's a mix of people who are politically connected, bankers, and academic macroeconomists. And even the latter group can be a mixed bag. And, despite this, (or, should I say, in part because it is a revolving door) the Obama administration never took its appointments seriously. They left in place an FOMC made up mostly of Republicans, including staunch white male MBA-holding Republicans raised in the south in the 1960s. Obama's economic advisors apparently did not see this as potentially problematic.

And, then we had Bernanke, who apparently still holds the view that economic growth in the US economy is still more-or-less OK. In 2011, I also had a conversation with Ben Bernanke. I saw as soon as I began talking to him that he figured I would criticize him for QE, or inciting hyperinflation with all this money printing. He was actually surprised when I asked him why he wasn't doing more, given that core inflation at the time was running around 1.4%. His response is that higher inflation wasn't costless. But I didn't see how inflation of 2% vs. 1.4% would be as costly as millions of people out of work. It seems, few people at the Fed were trying to influence him in the direction of doing more.

What all of this evidence does is make the thesis of "Self-Induced Paralysis", that the major problem with the US economy is overly tight monetary policy, more plausible. You had the competent, but cautious Bernanke who likely wanted some more stimulus, but was surrounded by a group of idiots concerned about inflation in 2008. And, even Bernanke himself clearly seems to be in a state of denial about US growth prospects. The reality is that the people who controlled monetary policy since 2009 are a mix of those who believed hyperinflation was just around the corner, those who believed monetary stimulus in a severe recession would do more harm than good, and on the dovish side a Chairman who hasn't noticed that the US economy that, since 2008, has consistently been growing slower than it used to.

In any case, let's return to Kevin Warsh for a minute. How bad would he be as Fed Chair? Likely a disaster. Certainly a disaster on regulation, and likely also a disaster on monetary policy. The only catch here is that he will be a perfect Fed Chair for Trump, as he'll be a yes-man Trump can control 100%. Although Trump sounded hawkish on monetary policy on the campaign trail, I always imagined he would eventually change his tune as President -- particularly once the election is on and he realizes the Fed can deliver faster growth. Thus, he could, in fact, adopt looser monetary policy and pave the way for a second term for Trump. Or, he could be the Kevin Warsh he was during the Financial Crisis, and continue the Yellen tradition of slightly-too tight policy. I think we won't know the answer to this until it happens, although I would probably put higher odds on the latter.


Note: you can follow me on twitter @TradeandMoney



Sunday, July 9, 2017

Ben Bernanke, in Denial? "When Growth is Not Enough"

"When Growth is Not Enough" is the title of a recent Ben Bernanke speech in Portugal. I found it via the NYT article on the "Robocalypse", which contained this bizarre quote from Ben S. Bernanke "as recent political developments have brought home, growth is not always enough."









However, as you can see, something terrible has happened to US GDP growth, which is now more than 20% below its long-run trend, even if it has escaped the attention of our former Fed Chair. On twitter, Kocherlakota and I were both hoping he'd been taken out of context. Unfortunately, that turned out not to be the case. 

In his speech, Bernanke is trying to make sense of how his tenure at the Fed was followed by a populist political rebellion. To his credit, early in the essay, he does admit that the "recovery was slower than we would have liked", but in the round, as the title of his speech suggests, he is a glass-is-half-full kind of guy on the economy "the [Fed] is close to meeting its ... goals of maximum employment and price stability... more than 16 million ... jobs have been created... the latest reading on unemployment, 4.3 percent, is the lowest since 2001." He then writes "So why, despite these positives, are Americans so dissatisfied?" He lists four reasons:
  1. Slow median income growth, especially for male workers. Hourly wages for males have declined since 1979.
  2. Declining rates of intergenerational mobility
  3. Social dysfunction in economically marginalized groups (see Case-Deaton on mortality increases for working-class Americans).
  4. Political alienation.
What were the causes of these? Bernanke pushes the Gordon thesis that wartime technologies led to the boom in the early post-war period. He notes that productivity growth has been slow the past 10 years.  He correctly notes that there was a China shock (which is good, would be nice if he also mentioned exchange rates), and also argues that globalization has led to the rise in inequality. In terms of policy, he argues that more could have been done to secure the safety net and help the downtrodden.

There is much to like in the essay, and I'm not opposed to his policy prescriptions. I also agree that inequality could be part of the problem. But that there were several things that struck me.

First, Bernanke also doesn't buy the Reagan/Thatcher revolution as the cause of the growth of inequality in the US and UK. He seems to think some combination of globalization/SBTC is the cause. At least he is in good company -- Krugman, Avent, DeLong, and David Autor -- all people I respect and have learned a lot from, also don't seem to buy it. I have no idea why. 

In my own research with Lester Lusher (see here and here), we concluded that trade almost certainly was not a major cause of the rise of inequality in the US. The aggregate timing just wasn't quite right, inequality increased just as much in sectors not directly affected by trade, and other countries that trade a lot (Germany, Sweden, Japan) did not see anything like the increase in inequality in the US or UK. And when inequality finally did increase in these countries, it followed cuts in top marginal tax rates just like it did in the US and UK.

Second, reading between the lines, Bernanke seems to have caved a bit in his debate with Summers on the source of Secular Stagnation. Now he seems to be closer to the view that there was some autonomous decline in technological growth. I'm very skeptical of this view, although I'll concede it's hard to prove either way.

The big one, of course, is that Bernanke does appear to be in a bit of denial that GDP growth really has slowed. He credits the Fed for price stability, without noting that the Fed has undershot its own stated inflation target for nearly a decade now. He also doesn't mention how/why both he and the ECB raised interest rates in 2010 (no, that isn't a typo). Why shouldn't tight money in a recession lead to slow growth? Of course, it would be nearly impossible for anyone to view such a horrible thing such as the election of Donald Trump, which likely was caused in part by a weak economy (the economy always matters for elections), and realize that one's own policies were at fault. 

Unfortunately, in recent months, the US has gotten more bad news on the GDP front. Is the problem that we've already invented everything worth inventing, and growth will just naturally slow, as Robert Gordon suggests? Or is the Robocalypse upon us, as some would have us believe? Or is it that the Fed ended QE prematurely and then raised interest rates four times in a row despite inflation at 1.5%? 

I'm going to go with the latter. After all, if GDP growth and inflation are both below target, and the Fed tightens monetary policy, tell me what is supposed to happen?

Thursday, June 29, 2017

How to Cure a Cancer: Thoughts on Improving Academic Journals

So, I currently have a paper that has been under review since last December, coming up on 7 months. A friend of mine recently waited something like 16-17 months at the JME. This is ridiculous. It doesn't seem like it would be that difficult to design a system in which this doesn't happen, or happens only very rarely. It's led me to think about all the ways the academic economics publishing system could be improved. Particularly since economists study incentives, you'd think we could design a good system.  All economists know and believe there are obvious problems in the academic publishing system, and yet, somehow, the profession soldiers on like a drunken sailor.  Thus, here are a few proposals, most of which are obvious and probably not new:

1. A recent positive trend is that many journals pay referees for finishing a report within a certain number of days. I approve. However, my paper is currently at such a journal, which has limit of 4-6 weeks (range given to protect the anonymity of the journal). After that, a referee gets nothing. What I wonder, then, is if a referee misses the deadline, they no longer have any incentive to write a report sooner rather than later. I also wonder if this doesn't make them more likely to delay. If the deadline was 30 days, and you wake up on day 31, and realize you've missed a payday, it seems you might be less likely to submit the report on day 31, feeling like a chump. A phase out of the payment over several months would likely be more effective. Here's another idea, why I'm throwing them out: referees should be given an option to have a larger payoff with a short duration of a referee report -- say, two weeks -- while also agreeing to pay if they don't finish a report within two months, with a fee increasing incrementally each month. And would need to provide their credit card information in advance. The fee for backing out could be set to be even worse than submitting a report at each date, to make this incentive compatible.

2. Another problem is that much research is funded by the government, or by universities paid for by tuition dollars, and yet academic research is not a public good. Academic journals make bank. First, academics write papers and submit them to journals, and not only do they do this free of charge, but they actually pay for the right to have the fruits of their free labor published, for the benefit of the journal owners. We must do this, of course, as our academic careers depend on publishing in fancy journals. Our careers are also helped by prestigious editing positions at journals, which we are also happy to do for free or low wages. Referees then, also, typically work for free or at least below-market wages. Next, the journals charge lots of money to the same universities which pay their professors for access to the same research the universities pay professors to produce. It's a great business to be in, if you can get it. The issue is that the most prestigious journals have near monopolies -- the tradeoff is that one Top 5 publication is seen as being worth 4-5 publications in Top Fields (at least for your first one) -- a crazy ratio, but the profession is obsessed with rank and status. The solution here is some form of collective action. For example, the government should tax academic journal profits, and could use the proceeds to help fund education, or it could regulate the fees that journals charge for libraries or the public to access the research. If the Ivy League, or the UC system were to spell out that only research published in open access journals or which obey some other criterion would result in tenure, this could be a start. The problems and solutions here are so obvious I'm hesitant to write it down. A dream of mine is also to devise a journal ranking system which penalizes journals for bad behavior (e.g., for high fees to libraries, or high submission fees), so that departments could change the ranking system which they use to award tenure and promotion decisions, in order to better incentivize journals. 

3. A third problem is that referees, and even journal editors, have limited incentives to try and do a good job, much less a timely job. One thing I propose here is for journals to ask authors for feedback on how the referee and editorial process went.  Clearly, when someone gets a negative referee report, they will give someone a bad ranking, but overall referee scores could be computed controlling for the recommendation of the report. Partly, I think this would be good for purely therapeutic reasons, even if the journals didn't use this for anything. When you're pissed off after getting yet another clueless report, you now have an outlet -- you can kill them in your referee Eval! However, the journals could band together to create a referee ranking. Editors and other referees could also rate referees. There are certainly cases where referees say crazy things, and this is acknowledged by not only the authors, but the editor and other referees. And yet, the said referee pays no penalty at all. On the contrary, they may receive the benefit of not being asked to referee again for some time. If a system were designed to give feedback to departments: "your professor X has a history of providing unprofessional reports", or simply: "here is the rank of your APs in terms of their prowess writing referee reports", this could change incentives for the better. 
4. Another problem is that journals have no incentives to publish comment papers, which tend to reflect poorly on the journal, and which also tend to be poorly cited. Thus, in the ranking system I devised above, I would penalize journals that don't accept comment papers on papers published in its own journal. 







Tuesday, May 9, 2017

Is US Manufacturing Really Great Again?

One chart, more than any other, has influenced people's beliefs about the US manufacturing sector, as can be see by this debate here, and here (the latter a link to the Autor/Harrison/DeLong/Krugman debate, all of whom I am enormous fans of). This graph (below) apparently exposes the lie that trade has played a significant roll in hollowing out manufacturing jobs. It shows that manufacturing employment as a share of total non-farm employment has been declining on a fairly steady trend. In fact, in recent years, it has even outperformed this long-run trend. Thus, it seems we can forget the idea that US manufacturing is in any kind of a slump, judging by this. On the contrary, since we are now above trend, it seems that manufacturing is great again!































However, there are several problems with this thesis. The first is that, if we extend this trend far out into the future, it will imply that manufacturing employment should soon comprise a negative share of employment. Obviously, that can't happen. It would be more natural to assume that the decline would flatten out over time, as it eventually heads toward zero at a slower rate. Additionally, in terms of output growth, as Noah Smith notes, the period from 2008 hardly looks like any sort of a renaissance. And if rising above the trend after 2008 in terms of employment share is not evidence of good performance, than staying on the trend can hardly be conclusive evidence that everything was OK before it.

An additional problem is that, imagine for a second that you have a city -- let's call it Detroit -- that loses 80% of its tradable-sector employment. What could theoretically happen is that the city might soon lose 80% of its non-tradable sector employment as well. Thus, one might infer from its unchanging tradable sector employment share that the decline in tradables was not the cause of the overall decline in jobs. (And, yes, as I pointed out in my job-market paper, other tradable sectors beside manufacturing do seem to have been hit in the early 2000s.)

























One example that those who imagine themselves to be sophisticated often use is the decline of agricultural employment. Obviously, one major reason for this is dramatic productivity growth in agriculture over time. We're told then, that the linear decline of manufacturing as a share of total employment is just like agriculture. However, agricultural employment did not continue a linear decline all the way to negative territory. That would be impossible! after all. What happened is that in recent decades the decline as a share of total employment has slowed tremendously. Thus, the question could be why this didn't happen earlier for manufacturing -- why manufacturing was not like agriculture.




















(Note: I believe the steep dropoff in agricultural employment around 2000 was related to a change in the classification system.)

However, I do think it is true that while trade is a major driver of the decline in manufacturing employment (see my own research here and here), and was dominant up until the Great Recession, since then slow overall GDP growth is now likely to be the dominant factor holding back manufacturing. And thus, those who care about the US economy should be more focused on monetary policy than trade. Incidentally, a focus on monetary policy will also weaken the dollar and help solve that problem as well.

In any case, let me plot real manufacturing output relative to trend to make the case that not all is well in this sector. (Frequent readers of this blog or my twitter feed @TradeandMoney are no doubt already familiar.)


And, here is Real GDP relative to the long-run trend.


























Lastly, haven't we heard that manufacturing is declining "everywhere" as a share of GDP? Let's do an international comparison of employment, exports, and value-added in that case (see below). Indeed, one sees problems here as well.

The idea that everything is OK in US manufacturing is a cockroach idea that, no matter how much it is at odds with basic facts, can't seem to die. In a future post, I'll write a bit more about my thesis on what went wrong. Spoiler alert: it has to do with real exchange rates. If true, then a massive tax cut from Trump would be the worst thing to happen to the US manufacturing sector since, well, the last two massive tax cuts.











Thursday, April 27, 2017

Reuven Glick Responds! The Currency Unions and Trade Debate Rages On...

Previously, I reviewed the Currency Unions and Trade Literature here, and then shared some of my students' referee reports of a recent Glick and Rose (2016) paper here. My undergraduates were quite skeptical.

In fact, I first wrote a paper on currency unions and trade as my class paper when I was a grad student in Alan Taylor's excellent course in Open-Economy Macro/History. A doubling of trade due to a currency union looked like a classic case of "endogeneity" to me. Currency Unions are like marriages, they don't form or break apart for no reason. They are as non-random a treatment as you'll find. In addition, we know from Klein and Shambaugh's excellent work that direct pegs seem to have a much smaller impact on trade, and that indirect pegs -- which probably are formed randomly -- have no effect at all. Thus we know that the effect of currency unions doesn't operate via exchange rate volatility, the most plausible channel. Also, the magnitude of the effects bandied about are much too large to be believed. Consider that Doug Irwin finds that the Smoot-Hawley tariff reduced trade by 4-8%.  How plausible is it that currency unions have an impact at least 12 times larger? Or the Euro six times larger? It isn't. A 5% increase in trade still implies an increase of tens of billions of dollars for the EU -- probably still too large. My intuition is that something on the order of .05-.5% would be plausible, but too small to estimate.

Since I was skeptical, I fired up Stata. It then took me approximately 30 minutes to make the magical -- yes, magical -- effect of currency unions on trade disappear, at least for one major sub-sample -- former UK colonies. It took only a bit more time to notice that the results were also driven by wars and missing data, but it was still a quick paper.

In any case, to his credit, Reuven Glick responds, making many thoughtful points. Here he is:

My co-author, Andy Rose, already responded to the comments on your blogsite about our recent EER paper. I'd like to offer my additional one-shot responses to your comments of March 24, 2017 and your follow-up on April 24, 2017.

•As Andy noted, we find significant effects for many individual currency unions (CUs), not just for those involving the British pound and former UK colonies. Yes, the magnitude of the trade effects varies across these unions, but that’s not surprising. So the “magical” currency union effect doesn’t disappear, even when attributing the UK-related effects to something other than the dissolution of pound unions. 

There is a pattern in this literature. You guys find a large trade impact result, and then somebody points out that it isn't robust -- in short, that the effect does disappear. For example, Bomberger (no paper on-line; but Rose helpfully posted his referee report here) showed that your earlier results go away for the colonial sample with the inclusion of colony-specific time trends, while Berger and Nitsch have already showed that a simple time trend kills the Euro impact on trade. Each time you guys then come back with a larger data set, and show that the impact is robust. However, the lessons from the literature curiously do not get internalized; the controls which reversed your previous results forgotten.

Since you mention the UK, let's have a look at a simple plot of trade between the UK and its former colonies vs. the UK and countries which it had currency unions. What you see below are the dummies plotted in a gravity regression for the UK with its former colonies vs. countries that ever had a UK currency union. Note that the UK had something like 60+ colonies while there are only 20-something countries which had currency unions, only one of them a non-colony. What it shows is that the evolution of trade between the UK and its former colonies is quite similar to the evolution of trade between the UK and countries which shared a CU with. The blue bars (axis at left), then show the dates of UK currency union dissolutions, mostly during the Sterling Crisis of the 1960s. What one sees is that there was a gradual decaying of trade for both UK colonies and countries ever in a currency union.




















So, yes, including a time trend control does in fact make the "magical" result go away. In my earlier paper, I got an impact of 108% for UK Currency Unions with no time trend control, but -3.8% (not statistically significant) when I include a simple UK colony*year trend control. That's a fairly stark difference from the introduction of a mild control. And, in your earlier paper (Glick and Rose, 2002), the UK colonial sample comprises one-fifth of the CU changes with time series variation in the data. In addition, the disappearance of the CU effect for UK CUs raises the question of how robust it is for other subsamples, where CU switches are also not exogenous -- think India-Pakistan. It turns out that almost none of it is robust. Yet, you do nothing to control for the impact of decolonization in your most recent work, nor do you ask what might be driving your results in other settings.

The thing is, it isn't just that you didn't see my paper (which I emailed to you). Your coauthor clearly refereed Bomberger's paper, which looks to me like a case of academic gatekeeping. I love academia, damnit! Bomberger deserved better than what he got. I want the profession to produce results people can believe in. With all due respect, you guys are repeat offenders at this stage.  If this was your first offense, I would not have been so aggressive. And yes, I'll confess to having been put off by the fact that you thanked me, even though I did not comment on the substance of your paper, but did not cite me.


•We tried to control for as many things as possible by including the usual variables, such as former colonial status, as well as appropriate fixed effects, such as country-year and pair effects. Yes, one can always think of something that was left out. For example, if you’re concerned about the effects of wars and conflicts, check out my paper with Alan Taylor (“Collateral Damage: Trade Disruption And The Economic Impact Of War,” RESTAT 2010), where we find that that currency unions still had sizable trade effects, even after controlling for wars (see Table 2). Granted the data set only goes up to 1997 in that paper, and we don’t address the effects of the end of the Cold War, the dissolution of the USSR, and the Balkan wars. That might make an interesting project for one of your graduate students to pursue.

OK, sure. I would like to believe that figuring out what was driving the CU effect on trade was incredibly difficult to figure out, and that I was able to solve the puzzle only via genius, but the reality is that others (such as Bomberger; others also mentioned decolonization as a factor too, or reversed earlier versions of the CU effect, and let's not forget my undergraduates), also had prescient critiques. I don't get the sense that you guys really searched very hard for alternative explanations (India and Pakistan hate each other!) for the enormous impacts you were getting. It looks to me more like you included standard gravity controls and then weren't overly curious about what was driving your results. As my earlier paper notes, it's actually quite difficult to find individual examples of currency unions in which trade fell/increased only right after dissolution/formation, without having a rather obvious third factor driving the results (decolonization, communist takeover, the EU). Even in your recent paper, you find strong "pre-treatment trends" -- that trade is declining long before a CU dissolution. In modern Applied Micro, the existence of strong pre-treatment trends is more evidence that the treatment is not random. It should have been a red flag.


•I agree that it is important to disentangle the effects of EMU from those of other forms of European trade integration, such as EU membership. In our EER paper we included an aggregated measure that captures the average effect of all regional trade agreements (RTAs). Of course, aggregating all such arrangements together does not allow for possible heterogeneity across different RTAs. To see if this matters, see a recent paper of mine (“Currency Unions and Regional Trade Agreements: EMU and EU Effects on Trade,” Comparative Economic Studies, 2017) where I separate out the effects of EU and other RTAs so as to explicitly analyze how membership in EU affects the trade effects of EMU. I also look at whether there are differences in the effects between the older and newer members of the EU and EMU, something that should be of interest to your students interested in East European transition economies. I find that the EMU and EU each significantly boosted exports, and allowing separate EU effects doesn't "kill" the EMU effect. Most importantly, even after controlling for the EU effects, EMU expanded European trade by 40% for the original members. The newer members have experienced even higher trade as a result of joining the EU, but more time and a longer data set is necessary to see the effects of their joining EMU. 


OK, but a 40% increase that you find in your 2017 paper is already 25% smaller than the 50% you guys argue for in your 2016 paper. At a minimum, the result seems sensitive to specification. I'm not convinced that a single 0/1 RTA dummy for all free trade agreements is remotely enough to control for the entire history of European integration, from the Coal & Steel Community, to the ERM and EU. Also, 0/1 dummies imply that the impact happens fully by year 1, and ignores dynamics -- part of my earlier critique. If two countries go from autarky to free trade, the adjustment should take more than just one year.  On a quick look at your 2017 paper, I'd say: I'd like to see you control for an Ever EU*year interactive fixed effect. If you do that, you'll kill the EMU dummy much like the EMU did to the Greek economy. I'd also like to see you plot the treatment effects over time, as I did in my previous post. (Here it is again:)

























Once again, I don't know how you can look at this above and cling to a 40 or 50% impact of the EMU. The pre-EMU increase in trade mostly happens by 1992. This increase happens for all EU countries, and indeed all Western European countries. Those that eventually joined the Euro even have trade increasing faster than EU or all Western European countries even before the EMU. If you ignore this pre-trend, you could get an impact of several percent at most for the difference between the Euro vs. EU/Europe in the graph above by 2013, but that difference won't be statistically significant.


•Andy and I agree that the endogeneity of CUs could be a concern, and suggested that employing matching estimation might be one way to approach the issue. Perhaps this would be another good assignment for one of your graduate students, who are interested in doing more than “seek and destroy.”

I agree with this. The point shouldn't be to destroy, but to provide the best estimate possible. I would be more than happy to join forces with you guys and one of my graduate students in writing a proper mea culpa, to nip this literature in the bud. It certainly would take a lot of intellectual integrity to do this, and you should be commended for it if you would like to do this. (Although, I'd note that this your results have already been reversed.) Your new paper has a larger data set, but is sensitive to the same concerns already extant in the literature.

•Lastly, our recent EER paper has over 40 references; sorry we didn’t include you. Please note that your citation date for our paper should be corrected; the paper was published in 2016, not 2017.

OK. You also should have added in a citation for Bomberger's unpublished manuscript, who killed your results on a key subsample.

One question -- now that you guys have my earlier paper in hand, and know that a time trend kills the UK CU effect, for example, and know that missing data and wars are driving the other result, and now that you've seen my figure above showing that the EMU increased trade by at most a couple percent, do you still believe that currency unions double trade, on average? Or, which part of my earlier paper did you not find convincing? And which parts were convincing?

Monday, April 24, 2017

How Bad is Peer Review? Evidence From Undergraduate Referee Reports on the Currency Unions and Trade Lit

In a recent paper, Glick and Rose (2016) suggest that the Euro led to a staggering 50% increase in trade. To me, this sounded a bit dubious, particularly given my own participation in the previous currency unions and trade literature (which I wrote up here; my own research on this subject is here). This literature includes papers by Robert Barro that imply that currency unions increase trade on a magical 10 fold basis, and a QJE paper which suggests that currency unions even increase growth. In my own eyes, the Euro has been a significant source of economic weakness for many European countries in need of more stimulative policies. (Aside from the difficulty of choosing one monetary policy for all, it also appears that MP has been too tight even for some of the titans of Northern Europe, including Germany. But that's a separate issue...)

Given my skepticism, I gave my sharp undergraduates at NES a seek-and-destroy mission on the Euro Effect on trade. Indeed, my students found that the apparent large impact of the Euro, and other currency unions, on trade is in fact sensitive to controls for trends, and is likely driven by omitted variables. One pointed out that the Glick and Rose estimation strategy implicitly assumes that the end of the cold war had no impact on trade between the East and the West. Several of the Euro countries today, such as the former East Germany, were previously part of the Warsaw Pact. Any increase in trade between Eastern and Western European countries following the end of the cold war would clearly bias the Glick and Rose (2016) results, which naively compare the entire pre-1999 trade history with trade after the introduction of the Euro.  Indeed, Glick and Rose assume that the long history of European integration (including the Exchange Rate Mechanism) culminating with the EU had no effect on trade, but that switching to the Euro from merely fixed exchange rates resulted in a magical 50% increase. Several of my undergraduates pointed out that this effect goes away by adding in a simple time trend control. Others noted that the authors only clustered in one direction, rather than in two or three directions one might naturally expect. In some cases, multi-way clustering reduced the t-scores substantially, although didn't seem to be critical. One student reasoned that the preferred regression results from GR (2016) don't really suggest that CUs have a reliable impact on trade. The estimates from different CU episodes are wildly different --  GR found that some CUs contract trade by 80%, while others have no statistically significant effect, some have a large effect, while others have an effect that is simply too large to be believed (50-140%). Many of my students noted that there is an obvious endogeneity problem at play -- countries don't decide to join or leave currency unions randomly -- and the authors did nothing to alleviate this concern. The currency union breakup between India and Pakistan is but one good example of the non-random nature of CU exits.

You'd think that a Ph.D.-holding referee for an academic journal which is still ranked in the Top 50 (Recursive, Discounted, last 10 years) might at least be able to highlight some of these legitimate issues raised by undergraduates. You might imagine that a paper which makes some of the errors above might not get published, especially if, indeed, star economists face bias in the publication system. You might also imagine that senior economists, tenured at Berkeley/at the Fed, might not make these kinds of mistakes which can be flagged by undergraduates (no matter how bright) in the first place. You'd of course be wrong.

The results reported and the assumptions used to get there are so bad that you get the feeling these guys could have gotten away with writing "Get me off your fucking mailing list" a hundred times to fill up space.

Before ending I should note that I do support peer review, and also believe that economics research is incredibly useful when done well. But science is also difficult. This example merely highlights that academic economics still has plenty of room for improvement, and that a surprisingly large fraction of published research is probably wrong. I should also add that I don't mean to pick on this particular journal -- if a big name writes a bad article, it is only a question of which journal will accept it. However, this view of the world suggests that comment papers, replications, and robustness checks deserve to be more valued in the profession than they are at present. Much of the problem with that line of work also stems from almost a willful ignorance of history. Thus, it's also sad to see departments such as MIT scale back their economic history requirements in favor of more math. I don't see this pattern resulting in better outcomes.


Update: Andrew Rose responds in the comments. Good for him! Here I consider each of his points.

Rose wrote: "-Get them to explain how that they could add a time trend to regression (2), which is literally perfectly collinear with the time-varying exporter and importer fixed effects."

Sorry, but a Euro*year interactive trend, or, indeed, any country-pair trend, is not going to even close to co-linear with time-varying importer and exporter year fixed effects. The latter would be controlling for a general country trend, but not for trends in country-pair specific relationships. To be fair, regression of one trending variable on another with no control for trends is the most common empirical mistake people make when running panel regressions.

-Explain to them how time-varying exporter/importer fixed effects automatically wipe out all phenomena such as the effects of the cold war and the long history of European monetary integration.

Sorry, but that's also not the case. A France year dummy, to be concrete, won't do it. That would pick up trade between France and all other countries, including EU, EMU, and former Warsaw Pact countries. You'd need to put in a France*EU interactive dummy, for example. But such a dummy will kill the EMU. Below, I plot the evolution of trade flows over time (dummies in the gravity equation) for (a) all of Western Europe, (b) Western European EU countries, and (c) the original entrants to the Euro Area (plus Greece). What you can see is that, while trade between EMU countries was much higher after the Euro than before (your method), most of the increase happened by the early 1990s, in fact. Relative to 1998, trade even declined a bit by 2013. There's nothing here to justify pushing a 50% increase.
































Rose also indicated that my undergraduates should "Read the paper a little more carefully. For instance, consider; and a) the language in the paper about endogeneity b) Table 7 which explicitly makes the point about different currency unions. "


Actually, let's do that. When I search for "endogeneity" in the article, the first hit I get is on page 8, where it is asserted that including country-pair fixed effects controls for endogeneity. Indeed, it does control for time-invariant endogeneity. But if countries, such as India and Pakistan, have changing relations over time (such as before and after partition), this won't help.

The second hit I get is footnote 7: "Our fixed‐effects standard errors are also robust. We do not claim that currency unions are formed exogenously, nor do we attempt to find instrumental variables to handle any potential endogeneity problem. For the same reason we do not consider matching estimation further, particularly given the sui generis nature of EMU." [the bold is mine.]

Actually, a correction here: my undergrads reported that the FE standard errors are actually clustered, but this is a minor point. You may not claim that currency unions are formed exogenously, but, as you admit, your regression results do nothing to try to reduce the endogeneity problem. And, this, despite the fact that I had already shared my own research with you (ungated version), which showed that your previous results were sensitive to omitting CU switches coterminous with Wars, ethnic cleansing episodes, and communist takeovers.

Also, the "For the same reason" above is a bit strange. The sentence preceeding it doesn't give a reason why you don't try to handle the endogeneity problem. The reason is? In fact, a matching-type estimator would be advisable here.

Lastly, in your discussion of Table 7, I see you note that it implies widely varying treatment effects of CUs on trade. But I like my undergraduates interpretation of this as casting doubt on the exercise. Many of the individual results, including an 80% contraction for some currency unions, are simply not remotely plausible. The widely varying results are almost certainly due to wildly different endogeneity factors affecting each group of currency unions, not due to wildly different treatment effects.

Update 2: Reuven Glick points out that their paper was published in 2016, not 2017. I've fixed this above.


Thursday, April 13, 2017

Monday, April 10, 2017

A Quick Theory of the Industrial Revolution (or, at least, an answer to 'Why Europe?')

Following a Twitter debate involving myself, Gabe Mathy, Pseudoerasmus, and Anton Howes on the theory that high wages in England induced labor-saving technologies and led to the Industrial Revolution, I thought I should lay out my own quick theory on why the Industrial Revolution happened in Britain, or at least, why in NW Europe. In short, this theory is too speculative to write an academic paper about (plus, this theory is not popular with Economic Historians, so it wouldn't publish), and I don't have enough time to write a book. Twitter doesn't provide enough space, so a blog post it will be.

As far as we know, the high-wage economy that persisted in Europe after the Black Death was somewhat of an anomaly for the pre-Industrial world. Wages don't seem to have been as high in East Asia or India, while we actually don't know what wages were like in Africa. On the other hand, wages were even higher in many parts of the New World which had been recently depopulated, and where labor-land ratios were high. In any case, one can see the case for why economic agents would certainly try to cut back on high labor costs through new technologies. This theory has some sense, but I also see a few weaknesses. One is that "a wave of gadgets" swept over England at this time, with inventions including things like the flush toilet, which was not actually labor saving, and also the invention of calculus and of Malthusian economics. Also, many of the big technologies invented were actually quite simple, and so effective they would have been cost-effective to develop and implement at a wide range of relative prices/wages. A recent example of labor-saving technologies being worth it even for low labor costs is robots in low-wage China delivering mail.

Instead, I would focus on the other implication of high wages in a Malthusian world: fewer nutritional deficiencies, and higher human capital. This would include having consumers who can do more than simply buy necessities. High wages aren't enough, but rather I would think it's the size of human-capital adjusted population with which one is in contact with which will matter for technology growth. Thus, the Industrial Revolution could not have happened on a remote island with high wages, and would have been much less-likely in continents with North-South axis a la Jared Diamond. It also means that the rise of inter-continental trade would make overall technological growth faster, as technologies could be shared. This latter part of the story is probably crucial, as the rise of cheap American cotton was sure to be combined with the idea of mechanized textile factory production, especially since the latter had already been invented. (Indeed, one can't even produce cotton in England!)

The above logic could also answer why the IR didn't happen immediately after the Black Death. All I would argue is that if human capital is important for growth, then what we should expect to have seen after the BD in Europe is a relative "Golden Age" with a lot of progress and advance. In fact, the Black Death was the beginning of the end of the Dark Ages in Europe, as the Renaissance, the Age of Exploration, the Protestant Reformation, the Scientific Revolution, the Enlightenment, and the US Declaration of Independence all happened in the centuries thereafter. The printing press was invented in the 1450s in high-wage Germany. Modern banking was invented in Italy in the high-wage 1500s. Henry the Navigator had his formative years in the post-BD period of affluence in the 1410s. Brunelleschi discovered perspective in art in 1504. Newton invented the Calculus in the 1650s. The locus of technological change dramatically shifted to Europe -- before it had been a backwater. Europe then began colonizing the rest of the world. And, as it did, stagnant Eurasian agriculture imported a wealth of new agricultural technologies from the New World. However, I think it's better to view the Industrial Revolution not by itself as a special event, but as one of a long-line of major breakthroughs and accomplishments in the centuries after the Black Death, in which one sphere after another of European society was being transformed. Productivity soared in book-making after the introduction of the printing press, only there was much less elastic demand for books than there was for clothes and fashion, and so an Industrial Revolution could not depend on it. But the initial idea for mechanizing cotton textiles was probably not much larger or that much more difficult of a technological or intellectual breakthrough than these other revolutions, only it happened to be much more consequential in economic terms due to the nature of the industry.

The key difference between Europe and the rest of the world was that European cities and people were filthy, thus had high death rates which kept living standards high. This theory can also explain why "Not Southern Europe", as, for malthusian reasons, the post-Black Death high wages began to decline after around 1650 in the south. Note that this theory is also perfectly consistent with a cultural explanation for Britain's "wave of gadgets". A society in which half of the people suffer from protein deficiency is probably not going to be very vibrant culturally. Such a society may also develop institutions (the Inquisition) which place other barriers on development. Conversely, a society benefiting from high wages for Malthusian reasons might also be more likely to develop a culture and institutions conducive to economic growth.

In any case, this theory isn't completely new -- I've heard others, such as Brad DeLong, mention this as a possibility, but I haven't seen it explored in any detail. However, it won't be as popular with economists as the idea that growth is all about genetics. This idea was actually the first really big "Aha" moment I had after starting my Ph.D., as it popped right out of Greg Clark's excellent course on the Industrial Revolution, from where many of these ideas come. I eventually decided it was too speculative to write my dissertation on, so then switched to hysteresis in trade, and then to the collapse in US manufacturing employment. Maybe one day, post-tenure I'll return to growth...

Update: Pseudoerasmus points me to a very nice-looking paper by Kelly, Mokyr, and O' Grada, with a very similar theory to what I've written. They focus on England vs. France, and on the IR rather than everything which happened after the Black Death, and they don't appear to include Crosby-Diamond type effects, but I still approve.

Thursday, April 6, 2017

The Real Exchange Rates and Trade Literature

Noah Smith asks about the literature on the real exchange rate (RER) and trade.

My own main line of research is about RERs and manufacturing (also here on workers), RER measurement, and I've also done work on the impact of currency unions on trade, or lack thereof.

The first and third of these papers does include a bit on trade, although in neither case is it the focus. The third paper shows that the class of Weighted-Average Relative (WAR) exchange rates does a better job predicting US trade balances than do traditional indices. The difference is largely that the WAR indices reflect the impact of increased trade with the People's Republic of China. (In truth, the WARP index was first discovered by Thomas, Fahle, and Marquez in a very under-rated paper...)



I've also looked recently while prepping for class on a good paper on RERs and trade. In truth, the answer is that I also couldn't really find good, modern work that takes identification seriously (if you know of some, please add in the comments. Actually, when I began working on RERs and manufacturing employment, I could see that this literature had already become quite dated. Results were mixed, the best paper was still before the era in which standard errors were clustered in one direction much less two, each paper implied something different (of the four most recent using US data, one said RERs had only a small impact, a second said the impact depended on the specification, and a third said it had none, and a fourth said it was large), and the literature seemed to me in need of a modern treatment. However, I got a lot of pushback from the first time I presented this paper that the topic had "already been done". I think part of this stems from a certain gullability, and over-confidence among certain economists in their profession. If a paper is published in a good journal, then that's the end of the story. It must be true. Even if the paper was from 1991, and the results came from 600 data points and included few controls, and barely any robustness. In truth, I suspect that probably more than half of published research is straight wrong. In addition, much of what is "right" might offer incomplete evidence, or still has plenty of room for improvement. A perfect paper probably does not exist, and so I see papers that subject already published research results to new tests as being helpful, and quite worthy of publication in top field journals. (This is almost certainly a minority opinion...)

One thing I probably overlooked is simply that the literature on RERs and trade isn't that great. I'm still contemplating doing a paper on this, although I'm sure I'd have a lot of similar pushback. ("This result is not new.") The big issue in this literature is identification. In theory, the correlation between the RER and trade could go in either direction. That's because something like the Asian Financial Crisis could cripple the industrial (tradable) sector and also lead to a severe currency depreciation. Thus, the raw correlation between trade and the RER could be negative, positive, or zero in the aggregate. I haven't seen anyone really try to solve the identification problem here. (Well, OK, there was an economic history paper that looked at countries which pegged to gold vs. silver, and then what happened when the price of gold to silver changes, but what it seemed to me to miss was that this also had implications for monetary policy, and so wasn't quite exogenous.)

In any case, the end conclusion is that this is very much a literature in need of improvement.


Tuesday, April 4, 2017

Some International Data on Industrial Robots

Via Adam Tooze on Twitter.  The picture quality is poor, but here it is below. The US in fact has fewer robots than Germany and Japan, and yet manufacturing employment has fallen by more.




Sunday, April 2, 2017

The Gold Standard, the Great Depression, and the Brilliance of NES Students...

Tomorrow's lecture is, in part, covering a homework on the Great Depression in which the students were asked to redo some of a classic Bernanke and James article on the Great Depression. It's a great article, although the empirics -- like all empirics from the early 1990s -- are quite dated. This actually makes it good to assign, in my view, as it makes it relatively easy for students to suggest improvements. The first time I assigned this, I had a student (A.A.) who wasn't quite happy with the limited information contained in a regression of Industrial Production on gold standard status. Thus, she created this. It makes it clear, aside from the wonders of conditional formatting. There are two things this makes obvious: (1) the later you abandon the gold standard, the worse you did. And, (2) at first, if anything, the gold dead-enders were doing slightly better (or as well), but the further they went the worse they did. It blew my mind. Absolutely brilliant table for an undergraduate to turn in for a two page essay. Go back and have a look at the Bernanke/James tables and compare. There's no comparison. I have a feeling she, and many others I've had the good fortune to teach, will go on to do great things).


Thursday, March 30, 2017

Staying Rich without Manufacturing will be Hard

This title was written by Noah Smith in a recent Bloomberg column. He makes a lot good points which readers of this blog, focused on manufacturing, are already familiar. Manufacturing output isn't really booming. I would stress that, despite that trade is a big reason for why manufacturing isn't so healthy, the number one thing the government could do to promote manufacturing is adopt looser monetary policy. This would help via growing the economy (manufacturing production and employment is strongly pro-cyclical), and also help by weakening the dollar. That's a way better way of improving the trade balance than via bilateral trade negotiations or protectionism.

In any case, it's worth a read.

Raise Rates to Raise Inflation

That's what Stephen Williamson is still arguing.

"recent data is consistent with the view that persistently low nominal interest rates do not increase inflation - this just makes inflation low. If a central bank is persistently undershooting its inflation target, the solution - the neo-Fisherian solution - is to raise the nominal interest rate target. Undergraduate IS/LM/Phillips curve analysis may tell you that central banks increase inflation by reducing the nominal interest rate target, but that's inconsistent with the implications of essentially all modern mainstream macroeconomic models, and with recent experience."

Wow. This is fantastic (fantastical?) stuff.

A couple questions for Neo-Fisherians. Both central banks and market participants believe that higher interest rates slow growth and inflation. If, in fact, higher interest rates raise growth and inflation, then, in a booming economy, why don't interest rate hikes and contractions in the money supply lead to hyperinflation? After all, central banks respond to high inflation by raising rates, which they believe will lead to lower inflation. If, instead, inflation goes higher, they will respond again with even tighter money.

Also, interest rate cuts are demonstrated to cause currencies to weaken. We know this, from, among other things, announcement effects using high-frequency identification. The evidence that real exchange rate movements impact the tradables sector is very strong, and merely relies on the theory that relative prices matter, the central teaching of economics. Then, is all of economics wrong, if prices don't matter?

We also know that market participants/the entire financial sector believes that low interest rates are good for the economy and profits and tight money is bad. We know this from market reactions to monetary policy. A series of interest rate hikes would raise borrowing costs considerably, cause the dollar to appreciate (which will affect inflation directly), cause the stock market to depreciate, and generally cause financial conditions to tighten. Thus, once again, to buy the Neo-Fisherian story, you have to believe that prices don't matter. In addition, you have to believe that markets aren't remotely efficient. Thus, you have to give up the central tenets of mainstream economics. Since "the market" believes monetary policy has the right sign, Stephen Williamson could make a lot of money by starting a hedge fund and betting on the "wrong" sign.

Hey, it's worth pointing out that people can disagree about monetary policy. Williamson (2012) says QE is hyperinflationary, while, on the other hand, Williamson (2013) says QE is deflationary. On the third hand, Williamson (2013; blog post) says money is just plain neutral. What's interesting here is that these are all the same Williamsons.

Thus it's worth peering into his intellectual journey.  First, after QE, despite high unemployment and a weak economy, he repeatedly predicted that inflation would rise. When it didn't happen, he changed his mind, which is what one should do. Only, he couldn't concede that standard Keynesian liquidity trap analysis was largely correct. That would be equivalent to surrendering his army to the evil of evils, Paul Krugman. Much easier to venture into the wilderness, and instead conclude, not that inflation wasn't rising despite low interest rates because the economy was still depressed, and banks were just sitting on newly printed cash, but rather that inflation was low because interest rates were low!

Fortunately, not all of Macro went in this direction, as Larry Christiano, a mainstream economist, discusses the Keynesian Revival due to the Great Recession.



Tuesday, March 28, 2017

Is Academia Biased Against Stars?

David Card and Stefano DellaVigna recently came out with a working paper which says yes.

The (abbreviated) abstract:
We study editorial decision-making using anonymized submission data for four leading
economics journals... We match papers to the publication records of authors at the time of submission and to subsequent Google Scholar citations.... Empirically, we find that referee recommendations are strong predictors of citations, and that editors follow the recommendations quite closely. Holding constant the referees' evaluations, however, papers by highly-published authors get more citations, suggesting that referees impose a higher bar for these authors, or that prolific authors are over-cited. Editors only partially offset the referees' opinions, effectively discounting the citations of more prolific authors in their revise and resubmit decisions by up to 80%. To disentangle the two explanations for this discounting, we conduct a survey of specialists, asking them for their preferred relative citation counts for matched pairs of papers. The responses show no indication that prolific authors are over-cited and thus suggest that referees and editors seek to support less prolific authors.
The interpretation of course hinges on whether you think more famous authors are likely to get cited more conditional on quality. To some extent, this must be true. You can't cite a paper you don't know about, and you're almost certainly more likely to know about a famous author's paper. This is part of the reason I started blogging -- as a marketing tool.

In any case, here's a second recent paper courtesy Andrew Gelman. "Daniele Fanelli and colleagues examined more than 3,000 meta-analyses covering 22 scientific disciplines for multiple commonly discussed bias patterns. Studies reporting large effect sizes were associated with large standard errors and large numbers of citations to the study, and were more likely to be published in peer-reviewed journals than studies reporting small effect sizes... Large effect sizes were ...  associated with having ... authors who had at least one paper retracted." So, studies with large effects are cited more and more likely to be retracted. In general, to get published in a top journal, you almost have to make a bold claim. But of course bold claims are also more likely to be wrong or inflated, and also cited more. 

My prior on the situation is that I suspect that any time "quality" is difficult to judge, it is only natural and rational to look for signals of quality, even if blunt. I would rather be operated on by a surgeon with a degree from Harvard, drink a pricey bottle of wine if someone else is paying, and choose to eat in a restaurant which is full. I'd think twice before putting my money in a bank I've never heard of, or of flying on an unknown airline. These are all perfectly rational biases. The more difficult it is to infer quality, the more people are going to rely on signals. An example of this is lineman/linebackers in the NFL -- since stats for these players are difficult, all-star lineman tend to be those drafted in the first or second round. This is much less so for the skill positions, where performance metrics are easier to come by. In the case of academia, I believe it can be enormously difficult to judge the relative quality of research. How to compare papers on different topics using different methodologies? Add to this, top people might referee 50 papers a year. Neither they nor the editors have time to read papers in great detail. And referees frequently disagree with each other. I was recently told by an editor at one of the journals in the Card study that in several years editing she had never seen a paper in which all the referees unanimously recommended accept. If referees always agreed, it should happen in 5% of cases. (Unfortunately, the Card/DellaVigna paper doesn't provide information on the joint distribution of referee responses. This is too bad, because one of the big gaps between R&R decisions and citations they find is that papers in which one referee recommends reject and the other straight accept are usually not given an R&R, and yet are well-cited. The other thing they don't look at is institutional affiliation, but I digress...) This all speaks to the idea that judging research is extraordinarily difficult. If so, then citations and referee and editorial decisions are likely to rely, at least in part, on relatively blunt signals of quality such as past publication record and institutional affiliation. It would be irrational not to.

So, why did the authors go the other way? I didn't find their use of the survey convincing. I suspect it had to do with personal experiences. The average acceptance rate at the QJE is 4%. That means top people like David Card get good papers (and I believe many of his papers are, in fact, very good) rejected at top journals all the time, despite high citations. He has a keen analytical mind, so it's reasonable to conclude based on personal experience that the rejections despite great citation counts are the result of some kind of bias, and perhaps they are. I've once had a paper rejected 9 times. Of course, I don't believe it could possibly be the fault of the author. Much easier to believe the world is conspiring against me. Of course, I'm joking here, but the very low acceptance rates at top journals, combined with some atrocious accepted papers probably feed this perception by everyone. Personally, I'd still prefer to be an NBER member and in a top 10 department with a weekly seminar series in my field (all of which would increase citations) and a large research budget, but hey, that's just me.

Update: I was interested in how correlated referee recommendations are, so I followed the reference in the Card paper to Welch (2014). Here is the key table. Conditional on one referee recommending "Must Accept", the second referee has a 7.7% chance of also recommending "Must Accept", vs. a 3.8% unconditional probability. If perfectly correlated, then it should be 100%. Even with one "Must Accept", chances are still better than even that the second recommendation will be reject or neutral. So, it's much closer to being completely random than referees being in agreement. The lesson I draw from this is not to read too much into a single paper rejection, and thus wait before moving down in the ranks. In a way, this could bolster the arguments of Card and DellaVigna. Referee reports are so random, how can citations be a worse measure of quality? But I think this ignores the fact that referees will spend more time with a paper than potential citers will. Even if all authors read the papers they cite, how meticulously do they read the papers they don't cite? I think the difficulty of assessing paper quality once again implies that signals should play an important role, not to mention personal connections.