"We found that test-first students on average wrote more tests and, in turn, students who wrote more tests tended to be more productive. We also observed that the minimum quality increased linearly with the number of programmer tests, independent of the development strategy employed."
"The external validity of the results could be limited since the subjects were students. Runeson  compared freshmen, graduate, and professional developers and concluded that similar improvement trends persisted among the three groups. Replicated experiments by Porter
and Votta  and Höst et al.  suggest that students may provide an adequate model of the professional population."
To read a good summary of that paper go to Phil's post and read it to the end.
Update: Jacob rips the report to shreds.