Doing a PhD: Keeping it Simple

Given how many words I’ve already written in this series covering preparing at the outset, engaging with academia, reading and writing, organising yourself, and time, money, and location, it might seem a bit weird to finish by advising you to keep it simple. Doing a PhD is a complex matter, so the following points are about keeping it as simple as possible, rather than making it simple overall. There’s no need for additional complexity in an already complex endeavour:


When you encounter problems, look for simple solutions first

This is partly related to your confidence with the analytical technique that you’re using (see below). So, if you’re anything like me, then if (when) something goes wrong in an analysis that you’re relatively unfamiliar with your kneejerk reaction is to panic. This leads me to cast around for obscure solutions (the logic being that if it’s an obscure mode of analysis then the solution must be obscure too) when it’d be much better to start by looking at the most basic possible option (e.g. check the distributions of all the variables (you should have done this already, of course!)). Countless hours can be wasted looking for complex solutions and, if you didn’t try the easy things first, you’ll feel like a complete tool when you finally realise how simple it was to solve the problem.


Don’t use structural equation modelling (SEM), unless…

…you fulfil the following criteria (this is the most specific, and technical, piece of advice that I give):

  • You’re already confident with advanced statistics;
  • There’s a real benefit to using such a complicated approach;
  • Your models aren’t too complex.

Alas, I didn’t meet any of the above criteria. I finished studying maths (a subject in which I felt chronically underconfident) at GCSE and had barely looked at the subject for ten years. I had no A-levels in maths, advanced maths, or statistics, and knew little about any of those topics. As such, I did the basic quantitative module in my Masters (the advanced quantitative module is reserved for what I call ‘stats whiz kids’, and what others have referred to as ‘statsos’). Thus, from being brought up to speed in a relatively introductory (albeit very well taught) manner, I jumped in at the deep-end. Try reading an online SEM ‘help’ board some time; if you’re not au fait with statistics then you may as well try to get help from a website written in Latin (apologies to the classical scholars amongst you, who scoff at the idea that one wouldn’t know how to read Latin). Indeed, when one of my fellow PhD students who is much more confident and competent with statistics than me (one of those whiz kids I mentioned) heard that I was using SEM, they remarked on how difficult it is. This should have set off massive deafening alarm bells, but I just waltzed on by and carried on along the path of doom. And for what? I mean, really, what has structural equation modelling added to my analysis? Yeah, sure, I can wheel out arguments in favour:

  • It’s good that it allows for the simultaneous estimation of measurement factors and the structural relationships between them (crowd: ‘oooohhhh!’);
  • It has the helpful capacity to separately estimate residuals and measurement error, which allows for improved accuracy in models (crowd: ‘aaaaaahhhhh!’);
  • It’s neat that you can also estimate plausible alternative measurement and structural loadings (i.e. produce modification indices) and thus, perhaps, test competing causal propositions when running models (crowd: ‘wowwwww!’).

But really, even with all of the above acknowledged, what is structural equation modelling except a very complicated way to (still) not prove causality (even assuming that’s ever possible). Of course, I can be confident that some of the variables in my model are causally prior to others (e.g. it’s fair to say that age precedes political views), but that would also be the case if I’d used a run-of-the-mill multiple regression. By contrast, all of my variables of interest (e.g. levels of cultural capital and levels of political participation) can be plausibly argued to precede one another (or be mutually reinforcing). This point stands regardless of how complicated the analytical technique used to analyse their relationships is (and such techniques are no substitute for longitudinal or experimental data). Thus, having failed to meet the first criteria, I also fail to meet the second by not really being able to see the benefit of having poured days, weeks, and months into an analytical approach that is effectively just an over-the-top way of saying how clever you are with statistics (which I’m not). Finally, the failure to meet the first two criteria was confounded by the fact that I was trying to analyse overly-complex models (e.g. my final model included 106 indicators, estimating 34 latent factors), which the software that I was using (Mplus) really isn’t designed to do. In short, using SEM was a two-year nightmare that greatly undermined the other aspirations that I had for my research. So, my conclusion vis-à-vis SEM? Balls to that.


Don’t get waylaid by side analyses

Interim analyses (by which I mean using messing around with your data without a clear purpose), fiddling with interesting data, and working on analyses suggested by other people can all seem useful but, unless they have a concrete pay-off (e.g. for you publications), they should be deprioritised. This means that if you do decide to take them on then your main analyses should remain the priority (i.e. the first thing you spend time on each day), and there should be a limit on the time you spend on such side analyses. I spent months working on multiple regressions (the time consuming bit was processing and recoding data) that I thought might provide useful interim findings but ended up being of almost no use. Instead, I could have used that time to start getting my head around SEM (assuming I didn’t follow the above advice) and getting that analysis done in a timely manner. So, decide on your analytical approach and do it. Mistakes will happen, and time will be spent on results that get revised or dropped, but the focus should be on the process that will give you something to write about in your thesis or in publications.


Reserve the last three months

Whenever your final deadline is set for, make sure that the preceding three months are kept as clear as possible. This means opting out of conferences, extra-curricular responsibilities and (if you can afford it) teaching so that you can focus exclusively on finishing your thesis. Also, don’t be tempted to go to interesting looking events unless you’re confident that they’ll have a concrete pay-off. If this means that you feel like you have spare time then good, because you’ll need it. Also, it means that you can afford to take breaks, so that you’re less intellectually (plus emotionally and physically) exhausted as you sprint towards the finish.


Phew, that’s quite a list, but I’ve reached the end of everything I can think of for now. Of course, hindsight is a wonderful thing and all that, which is why I’ll keep this open to additions in future (comments, suggestions?). But, otherwise, and rather unceremoniously, rant ends.





Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s