Some lessons from my Ph.D.

6 minute read

I want to share some general ideas that will make your PhD journey more enjoyable, less stressful, and much more productive. In my experience, most of the tips are useful beyond your college years: I still follow them, so practicing them during 3 - 4 years of a PhD can benefit you in the next steps in your career. These ideas won’t work unless they become a habit, but I believe that it is worthwhile trying them out and adjusting to your own needs.

Strategy and time management

Think, reuse, then do

“Only a fool learns from his own mistakes. The wise man learns from the mistakes of others.”

It is very exciting to start implementing a brilliant idea you came up with. It is more efficient though to ask people around you what they think. Do your research on the topic first. Perhaps (almost certainly) others have already done something similar, or failed trying.
Lead or be led

“Lead or be led” is not about being proactive versus following someone. It is about proactively following someone when you have no clue what is going on. Read this article.

If you don’t have enough experience on the topic of your PhD, it will be your job to find people that can lead and support you. Usually, more senior people have a good understanding on what can be done, so listen to them. Digest those ideas and transform them into sensible stories, and bring them to your advisor (with appropriate acknowledgement of the author).
It easy to succeed if you make small steps; it is easy to fail if you jump too far

Quite obvious suggestion: make incremental improvements, so you can control the outcome and the resources it takes to achieve a clear target.

When you fail however, fear not: instead, publish to the Journal of Failed Research. The Journal of Failed Research is your personal collection of notes and code that you spent time on developing but did not quite work in the end. Consider this to be a real, peer-reviewed journal - it is really worth documenting what you wanted to achieve, what your journey was like, and why do you think it failed. I am sure that yourself (or even other people) will find it useful to read such paper in future, because of the idea number 1.

Once I spent 3 months developing adjoints for Perfectly Matched Layers for acoustics just to realize I don’t need that - but the outcome of those 3 months was a pack of handwritten notes that I could not follow two years later when I was writing my thesis. Did I waste almost 500 hours of work? Yes. Could I do better? Absolutely. Adding an extra week of work (which is less that than 10% of the time I had already spent!) to properly document everything would make my whole summer meaningful.
Push it to the limit in your good days, and relax in your bad days

Bad things happen. Sometimes you get cold, migraine will suddenly appear, or it is just too warm outside to concentrate. That is perfectly fine to switch focus from the work and watch your favorite Friends episode (or a season. Or two.) I would advice to switch off your brain in bad days. Do not try to sit on two chairs - there is little reward in doing hard work when you are not ready.

Account for bad days though. I miss one day a month because of a health condition, which means I miss one month every three years: all of a sudden my PhD was 35 months, not 36. That is unfortunate, but I know that in advance, so I can compensate for that when I get into the flow state.

Technical tools

Use version control

Use version control for everything. Full stop.

Version control systems are used by virtually every sensible company in the world. Keep track of the changes you and your collaborators apply to code, be confident that you can mess with a code base and roll back weird changes painlessly, be sure that you and your advisor are on the same page and run the same analysis - this and many more is granted to you for free. If you are not familiar with the concepts of version control - spend a week doing an online course on Git, just to save yourself months (or even years) of time. Unless, of course, you really enjoy trying to recover data and code that you wrote last November but have no clue where it is and how to run it.

I prefer Git + github. Github provides a student developer pack which contains lots of freebies and technical resources.

Check with your institution: it might be that you have access to Gitlab server hosted by your department, which makes it very easy to share code and data with your colleagues.
Make your research reproducible.

Do you have a program that incorporates a fascinating idea and produces useful data? Sounds awesome. Now let’s make it reproducible:
- Document the steps required to run the program on a brand-new machine, and get the same results.
  
  This document should include the assumptions you make on the operating system you use ( installation can be quite different on Windows and Linux), assumed compiler or interpreter (e.g., is it python3.6 or python3.10?), external dependencies (do you need numpy? which version?).
- Automate the installation process
  
  In principle, your code should be executable by a headless machine, without any manual interaction. I would suggest looking into Github Actions as a proxy for building your work environment, and a way to run and verify that your programs produce sensible results.
  
  Nobody should ever struggle with “it does not work on my machine!”.
- Write tests
  
  Naturally, some experiments or data analysis can run for days. It would be very hard to re-run a 3 day pipeline every time you modify your code to check that everything still works fine. Instead, write small, consistent unit tests that can be executed in a fraction of a second. Unit test is the smallest possible piece of a computational pipeline that verifies that a very specific functionality works as expected. It is encouraged to have a unit test for every action your code can perform, but focus on the critical features first.
- Combine version control, automated installation, and testing into a Continuous Integration
  
  Once you have these three pieces in place, you can execute the whole test suite for your project in a controllable, reproducible manner on any machine. If all tests pass on a fresh machine without manual intervention - congratulations! You, your advisor, and pretty much everyone in the world can verify by themselves that your results make sense, and you are protected from most of the reproducibility questions.
Reproducible research is a very interesting topic, and I would recommend familiarizing yourself with the patterns and tools you can apply for your project. A good start can be found here.
Master your tools

Regardless whether you are a Python, C++, or Matlab adept (please, don’t do Matlab…), getting to know your tool is very important. Many times, I have seen (even in mature projects!) code that does not make sense, or code that generates such obscure errors that one would spend a week fixing a single symbol typo that broke everything.

Futhermore, there are so many really powerful products given for free to students (VSCode, JetBrains IDEs, various Git tools, etc., etc.) that can accelerate your research by order of magnitude. Do not hesitate to ask professional developers community which tools can work for you.

In the end, understand what you are doing. There is no silver bullet, and if something just works magically for you, be very cautious.

To sum up, a PhD is a big commitment and a very interesting time investment. There is a lot to learn during this time, but beware of getting wrong habits. Regardless of whether things are going great or not quite, there is almost a limitless list of skills to acquire - just keep going, one step at a time.

Some lessons from my Ph.D.

Strategy and time management

Technical tools

You may also enjoy

Demystifying python container types

Sorry, I still don’t understand what type variance is

The last guide on mocking you will ever need

The Lord of the Coins: The Two Towers