Photo by Beau Runsten on Unsplash
If you want a great job,
you need something of great value
to offer in return.
— Cal Newport
Some time back I posted a list of “researchers without a PhD” and it got viral. Even people with a PhD liked the post and it made me think there may be some truth in becoming a researcher without a PhD.
Colin(T5 paper author) tweeted this recently and it seemed like nature’s call to me to write this article. There are many interested in serious research if they get the right environment.
The below set of question-answer is a personal enquiry on becoming a better data scientist.
Let’s begin…
1. What master chess players do to keep improving?
They think holistically about the pieces and exploit their strengths for the end goal. They will gambit if they can.
2. What does this mean for a data scientist?
We need to have a ‘goal’ in mind and need to achieve it under the constraints of time, data, compute and intellectual resources.
3. What am I saying exactly?
We need to learn to accomplish research with the given constraints.
4. Okay, and what should I work on?
“People who make impact, select important problems to work with”.
— Richard Hamming (Turing award winning researcher at Bell Labs)
5. Who gets the opportunity to work on important problems?
We live in a world of rapid advancement. I am interested in making new language models but I cannot do it without having the compute to run experiments. I do not have access to the intellectual capital of people the way FAANG researchers have it.
To be a FAANG researcher, I need a PhD from a top college.
6. Do I surely need a PhD for doing impactful work?
More impactful work is done by PhDs because they have access to the resources that come with joining a research group.
7. I see…How tough is it to get a PhD at top groups?
8. If I luckily get a PhD, how long will it be?
It seems I need to spend around 3-5 years 🤯
9. Will I spend 4±1 years of my life to improve the next 25 years?
I don’t think so.
10. How will I ever get a chance to do impactful work without a PhD?
I really don’t have an answer but I do know some people who made it without a PhD.
Christopher Olah at OpenAI
Known for LSTM blogs and distil.pub
Victor Sanh at Hugging Face
Known for DistilBERT and Movement Pruning
Denny Britz at ex-Google
Known for blogs on CNN, RNN and algo-trading
Melvin Johnson at Google
Known for work on multilingual translation
Niki Parmar at Google Brain
Known as one of the authors of the Transformer paper
Alec Radford at OpenAI
Known for generative modeling such as Jukebox and Image GPT
Jacob Devlin at Google
Known as one of the authors of BERT
Goku Mohandas at ex-Apple and ex-Ciitizen
Known for work in NLP(Healthcare) and now runs madewithml.com
Madison May at Indico
Known for NLP blogs and finetune(Scikit-learn style model finetuning for NLP)
Andreas Madsen (independent researcher)
Known for his story and Neural Arithmetic Units
Stephen Merity at ex-Salesforce
Known for AWD-LSTM and Single Headed Attention RNN
Jeremy Howard (Who doesn’t know him!)
Known for Kaggle, fastai initiative and ULMFiT which fired transfer learning in NLP!
(There will be many other researchers but I only know a few in NLP)
11. Ahh..there seems some hope! How can I stand out as a researcher?
12. What about conference papers?
13. Should I go wide or deep?
14. And what happens after I do some projects and research blogs, if not papers?
Yessss…this question and the answer.
💥 QnA with the researchers 💥
—> Goku Mohandas at ex-Apple and ex-Ciitizen
What are your research interests?
NLP, healthcare, platforms, network effects
What challenges did you face as a researcher without a PhD?
Initially, it was proving myself to those who traditionally hired PhDs from top tier programs who have several publications at top tier conferences. When you don't have this (because you didn't want to do it or couldn't afford to do it), you need to think about how to differentiate yourself and prove that you are fit for the role reserved for those with higher education.
Even if you were hired by proving yourself, your research colleagues will also look for the same proof of value. So you want to really solidify your value add and even better if you can do something the rest of your team cannot.
What are your suggestions for people wanting to research without doing a PhD?
Be different (applies even if you have a PhD). You can demonstrate a focus on a specific area (much like a PhD) but instead of publications at top conferences, you can have a portfolio of end-to-end projects that you've built that demonstrate research and product (which you can still publish on).
Over the years, we've hired many researchers with and without PhD. The best signal for performance has been a portfolio that demonstrates their ability to apply research. Just because someone has X years of formal school doesn't make them a better candidate or researcher.
Don't be afraid to apply your research (applied research is still research). You don't always have to work on devising new architectures or chasing SOTA on specific datasets. Applying and extending research in a focused area is still amazing research and it can have a profound impact on practical applications in your field!
—> Chris Olah at OpenAI
What are your research interests?
NLP, explainability
What challenges did you face as a researcher without a PhD?
Academia challenges
It’s important to be aware that not having a degree can have several negative long-term consequences.
University degrees have a lot of signaling and credentialing value. They cheaply communicate individuals and organizations that you have some baseline skills, at least in theory. In some fields, you can succeed without a degree by demonstrating skill in other ways (publications, open-source projects, portfolios, talks, awards, work history, referrals, etc). Other fields are less accepting.
How can you tell which type of field you're in? One useful test can be to look for examples of people who are successful without degrees in your field. (Author: The reason for writing this post is now validated 😝)
There's also a weird flip side to all these downsides. Once you establish yourself as competent there is this kind of threshold effect where not having a university degree can suddenly start causing people to actually take you more seriously. This kind of counter signalling effect seems to be common when you do non-traditional things.
Family challenges
Unfortunately, even if you feel confident that you would be best served by taking a non-traditional path, many young people face significant social and emotional barriers to doing so, particularly from adults. While some people are lucky enough to have a family that will support them doing something unusual, many are not.
Social challenges
For many people, university is a period of social development. They learn social skills, make long-term friends, and form romantic relationships. For some people, especially readers of this essay, it seems possible this is the biggest benefit of university.
What are your suggestions for people wanting to research without doing a PhD?
If your plan is to learn or work on projects independently, it’s worth thinking especially carefully. This can be great -- I did it for three years and grew a lot as a result -- but it can also very easily fail.
Some important questions to ask yourself are:
Do I have things I deeply want to spend a year of my life exploring or working on?
Do I have a way to support myself that leaves me time and energy to grow?
Can I really work self-directed for months at a time?
Do I have examples of me working hard on a personal project or learning without external structure?
Do I have or can I learn the skills I need to work on this project independently?
Do I have sources of community, peer support or mentorship for what I want to do?
(You can read his very detailed blog on this topic.)
—> Andreas Madsen (independent researcher)
What are your research interests?
Interpretability, or more generally trust in ML
What challenges did you face as a new researcher?
Lack of computational resources.
Nobody experienced to proofread papers before submission.
Keeping up with research at a broader scope on your own.
Finances, how do I pay my rent when nobody is sponsoring me.
Lack of support network with similar difficulties.
Staying motivated when working in isolation.
Fear of getting the paper rejected.
What are your suggestions for people wanting to research?
Pair up with a friend who can also think critically about research. Being the first author is super important at the beginning of a research career, so don't do equal contributions. Instead, work on two papers, where each of you is first-author on your own paper and is critical of the other’s work.
Secondly, do smaller scoped open-source solo projects - its small success will keep you motivated and makes the risk of rejection feel smaller.
Thirdly, find a field with not too much activity and which doesn't require large computational resources. GPT3 might be cool but it’s not for independent researchers.
—> Excerpts from An Opinionated Guide to ML Research
Roughly speaking, there are two different ways that you might go about deciding what to work on next.
Idea-driven: Follow some sectors of the literature. As you read a paper showing how to do X, you have an idea of how to do X even better. Then you embark on a project to test your idea.
Goal-driven: Develop a vision of some new AI capabilities you’d like to achieve, and solve problems that bring you closer to that goal.
While I was working on locomotion and starting to get my first results with policy gradient methods, the DeepMind team presented the results using DQN on Atari. After this result, many people jumped on the bandwagon and tried to develop better versions of Q-learning and apply them to the Atari domain.
However, I had already explored Q-learning and concluded that it wasn’t a good approach for the locomotion tasks I was working on, so I continued working on policy gradient methods, which led to TRPO, GAE, and later PPO—now my best-known pieces of work.
Choosing a different problem from the rest of the community can lead you to explore different ideas.
Sometimes, people who are both exceptionally smart and hard-working fail to do great research. In my view, the main reason for this failure is that they work on unimportant problems. When you embark on a research project, you should ask yourself: how large is the potential upside? Will this be a 10% improvement or a 10X improvement?
I often see researchers take on projects that seem sensible but could only possibly yield a small improvement to some metric. During your day-to-day work, you’ll make incremental improvements in performance and in understanding. But these small steps should be moving you towards a larger goal that represents a non-incremental advance.
Besides reading books, PhD thesis and seminal papers, you should also keep track of the less exceptional papers being published in your field.
Reading and skimming the incoming papers with a critical eye helps you notice the trends in your field (perhaps you notice that a lot of papers are using some new technique and getting good results—maybe you should investigate it). It also helps you build up your taste by observing the dependency graph of ideas—which ideas become widely used and open the door to other ideas.
Industrial perspective from a veteran PhD
—> Ajit Rajasekharan - CTO at nference
50+ patents and a successful entrepreneur
What are your research interests?
I don't do research. I work on specific problems in NLP that are relevant to certain application domains of interest to my company.
What challenges do you face at work?
The challenges I face have to do with finding working solutions meeting deployment thresholds of performance and scale, in short time windows.
While a PhD could certainly help in solving hard open-ended problems, the nature of problems in the industry often requires a unique skill that is perhaps best honed only by solving such problems, under time constraints, requiring solutions that can be deployed at scale - it is not enough to just beat a previous benchmark by a few percentage points.
A specific class of problems I work on, driven by practical constraints of the absence of labeled data, is finding unsupervised solutions to problems traditionally solved with supervised models.
A concrete example is unsupervised NER for custom entity types without labeled data. I converged after many failed attempts, with a reasonably working solution (given some constraints) that can be used in production. I doubt having a PhD would have helped me converge on such a solution for multiple simple reasons. The approach and nature of the solution certainly wouldn't merit a PhD to begin with - it’s a simple solution leveraging an existing unsupervised model.
Unsupervised NER remains an active area of academic research to date and sadly has orders of magnitude fewer papers relative to supervised NER.
What are your suggestions for people wanting to research without a PhD?
There are companies, even if not many, that have hard problems that need to be solved in shorter time windows than a traditional PhD time frame. While they may not be very open-ended both in their objective and solution strategies, the problems they solve are not trivial. An example of this is Tesla. Tesla continues to make tangible incremental progress in self-driving deployed to its fleet of cars. Any Tesla owner can testify to this progress, despite all the inevitable glitches of such a rapid pace.
However, there is the other end of broad open-ended problems that can only be solved by longer time frames in a PhD setting, ideally even without the time constraint of having to publish regularly. Examples of this is progress in Deep learning 2.0 as outlined by Prof. Yoshua Bengio, Yann LeCun, and others.
Those of us in the industry, regardless of having a PhD, would have nothing to build on, without progress in such academic research settings. Tesla largely owes its progress in self-driving to CNNs which had its origins and development spanning academia and research labs(Bell labs - interestingly an early embodiment of CNN was first deployed for the practical problem of recognizing bank checks).
I consciously chose not to directly answer the question above. My suggestion for those entering the field, is to choose one of these areas of work - industry or academia exclusively, the transition between them or even straddle them, based upon your strengths, interests, and circumstances.
All in all
Surely it’s going to be difficult for anyone to simulate the environment of top research groups without access to peers and mentors. Albeit, you can always develop taste for finding good problems, make friends of similar interests, and request someone to be your mentor.
If your end goal is to optimize impact on the world, keep building things others need which others haven’t.
My framework
Which are the important problems where I can get high improvements?
Which of these problems can I work with my constraints?
Whom can I ask for guidance and mentoring for the selected problem?
I hope this inquiry helps you become a better researcher with or without a PhD.
Ending this on another of Richard’s quotes 😊
“Beware of finding what you're looking for.
— Richard Hamming
Come join Maxpool - A Data Science community to discuss real ML problems
Ask me anything on ama.pratik.ai
You can try ask.pratik.ai for any study material.
What are your views for MS? Do you think the same discussion applies to that as well?
very well written article? I resonate with the thought of researching without a Phd. What's in a degree? The learning never stops as long as you are interested. A paper cant prove that. But yes, acceptance is something that I do relate with what Goku mentioned. Its tough, but have to make it through your own niche way.