Data Science

Transitioning from Engineering PhD to Data Scientist in Industry, Part II

Part II: What engineering PhDs do well & common gaps

Nasir Bhanpuri, PhD

--

For more context, see Part I: What data scientists do

Engineering PhD students have many of the foundational skills needed for data science, have familiarity with many of the common methods, and may have even completed projects similar to daily data science functions, however, there are some common gaps. For these reasons, there are several “bootcamps” that have sprung up in the last ~5 years to help STEM graduate students transition to data science roles outside of academia (e.g. Data Science for Social Good, Insight Data Science, Data Incubator, etc.).

The tables below list out Functions, Methods, Tools, and Skills I described in more detail in Part I. It also includes the typical level of someone with an engineering PhD and the typical desired level for a data science job. I noted some related resources (non-exhaustive) and a rough estimate of priority as seen from the perspective of the employer.

As a quick reminder, here are the definitions of the major categories:

  • Functions: The activities that data scientists perform on a regular basis.
  • Methods: The approaches/procedures that underlie functions. Typically these are built upon one or more skills and involve tools.
  • Skills: The (generic) abilities pertinent to relevant methods and enable performance of functions.
  • Tools: Common software tools that data scientists use

Note, there is not a consensus in the field on these terms and what they encompass and thus do not be surprised if other data scientists/job descriptions do not organize terms in this manner.

Color key for tables:

Green — indicates items where typical PhD level meets or exceeds typical job level

Orange — indicates high priority items where typical PhD level is below typical job level

White — indicates low priority items

My practical advice for folks looking to transition to data science would be to focus on developing orange items, then white items. I’d recommend highlighting green items during the application & interview process.

Functions
Methods
Tools
Skills

Definition of terms used to describe Levels:

  • Novice: <1 month of experience, familiar with terminology
  • Beginner: 1–6 months of experience, can do basics with little assistance, will need time to learn (and mentoring) for complex project
  • Competent: 6 months-2 years of experience, can complete complex projects with little assistance, will need time to learn (and mentoring) for building continuous learning systems and functional data products, can mentor others on basics
  • Proficient: 2–5 years of experience, can build continuous learning systems and functional data products, can mentor others on complex projects, can mentor others on basics
  • Expert: 5+ years of experience, can mentor others on building systems/products, complex projects and basics

On the bright side, there are many green items, which means that many of the qualifications were already met during the PhD journey and not much additional effort is needed in those areas. However, there are just as many orange items, so you likely will need to fill some gaps before you are a competitive data science candidate. Since most PhDs love learning, I’m confident most folks will be able flip more items to green with a bit of focussed effort.

I’d love to hear from folks as to which aspects of this post (or Part I) are helpful and which aspects could be improved. Please comment here or connect with me on LinkedIn where we can continue the conversation. I’m also compiling a list of tips for conducting a data science job search, please reach out if you would like to hear more.

Good luck with your transition and looking forward to having more engineering PhDs join me in the interesting and exciting field of data science!

--

--

Nasir Bhanpuri, PhD

AI at Virta Health where I use data science to solve challenges in healthcare/medicine. I also use DS for sports, education, and music.