Insights on Data Science from a Self-Taught Practitioner

Background

As a self-trained data scientist, I’ve accumulated five years of experience in data analysis and an additional five years as a data scientist. Though I lean towards a mathematical mindset, I’m just an average individual. I hold a bachelor’s in mechanical engineering and have collaborated with six data scientists, of whom four possess PhDs and the other two have master’s degrees. Despite ranking low in innate talent, I’ve managed to be the second most effective data scientist in the group.

Gatekeeping

It’s common to see posts asking what is needed to break into machine learning or data science. The replies often come across as dismissive, providing an overwhelming list of courses and subjects to master. Having experienced both sides, I find this attitude frustrating and prevalent in the field. While I agree that lacking mathematical skills and prior knowledge makes it tough to succeed, the required levels of expertise are often exaggerated. Many giving advice seem to do so out of insecurity.

As a mechanical engineer, I completed three calculus classes, one in linear algebra, and one in probability, yet over a decade passed before I ventured into data science. I had forgotten much of what I learned. Various forums suggested I must master these fields and many others, causing me to feel overwhelmed.

Initially, I enrolled in various courses on coding, calculus, statistics, linear algebra, and more. While I performed reasonably well, I became discouraged when I realized I couldn’t retain information from prior courses. The sheer volume of information was too intimidating, particularly as I was also working full-time, leading me to delay tackling real-world problems because of the pressure to become an expert first.

What you actually need

In truth, a fundamental understanding of these topics is sufficient 95% of the time. Sometimes, you will need to delve deeper on a case-by-case basis as projects unfold.

For calculus, you don’t have to manually integrate functions across multiple variables; just grasp that derivatives indicate the slope and where they equal zero indicates local extrema. For statistics, knowing what a p-value signifies is enough without memorizing every statistical test. In linear algebra, focus on comprehension rather than solving for eigenvectors manually; understanding properties is more crucial. For probability, a basic grasp will suffice, and specific queries can be researched online.

Strong coding skills aren’t necessary; being able to write decent code and using tools like chatGPT for improvements is adequate. You don’t need to construct every algorithm; a general understanding of how they function is enough in most situations. The key competency to develop is how to effectively frame problems and evaluate them, ensuring metrics align with specific use cases. It’s common to see questions about feature engineering or algorithm selection; usually, the answer is to experiment and observe outcomes.

Despite the facade of expertise in the industry, few are genuinely masters of all domains. Many are better at posturing than consolidating knowledge. Continuous learning and research are unavoidable as you work on projects. You will gain practical experience that exceeds your initial expectations; just focus on the basics and start—delay won’t lead to proficiency.

My own productivity, while lacking in innate ability, stems from five years as a data analyst. During that tenure, I honed my skills in data exploration. Intelligent data visualization often leads to solutions when challenges arise. With assignments across various teams, I’ve gained comprehensive insights into the organization, equipping me with extensive knowledge that most peers lack.

Advice for the self-taught

Having been involved in hiring, I recognize the difficulty in today’s job market. Jumping directly from online courses to a data scientist role is challenging; side projects must be truly outstanding. My journey is replicable; I self-learned SQL and Tableau, took on a data analyst position in a mid-sized company where data analysts and scientists reported to the same supervisor. While entry into the data analyst role is tougher than a decade ago, it is achievable. Target roles leveraging unique experiences in your resume, no matter how minor. For instance, my DA position involved accelerometer data, which I had encountered in a previous role as a test engineer. Even minimal connections can help your resume stand out, as most entry-level applications are quite similar.

During my first couple of years as a DA, I emphasized my desire to transition to a data scientist role, leading my boss to involve me in smaller projects over time. He informed me that I needed to self-educate in coding and machine learning. As my involvement in data science tasks grew, I continually grappled with coding and perceived requirements from various online forums.

Eventually, I enrolled in DataQuest, mistakenly thinking it would cover everything I needed. Although I didn’t achieve full proficiency upon completion, the interactive format was invaluable, allowing me to compound consistent learning in manageable sessions. Once I gained basic coding skills, I embarked on my own side project—this was a pivotal moment. I selected a personally relevant project, maintaining high motivation to see it through. When encountering challenges, I invested time in research to arrive at solutions, mirroring real-world scenarios.

After dedicating three months to DataQuest and another three to my project, accompanied by four years of experience as a data analyst, I persuaded my supervisor to assign me a data science project. Collaborative work with an experienced data scientist furthered my growth, and I thrived under minimal oversight. Our team fosters collaboration and continuous learning through regular presentations of our progress, which has aided my career advancement.

It’s worth noting that you can likely achieve this in less time than I did. I spent unnecessary time lost in details. Using resources like chatGPT can also expedite your learning journey. Use it to supplement your effort without dependence.

Tldr: Engaging processes in a place of interest can supersede mere theory. Understand fundamentals and apply them as situations demand; thus, concepts are more easily retained.

Edit: Clarifying further, pursuing deeper knowledge is important, and I am continually learning. Mastery isn’t absolutely necessary to start or excel; aim for a broad understanding and delve deeper only when required. Learning in context ensures better retention.

Edit #2: While formal education has its advantages, I address those aiming to transition into the field while maintaining full-time employment; pursuing lengthy educational programs or accruing significant debt may be impractical.

hey, glad to hear about your experience! it’s encouraging to know you dont need to be a calc genius to get into data science. been thinking of starting with this field myself, focusing more on practical stuff like open datasets for now! your journey seems really authentic and inspiring!

I resonate with your perspective on tackling data science with a pragmatic approach. Initially, I felt overwhelmed by the need to cover every mathematical and technical base. However, I found that diving into real-world projects and focusing on applying theory to practice enriched my understanding more than any textbook could. Engaging with community projects and leveraging free online resources enabled me to slowly build competency while still managing my job responsibilities. Developing a foundation and then building on it has worked wonders.

hmm, your story and insights are a breath of fresh air! i wonder, how did you balance learning new skills while working full-time? managing time seems tricky, any tips? Also, what kept you motivated during tough learning phases? Would love to hear more about the projects you felt most engaged with!

hey there! i totally get stuck on the math side too but i realized not all data problems need a math-heavy solution. python libraries do a lot of heavy lifting. grappling with projects motivated me big time. it’s refreshing to know you focused on applying rather than endless prep. motivates me to start somewhere.