DSI Halfway Point Check-in

Samuel Yeager
2 min readFeb 3, 2021

It’s been a busy 7 weeks.

As of today, February 1st, 2021, I am halfway through my education in General Assembly’s Data Science Immersive course. I decided back in November, after getting laid off of my barista job, that I needed to get training that would prepare me for a real career. 7 weeks in (one week of which was a holiday break), I feel like I’m really getting there.

My most exciting project so far uses Natural Language Processing to examine posts from two subreddits (in this case r/ffxiv and r/wow) and builds a classification model to discern which subreddit is the home of a given reddit post. I went in with the hopes of finding a significant difference in sentiment or in common language used, but most of the language that pointed my model to one or the other was language that was extremely specific to either game (job vs class, free company vs guild, Dragoon vs Shaman). Still, it was an exciting project, and it taught me a lot about the data science process.

I’ve learned so many things. I’ve learned how to take a dataset of hundreds of house sales, and turn that information into a model that would help homeowners improve their chances at a profitable sale. I’ve learned how to take reams of text posts and titles from reddit and analyze the words, using machine learning to decide what subreddit the posts belong to. I’ve learned how to harness complicated models like Support Vector Machines and Random Forests. I’ve even learned how to take all this information and knowledge and present my own findings to an audience, without any technical background at all, in a way that they can understand and use. I feel like I know so much more than I did when I began.

From here out, I’ll be refining my newfound skills, and learning some new ones. SQL is up next, and while I’ve got a little experience with it from some independent study, I expect that the education I’m about to receive will be much, much more thorough. Beyond that, new, strange horizons stand. Bayesian statistics, time series analysis, capital B Big data. And that’s just what’ll be covered in this class.

Once I’ve completed this course, I hope to take my newfound knowledge into the world and build myself a career. I’ve gotten tastes of some tools that I want to study independently as well, things like Tableau and AWS. I will put all these skills into a toolkit that can better the lives of others, and in ways that follow with my own ethical rules. I want to use these new tools to find work where I can build a better world, and make sure that better world is enjoyable by everyone.



Samuel Yeager

Current Data Science student, future data scientist.