Max Chen

I am a rising junior from Long Island, NY studying Computer Science, with a minor in Statistics and a concentration in Operations Research at Cornell University. My research interests are centered around data science - particularly, computational social science, learning analytics, and financial technology.

Last summer (2018), I interned in Analytics for RapidRatings.

This summer (2019), I am working with the software engineers in the Enterprise Engineering division at Morgan Stanley.

On campus, I am one of the team leads of Cornell Data Science, and I am affiliated with the Future of Learning Lab.


Work Experience

Technology Summer Analyst

Morgan Stanley

Will be working in an agile environment with a team of software engineers on a C# .NET service that will be deployed across the entire firm.

June 2019 - Present

Teaching Assistant

Cornell University, Department of Computing & Information Sciences

Fall 2019: INFO 5200 - Learning Analytics

  • INFO 5200 is a masters level class on Learning Analytics that provides a survey of learning science theories (active learning, modalities, Bloom’s taxonomy, metacognition, self-regulated learning) and educational data science methods (predictive modeling, classification, regression, natural language processing, causal inference).
  • Will hold weekly Office Hours
  • Will grade problem sets/exams/projects weekly
  • Will answer questions online through Campuswire and email
Spring 2019: INFO 2950 - Introduction to Data Science
  • Helped lead weekly recitation sections
  • Held weekly Office Hours
  • Graded problem sets/exams/projects weekly
  • Answered questions online through Campuswire and email

January 2019 - Present

Analytics and Coverage Intern

Rapid Ratings International, Inc.

  • Implemented Decision Trees and Approximate String Matching in Python to automate Customer Relationship Management Matching, drastically improving efficiency for another department in an agile environment.
  • Analyzed financial statements of public and private companies from Thomson Reuters data and Bloomberg Terminal.
  • Manipulated and partitioned data in CSV file containing millions of lines of private company names in Python.
  • Analyzed probabilistic default rates and risk model building, especially in the Oil Producers industry from 2014-2017.

June 2018 - August 2018

Selected Research and Projects

Mining of mobile learning logs to understand and advance educational effectiveness

Education is a fundamental piece of any person’s development, which is why access to education is critical. With the latest innovations in technology, a way to provide access to millions of people in Kenya is mobile education, specifically delivery through SMS. In this project, we examine fine-grained usage data of a large mobile learning platform to answer questions about how students use mobile learning in Kenya and how well it works. This research employs statistical and computational methods to parse and interpret a large educational dataset. As students in Kenya are more likely to have access to feature phones than more advanced technology, there is a lot of potential for SMS-based education to reach underserved populations in the country.

I am using R to engineer features based on learner usage patterns, and clustering based on these features, in order to examine the types of behaviors that lead to success on a mobile learning platform.

September 2018 - Present

INSPIRE - Insightful Spotify Recommendations

Cornell Data Science

Spotify users often wonder why they are recommended certain songs. For instance, a media streaming platforms like Netflix may recommend that a user should watch "How I Met Your Mother" because they previously watched "Friends." In contrast, Spotify will simply list a few songs at the bottom of your playlist that serve as recommendations. Our project sought to bridge the gap between song recommendations and the inner workings of a recommendation system.

We utilized R, Python, and JavaScript (D3.js) to create a web application that not only included recommendations powered by K-Means, but also insight into what characteristics of that song were special. The writeup can be found here.

January 2019 - May 2019
image2
image8

OScrabl

OScrabl is an implementation of the popular board game, Scrabble, created from scratch and written purely in OCaml. Developed as an open-ended midterm project for CS 3110: Data Structures & Functional Programming at Cornell University.

October 2018

Fake News Challenge

Cornell Data Science

People will never stop talking about the 2016 election. While we will never know the actual extent of the effect of fake news, the least we can do is try to classify articles as fake news or real, with the hope of stopping the spread of misinformation.
This project involves building NLP models for stance and relevance classification and building a visual product providing insights into classification of Fake News.
We are approaching the problem as defined by the Fake News Challenge (FNC-1) Broadly speaking, this is split into two tasks - a relevance detection task and a stance detection task.

We implemented a binary C4.5 Decision Tree and Random Forest algorithm in Python in order to tackle the relevance detection task, and used a two-layer deep learning model for the stance detection task. We also visualized the workings of these models, as well as our exploratory data analysis in JavaScript(D3.js). The writeup can be found here.

September 2018 - December 2018

Social Tribes

Cornell Data Science

Our motivation for this project stemmed from the question of whether social media accounts actually promote interactions among a large group of diverse individuals. We wanted to research and observe if people simply interacted with only a specific “tribe” of peers on social media accounts similar to how they do in real life. Our objective was to determine if people from a given political group solely interacted with other members within that group, or if they also interacted with members of a different affiliation.

We constructed a similarity matrix using similarity scores that we created between each each journalist, then investigated the effects of three different clustering algorithms. We visualized the workings of each clustering algorithm in JavaScript (D3.js). The demo link can be found above. The writeup can be found here.

January 2018 - May 2018

Chowtime

Chowtime is a project designed to help aspiring foodies decide where to eat.

July 2018 - Present

Skills

Programming/Scripting Languages
  • Python
  • R
  • OCaml
  • Java
  • C
  • JavaScript
Tools
  • SQL
  • HTML/CSS
  • Stata
  • Coq
  • Git
Specializations
  • Applied Machine Learning
  • Learning Analytics
  • Data Mining
  • Data Visualization
  • Data Manipulation
  • Data Analytics
Interpersonal skills

Leadership, conflict resolution, time management



Leadership Experience

Cornell Data Science

Team Lead

As the lead of Cornell Data Science’s Insights team, I lead weekly meetings, hold workshops on machine learning and data science techniques, coordinate guest lectures with professors in the field and ensure smooth operations of the team.

December 2018 - Present

Student Management Corporation

Board of Directors Member

The Board of Directors of SMC (a 2-million-dollar company) makes decisions impacting thousands of Cornell students that are members of various student-run organizations, as well as full-time employees of the Student Management Corporation.

October 2018 - Present

Phi Delta Theta Fraternity

Kitchen Manager

As the Kitchen Manager, skills such as conflict resolution are utilized to maintain order and functionality while meeting health standards of an industrial sized kitchen serving seventy people each night.

April 2018 - April 2019

Interests & Miscellaneous Information

  • Running
    • I ran competitive Track and Field/Cross Country back in high school. These days, I don't have as much time as I'd like to dedicate towards running, but I always love to get out in the trails when I can.
    • I ran on a Distance Medley Relay at New Balance Outdoor Nationals
    • Some PRs from high school: 800m - 2:04 (relay) 1600m - 4:40 5km - 17:32 (XC)
  • Weightlifting
  • I was certified as a CompTIA A+ Technician when I was 11 years old.
  • My favorite TV shows are The Office, Brooklyn 99, and How I Met Your Mother.
  • In my free time, I enjoy listening to music, playing sports, and spending time with friends/family.
  • My favorite basketball team is the Toronto Raptors.