Education

Renmin University of China, Beijing, China  2014-2018
B.S in Statistics     Minor in Economic Statistics
Cumulative GPA:  3.78/4.0    Ranking:  4/27
Cornell University, Ithaca, NY, USA   2018-present
MPS in Statistics

Research Experience

  • Deep Learning Collaborative Model on Implicit Feedback Datasets
  • Textual Analysis Towards Spring Festival in Microblog
  • Sentiment Analysis for Financial News Article
  • Bike Demand prediction for Bike Sharing System
  • Split Questionnaire Design for Long Questionnaires
  • Relationship between the Value of Children(VOC) and Desired Fertility
  • Internship

    Data Analyst in Schlumberger: DDCE team
    Wrote reproducible python scripts to extract, clean and convert 3.5 million data
    Characterized trajectory structures using feature engineering and trajectory clustering
    Dealt with misspelling and irregular writing in user input using Weighted Levenshtein Distance
    Predicted users’ preference in trajectory design with modified Random Forest
    AI Researcher in Baidu: Brand Marketing Department
    Collected and analyzed international Artificial Intelligence reports by building a web crawling
    Dealt with misspelling and irregular writing in user input using Weighted Levenshtein Distance
    Designed a research about drivers’ attitudes towards driver fatigue and drowsiness detection devices

    Skills

    R, Python, MATLAB, C++, SPSS, LaTex, Linux, SQL, Visual Studio, Jupyter Notebook, Photoshop

    Awards & Scholorship

    Rewarded as Excellent League Member (5%)
    Rewarded for tutoring children for 4 months
    Rewarded as Outstanding Individual of Student Association (5%)
    Certificated as a outstanding volunteer in the 35th Beijing Marathon
    Received Academic Excellence Scholarship in three consecutive years (top 10%)
    Won second prize in National College Student Academic Works Competition(top 10%)
    Received 20000 research fund in 2016 Undergraduate Innovative Test Program (top 5%)

    Learn More about her Researches

    Deep Learning Collaborative Model on Implicit Feedback Datasets

    Combined matrix factorization and stacked denoising autoencoder algorithm on implicit feedback dataset aiming at cool start and sparse matrix and compared the model with PMF model

    Textual Analysis Towards Spring Festival in Microblog

    Collected and cleaned 60GB data; used light-LDA and association rules to explore topics and relations
    Explored online information transmission rules by characterizing the behavior of follower and being followed respectively, and used a Bayesian approach to capture repost dynamics

    Sentiment Analysis for Financial News Article

    Explored positive and negative attitudes in news about stock market and its influence on stock volatility

    Bike Demand prediction for Bike Sharing System

    Predicted occupancy of stations in long term and short term using stochastic network model and Principle Component Regression with R and Matlab; used the maximal covering location model to locate stations

    Split Questionnaire Design for Long Questionnaires

    Proposed a design dividing a lengthy questionnaire into subsets of survey items and then administered each subset to distinct subsamples to solve low response rate and poor quality of survey data
    Used predictive mean matching multiple imputation dealing with empty items

    Relationship between the Value of Children(VOC) and Desired Fertility

    Designed a questionnaire and surveyed more than 400 people under the two-child policy background
    Used factor analysis, lasso and hypothesis testing to explore the influence of VOC on desired fertility in three dimensions and established a hybrid model to explain fertility desire