Project: CoinPilot Premium User Conversion Prediction Systemt
CoinPilot Premium User Conversion Prediction System Executive Summary The CoinPilot Premium User Conversion Prediction System is a comprehensive machine learning solution designed to predict user conversion to premium services in a fintech application. The system leverages ensemble learning techniques to analyze user behavior patterns, financial profiles, and engagement metrics to provide accurate conversion probability predictions. The project encompasses data analysis, model development, and deployment through a modern web-based architecture using FastAPI and Streamlit. ...
Project: A specialized Wikipedia Research Assistant
A specialized Wikipedia Research Assistant 1. Project Overview This project is an AI-powered, fully automated research system designed to scrape information from unstructured sources like Wikipedia, perform intelligent information extraction and structuring using Large Language Models (LLMs), and provide an interactive query and analysis platform accessible via natural language. The system constitutes an end-to-end data intelligence pipeline, from data acquisition and processing to analysis and visualization, demonstrating how modern AI technology can transform complex web information into actionable knowledge. ...
Project: O2O Coupon Usage Prediction
O2O Coupon Usage Prediction Project Summary In this project, I developed a machine learning model to predict whether customers would redeem coupons on an O2O (Online-to-Offline) platform. By engineering over 30 features from user and merchant behavior and employing a LightGBM classifier, the model provides valuable insights into the key drivers of coupon redemption, enabling more targeted and effective marketing strategies. The Business Challenge O2O platforms frequently issue coupons to attract customers and drive sales. However, untargeted coupon distribution can be costly and inefficient. The core challenge is to accurately predict which users are most likely to use a given coupon, enabling more effective marketing campaigns and maximizing return on investment. ...
Intern: Fixed Income Intern in CIB Co., Ltd.
Introduction Prepared daily, weekly, and monthly reports on the bond market: Daily: recorded and analyzed bond market transactions, government bond trends, as well as money and stock market activity and sentiment. Weekly: compiled interest rate briefs and drafted reports, including market liquidity review, central bank operations review, and market outlook analysis. Monthly: independently prepared macroeconomic data briefs and reports, covering interbank certificate of deposit trends, custody data analysis, institutional behavior reports, and commercial paper reports. Compiled and analyzed economic and financial data for monthly reports, including PMI, social financing, CPI & PPI, retail sales, industrial production, and import/export data, producing statistical analyses and commentary reports.
Competition: 2023 Huashu Cup Model Construction Competition
A Machine Learning and Evaluation Framework for Analyzing the Impact of Maternal Health on Infant Development Summary This study establishes machine learning models to analyze correlations between infant behavior characteristics, maternal physical and mental health indicators, and infant sleep quality, and proposes treatment strategies. Problem 1 Preprocessed infant behavior features and maternal health indicators. Designed hierarchical statistics for multiple variables: ANOVA for continuous variables and logistic regression for categorical variables. Conducted correlation analysis and multifactor ANOVA, finding significant relationships: Maternal age ↔ infant sleep patterns & behavior features Maternal gestation period ↔ infant wake-up frequency Maternal HADS score ↔ infant wake-up frequency, total sleep time, behavior features Maternal EPDS score ↔ infant wake-up frequency, total sleep time No significant effects were found for other indicators. Problem 2 Trained models using logistic regression, Random Forest, Neural Networks, and XGBoost. Selected XGBoost as the best-performing model (highest accuracy). Optimized model parameters using loss function minimization and cross-validation. Predicted the behavior types for the last 20 infant samples using the trained XGBoost model. Problem 3 Combined genetic algorithms with the XGBoost model from Problem 2 to generate treatment plans. Final treatment costs: Moderate type: 695 CNY Quiet type: 10,448 CNY Problem 4 Evaluated infant sleep quality using the CRITIC method. Established a comprehensive sleep quality ranking system using rank-sum ratio evaluation, classifying sleep quality as excellent, good, medium, or poor. Determined indicator weights with the CRITIC method. Trained a Random Forest model to associate comprehensive infant sleep quality with maternal health indicators, predicting sleep quality for the last 20 infant samples. Problem 5 Based on the evaluation and association models from Problem 4, calculated the initial sleep quality of infant #238. Applied the same approach as Problem 3, updating maternal indicators in the association model to generate a new treatment plan: Moderate type (sleep quality: excellent), minimum cost: 8,699 CNY Keywords: XGBoost, Genetic Algorithm, CRITIC Method, Rank-Sum Ratio Evaluation, Random Forest, Association Model ...
Competition: 2024 Mathematical Contest in Modeling
An Analysis of Sustainable Strategies for Property Insurance Introduction In this paper, I developed an LSTM model with an accuracy of 85% to predict future natural disasters using natural disaster data in Florida and California of the past 30 years. Applied neural network, linear regression, and deep learning models to assess the relationship between natural disasters and property damage. Summary In recent years, homeowners and insurance companies have faced significant crises, necessitating the development of comprehensive solutions to meet the needs of all stakeholders involved in the insurance industry. This paper presents an innovative approach to property insurance by introducing an insurance company’s property allocation model based on deep learning and LSTM and a market investment model utilizing regression analysis, as well as a community conservation building model employing grey correlation analysis. These models provide valuable insights and correlation analyses for the property-casualty insurance sector, promoting a more sustainable industry. ...
Intern: Assurance Data Analysis Intern in ERNST & YOUNG
Introduction Managed and processed financial data exported from Kingdee and UFIDA systems, handling 10,000+ data fields daily. Automated real estate data auditing with Openpyxl and Pandas, achieving faster queries, reduced latency, and streamlined procedures.
Intern: RA IBond Data Analyst Intern in Deloitte
Introduction Implemented automated ETL pipelines using WIND Excel plugin and templates to query and preprocess 80,000+ records daily for Deloitte IBond Smart Bond and CITIC Construction Investment teams. Optimized large-scale data retrieval and analysis pipelines with Python + SQL, improving processing speed and stability; proficient in WIND (Excel & Python integration).