Kexin Deng

Hi! I am a graduate of the Master of Engineering in Financial Engineering program at Cornell University. My work focuses on systematic equity research, machine learning for financial prediction, and quantitative portfolio construction. At Cornell, I conducted research on market microstructure, execution modeling, and regime-aware portfolio strategies. I have also built large-scale machine learning pipelines for cross-sectional equity prediction, including feature engineering, signal evaluation, and portfolio optimization. My broader interests lie in applying machine learning and data systems to financial markets, including alternative data, large-scale factor modeling, and AI-driven research workflows like RAG.

Email  /  CV  /  SSRN  /  Instagram  /  LinkedIn  /  Github

profile photo

Research

My research interests lie at the intersection of systematic investing, machine learning, and quantitative finance. I work on building data-driven models for financial markets, including cross-sectional equity prediction, portfolio construction, and regime-aware risk modeling. My recent work includes large-scale machine learning pipelines for equity prediction, retrieval-augmented generation systems for financial knowledge retrieval, and systematic trading strategies informed by market microstructure signals.

Retrieval-Augmented Generation in Finance: Evaluating Vector, Hierarchical, and Graph-Based RAG Architectures
Kexin Deng, Yueying Wang, Yuewei Wang, Yun Zhang,
Yihan Zhao, George Lin, Yu Yu, Charles Zhou,
Sasha Stoikov, Laura Appleby,
Cornell Financial Engineering Manhattan, Cornell University BlackRock
CFEM Capstone Research Project with BlackRock, 2025.8-2026.2
paper / code / poster

This project studies retrieval-augmented generation systems for financial knowledge retrieval. We evaluate multiple RAG architectures including vector search, hierarchical retrieval, and graph-based retrieval pipelines over enterprise-scale financial document corpora.

Machine Learning and Deep Learning–Enhanced Feature Engineering and Model Zoo Ensembling for the Trexquant Market Prediction
Kexin Deng, Yichen Gao, Michael Cao
Cornell Financial Engineering Manhattan, Cornell University
AI/ML Applications to Trading & Execution, 2025.9–2025.12
Mentored by Giuseppe Nuti (head of Machine Learning & AI for UBS’s Global Markets)
paper / code / slide deck

Developed a large-scale cross-sectional equity prediction system combining domain-driven feature engineering with machine learning and deep learning models. The pipeline expands 400 base alpha signals into 1,947 engineered features, integrates LightGBM, IC-oriented neural networks, and Ridge regression into a Model Zoo ensemble optimized using a Correlation-Optimized Ensemble (COE). The final system achieved a public leaderboard score of 0.07066, ranking Top 2 publicly and Top 3 privately in the Trexquant competition.

Agentic Artificial Intelligence in Finance: A Comprehensive Survey
Irene Aldridge, Cornell Jolie An, Riley Burke, Michael Cao, Chia-Yi Chien, Kexin Deng, Ruipeng Deng, Yichen Gao, Olivia Guo, Shunran He, Zheng Li, George Lin, Weihang Lin, Fanyi Lyu, Kwunfung Ng, Qi Wang, Hanxi Xiao, Dora Xu, Yuanyuan Xue, Sheng Zhang, Sirui Zhang, Yun Zhang, Sirui Zhao, Xiaolong Zhao, Yihan Zhao, Waner Zheng
Cornell Financial Engineering Manhattan, Cornell University
SSRN Working Paper, 2025
paper

This survey studies the emergence of agentic artificial intelligence in financial markets. We analyze architectures for autonomous financial agents capable of reasoning, planning, and adaptive decision-making across trading, portfolio management, and market infrastructure. The paper reviews system design, multi-agent coordination, regulatory considerations, and the implications of agentic AI for market efficiency, liquidity provision, and systemic risk.

Meet the Rollups: the Digital Broker-Dealers
Irene Aldridge, Cornell, Kexin Deng
Cornell Financial Engineering Manhattan, Cornell University
SSRN Working Paper, 2025
paper

This study discusses how recent innovations in cryptography can help bridge the divide between the optimality of now-traditional dark pools and the advances in the latest digital ledger technologies. We argue that recent innovations such as Merkle tree compression enable efficient decentralized electronic matching in a traditional dark pool environment. This development can help bring about higher market efficiencies in markets around the world.

Forecasting VIX Spikes with Machine Learning and Market Regime Signals
Sirui Zhao, Kexin Deng, Yichen Gao, Yihan Zhao
Cornell Financial Engineering Manhattan, Cornell University
Systematic Risk Modeling / Volatility Forecasting, 2025

Mentored by Edward Tom (Senior Director at the Chicago Board of Options Exchange)
paper

Developed a machine learning framework to forecast extreme spikes in the CBOE Volatility Index (VIX) using cross-asset market signals and macroeconomic indicators. The system integrates equity market features, options-implied volatility metrics, macro factors, and market microstructure signals to detect early-warning patterns preceding volatility explosions. The feature pipeline incorporates realized volatility, term structure dynamics of VIX futures, equity index momentum, credit spreads, liquidity indicators, and macroeconomic variables from the FRED database. Multiple classification and regression models including Gradient Boosting, Random Forest, and neural networks are trained to predict the probability of large VIX spikes and regime transitions. The model is evaluated on historical market stress events such as the COVID crash, inflation shocks, and volatility regime shifts, demonstrating improved early detection of volatility surges and enabling systematic portfolio protection strategies for risk-managed equity portfolios.

News Sentiment Analysis and Summarization for Financial Markets
Sirui Zhao , Kexin Deng, Weihang Lin
Cornell Financial Engineering Manhattan, Cornell University
ORIE 5253 Asset Management Seminar, 2025

Mentored by Andrew Chin (Chief Artificial Intelligence Officer at AllianceBernstein)
paper / code / demo

Built an end-to-end platform for extracting structured investment signals from financial news. The system ingests company-specific news articles, applies FinBERT to generate article-level sentiment scores, and aggregates them into daily stock-level sentiment time series for S&P 500 constituents. The pipeline combines sentiment signals with price dynamics to visualize sentiment–return co-movements and generates simple Buy/Hold/Sell regimes based on smoothed sentiment indicators. An LLM-based agent summarizes daily news flows into concise investor-friendly insights, enabling faster interpretation of market-moving events. The platform integrates data ingestion, NLP-based sentiment modeling, signal aggregation, and interactive dashboards, demonstrating how large-scale financial text data can be transformed into actionable signals for quantitative asset management workflows.

Clustering and Network-Based Portfolio Optimization with Shrinkage and CVaR
Kexin Deng, Olivia Guo, Weihang Lin
Cornell University — Operations Research and Information Engineering
ORIE Asset Management Project, 2025
Mentored by James Renegar (Class of 1912 Professor of Engineering Emeritus)
paper / code

Developed a machine-learning–driven portfolio construction framework combining clustering-based stock selection with shrinkage covariance estimation and CVaR optimization. Backtests on S&P 500 equities (2020–2023) show improved stability and downside risk control, achieving Sharpe ratios above 1.6 and significantly outperforming the S&P 500 benchmark.

High-Frequency Execution Strategy with Microstructure Signals
Kexin Deng, Yichen Gao, Michael Cao, Dora Xu
Cornell Financial Engineering — ORIE 5259 Market Microstructure & Algorithmic Trading
Mentored by Giuseppe Nuti (head of Machine Learning & AI for UBS’s Global Markets)
slides / code

Score-based high-frequency execution model using L2 limit order book signals including momentum, order flow imbalance, volatility-to-spread ratio, bid–ask spread, and time pressure. Rolling optimization and queue-aware execution improve trade timing and reduce buy–sell spread by 40–125% relative to a TWAP benchmark.

Automotive Industry Portfolio Optimization and Risk Analysis
Kexin Deng, Michael Wachsman, Klaus Stier, Olivia Guo,
Cornell ORIE 5630 — Financial Data Science
paper and code

Quantitative sector analysis of U.S. auto manufacturers (Tesla, GM, Ford) implemented in R (quantmod). Constructed a tangency portfolio using Mean–Variance Optimization with Treasury-bill risk-free rates and benchmarked against the industry index. Extended analysis using CAPM regressions and tail-risk metrics (VaR, Expected Shortfall) to evaluate systematic exposure and downside risk.

Siamese ResNet50 for Paired Image Classification
Kexin Deng, Rui Li
Cornell CS ML
code

This project addresses a relational prediction task where the objective is not to classify individual inputs, but to model the interaction between two inputs. Using a Siamese architecture with a shared ResNet50 backbone adapted for grayscale images, each input is encoded into a comparable feature representation, which is then combined through concatenation and passed into a fully connected classifier to capture pairwise dynamics. Beyond architecture design, performance gains were driven primarily by training strategy, including mixed precision for stability in deep networks, balanced binary cross-entropy to address class imbalance, cosine warm restarts for optimization, and F1-based threshold tuning to better align training with evaluation metrics. The final model achieved 0.9048 accuracy on the private leaderboard (Top 4 out of 91), significantly outperforming CNN and ResNet18 baselines. This project highlights that for relational tasks, explicitly modeling interaction structure is often more critical than increasing model complexity, and demonstrates how training methodology can be as important as model architecture in achieving state-of-the-art performance.

The Impact of the COVID-19 Pandemic on Chinese Mutual Funds Investors
Kexin Deng
NYU, Alipay
Mentored by Guodong Chen, Yiqing Lv
paper

This research explores the impact of the COVID-19 pandemic on Chinese mutual fund investors across different wealth levels, utilizing data from the Alipay AntFin Open Research Laboratory and employing two-way fixed effects and difference-in-differences analyses. Wealthier investors were observed cashing out and reducing called investments while maintaining higher overall investment levels post-pandemic. The study delves into changes in dispositional effect, portfolio rebalancing, timing, and alpha risk. City-level severity measurements suggest increased mutual fund turnover and net flow in more severe pandemic situations, favoring stock and mixed funds in more severe cities while less favoring money market funds. The research suggests that extreme health events can prompt investor behavior shifts beyond the traditional diagrams, offering a Chinese mutual fund perspective and advocating for the consideration of inequality in financial market assessments during COVID-19 and similar health crises.

Miscellanea

Research Assistant

Research Assistant — Irene Aldridge, Cornell
May - Sep 2025 Conducted research on crypto and zk-rollups, exploring decentralized dynamics.

Research Assistant — NYU Shanghai's Volatility Institute (VINS)
Oct. 2023 to Jun. 2024 Conducted empirical research on climate-related financial risks, including termination risk modeling inspired by Robert F. Engle’s framework.

Research Assistant — Yiqing Lv, NYU
Sep. 2023 to Jun. 2024 Developed NLP-driven factor models and data-asset pricing signals using Chinese listed company filings, and conducted research on global corporate governance including say-on-pay regulation and CEO compensation.

Research Assistant — Guodong Chen, NYU
March 2023 to Aug. 2024 Built econometric models on large-scale Alipay transaction data to analyze mutual fund investor behavior, including portfolio rebalancing, timing, and cross-sectional heterogeneity.

Honors & Awards

WorldQuant IQC – Gold Level
MCM/ICM (COMAP) – Honorable Mention (2023)
University Honors Scholar & Founders Day Award
Global Quintessence Scholarship
Dean’s List (2020–2023)

Teaching

Teaching Assistant — Cornell Johnson Graduate School of Management
May 2025 – Aug 2025

Courses
Cornell Johnson MS in Business Analytics
• BANA 5160 Capstone Project
• BANA 5040 Teamwork and Collaboration

Cornell Johnson MBA
• NCC 5040 Leading Teams
Teaching Assistant — Cornell Engineering
Sep 2025 – Dec 2025

Cornell Master of Engineering in Financial Engineering
• ORIE 5257 Special Topics in Financial Engineering VI (AI/ML Applications to Trading & Execution)

Certifications

Financial Data Science Certificate — Cornell University

Academic Engagement

Conference Volunteer
Participated in leading conferences in quantitative finance and financial economics, supporting event operations and engaging with research on asset pricing, market microstructure, and AI-driven finance.

2023 VINS Annual Conference & Festschrift in Honor of Robert F. Engle
2024 Future of Finance & AI Conference — Cornell Financial Engineering Manhattan