Subhojit Biswas

I am an experienced professional with a Ph.D. in Operations Research from Texas A&M University, specializing in game theory, reinforcement learning, machine learning, and deep learning algorithms. My expertise extends to statistical analysis, artificial intelligence, and data-driven decision-making in complex systems. With a strong foundation in energy markets, infrastructure resiliency, and supply chain management, I focus on developing robust optimization models that drive efficiency and strategic growth. My work also includes designing and optimizing trading strategies for financial markets, leveraging advanced analytical techniques to navigate dynamic and uncertain environments.

Professional Experience

Barclays

Modified statistical algorithms to improve the deviation of performance metrics for trade order execution (hui huebel ratio, turnover rate, fill rate, and average slippage) from ~10% to less than 1%.
Developed a critical client offering (Capital Commitment) to mitigate the liquidity risk during high-volume trading of equity products. Awarded “The Best Quarterly Performer,” a global recognition at Barclays, Q3, 2014.
Statistically analyzed the impact of conditional orders in the market to determine price movement in intervals of every 5 seconds and anomalies in counterparty trading patterns.
Optimized billing framework for options and equities using the QKdb platform for clients based on real-time intraday data.

Fidelity Investment

Analyzed email campaign performance by classifying and predicting revenue generated from targeted customer segments based on sex, age, and demographics.
Used machine learning algorithms such as Lasso Regression, Regression Splines, Support Vector Machine, AdaBoost, and Random Forest.
Achieved a 90% accuracy with the help of Support Vector Machine in revenue prediction, optimizing marketing strategies and customer targeting.

JPMorgan Chase & Co.

Developed an algorithm using dimensionality reduction techniques (PCA, LDA), Autoencoders and Tree-based models (Random Forest, GBT) on the implied volatility surface in tenor and moneyness dimension to determine the significant factors (level shift, skew & curvature) responsible for anomalies of the volatility surface over a rolling time period.
Designed a customized statistical library in Python for various parameters such as implied volatility, skew & curvature for different moneyness and stochastic volatility parameters of several indices to determine their statistical significance.
Developed a methodology to calculate the implied volatility for a basket of options using a geodesic approach.
Designed the success criteria in QKdb based on time series analysis (ARIMA, GARCH, etc.) for a volume profile engine used for forecasting intraday trade volume distribution to downstream algorithms. Achieved an accuracy of 87.5%.
Developed an algorithm for ‘Implementation Shortfall’ (a trading strategy) utilizing the concept of mean-variance optimization.

Skills

AMPL

Python

VS Code

Github

MySQL

Qkdb

Ms Project

MATLAB

Visio

Tablue

Gurobi

Minitab

CPLEX

C++

Research Experience

Ph.D. Working Papers

A Review on Response Strategies in Infrastructure Network Restoration

Subhojit Biswas, Bahar Cavdar, Joseph Geunes. (Draft ready for Submission)

This paper reviews the logistics decisions and mathematical models in operations research for critical infrastructure restoration post-disruptions. Analyzed response methodologies for power, road, water, oil, and gas networks, focusing on resource allocation, scheduling, routing, and repair. Highlighted computational challenges in real-time decision-making and identified open research questions for future work. (Read more...)

Repair Crew Routing for Infrastructure Network Restoration under Incomplete Information

Subhojit Biswas, Bahar Cavdar, Joseph Geunes. (Draft ready for Submission)

This paper introduces the Traveling Repairman Network Restoration Problem (TRNRP), where a repair crew must restore service in a disrupted infrastructure network with incomplete fault information. We model the problem as a finite-horizon Markov Decision Process and develop a reinforcement learning-based solution, incorporating structural results to eliminate suboptimal moves and state aggregation techniques to manage complexity. Computational experiments demonstrate the effectiveness of our approach across various parameter settings compared to benchmark methods (Read more...)

The Price of Flexibility in Electricity Markets

Subhojit Biswas, Bahar Cavdar, Alfredo Garcia, Joseph Geunes. (Draft ready for Submission)

In this paper, we examine how electricity markets implicitly price flexibility through persistent day-ahead premiums, which arise due to the risk of low-probability but high-impact disruptions. Using a two-stage game-theoretic model, we demonstrate that arbitrageurs' participation is constrained by downside risk, enabling flexible generators to exert market power by withholding capacity. Furthermore, we show that as arbitrageur participation increases, suppliers' expected returns decline more sharply, while arbitrageurs' returns become increasingly suppressed. (Read more...)

Ph.D. Publications

Replicating the performance of a portfolio of stocks using minimum dominating set

Subhojit Biswas. Expert Systems with Applications, Vol 263, 125797, 2025.

This study applies a graph theory framework to construct a tracking portfolio by modeling assets as nodes and correlations as weighted edges. Using Minimum Dominating Sets (MDS), we efficiently manage large portfolios, solving the NP-hard problem with integer linear programming (Gurobi solver) and heuristic algorithms (greedy and stochastic local search). Computational results demonstrate the effectiveness of our approach in capturing expected returns while significantly reducing computation time. (Read more...)

Github

Optimal investment policy in sharing and standalone economy for solar PV panel under operational cost

Subhojit Biswas. Solar Energy, Vol 264, 112003, 2023.

This research investigates the resilience of solar photovoltaic (PV) investments against natural disruptions like hurricanes and hailstorms. By exploring a sharing economy model and applying game theory, it aims to optimize investment strategies, enhance system resilience, and promote sustainable solar energy expansion despite operational challenges and disaster risks. (Read more...)

Github

Other Publications

Portfolio Optimization Managing Value at Risk under Heavy Tail Return, using Stochastic Maximum Principle

Subhojit Biswas, Mrinal Kanti Ghosh, Diganta Mukherjee. Stochastic Analysis and Applications, Vol 39 (6), 1025-1049, 2021.

The research addresses portfolio optimization for an investor holding a risky and a risk-free asset. It employs stochastic maximum principles to manage Value at Risk (VaR) in heavy-tailed returns while maximizing median returns. The model uses non-parametric calibration and quantile-based optimization to adapt to high-frequency trading environments. (Read more...)

Multi-asset Generalized Variance Swaps in Barndorff-Nielsen and Shephard model

Subhojit Biswas. Diganta Mukherjee, Indranil Sengupta. International Journal of Financial Engineering, Vol 7 (04), 2050051, 2020

This research develops pricing methods for generalized variance swaps in financial markets using the Barndorff-Nielsen and Shephard (BNS) model. It extends covariance-based approaches to multi-asset portfolios, leveraging eigenvalues and covariance matrix traces for risk hedging, with numerical demonstrations applicable to commodities and other financial sectors. (Read more...)

A Proposal for Multi-asset Generalized Variance Swaps

Subhojit Biswas. Diganta Mukherjee. Annals of Financial Economics, Vol 14 (04), 1950019, 2018.

This work introduces generalized variance swaps for multi-asset portfolios, focusing on trace and maximum eigenvalue measures of covariance matrices under Markov-modulated volatilities. It provides pricing methods and compares numerical examples, suggesting eigenvalue swaps may offer cost advantages while addressing variance and correlation dynamics in financial markets. (Read more...)

Multi-asset portfolio optimization with stochastic Sharpe ratio under drawdown constraint

Subhojit Biswas, Saif Jawaid, Diganta Mukherjee. Annals of Financial Economics, Vol 15 (01), 2080001, 2020.

We study an investor’s portfolio optimization problem under a drawdown constraint in a market with local stochastic volatility. The investor seeks to maximize the expected utility of terminal wealth relative to the maximum wealth achieved over a fixed horizon, with asset selection guided by pairs trading. Since closed-form solutions for the value function and optimal strategy are unavailable, we approximate them using coefficient series expansion and finite difference methods, leveraging a risk tolerance function to simplify computations and compare stochastic versus constant volatility scenarios. (Read more...)

Discrete Time portfolio optimization managing value at risk under heavy tail return distribution

Subhojit Biswas, Diganta Mukherjee. International Journal of Mathematical Modelling and Numerical Optimisation, Vol 10 (04), 424-450, 2020.

This research optimizes portfolio returns under heavy-tailed stock price distributions while managing Value at Risk (VaR). It applies dynamic programming and Markov Decision Processes to derive optimal strategies for known and unknown return distributions, addressing transaction costs and maximizing value functions through numerical and parametric methods. (Read more...)

Selected Projects

Artificial Neural Network for Image Classification

This assignment involves implementing and training a Convolutional Neural Network (CNN) using PyTorch for image classification on the FashionMNIST dataset. The project focuses on building and optimizing a CNN classifier for image recognition. GPU acceleration and efficient training techniques were implemented to enhance performance. Additionally, data augmentation and normalization were used to improve model generalization. The assignment also includes a comparison of different configurations to analyze model performance under stochastic and constant volatility conditions. Following steps were performed:

Github

Details

Installed necessary dependencies (torchinfo for model inspection). Checked for GPU availability to speed up training.
Downloaded the FashionMNIST dataset. Applied data transformations such as Gaussian blur, tensor conversion, and normalization. Implemented helper functions for data visualization.
Defined a custom CNN architecture within the parameter limit of 100,000 parameters. Included convolutional layers, batch normalization, dropout layers, and fully connected layers. Used ReLU activation functions and max-pooling layers to enhance feature extraction.
Wrapped the data loader with a function that moves tensors to the correct device. Trained the CNN model using backpropagation and stochastic gradient descent.
Evaluated the model’s performance using accuracy and loss metrics. Compared the impact of stochastic volatility vs. constant volatility on training stability.
Ensured compliance with the ≤100,000 parameter limit. Achieved an accuracy of 85%.

Data Science Project for GLOBE Dataset and K-12 Program

This study applies data science and machine learning techniques to improve the accuracy and reliability of student-collected climate data. By addressing issues like missing and inconsistent temperature readings, the research leverages predictive modeling, correlation analysis, and geospatial methods to refine climate observations. Key findings highlight the effectiveness of automated data processing and the potential for integrating satellite data for enhanced environmental monitoring.

Details

This study leverages data science and machine learning techniques to enhance the quality and reliability of student-driven climate observations using Python 3.10
It addresses challenges such as missing and inconsistent temperature data through validation, outlier detection, and predictive modeling using Random Forest Regressor and CatBoost Regressor for data imputation. Correlation analysis reveals weak relationships between elevation and temperature, suggesting that other meteorological factors play a larger role. Accuracies of 70% and 78% were obtained for different models.
Additionally, logistic regression models are used to predict sky visibility and sky color based on environmental factors, while geospatial and time-series analysis highlight regional and seasonal trends in climate observations. An accuracy of 65% was obtained.
The findings emphasize the need for automated data cleaning pipelines and integration of satellite data to improve climate monitoring and environmental analysis.

Financial Risk Analysis Using Large Language Models

Developed an LLM-driven financial risk assessment tool that analyzes earnings reports, financial statements, and market news to assess company risk levels. The model provides real-time risk scoring by processing textual data from SEC filings and financial news using retrieval-augmented generation (RAG) and sentiment analysis.

Details

Designed an LLM-based pipeline to extract insights from 10-K, 10-Q reports, and financial news articles using OpenAI GPT-4 & LangChain.
Implemented sentiment analysis on company disclosures and analyst reports using Hugging Face Transformers to identify market sentiment trends.
Developed a financial risk scoring system integrating NER (Named Entity Recognition) for detecting key risk factors such as litigation, regulatory changes, and credit risks.
Integrated real-time data retrieval from SEC's EDGAR database and financial news APIs to ensure up-to-date risk assessment.
Deployed the solution using FastAPI and PostgreSQL, enabling scalable insights for financial analysts and portfolio managers.
Achieved a 30% improvement in risk prediction accuracy compared to traditional risk models by integrating NLP-based insights into financial models.

New Clique Relaxation (2-stable) using Integer Programming

The project focuses on a graph-theoretic clique relaxation problem where the goal is to find the largest induced subgraph whose independence number is ≤ 2. This problem is referred to as the s-stable cluster problem. The study involves an extensive literature review, the development of exact algorithms, and computational experiments to determine optimal solutions using Python 3.10.

Github

Details

We implement two exact algorithms—the Branch & Cut (BC) Algorithm and the Combinatorial Branch & Bound (A2) Algorithm when s = 2.
The BC algorithm leverages column generation techniques, solving a restricted master problem (RMP) iteratively while generating valid inequalities (cuts) to refine solutions. It is particularly efficient for dense graphs and can handle up to 205 nodes efficiently, though performance declines as graph density decreases.
On the other hand, the A2 algorithm follows a hereditary structure approach that systematically explores subgraphs using Russian Doll Search (RDS) with backtracking, making it more suitable for sparse graphs but computationally intensive for graphs beyond 50 nodes due to memory constraints.
Computational experiments using Erdős–Rényi random graphs confirm that BC is more scalable, whereas A2 provides strong results for smaller, low-density graphs. The findings highlight the trade-offs between exact optimization approaches, emphasizing the need for stronger constraints in BC for feasibility and enhanced scalability in A2.

Predicting Credit Card Default

This project focuses on developing predictive models for credit card default using various machine learning techniques. By addressing data imbalance, selecting significant features, and evaluating multiple models—including logistic regression, decision trees, random forests, neural networks, and support vector machines—the study identifies the most effective approach for accurate predictions. Performance metrics such as ROC curves, misclassification rates, and lift charts highlight the strengths.

Details

The Credit card data has an imbalanced distribution. We performed down sampling to create a balanced dataset. Chi-Square analysis was used to determine the most significant variables affecting the target variable. No missing values were found, so imputation was not required. We used 70% of data for training and 30% for validation.
We applied Logistic Regression using Stepwise, Forward, and Backward selection methods. In Stepwise Regression, we iteratively added or removed variables to find the best model. In Forward Selection, we started with an empty model and added variables based on improvements in Akaike’s Information Criterion (AIC). In Backward Selection, we started with all variables and removed the least significant ones. The results showed that Stepwise Selection achieved 70.21% accuracy on the training set and 70.56% on the validation set, while Forward and Backward Selection provided similar accuracy (~70.44% on the validation set).
We applied decision trees (Two-Branch, Four-Branch, and Interactive). In the Two-Branch Decision Tree, 11 leaf nodes provided the best balance between training and validation accuracy. In the Four-Branch Decision Tree, 175 leaves provided the best fit on the validation data. Lastly, the Interactive Decision Tree found 22 leaves to be the optimal stopping point.
Next, we used the Random Forest to ensemble multiple decision trees to improve performance. We used 100 trees for training and obtained validation accuracy of 71.40%, which was better than logistic regression and decision trees.
In Neural Network, we used Multi-layer perceptron (MLP) model with 6 hidden layers. Used a high learning rate with momentum. We obtained a validation accuracy of 71.19% which is slightly lower than Random Forest but better than Decision Trees.
Finally, we tested the Support Vector Machine (SVM) with a Polynomial Kernel of degree 2 and obtained a validation accuracy of 70.34%.
To compare and validate the models, we used the following indices: ROC Curve (Receiver Operating Characteristic), Cumulative Lift (Lift Chart), Gini Coefficient, Misclassification Rate, Average Squared Error, Response Percentage, and Lift and Gain Values. Our observations showed that Random Forest performed the best overall, achieving the highest ROC score and the lowest misclassification rate. The Neural Network performed well in terms of error minimization, yielding the lowest average squared error. While SVM and Logistic Regression delivered decent performance, they were outperformed by ensemble methods (Random Forest) and deep learning models.

Teaching Experience

Instructor of Record: ISEN 370 Production Systems Engineering, Texas A&M University, College Station, USA

In this course, I covered essential methods in forecasting, inventory management, supply chain analytics, queuing systems, materials requirement planning, and operations scheduling. Techniques such as exponential smoothing, EOQ models, network optimization, queuing theory, and scheduling algorithms provide a comprehensive framework for efficient decision-making in supply chain and operations management.

Details

Forecasting of Supply and Demand : Exponential Smoothing, Moving Average, Regression Analysis, Holt's Method, Winter's Method.
Inventory Management : Basic Economic Order Quantity (EOQ) Model, EOQ model with Backordering, News Vendor Problem, Finite Production Rate, Finite Production Rate with Backordering, Quantity Discount Model, Resource Constrained Multi-product System, (Q, R) Policy, (S,s) Policy, Inventory Horizon with Backordering for News Vendor Problem.
Supply Chain Analytics: Capacity Growth Planning, Transportation Problem, Delivery Routing Problem, Network Optimization Model.
Queuing System: Poisson Distribution, Exponential Distribution, Memoryless Property, Little's Law, M/M/1 Queue, M/M/s Queue, G/G/1 Queue, G/G/s Queue.
Materials Requirement Planning: EOQ Lot Sizing, Silver Meal Heuristic, Least Unit Cost, Part Period Balancing, Wagner-Whitin Algorithm, Backward Dynamic Programming, Shortest Path Labeling Algorithm.
Operations Scheduling : Sequencing Rules (First Come First Serve, Early Due Date, Shortest Processing Time, Critical Ratio), Johnson's Algorithm, Stochastic Scheduling, Three Machine Flow Shop, Lawler's Algorithm.

Guest lecturer: Dynamic Programming in Quantitative Finance, Indian Statistical Institute, Kolkata, India

Explored stochastic control and dynamic programming for optimizing investment and consumption strategies under uncertainty in financial markets. The study modeled an investor's decision to allocate wealth between risk-free assets (e.g., bonds) and risky assets (e.g., stocks) to maximize expected utility over time.

Details

Using dynamic programming, the optimal allocation was determined by solving the Bellman equation, incorporating stochastic asset returns modeled as a normal distribution. The lecture was extended to multi-asset portfolios, where returns followed a multivariate normal distribution, and risk preferences were incorporated through the CRRA utility function.
The lecture further explored continuous-time stochastic control, solving a stochastic differential equation (SDE) for wealth evolution using the Hamilton-Jacobi-Bellman (HJB) equation. Python implementations included Monte Carlo simulations, backward induction for optimal policies, and Euler-Maruyama methods for continuous-time optimization.
Results demonstrated how investors adjusted portfolio allocations based on risk and expected returns, showcasing the power of computational finance in real-world investment decision-making.

Guest lecturer: Swap Derivatives in Quantitative Finance, Indian Statistical Institute, Kolkata, India

The swap lectures covered the fundamentals of interest rate swaps, currency swaps, and their applications in financial markets. The discussions focused on how swaps are used for hedging interest rate risk, managing currency exposure, and arbitrage opportunities.

Details

Key concepts included fixed-for-floating rate exchanges, pricing models, and counterparty risk considerations. Practical examples demonstrated how corporations, banks, and investors utilize swaps for risk management and speculative strategies.
The lectures also explored valuation techniques using present value formulas and yield curves, highlighting the role of swaps in derivative markets and structured finance.

Student’s Feedback

Many found the lecture pacing to be well-structured and felt that the review sessions were particularly helpful in preparing for exams. While students valued the extra credit opportunities, some suggested that quizzes should carry less weight in the overall grading scheme. Overall, the feedback highlights a well-organized course with fair grading and supportive review sessions that aid student learning. Students expressed their appreciation for the practical applications and support provided throughout the course. Some comments:

Details

Fall 2024: 48 students
- "Professor Biswas did provide lots of opportunities for extra credit, and the exams were graded very nicely with lots of partial credit, which I greatly appreciate."
- "I believe that the lectures are paced well, and the reviews are very helpful when it comes to taking the exams."
- "I did like that there were opportunities to earn extra credit, but I think the quizzes should be weighted less."
Spring 2024: 45 students
- "The real-world applications were much more interesting and felt more beneficial to learn."
- "I walked away learning a ton about production systems and industrial engineering as a whole."
Fall 2023: 70 students
- "Those extra credit [assignments] make you think."
- "I learned the importance of how I needed to be way more attentive to detail... I could not have passed this class without your help and without your teaching. This class was a monumental step to get there."

Conferences

The Price of Flexibility in Electricity Markets

I have presented this paper at the following two conferences with incremental updates:

INFORMS Annual Meeting, Seattle, USA – October 2024
IISE Annual Conference, Montréal, Canada – May 2024

Repair Crew Routing for Infrastructure Network Restoration under Incomplete Information

I have presented this paper at the following three conferences with incremental updates:

INFORMS Annual Meeting, Phoenix, USA – October 2023
IISE Annual Conference, New Orleans, USA – May 2023
INFORMS Annual Meeting, Indianapolis, USA – October 2022

Multi-asset portfolio optimization with stochastic Sharpe ratio under drawdown constraint

I have presented this paper at the following conference:

Statfin Conference, Chennai Mathematical Institute, Chennai, India – December 2018

Leadership Experience

President, INFORMS Student Chapter at TAMU (September 2023 - May 2024)

Successfully led INFORMS Student Chapter at Texas A&M University to win the highest Summa Cum Laude award at the INFORMS Annual Meeting, Seattle, 2024.
Secured $1000 in funding for the Student Chapter by participating in an INFORMS event.
Supervised the poster competition for graduate and undergraduate students and grant writing session series for the Ph.D. students in Industrial and Systems Engineering department.

President, INFORMS Student Chapter at TAMU (September 2023 - May 2024)

Organized a booth to represent the department at the Optimization Society Conference, Houston, March 2024 for post-doctoral and faculty hiring.
Organized workshops on Python and AMPL optimization solvers featuring industry speakers, aimed at assisting graduate and undergraduate students in advancing their professional careers.
Successfully led INFORMS Student Chapter at Texas A&M University to win the highest Summa Cum Laude award at the INFORMS Annual Meeting, Phoenix, 2023.

Vice-President, INFORMS Student Chapter at TAMU (January 2023 – August 2023)

Helped to secure $400 in funding for the Student Chapter by participating in an INFORMS event.
Organized a booth to represent the department for post-doctoral and faculty hiring at the ERS conference, Texas A&M University, August 2023.
Organized INFORMS Coffee Chat with the seminar speakers from different universities.

Conference Session Chair

Cluster: Revenue Management and Pricing
Title of the session: Electricity Market Pricing focusing on the impact of uncertainties and competition on day-ahead market pricing efficiency, INFORMS Annual Meeting 2024.
Cluster: ENRE Energy Climate:
Title: Energy Infrastructure Network Resilience, INFORMS Annual Meeting 2022.