OPEN TO WORK • AI / ML / GENAI / MLOPS • 2026
ML & AI
Engineer.
I am Shalin Vachheta, an M.Sc. Mechatronics student at the University of Siegen focused on building practical AI systems. My work spans LLMs, RAG, Deep Learning, and offline-first MLOps — from model development and evaluation to deployment, monitoring, and reproducible ML pipelines.
// Career Journey
DEC 2025 - PRESENT
Master Thesis
University of Siegen
MLOps Pipeline for Continuous Mental-Health Monitoring
- Architecting a production-grade, offline-first MLOps pipeline for wearable IMU time-series (ACC/GYRO) for anxiety detection.
- Utilizing DVC and MLflow for robust reproducibility, dataset/model versioning, and tracking artifact lineage.
- Designing privacy-first deployments using containerized inference, FastAPI services, and structured artifact management.
MAY 2024 - MAY 2025
Student Research Assistant
University of Siegen
Sequence-to-Sequence Event Forecasting
- Developed a BiLSTM Seq2Seq forecasting model for multivariate simulator event streams.
- Improved performance by 25% over the baseline, gaining an additional 15% via Optuna-based hyperparameter optimization.
- Implemented mask-aware evaluation, reliable logging, and checkpointing for processing variable-length sequences.
GenAI, LLMs & Retrieval
RAG Chatbot — Multilingual Document Intelligence
- Developed a multilingual RAG pipeline for scalable tender-document analysis.
- Implemented metadata mapping, deduplication, and embedding-based retrieval.
- Enhanced response accuracy with contextual re-ranking for enterprise workflows.
LLM-Based AI Email Generator
- Built a context-aware email drafting system using Llama 3.1 and LangChain.
- Structured persona and company inputs into reusable, semantic prompt templates.
- Delivered a controllable generation workflow via a streamlined Streamlit interface.
PEGASUS Summarization
- Fine-tuned Google PEGASUS on 16k+ SAMSum samples for abstractive summarization.
- Evaluated model performance using ROUGE metrics within a reproducible pipeline.
- Packaged and deployed the final model as a modular FastAPI inference endpoint.
Transformer From Scratch
- Implemented the core 'Attention Is All You Need' architecture in PyTorch.
- Built multi-head attention, positional encoding, and residual connections.
- Enabled sequence modeling experiments with masking and beam search integration.
Research & Deep Learning
Seq2Seq Event Forecasting
- Engineered a BiLSTM encoder-decoder for multivariate simulator event streams.
- Optimized hyperparameters using Optuna to improve baseline performance by 25%.
- Integrated mask-aware evaluation and reliable checkpointing for robust research.
Microscopic Denoising & Segmentation
- Designed a dual-branch U-Net pipeline for fluorescence microscopy analysis.
- Achieved simultaneous image denoising and precise semantic segmentation.
- Implemented robust preprocessing, hybrid loss functions, and method benchmarking.
Data Analytics & Classical ML
Retention Analytics — Churn Insights
- Executed end-to-end churn analytics on 10k+ customer records.
- Trained Random Forest classifiers reaching ~89% accuracy with feature analysis.
- Delivered actionable, business-facing retention recommendations.
Predictive Modeling & Decision Support
- Constructed a complete ML workflow spanning feature engineering to evaluation.
- Trained and compared multiple models, achieving an R² of 0.87 with XGBoost.
- Deployed the optimal regression model as a containerized Flask API service.
US Vehicle Sales Forecasting & Insights
- Analyzed 558k+ vehicle sales records with large-scale data preprocessing.
- Engineered temporal and vehicle-level features to uncover pricing elasticity.
- Developed 10+ EDA visualizations to highlight seasonal market behavior.
IPL Match Outcome Analysis
- Conducted exploratory analytics on 700+ professional cricket matches.
- Studied the impact of toss strategy, venue effects, and team performance.
- Utilized Pandas and NumPy to uncover historical outcome patterns.
PythonSQLPyTorchTensorFlowscikit-learnMLflowDVCDockerFastAPILangChainTransformersRAGQdrantOllamaGitHub ActionsPythonSQLPyTorchTensorFlowscikit-learnMLflowDVCDocker