Finance‑Document VQA and Field‑Value Extraction POC
Tech: Python, PyTorch, Amazon Bedrock, SageMaker, Textract, LoRA/QLoRA, Knowledge Distillation, Quantization
- Built a prototype for Visual Question Answering (VQA) and field‑value extraction on tax documents using LLMs/VLMs.
- Boosted accuracy from 80–85% to 99% overall, with 97% success on critical fields.
- Hands‑on with prompt engineering (zero‑/few‑shot, chain‑of‑thought), parameter‑efficient fine‑tuning with LoRA/QLoRA, knowledge distillation, and quantization.
Data Pipeline Monitoring System
Tech: AWS Lambda, Athena, DynamoDB, S3, EventBridge, CloudWatch, Secrets Manager, SQS/SNS, Docker
- Designed a scheduled monitoring service that queried multiple data sources to check for missing/late records.
- Implemented automated alerting using CloudWatch metrics, dashboards, and alarms.
- Improved reliability and data integrity across finance automation pipelines.