Vishwajeet Thakur
Software Engineer-| - | https://www.linkedin.com/in/vishwajeet%7Ethakur/
Summary
Backend, Data and GenAI engineer with experience building cloud-native platforms on AWS, productiongrade ETL on Spark, Snowflake and Retrieval-Augmented Generation (RAG) systems with strong MLOps
and observability.
Skills
Backend:
Data:
Cloud/DevOps:
GenAI:
Python, FastAPI (async), Django, Django Rest framework, REST APIs, Pydantic,
SQLAlchemy, PostgreSQL, MongoDB
Airflow, PySpark, EMR, Parquet, S3, Snowflake (MERGE, Streams/Tasks), data quality checks
AWS (EKS, RDS, S3, IAM, CloudWatch), Docker, CI/CD, Prometheus and Grafana
RAG, LangChain, LangGraph, LangSmith, Pinecone, OpenAI embeddings (textembedding-3-large), AWS Bedrock
Experience
Consultadd Inc - Client-facing Engineer (Backend / Data / GenAI)
Sep 2021 - Present
Client: Pfizer | GenAI Engineer
• Built a production FastAPI backend with async endpoints for semantic search, investigation sessions,
analytics, and audit/history; defined strict request/response contracts using Pydantic models (nested
schemas, enums, discriminated unions) to enforce consistent structured outputs end-to-end.
• Implemented a SQLAlchemy persistence layer with declarative models and repositories (data access) plus
an application service layer (transactional workflows) to persist investigation sessions, evidence snippet
references, prompt/model versions, and feedback signals; managed transaction scoping, safe rollbacks,
and high-throughput bulk inserts/updates for ingestion metadata..
• Added a comprehensive pytest suite for unit and API integration where FastAPI dependency overrides for
test isolation, fixtures for deterministic seeded data, contract tests validating Pydantic schema compliance, and mocks/stubs for Bedrock/Pinecone/embedding calls to validate error paths (timeouts, empty
context, parse failures) and retry behavior.
• Built a centralized MLOps and GenAI platform used across Oncology, iMaging, IMRU, MLCS to standardize how datasets, embeddings, models, and prompts are built, tracked and deployed across multiple
initiatives.
• Implemented ClearML as the control plane for experiment tracking and orchestrated pipelines: dataset
versioning, reproducible training runs, model artifacts stored on S3 and promotion between environments
via immutable run IDs and tracked lineage.
• Deployed SageMaker-backed training/inference workflows (containerized jobs/endpoints) triggered from
ClearML pipelines; packaged code and dependencies into Docker images to ensure parity across dev/stage/prod.
• Delivered a RAG system for log analytics: ingestion jobs normalize and chunk operational logs, compute
embeddings using Amazon Titan Text Embedding v2, and upsert vectors into Pinecone with metadata
filters (application/team/environment/time window) to support precise retrieval and scoped investigations.
• Integrated AWS Bedrock for LLM inference; built retrieval and generation chains in LangChain with
strict structured outputs (Pydantic schemas) to return actionable summaries (root-cause hypotheses,
related log clusters, suggested next checks) rather than free-form text.
1
• Added LangSmith monitoring for end-to-end tracing: retrieval hits/misses, tool timings, prompt versions,
model/endpoint selection, and failure categories (timeouts, empty context, parse errors) to accelerate
debugging and iterative prompt tuning.
• Built the product as a FastAPI backend (auth-protected endpoints for search, analytics, and audit/history)
and a React UI with Material UI for analysts to run queries, review cited evidence snippets, and track
investigation sessions.
Stack: FastAPI, Python (async), SQLAlchemy, Pydantic, pytest, ClearML, AWS (SageMaker, S3, IAM,
CloudWatch), Docker, AWS Bedrock, Amazon Titan Text Embedding v2, LangChain, LangSmith, Pinecone,
React, Material UI
Client: Charter Communications | Software Engineer (Backend / Platform / GenAI)
• Built a greenfield device onboarding & operations platform on AWS: Dockerized FastAPI services on
EKS, PostgreSQL on RDS, artifacts/logs on S3, metrics/alerts in CloudWatch, with Helm-based rolling
and blue-green deployments.
• Implemented vendor adapters (Ciena, Nokia, and others), handling auth, pagination, and payload differences; scheduled polling to reconcile shipments vs. inventory with idempotent upserts and a complete
audit trail.
• Exposed a versioned REST API ( 45 endpoints) for device search, vendor sync, and engineer actions;
added async workers with per-vendor rate limits; built GlfConnect (Paramiko wrapper) for retries and
circuit breakers.
• Automated end-to-end SSH workflows (artifact download, image install, verification, health probe) and
surfaced job status and logs via API and UI.
• Wrote a high-throughput ingestion pipeline using multiprocessing, concurrent.futures, and connection
pooling; reduced ingestion time from 219s to 4.07s ( 98 improvement).
• Introduced request/job correlation IDs and operational dashboards to reduce triage time and make
vendor sync health and job histories queryable.
• Built an internal Dash (Python) ops dashboard (real-time job queue, device health KPIs); designed
MongoDB aggregation pipelines for summary and drill-down views.
• Built a RAG assistant for NOC runbooks & ticket triage: Pinecone vector store (cosine), chunking (512
tokens, 64 overlap), hybrid retrieval (BM25 with vector) with cross-encoder reranking for high-precision
context.
• Migrated embeddings from text-embedding-ada-002 to text-embedding-3-large; built offline indexing
with incremental upserts, TTL compaction, and provenance stamps on each answer.
• Orchestrated agents with LangChain and LangGraph: tool-calling to inventory, vendor APIs, and SSH
job triggers; LCEL chains with guarded retries, timeouts, and vendor-specific circuit breakers.
• Enforced deterministic structured outputs via Pydantic (schema validation with retry-on-parse-fail),
prompt versioning, and lineage persisted in PostgreSQL.
• Added evaluation harness (Recall@k, MRR@10, faithfulness/groundedness) integrated into CI (pytest)
to prevent regressions across embeddings/chunking/reranking.
• Fine-tuned small LLMs (Phi-4 family) using PEFT/QLoRA with Unsloth on AWS g6e.xlarge; deployed
LoRA adapters behind the same FastAPI/EKS layer.
• Set up GenAI observability (token/cost budgets, latency SLAs, tool-call traces, failure taxonomy) via
CloudWatch/OpenTelemetry; safe fallback to keyword-only search when needed.
2
Stack: FastAPI, Python (async), PostgreSQL (RDS), Docker, Kubernetes (EKS), AWS (RDS, S3, CloudWatch, IAM), Helm, SSH automation, Dash, MongoDB. GenAI: LangChain, LangGraph, Pinecone, OpenAI
embeddings, Pydantic, OpenSearch (BM25), reranking, PEFT/QLoRA, Unsloth.
Client: Charles Schwab | Software Engineer (Data Engineering)
• Built REST APIs using Django REST Framework: implemented ViewSets and Routers for versioned
endpoints, wrote serializers (field validation, transformation, nested payloads) to enforce consistent contracts, and optimized ORM queries (select_related/prefetch_related) for predictable API latency under
batch-trigger workloads.
• Owned Airflow orchestration for batch ETL: 20+ DAGs (sensors, retries, SLA alerts, backfills) parameterized by load date and environment.
• Wrote PySpark jobs on EMR for ingest/cleanse/conform (schema enforcement, partitioning, broadcast/hash joins, window functions); produced Parquet to S3 with deterministic partition layouts.
• Modeled Snowflake using star schemas: multiple fact tables (event and snapshot grains) and ~18 dimensions; implemented SCD Type 2 with surrogate keys and MERGE-based upserts.
• Added data quality gates (referential integrity, range checks) integrated with Airflow callbacks and
runbooks; ensured deterministic reprocessing via checkpointed stages.
• Tuned shuffles/partitions/file sizes/commit frequency to meet nightly SLAs and reduce warehouse spend
during loads.
Stack: Django, Django REST Framework, Airflow, PySpark, EMR, S3, Parquet, Snowflake (Streams/Tasks,
MERGE, Time Travel), Git, Linux.
Education
National Institute of Technology, Arunachal Pradesh
B.Tech. Electronics and Communication (GPA: 7.70/10)
Arunachal Pradesh, India
3
Aug 2017 – May 2021