Lucas Carvalho

Lucas Carvalho AI Engineer -Brasília, Distrito Federal https://www.linkedin.com/in/lucas-carvalho-mle PROFESSIONAL SUMMARY AI Architect with 8+ years pioneering enterprise-grade LLM platforms, specializing in dynamic model routing, secure RAG systems, and resilient agentic workflows. At Cognii (award-winning EdTech AI leader), engineered core components of their conversational tutoring infrastructure, which processes 10K+ student interactions weekly with 95% precision. Built multi-LLM pipelines to optimize cost/latency for educational open-response assessments. Expertise spans cross-cloud deployment (AWS/Azure/GCP), GDPR/SOC2 compliance, and n8n/Zapier automation for failover routing. SKILLS • LLM & RAG: Prompt engineering (zero/few‑shot), multi‑LLM orchestration (OpenAI, Azure, Hugging Face), vector‑based retrieval pipelines, zapier, n8n • Agentic Workflows: Intent classification, retrieval, LLM synthesis, fallback routing • Backend & APIs: FastAPI, Flask, gRPC; Docker & Kubernetes deployment; CI/CD with Terraform/Jenkins • Data & Security: Secure connectors (APIs, SharePoint) with AES‑256/TLS; GDPR/SOC2 compliance; FAISS/Pinecone vector stores • Cloud & DevOps: AWS, GCP, Azure for scalable inference and data pipelines • ML & Evaluation: Model fine‑tuning, synthetic data generation, performance metrics (precision, BLEU/ROUGE) EXPERIENCE AI Engineer (Lead LLM Architect)09/2021 - 04/2025 Cognii (San Francisco, CA, USA) Built AI infrastructure for Cognii’s Virtual Learning Assistant—used by 30K+ students globally for personalized tutoring and open-response assessment • Multi-LLM Inference Engine: Architected a model-agnostic pipeline dynamically routing queries between OpenAI, Azure, and Hugging Face based on real-time cost ($0.02/token threshold), latency (<500ms SLA), and accuracy (BLEU-4 >0.85). Integrated fallback to lighter models during traffic spikes, maintaining 99.9% uptime during 5x demand surges (e.g., exam periods). • Multilingual RAG Moderation: Scaled content moderation to 30+ languages using Pinecone vector retrieval and semantic hierarchies (syntax - concept mapping). Achieved 95% precision (F1-score) via few-shot prompt chaining, reducing manual review workload by 70% ($250K annual savings). • n8n-Automated Agentic Tutoring: Engineered an intent-driven workflow: student query - concept retrieval - LLM synthesis - feedback analytics. Used n8n to trigger Slack alerts for latency breaches, accelerating incident response by 40%. System processed 10K+ cycles/week with sub-500ms P99 latency. • SOC2-Compliant Data Ingestion: Secured academic data ingestion from SharePoint/APIs using AES-256 encryption and TLS 1.3, aligning with NSF grant requirements and GDPR. AI Engineer 08/2019 - 04/2021 DataMind Analytics (London, England, United Kingdo Developed ML automation for manufacturing/logistics clients prior to mainstream LLMs. • RL-driven LLM agents: Created agents that integrated policy-based LLM chains with API calls to automate manufacturing RPA, reducing defects by 28% and boosting throughput by 15%. • High-Scale Sentiment Analysis: Deployed Scikit-learn/TensorFlow pipeline processing 1M+ social posts/day (94% accuracy), avoiding vendor API costs. • LLM summarisation engine: Fine-tuned models for medical report summarisation, accelerating read-out times by 35% in clinical operations. • Multi-modal LLM perception: Combined LiDAR/camera data with LLM-based annotation and reasoning agent; delivered 99.98% detection accuracy under harsh weather. • Zapier-Integrated Anomaly Detection: Connected Keras-based monitoring to Jira/Slack via Zapier, cutting incident response time by 50%. Machine Learning Developer06/2017 - 03/2019 Kis Solutions (São Paulo, Brazil) • NLP Customer Service Chatbot: Architected intent recognition pipeline using spaCy entity extraction and BERT embeddings (pre-LLM era), reducing customer resolution latency by 1.7 hours through dynamic FAQ routing and automated ticket classification – handling 5K+ daily queries with 92% accuracy. • High-Frequency Ad Optimization: Developed distributed GridSearchCV framework (Dask/Scikit-learn) automating hyperparameter tuning for real-time bidding models; boosted AUC-ROC by 12% and increased client ad spend ROI by 18% through latency-optimized feature engineering. • Analytics API Revolution: Replaced monolithic Django backend with asynchronous FastAPI microservice using Redis caching and WebSocket streaming, achieving 27% higher throughput (tested at 12K RPM) and cutting dashboard load times from 3.2s → 800ms under production load. • Mission-Critical Anomaly Detection: Designed LSTM autoencoder (Keras/TensorFlow) with adaptive thresholding for AWS EC2 monitoring, reducing false positives by 36% and slashing incident MTTR to <8 minutes through Slack-integrated alerting. COURSES / CERTIFICATIONS • Professional Machine Learning Engineer (PME) 04/2023 Google Cloud • Deep Learning with PyTorch 03/2021 Google PROJECTS • https://console.vectara.com Deployed hybrid keyword/vector search for clinical trial RAG pipelines, leveraging their BM25-neural fusion to boost recall 28% while maintaining SOC2-compliant data ingestion via TLS 1.3-encrypted connectors. • https://app.credal.ai Integrated their policy-enforced RAG framework into financial agent workflows, automatically redacting PII from loan documents using Azure AD permission syncing and reducing compliance review latency by 40% under GDPR audits. • https://www.cognii.com Deployed their Virtual Learning Assistant's NLP assessment engine for university open-response grading, leveraging syntax/semantic analysis to achieve 96% scoring accuracy while reducing instructor workload 70% through automated feedback generation and GDPR-compliant analytics. • https://studio.graphlit.com Integrated their multimodal RAG platform to ingest and process 50K+ clinical trial documents via automated OCR/transcription pipelines, cutting healthcare chatbot development time by 80% while maintaining HIPAA-compliant data handling with RBAC controls. EDUCATION • Bachelor of Computer Science04/2013 - 03/2017 Universidade de Brasília (UnB)

Scheduled maintenance