Arun Kumar Singh
103, SRECon, San Diego, CA, US 92121
-
#-ï linkedin.com/in/arunsingh23/ § github.com/arunsingh
medium.com/@aruns89-techblog = dyota.substack.com-newsletter
ü
Professional Summary
Business focused, Customer Obsessed engineering aficionado with cross-functional 16+ years of Industry, Start Ups
R&D experience in ICT, AIops, FinOps, healthcare, Industrial IoT energy & utilities, specialised software product
tech development, SRE, Cloud Infra, architecture, Security, Production Support, Operations, Incident management.
Education
Harvard University, J.F Kennedy School of Government
Public Leadership Credential
Krishna Institute of Engineering & Technology
Bachelor of Technology in Information Technology
Jan. 2025 – Ongoing
Cambridge, MA, USA
Aug. 2005 – Jun 2009
Delhi NCR, India
Technical Skills
Languages: Python, Java, C, Javascript, Typescript, *nix shell scripting, Golang, Ruby, SQL
Developer Tools: VS Code, Eclipse, Google Cloud Platform, PyCharm, AWS Cloud9
Technologies/Frameworks/Toolkits/DB/API: Datadog, Sensu, Puppet, Ansible, Docker, Kubernetes, Packer,
Terraform, Spinnaker, Jenkins, New Relic, Redis, AWS, GCP, Azure, ELK, LitmusChaos, DNS, SSL/TLS, TCP/IP,
HTTP(S), LATEX, Gatling, GIT, Kafka, Grafana, perf, flamegraph, kvm, JMeter, K6, Flask, Pager-duty, Django, vagrant,
MySQL, MongoDB, PostgreSQL, Timescale DB, ArgoCD, Mesos, DC/OS, N8n, Llama, GPT, AutoGen, AGI.
Hardware Platform(s): RaspBerry Pi, Arduino, Bare-Metal, Beagle, Custom PCBs.
Policies : HIPAA, CIS, SOC2, PCI-DSS, GDPR, ITIL, DPDP.
Professional Certifications
Languages, Domains: NVIDIA AI Infrastructure and Operations(2025), Google Cyber Security (2024),
Python (2018-UMichigan), Unix shell scripting (2019), AWS (2019, 2021, 2023), GCP (2019), Chaos Engineering (2024)
Platforms: Linux (2019, 2023), K8s (2023), Docker (2023), Linkerd (2023), Istio (2024), Prometheus (2023)
Frameworks & Leadership: Harvard University - Exercising Leadership: Foundational Principles (2024), ITIL Service
Management (2024), Google Cloud - SRE: Measuring & Managing Reliability (2019), Sumologic - Log, Search, Analytics
(2021)
Professional Experience
Godspeed Pte Ltd
Nov 2022 – Present
Co-founder & VP - SRE, Product & Platform Engineering
Singapore — Bangalore, India
• People Management: IC role:: 30:70, Platform owner for multiple projects. Team size: 8-22, tech stack: AWS, GCP,
Azure, fastly, Prometheus, Grafana, OTEL, Python, Golang, Typescript, JS, Docker, Kubernetes, Ansible, GitLab,
ArgoCD, Kafka, Debezium, Airflow, Terraform, helm, Airbyte, Linkerd, Spark, Snyk, Blackduck, Gatling, Extrahop.
• Key Accomplishments
• Shipped and led architecture of a Generative AI LLM-based platform, integrating multiple cloud vendors, dev tooling,
coding instrumentation, observability, ProdOps, enhancing developer productivity by 30X using IDP.
• Developed PoC, a Linux-based traffic generator for network performance testing, Shift left QA automation using
IaaC for cloud-based workloads.
• Fundraising, sales, customer onboarding PoCs, team building, training bootcamps. GTM strategies resulting in 5X
customer MoM. CASB, PCIDSS, HIPAA, NIST CSF, GDPR GRC Security Audits (AppSec, InfoSec, Network
Security).
Stealth Startup - Inductive Charging EV Mobility
Nov 2021 – Oct 2022
Founder - SRE, Product and Platform Architecture
Canada, India, Taiwan, Singapore
• People Management: IC role:: 10:90, Backend, platform, hardware owner for multiple projects. Team size: 2-10
• Key Accomplishments
• Bootstrapped a stealth startup focused on EV mobility, developing an inductive charging prototype for product-market
fit evaluation. Despite progress, commercialization with OEMs was hindered by hardware optimization challenges.
• Created inductive charging hardware prototype through contract manufacturing in SE Asia.
•
Led software platform for EV mobility, Battery Observability, health, OEM integration, Shift left Approach SRE,
performance benchmarking, tooling based automation across Code, Build, Test, Release, Deploy,
Monitoring, Observability.
NextGen Healthcare Inc (formerly QSI Inc)
Aug 2019 – Oct 2021
Sr. Staff Site Reliability Engineer
San Diego, CA, United States
• People Management: IC role:: 30:70, Backend, platform, infrastructure owner for multiple projects. Team size: 5-68,
tech stack: AWS, fastly, Datadog, Sumologic, Python, Java, Docker, Kubernetes, Puppet, GitLab, opa, atlantis, Jenkins,
Kafka, Databricks, Kong, Terraform, Airbrake, Linkerd.
• Key Accomplishments
• Enhanced developer workflow and productivity within a cross-functional team by creating tools to automate and
manage multi-cloud infrastructure, cutting operational overhead by a third. On-premises DC workload to Cloud
migration.
•
•
Developed a performance benchmark suite testing tool focused on improving 5x Reliability. Shipped product-wide
Instrumentation, enterprise monitoring and reporting tools for apps used by 4 million users.
Built PaaS abstractions over infrastructure for reducing multi-cloud deployment from a day to under an hour.
Re-architected NGO BI Reports Infra to unlock USD 50M customer subscriptions.
MachinePulse Tech Pvt Ltd (now MahindraTeqo)
Oct 2017 – Feb 2019
Principal Architect - SRE, Product & Platform Engineering
Bangalore, India
• People Management: IC role:: 40:60, Backend, platform, hardware owner for multiple projects. Team size: 8-45, tech
stack: AWS, on-prem, fastly, Sensu, ELK, Python, Java, Docker, Kubernetes, Puppet, GitLab, Jenkins, Kafka, Spark,
Terraform, Sentry.
• Key Accomplishments
• Launched leading on-premise monitoring solution for Solar industry within Industrial IoT domain.
• Scaled system 8x to 3 million writes/sec across 700+ installations via event-driven architecture migration & PoC,
Implemented litmus Chaos engg tool.
• Designed and implemented a tool for load and performance testing of HTTP API backend services.
• Built support org and led a 30-member cross-functional team of PM, devs, QA, and DevOps.
Infinite Forest, Inc
Oct 2016 – Aug 2017
Bangalore, India
Sr. SRE, Platform Engineering
• IC role, Shipped Backend, platform, infrastructure for multiple projects. Team size: 5
• Key Accomplishments
• Shipped the back-end engine for machine learning-based matrimonial product. Created robust, fault-tolerant platform,
built cross-platform 24X7 centralized monitoring.
Livestream Technologies Pvt Ltd (now Vimeo, Inc)
Jun 2016 – Sept 2016
Sr. Site Reliability Engineer, Platform & Product Group
Bangalore, India
• IC role, Shipped, platform tools, infrastructure, monitoring, observability for multiple projects. Team size: 4-20, tech
stack: AWS, Akamai CDN, fastly, Sensu, Python, Django, Docker, Ansible, Puppet, GitLab, Jenkins.
• Key Accomplishments
• Managed scalability and uptime for a live video streaming platform on 4550 servers through automated system design,
monitoring, and operational tools development.
Shadowfax Technologies Pvt Ltd
Oct 2015 – Jan 2016
Tech Architect - Backend Infrastructure & DevOps
Bangalore, Karnataka, India
• People management: IC role:: 20:80, Shipped backend and infrastructure for Hyperlocal delivery. Team size: 2-12, tech
stack: AWS, Python, Django, Docker, Ansible, GitLab, Jenkins, JFrog, Cloudwatch.
• Key Accomplishments
• Created On demand hyperlocal delivery product using Python, Django framework.
Indian Institute Of Technology Delhi
Apr 2010 – June 2015
Sr. Computer Scientist
New Delhi, India
• Worked with IIT Delhi & incubation based start up Gramvaani Ltd, to understand the State of Internet
Connectivity in Rural Areas. Performing Network auditing & performance measurements for wireless service
providers. Built technology Infrastructure and deployed technology suites to rural field locations under e-governance
initiatives of the Government of India.
• Multitouch Surface Table: Implemented TUIO protocol & developed a graphical Zoomable User Interfaces (ZUIs) by
hacking Python(pyMT) and Java Library(MT4j) APIs, hacked an TFT screen to create 3D interactive table with
Acrylics mask and also designed 100 inches interactive wall with fiducial marking that enables device to detect object.
Sun Microsystems, Inc
June 2009 – Mar 2010
Member Technical Staff
India
• Worked on Sun Cluster infrastructure on solaris kernel, troubleshooting solaris kernel crash dump analysis and
subsystems feature viz. VM, ProcFs and boot up.
Sapro Robotics India Private Limited
Co-founder & System Engineering
• Invented Surgical robots for healthcare, UAVs, and mobile robots for the defence sector.
Feb 2008 – Jan 2010
Ghaziabad, U.P. India
Open Source and Side Projects
FinOps as a Service | Python, Selenium, Terraform, Ansible, Vsphere, AWS, Crossplane
Apr 2024
• Shipped automated tools for fintech companies operations abstract problems across compute, storage, network,
databases.
• North Star Architecture IaC tool, Shipped, system design & implemented In-house terraform like IaC tool, reducing
org-wide tech-debt.
• Factory design pattern-based interface to dynamically load different cloud providers.
• Implement retries with exponential backoff for transient errors.
• Defined a schema for configuration files and validate against it.
• Transaction Management: Implement rollback mechanisms to maintain consistency if operations fail.
• Event Logging and Auditing: Maintain logs for all operations for better auditing and troubleshooting.
• WIP: Distributed cron scheduler
Playbox OTT App | Python, React, Redux, NodeJS, JS, ElasticSearch, PostgreSQL, Scrapy, Sequelize, Kibana
• Shipped a complete OTT solution, something on the lines of Hotstar, Netflix, I call this platform PlayBox.
• Automated CI/CD prod deployment, Security Vulnerabilities Scans.
Incydent Commander Bot | Python, YAML, qdrant
• Designed an automated tool in Python to integrate incident management with Slack and PagerDuty.
• WIP: Gen AI GPT-LLM based alert correlation engine integration.
Mar 2019
Jan 2023
Open Source Software Contribution, Leadership, Speaking
Google GDG Cloud Bangalore MeetUp
Mar 2018 – Present
Community Founder & Tech Evangelist
Google
• Contributor & Maintainer for Stanford University based Eyepatch project and Creator for Elongpower.
• Managed executive board of 4 members and ran weekly meetings to oversee progress in essential parts of the chapter.
• Led chapter of 10000+ members to work towards goals that improve and promote tech community tech talks.
Cilium & Linkerd
OSS
•
OSS Contribution since 2022 to CNCF Cilium, Linkerd Triage issues, PRs.
REFERENCES
Available on Request.
2022 – Present
CNCF