About Me
DevOps & Cloud Infrastructure Engineer
My Professional Journey
DevOps & Cloud Infrastructure Engineer with 2+ years of production experience at Protean eGov Technologies, building CI/CD pipelines, managing AWS cloud infrastructure, and operating Kubernetes (EKS) workloads at scale for government-grade platforms (NPS, PAN, eSign) serving 300M+ users.
Provisioned and maintained production EKS clusters via Terraform, managed GitOps deployments with Argo CD and Helm Charts, enforced Kubernetes RBAC and network policies, and ran a unified observability stack (Prometheus, Grafana, kube-state-metrics, ELK) across both EKS and VMware environments.
Maintained 99.9%+ uptime across 1500+ VMs on a large-scale VMware vSphere environment (100+ ESXi hosts), driving performance optimization, zero-downtime upgrades, and infrastructure reliability at enterprise scale.
Led disaster recovery and backup operations for a 10+ PB Commvault environment, automating DR failover with Ansible and reducing recovery time by 60% while consistently meeting RPO/RTO targets.
Promoted to Assistant Manager within 6 months, leading a 6-member infrastructure team, handling P1/P2 incidents, and enforcing ITIL-driven change management to reduce production risks and improve system stability.
Core Expertise
Cloud & DevOps
AWS (EC2, S3, IAM, VPC, Lambda, API Gateway, EKS, CloudWatch, ALB/NLB, Route 53, CloudFront, ECR, KMS), Terraform (modules, remote state, S3 + DynamoDB locking), Docker (multi-stage builds) & Containerization, Kubernetes (EKS), Argo CD (GitOps), Helm Charts & Kustomize, Jenkins, GitHub Actions, Ansible, Agile/Scrum SDLC, Cloud service models (IaaS, PaaS, SaaS), Workflow automation with N8N
Virtualization & Infrastructure
VMware vSphere, ESXi & vCenter, High Availability (HA), Distributed Resource Scheduler (DRS), vMotion, vRealize Operations (vROps), Capacity Planning & Performance Optimization
Observability & Scripting
ELK Stack (Elasticsearch, Logstash, Kibana, Filebeat, ElastAlert2), Prometheus & kube-state-metrics, Grafana, Amazon CloudWatch, Python, Shell Scripting
Professional Experience
Assistant Manager – DevOps & Cloud Infrastructure Engineer
Protean eGov Technologies Ltd (formerly NSDL eGov Infrastructure Ltd)
Mumbai, Maharashtra, India
Key Responsibilities
- ★Built and maintained Jenkins CI/CD pipelines for AWS EKS and VMware on-prem deployments with separate environment workflows, automated build triggers, test stages, Slack notifications, and rollback triggers; integrated Argo CD for GitOps-driven Kubernetes sync and deployed environment-specific configurations via Kustomize overlays
- ★Designed end-to-end EKS deployment pipeline: GitHub webhook → Maven build → Docker multi-stage image → ECR push → Helm chart deploy → Argo CD sync, with automated rollback and environment-specific Helm value overrides
- ★Authored Terraform IaC modules for AWS (EC2, VPC, ALB, IAM, EKS node groups) and VMware vSphere; managed remote state with S3 + DynamoDB locking ensuring safe concurrent deployments across teams
- ★Provisioned 2 production EKS clusters (20–40 worker nodes) via Terraform; enforced Kubernetes RBAC (Roles, ClusterRoles, RoleBindings) for least-privilege access control and implemented Pod Security Admission policies to restrict privileged workloads across namespaces
- ★Managed Kubernetes persistent storage for stateful workloads using PersistentVolumeClaims, StorageClasses, and the EBS CSI Driver — handling dynamic provisioning, volume binding, and lifecycle management across dev and production environments
- ★Configured Kubernetes Network Policies (VPC CNI) to enforce pod-to-pod traffic segmentation between namespaces; managed AWS Load Balancer Controller for production ingress routing alongside IRSA/OIDC for pod-level IAM
- ★Managed Kubernetes secrets using Sealed Secrets for GitOps-safe secret storage; containerised microservices (eSign, eKYC) via Docker multi-stage builds deployed via Helm Charts with namespace isolation and resource limits
- ★Deployed Kubernetes-native observability stack: Prometheus with kube-state-metrics and metrics-server for cluster and workload metrics, Grafana dashboards for pod/node/namespace visibility, and ELK Stack for centralised log aggregation — covering both EKS and VMware environments
- ★Configured ElastAlert2 detection rules (SSH brute force, CPU spikes, error rate thresholds) achieving sub-5-minute MTTD; vROps alerting for VMware resource contention reduced unplanned downtime by ~40%
- ★Maintained 99.9%+ uptime across 1500+ VMs for NPS, PAN, TIN, CRA, and eSign platforms serving 60M+ NPS subscribers and 300M+ PAN cardholders; executed zero-downtime ESXi upgrades across 100+ hosts with Python/Shell pre/post-validation scripts
- ★Automated VM provisioning via Ansible achieving 100% monthly patch compliance, eliminating configuration drift across 1500+ Windows Server (2016/2019/2022) and Linux VMs
- ★Led daily operations of a 6-member infrastructure team; served as escalation point for P1/P2 incidents and conducted structured Root Cause Analysis (RCA) for all major events
- ★Participated in CAB meetings and enforced ITIL-aligned change management processes, reducing change-related incidents by standardizing pre-change checklists
- ★Built an internal infrastructure asset management platform using Spring Boot, MySQL, and ReactJS to replace manual spreadsheet tracking of 1500+ hardware/software assets, reducing audit preparation time by ~50%
Key Achievements
- ★Promoted to Assistant Manager within 6 months based on operational ownership and leadership.
- ★Designed and owned end-to-end GitOps-based EKS deployment pipeline adopted across production environments.
- ★Reduced DR drill time by 60% through Ansible-driven disaster recovery automation.
Technical Skills
cloud Dev Ops
- ✓AWS (EC2, S3, IAM, VPC, Lambda, API Gateway, EKS, CloudWatch, ALB/NLB, Route 53, CloudFront, ECR, KMS)
- ✓Terraform (modules, remote state, S3 + DynamoDB locking)
- ✓Docker (multi-stage builds) & Containerization
- ✓Kubernetes (EKS)
- ✓Argo CD (GitOps)
- ✓Helm Charts & Kustomize
- ✓Jenkins
- ✓GitHub Actions
- ✓Ansible
- ✓Agile/Scrum SDLC
- ✓Cloud service models (IaaS, PaaS, SaaS)
- ✓Workflow automation with N8N
kubernetes And Security
- ✓RBAC (Roles, ClusterRoles, RoleBindings)
- ✓Pod Security Admission
- ✓Network Policies (VPC CNI)
- ✓PersistentVolumeClaims & StorageClasses
- ✓EBS CSI Driver
- ✓IRSA / OIDC
- ✓AWS Load Balancer Controller
- ✓Sealed Secrets
- ✓IMDSv2 & KMS
infrastructure Virtualization
- ✓VMware vSphere
- ✓ESXi & vCenter
- ✓High Availability (HA)
- ✓Distributed Resource Scheduler (DRS)
- ✓vMotion
- ✓vRealize Operations (vROps)
- ✓Capacity Planning & Performance Optimization
operating Systems
- ✓Linux (Ubuntu, Amazon Linux, RedHat)
- ✓Windows Server 2016/2019/2022
- ✓OS Hardening
- ✓Patch Management
backup Disaster Recovery
- ✓Commvault Enterprise Backup
- ✓Disaster Recovery Automation
- ✓RPO/RTO Validation
- ✓Backup Infrastructure Management
monitoring Automation
- ✓ELK Stack (Elasticsearch, Logstash, Kibana, Filebeat, ElastAlert2)
- ✓Prometheus & kube-state-metrics
- ✓Grafana
- ✓Amazon CloudWatch
- ✓Python
- ✓Shell Scripting
networking
- ✓TCP/IP, DNS, DHCP
- ✓AWS VPC, Security Groups, NACLs
- ✓CNI (VPC CNI)
- ✓Kubernetes Network Policies
Certifications
Advanced Cloud Computing & DevOps Certification from Learnbay, in collaboration with Microsoft
Learnbay
AI Engineer MLOps Track – Deploy GenAI & Agentic AI at Scale
Udemy
PromptOps – AI-Powered DevOps
Udemy
Complete VMware vSphere ESXi and vCenter Administration
Udemy
Java Full Stack Development
TalentSprint / Q-J Spiders
Certified Kubernetes Administrator (CKA)
CNCF