Kshitij Hatwar
Building resilient cloud infrastructure, scalable Kubernetes platforms, and production-grade automation.
Results-driven Kubernetes Administrator with 3+ years of experience designing, deploying, and managing production cloud infrastructure. Certified Kubernetes Administrator (CKA) and AWS Solutions Architect.
About
I'm a Kubernetes Administrator currently working at Yotta Data Center, where I design, deploy, and manage production cloud infrastructure at scale.
My core strengths lie in Kubernetes orchestration, AWS cloud services, CI/CD pipeline automation, and infrastructure as code with Terraform and Ansible.
I hold the Certified Kubernetes Administrator (CKA), AWS Solutions Architect – Associate, and ITIL 4 Foundation certifications, demonstrating my commitment to both technical excellence and operational best practices.
I'm passionate about building systems that are resilient, scalable, and maintainable — systems that teams can rely on for 99.9%+ uptime.
Skills
Cloud & Infrastructure
DevOps & Automation
Systems & OS
Monitoring & Tools
Experience
Kubernetes Administrator
@ Yotta Data CenterProjects
Small-Scale LLM Training on Kubernetes
Deployed and orchestrated LLM training workloads on Kubernetes clusters with GPU scheduling, distributed training, and automated resource management.
Large-Scale Monitoring with VictoriaMetrics & Grafana
Designed and implemented enterprise-grade monitoring infrastructure handling millions of metrics from 100+ nodes with custom alerting and dashboards.
Multi-Master Kubernetes Cluster Upgrade
Executed zero-downtime upgrades across multi-master HA Kubernetes clusters, including etcd, control plane, and worker node rolling updates.
Certifications
Certified Kubernetes Administrator
Cloud Native Computing Foundation
Demonstrates proficiency in Kubernetes cluster administration, including installation, configuration, and management of production-grade clusters.
AWS Solutions Architect – Associate
Amazon Web Services
Validates expertise in designing distributed systems and deploying scalable, highly available applications on AWS infrastructure.
ITIL 4 Foundation
Axelos
Foundational understanding of IT service management best practices, enabling efficient service delivery and continuous improvement.
Writing
Zero-Downtime Kubernetes Upgrades: A Practical Guide
Learn the strategies and techniques for performing major Kubernetes version upgrades without impacting your production workloads.
Building Resilient Monitoring with VictoriaMetrics
A deep dive into setting up VictoriaMetrics for high-cardinality metrics at scale, with Grafana dashboards and alerting.
ETCD Backup Strategies for Production Clusters
Essential backup and disaster recovery patterns for etcd in production Kubernetes environments.
Contact
Let's work together
I'm always interested in discussing new opportunities, challenging projects, or just chatting about cloud infrastructure and Kubernetes. Feel free to reach out!