Senior Site Reliability Engineer

Gruve

All India • 1 month ago

Experience: 6 to 10 Yrs

PREMIUM

Deal of the Day --:--:--

A recruiter messaged CVX24 Premium users few seconds ago.

Upgrade to CVX24 Premium: Only $2.49

Free Resume Writing
Get a Verified Blue tick
See who viewed your profile
Unlimited chat with recruiters
Rank higher in recruiter searches
Get up to 10× more recruiter visibility
Get practical interview tips and guidance
Receive verified recruiter messages directly
Unlock hidden jobs, not visible to free users

$4.99 $2.49 🔥 50% OFF

Activate

$4.99 $2.49 all inc.

🔥 50% OFF

(Validity: 6 Months. After payment confirmation we will reach out to you)

Job Description

Role Overview: At Gruve, you will be leading reliability strategy and architectural improvements across various areas including infrastructure, GPU systems, observability, ML Ops, and IT Ops. Your role involves mentoring engineers, managing high-severity incidents, and driving SLO governance. Working with a team of SRE engineers, you will be responsible for setting up, maintaining, and troubleshooting the stack from bare metal through the application layer. Key Responsibilities: - Architect reliability improvements across Kubernetes, GPU infrastructure, ML Ops, networking, and monitoring. - Lead incident management, blameless post-mortems, and error-budget policies. - Drive automation, IaC, and reliability tooling at scale. - Oversee metrics, logs, tracing, and dashboards; ensure actionable alerting. - Integrate GPU operators/exporters and model lifecycle workflows for inference platforms. - Mentor junior and mid-level SREs and guide cross-team initiatives. Qualifications Required: - 69 years of SRE or platform engineering experience. - Expertise in Kubernetes operations and cloud platform experience (AWS/GCP/Azure). - Advanced networking and security fundamentals. - Strong coding background in Python, Go, or Java. - Deep observability knowledge in Prometheus, Grafana, ELK / Fluentd. About Gruve: Gruve is an innovative software services startup dedicated to transforming enterprises into AI powerhouses. Specializing in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs), Gruve's mission is to assist customers in utilizing their data for making more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks. If you are passionate about technology and eager to make an impact, Gruve fosters a culture of innovation, collaboration, and continuous learning in a diverse and inclusive workplace. Gruve is an equal opportunity employer welcoming applicants from all backgrounds. Role Overview: At Gruve, you will be leading reliability strategy and architectural improvements across various areas including infrastructure, GPU systems, observability, ML Ops, and IT Ops. Your role involves mentoring engineers, managing high-severity incidents, and driving SLO governance. Working with a team of SRE engineers, you will be responsible for setting up, maintaining, and troubleshooting the stack from bare metal through the application layer. Key Responsibilities: - Architect reliability improvements across Kubernetes, GPU infrastructure, ML Ops, networking, and monitoring. - Lead incident management, blameless post-mortems, and error-budget policies. - Drive automation, IaC, and reliability tooling at scale. - Oversee metrics, logs, tracing, and dashboards; ensure actionable alerting. - Integrate GPU operators/exporters and model lifecycle workflows for inference platforms. - Mentor junior and mid-level SREs and guide cross-team initiatives. Qualifications Required: - 69 years of SRE or platform engineering experience. - Expertise in Kubernetes operations and cloud platform experience (AWS/GCP/Azure). - Advanced networking and security fundamentals. - Strong coding background in Python, Go, or Java. - Deep observability knowledge in Prometheus, Grafana, ELK / Fluentd. About Gruve: Gruve is an innovative software services startup dedicated to transforming enterprises into AI powerhouses. Specializing in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs), Gruve's mission is to assist customers in utilizing their data for making more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks. If you are passionate about technology and eager to make an impact, Gruve fosters a culture of innovation, collaboration, and continuous learning in a diverse and inclusive workplace. Gruve is an equal opportunity employer welcoming applicants from all backgrounds.

Skills Required

Kubernetes networking monitoring Python Go Java GPU infrastructure ML Ops Prometheus Grafana ELK Fluentd

Posted on: March 6, 2026

Relevant Jobs

Medical Copywriter

Thepharmadaily

All India

View Job →

QuickTV AI Video and Sound Editor (Contract)

Sharechat

All India

View Job →

Senior Designer- Electrical

Barry-Wehmiller

All India, Chennai

View Job →

Digital and print media artist

Stackular

All India, Hyderabad

View Job →

Director Brand Marketing

Upstox

All India

View Job →

Content and Social Media Marketing Internship

calmveda

All India, Delhi

View Job →

Social Media & Content Lead

FrugalTesting

All India

View Job →

Video Content Creator/Producer (Shoot & Edit)

alt.f coworking

All India, Gurugram

View Job →

Video Editing/Making - Internship

Animtopedia Private Limited

All India, Faridabad

View Job →

Senior Performance Marketer

Get Marketed

All India, Jaipur

View Job →

Senior Site Reliability Engineer

A recruiter messaged CVX24 Premium users few seconds ago.

Enter Your Details

Job Description

Skills Required

Relevant Jobs

Medical Copywriter

QuickTV AI Video and Sound Editor (Contract)

Senior Designer- Electrical

Digital and print media artist

Director Brand Marketing

Content and Social Media Marketing Internship

Social Media & Content Lead

Video Content Creator/Producer (Shoot & Edit)

Video Editing/Making - Internship

Senior Performance Marketer

Application Submitted

Your Professional Info

Login / Register Free