AbleCredit - GenAI Infra for BFSI Logo

AI Systems Engineer

AbleCredit - GenAI Infra for BFSI

All India, Pune • 2 months ago

Experience: 4 to 8 Yrs

PREMIUM
Deal of the Day --:--:--

15 Days Free Trial

After Free Trial → Flat 50% OFF

Upgrade to CVX24 Premium

Offer Announcement Banner
  • Free Resume Writing
  • Get a Verified Blue tick
  • See who viewed your profile
  • Unlimited chat with recruiters
  • Rank higher in recruiter searches
  • Get up to 10× more recruiter visibility
  • Auto-forward profile to 10 top recruiters
  • Receive verified recruiter messages directly
  • Unlock hidden jobs, not visible to free users

A small token amount will be charged to verify. Get Refund in 48 Hours.
Free Earplugs Delivery Only after Payment of Rs. 99 for Five Consecutive Months.
After free-trial 6 Months subscription will be auto Activated @ $ 1 (Cancel Anytime). Quoted price includes 50% discount.

Job Description

As an SDE 2 / SDE 3 specializing in AI Infrastructure & LLM Systems Engineering at AbleCredit, you will play a crucial role in deploying and maintaining production-grade AI systems for BFSI enterprises, with a focus on reducing operational expenses by up to 70% in critical areas such as onboarding, credit, collections, and claims. **Role Overview:** You will be responsible for deploying AI models on GPU infrastructure, exposing them through APIs, and ensuring scalable inference capabilities under high parallel loads using asynchronous systems and queues. This role specifically entails backend and systems engineering tasks, emphasizing on the deployment and optimization of AI workflows within a high-traffic enterprise environment. **Key Responsibilities:** - Deploy and manage LLMs on GPU infrastructure, whether on the cloud or on-premises. - Operate inference servers like vLLM, TGI, SGLang, Triton, or their equivalents. - Develop FastAPI or gRPC APIs to interact with AI models effectively. - Implement asynchronous, queue-based execution for AI workflows, including fan-out, retries, and backpressure strategies. - Strategize capacity planning and scaling considerations such as GPU count versus Requests Per Second (RPS), batching versus latency, and balancing cost with throughput. - Implement observability measures to monitor latency, GPU utilization, queue depths, and failures effectively. - Collaborate closely with AI researchers to ensure the safe productionization of models. **Qualifications Required:** - Solid understanding of backend engineering principles, distributed systems, and asynchronous workflows. - Hands-on experience in deploying GPU workloads in a production environment. - Proficiency in Python, with Golang skills being a plus. - Familiarity with Docker, Kubernetes, or similar containerization technologies. - Practical knowledge of queueing systems and worker frameworks such as Redis, Kafka, SQS, Celery, Temporal, etc. - Ability to analyze and optimize performance, reliability, and cost metrics quantitatively. **Additional Company Details (if available):** At AbleCredit, we are dedicated to building cutting-edge AI systems that revolutionize the operations of BFSI enterprises, enabling significant cost savings and operational efficiencies. Our focus on deploying LLMs on GPUs, operating high-concurrency inference systems, and scaling AI workflows under real enterprise traffic sets us apart as innovators in the field of AI infrastructure. If you possess hands-on experience in deploying models on GPUs, troubleshooting GPU-related issues, scaling compute-heavy backends, and designing asynchronous systems, you are an ideal candidate for this role. Familiarity with infra layers like LangChain and LlamaIndex, experience with vector DBs such as Qdrant, Pinecone, and Weaviate, as well as prior work on multi-tenant enterprise systems will be advantageous. If your background is primarily in calling OpenAI or Anthropic APIs, or if you are more focused on prompt engineering or frontend-centric AI development without hands-on involvement in infrastructure, scaling, or production reliability, this role may not be the best fit for you. As an SDE 2 / SDE 3 specializing in AI Infrastructure & LLM Systems Engineering at AbleCredit, you will play a crucial role in deploying and maintaining production-grade AI systems for BFSI enterprises, with a focus on reducing operational expenses by up to 70% in critical areas such as onboarding, credit, collections, and claims. **Role Overview:** You will be responsible for deploying AI models on GPU infrastructure, exposing them through APIs, and ensuring scalable inference capabilities under high parallel loads using asynchronous systems and queues. This role specifically entails backend and systems engineering tasks, emphasizing on the deployment and optimization of AI workflows within a high-traffic enterprise environment. **Key Responsibilities:** - Deploy and manage LLMs on GPU infrastructure, whether on the cloud or on-premises. - Operate inference servers like vLLM, TGI, SGLang, Triton, or their equivalents. - Develop FastAPI or gRPC APIs to interact with AI models effectively. - Implement asynchronous, queue-based execution for AI workflows, including fan-out, retries, and backpressure strategies. - Strategize capacity planning and scaling considerations such as GPU count versus Requests Per Second (RPS), batching versus latency, and balancing cost with throughput. - Implement observability measures to monitor latency, GPU utilization, queue depths, and failures effectively. - Collaborate closely with AI researchers to ensure the safe productionization of models. **Qualifications Required:** - Solid understanding of backend engineering principles, distributed systems, and asynchronous workflows. - Hands-on experience in deploying GPU workloads in a production environment. - Proficiency in Python, with Golang skills being a plus. - Familiarity with Docker, Ku

Posted on: March 19, 2026

Relevant Jobs

Senior Designer- Electrical

Barry-Wehmiller

All India, Chennai

View Job →

PLC and Robotics Automation engineer

Expleo Group

All India

View Job →

Senior Software Engineer - Java / Telecom OSS - Remote

Insight Global Technologies

All India, Hyderabad

View Job →

Lead Platform Engineer/Platform Architect

PEOPLE EQUATION PRIVATE LIMITED

All India

View Job →

Software Engineer - Intermediate

Equifax

All India

View Job →

Engineering Manager (JIRA Project Management)

Newgen Software

All India, Noida

View Job →

Software Implementation Engineer Healthcare ERP

Ideaedu Consultant Technologies Pvt Ltd

All India, Chennai

View Job →

STEM Innovation Engineer

Innow8

All India, Gurugram

View Job →

Senior Project Head

DAS FOODTECH PVT. LTD.

All India, Gurugram

View Job →

Customer Service - Engineering

Cadence

All India, Pune

View Job →