AbleCredit - GenAI Infra for BFSI Logo

AI Systems Engineer

AbleCredit - GenAI Infra for BFSI

All India, Pune • 2 months ago

Experience: 4 to 8 Yrs

PREMIUM
Deal of the Day --:--:--

15 Days Free Trial

After Free Trial → Flat 50% OFF

Upgrade to CVX24 Premium

Offer Announcement Banner
  • Free Resume Writing
  • Get a Verified Blue tick
  • See who viewed your profile
  • Unlimited chat with recruiters
  • Rank higher in recruiter searches
  • Get up to 10× more recruiter visibility
  • Auto-forward profile to 10 top recruiters
  • Receive verified recruiter messages directly
  • Unlock hidden jobs, not visible to free users

A small token amount will be charged to verify. Get Refund in 48 Hours.
Free Earplugs Delivery Only after Payment of Rs. 99 for Five Consecutive Months.
After free-trial 6 Months subscription will be auto Activated @ $ 1 (Cancel Anytime). Quoted price includes 50% discount.

Job Description

As an SDE 2 / SDE 3 specializing in AI Infrastructure & LLM Systems Engineering at AbleCredit, you will play a crucial role in deploying and maintaining production-grade AI systems for BFSI enterprises, with a focus on reducing operational expenses by up to 70% in critical areas such as onboarding, credit, collections, and claims. **Role Overview:** You will be responsible for deploying AI models on GPU infrastructure, exposing them through APIs, and ensuring scalable inference capabilities under high parallel loads using asynchronous systems and queues. This role specifically entails backend and systems engineering tasks, emphasizing on the deployment and optimization of AI workflows within a high-traffic enterprise environment. **Key Responsibilities:** - Deploy and manage LLMs on GPU infrastructure, whether on the cloud or on-premises. - Operate inference servers like vLLM, TGI, SGLang, Triton, or their equivalents. - Develop FastAPI or gRPC APIs to interact with AI models effectively. - Implement asynchronous, queue-based execution for AI workflows, including fan-out, retries, and backpressure strategies. - Strategize capacity planning and scaling considerations such as GPU count versus Requests Per Second (RPS), batching versus latency, and balancing cost with throughput. - Implement observability measures to monitor latency, GPU utilization, queue depths, and failures effectively. - Collaborate closely with AI researchers to ensure the safe productionization of models. **Qualifications Required:** - Solid understanding of backend engineering principles, distributed systems, and asynchronous workflows. - Hands-on experience in deploying GPU workloads in a production environment. - Proficiency in Python, with Golang skills being a plus. - Familiarity with Docker, Kubernetes, or similar containerization technologies. - Practical knowledge of queueing systems and worker frameworks such as Redis, Kafka, SQS, Celery, Temporal, etc. - Ability to analyze and optimize performance, reliability, and cost metrics quantitatively. **Additional Company Details (if available):** At AbleCredit, we are dedicated to building cutting-edge AI systems that revolutionize the operations of BFSI enterprises, enabling significant cost savings and operational efficiencies. Our focus on deploying LLMs on GPUs, operating high-concurrency inference systems, and scaling AI workflows under real enterprise traffic sets us apart as innovators in the field of AI infrastructure. If you possess hands-on experience in deploying models on GPUs, troubleshooting GPU-related issues, scaling compute-heavy backends, and designing asynchronous systems, you are an ideal candidate for this role. Familiarity with infra layers like LangChain and LlamaIndex, experience with vector DBs such as Qdrant, Pinecone, and Weaviate, as well as prior work on multi-tenant enterprise systems will be advantageous. If your background is primarily in calling OpenAI or Anthropic APIs, or if you are more focused on prompt engineering or frontend-centric AI development without hands-on involvement in infrastructure, scaling, or production reliability, this role may not be the best fit for you. As an SDE 2 / SDE 3 specializing in AI Infrastructure & LLM Systems Engineering at AbleCredit, you will play a crucial role in deploying and maintaining production-grade AI systems for BFSI enterprises, with a focus on reducing operational expenses by up to 70% in critical areas such as onboarding, credit, collections, and claims. **Role Overview:** You will be responsible for deploying AI models on GPU infrastructure, exposing them through APIs, and ensuring scalable inference capabilities under high parallel loads using asynchronous systems and queues. This role specifically entails backend and systems engineering tasks, emphasizing on the deployment and optimization of AI workflows within a high-traffic enterprise environment. **Key Responsibilities:** - Deploy and manage LLMs on GPU infrastructure, whether on the cloud or on-premises. - Operate inference servers like vLLM, TGI, SGLang, Triton, or their equivalents. - Develop FastAPI or gRPC APIs to interact with AI models effectively. - Implement asynchronous, queue-based execution for AI workflows, including fan-out, retries, and backpressure strategies. - Strategize capacity planning and scaling considerations such as GPU count versus Requests Per Second (RPS), batching versus latency, and balancing cost with throughput. - Implement observability measures to monitor latency, GPU utilization, queue depths, and failures effectively. - Collaborate closely with AI researchers to ensure the safe productionization of models. **Qualifications Required:** - Solid understanding of backend engineering principles, distributed systems, and asynchronous workflows. - Hands-on experience in deploying GPU workloads in a production environment. - Proficiency in Python, with Golang skills being a plus. - Familiarity with Docker, Ku

Posted on: March 19, 2026

Relevant Jobs

Director - Software Architecture

HCA Healthcare - India

All India, Hyderabad

View Job →

Director - Software Architecture

HCA Healthcare - India

All India, Hyderabad

View Job →

Data Scientist

welspun world

All India

View Job →

Data Scientist

welspun world

All India

View Job →

Senior Software Engineer, Machine Learning

Google

All India

View Job →

Data Scientist

welspun world

All India

View Job →

AI ML Data Science Specialist

Mizuho

All India

View Job →

AI ML Data Science Specialist

Mizuho

All India

View Job →

AI ML Data Science Specialist

Mizuho

All India

View Job →

Data Scientist

welspun world

All India

View Job →