Lead Engineer, Senior-Machine Learning Tools

Qualcomm

All India, Hyderabad • 1 month ago

Experience: 6 to 10 Yrs

PREMIUM

Deal of the Day --:--:--

15 Days Free Trial

Upgrade to CVX24 Premium

Free Resume Writing
Get a Verified Blue tick
See who viewed your profile
Unlimited chat with recruiters
Rank higher in recruiter searches
Get up to 10× more recruiter visibility
Auto-forward profile to 10 top recruiters
Receive verified recruiter messages directly
Unlock hidden jobs, not visible to free users

Activate

A small token amount will be charged to verify. Get Refund in 48 Hours.
After free-trial 6 Months subscription will be auto Activated @ $2.49 (Cancel Anytime).
Free Bluetooth earphones with 6 Months subscription only.

Job Description

Role Overview: Join the exciting Generative AI team at Qualcomm focused on integrating cutting edge GenAI models on Qualcomm chipsets. You will work with Qualcomm chips extensive heterogeneous computing capabilities to allow inference of GenAI models on-device without the need for connection to the cloud. Your role involves spearheading the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs. As an AI inferencing expert, you will push the limits of performance from large models and deploy large C/C++ software stacks using best practices. Your responsibilities will include staying updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment. Your passion for the role of edge in AI's evolution will be a key driving force in this role. Key Responsibilities: - Spearhead the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs - Push the limits of performance from large models - Deploy large C/C++ software stacks using best practices - Stay updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment - Utilize power efficient hardware and Software stack to run Large Language Models (LLMs) and Large Vision Models (LVM) at near GPU speeds Qualifications Required: - Masters/Bachelors degree in computer science or equivalent - 6+ years of relevant work experience in software development - Strong understanding of Generative AI models LLM, LVM, LMMs and building blocks (self-attention, cross attention, kv caching etc.) - Knowledge of floating-point, fixed-point representations, and quantization concepts - Experience with optimizing algorithms for AI hardware accelerators (like CPU/GPU/NPU) - Proficient in C/C++ programming, Design Patterns, and OS concepts - Good scripting skills in Python - Excellent analytical and debugging skills - Good communication skills (verbal, presentation, written) - Ability to collaborate across a globally diverse team and multiple interests Role Overview: Join the exciting Generative AI team at Qualcomm focused on integrating cutting edge GenAI models on Qualcomm chipsets. You will work with Qualcomm chips extensive heterogeneous computing capabilities to allow inference of GenAI models on-device without the need for connection to the cloud. Your role involves spearheading the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs. As an AI inferencing expert, you will push the limits of performance from large models and deploy large C/C++ software stacks using best practices. Your responsibilities will include staying updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment. Your passion for the role of edge in AI's evolution will be a key driving force in this role. Key Responsibilities: - Spearhead the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs - Push the limits of performance from large models - Deploy large C/C++ software stacks using best practices - Stay updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment - Utilize power efficient hardware and Software stack to run Large Language Models (LLMs) and Large Vision Models (LVM) at near GPU speeds Qualifications Required: - Masters/Bachelors degree in computer science or equivalent - 6+ years of relevant work experience in software development - Strong understanding of Generative AI models LLM, LVM, LMMs and building blocks (self-attention, cross attention, kv caching etc.) - Knowledge of floating-point, fixed-point representations, and quantization concepts - Experience with optimizing algorithms for AI hardware accelerators (like CPU/GPU/NPU) - Proficient in C/C++ programming, Design Patterns, and OS concepts - Good scripting skills in Python - Excellent analytical and debugging skills - Good communication skills (verbal, presentation, written) - Ability to collaborate across a globally diverse team and multiple interests

Skills Required

C C Java Python LVM Design Patterns Analytical skills Communication skills System design OpenCL CUDA Generative AI models LLM LMMs selfattention cross attention kv caching Floatingpoint representations Fixedpoint representations Quantization concepts Optimizing algorithms for AI hardware accelerators OS concepts Scripting skills in Python Debugging skills SIMD processor architecture Objectoriented software development Linux environment Windows environment Kernel development for SIMD architectures Frameworks like llamacpp MLX MLC PyTorch TFLite ONNX Runtime Parallel computing systems

Posted on: April 3, 2026

Relevant Jobs