Lead Engineer, Senior-Machine Learning Tools
Qualcomm
All India, Hyderabad • 1 month ago
Experience: 6 to 10 Yrs
PREMIUM
Deal of the Day
--:--:--
15 Days Free Trial
Upgrade to CVX24 Premium
- Free Resume Writing
-
Get a Verified Blue tick
- See who viewed your profile
- Unlimited chat with recruiters
- Rank higher in recruiter searches
- Get up to 10× more recruiter visibility
- Auto-forward profile to 10 top recruiters
- Receive verified recruiter messages directly
- Unlock hidden jobs, not visible to free users
$0
Activate
$0
A small token amount will be charged to verify.
Get Refund in 48 Hours.
After free-trial 6 Months subscription will be auto Activated @ $2.49 (Cancel Anytime).
Free Bluetooth earphones with 6 Months subscription only.
Enter Your Details
Job Description
Role Overview:
Join the exciting Generative AI team at Qualcomm focused on integrating cutting edge GenAI models on Qualcomm chipsets. You will work with Qualcomm chips extensive heterogeneous computing capabilities to allow inference of GenAI models on-device without the need for connection to the cloud. Your role involves spearheading the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs. As an AI inferencing expert, you will push the limits of performance from large models and deploy large C/C++ software stacks using best practices. Your responsibilities will include staying updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment. Your passion for the role of edge in AI's evolution will be a key driving force in this role.
Key Responsibilities:
- Spearhead the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs
- Push the limits of performance from large models
- Deploy large C/C++ software stacks using best practices
- Stay updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment
- Utilize power efficient hardware and Software stack to run Large Language Models (LLMs) and Large Vision Models (LVM) at near GPU speeds
Qualifications Required:
- Masters/Bachelors degree in computer science or equivalent
- 6+ years of relevant work experience in software development
- Strong understanding of Generative AI models LLM, LVM, LMMs and building blocks (self-attention, cross attention, kv caching etc.)
- Knowledge of floating-point, fixed-point representations, and quantization concepts
- Experience with optimizing algorithms for AI hardware accelerators (like CPU/GPU/NPU)
- Proficient in C/C++ programming, Design Patterns, and OS concepts
- Good scripting skills in Python
- Excellent analytical and debugging skills
- Good communication skills (verbal, presentation, written)
- Ability to collaborate across a globally diverse team and multiple interests Role Overview:
Join the exciting Generative AI team at Qualcomm focused on integrating cutting edge GenAI models on Qualcomm chipsets. You will work with Qualcomm chips extensive heterogeneous computing capabilities to allow inference of GenAI models on-device without the need for connection to the cloud. Your role involves spearheading the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs. As an AI inferencing expert, you will push the limits of performance from large models and deploy large C/C++ software stacks using best practices. Your responsibilities will include staying updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment. Your passion for the role of edge in AI's evolution will be a key driving force in this role.
Key Responsibilities:
- Spearhead the development and commercialization of the Qualcomm AI Runtime (QAIRT) SDK on Qualcomm SoCs
- Push the limits of performance from large models
- Deploy large C/C++ software stacks using best practices
- Stay updated on GenAI advancements, understanding LLMs/Transformers, and the nuances of edge-based GenAI deployment
- Utilize power efficient hardware and Software stack to run Large Language Models (LLMs) and Large Vision Models (LVM) at near GPU speeds
Qualifications Required:
- Masters/Bachelors degree in computer science or equivalent
- 6+ years of relevant work experience in software development
- Strong understanding of Generative AI models LLM, LVM, LMMs and building blocks (self-attention, cross attention, kv caching etc.)
- Knowledge of floating-point, fixed-point representations, and quantization concepts
- Experience with optimizing algorithms for AI hardware accelerators (like CPU/GPU/NPU)
- Proficient in C/C++ programming, Design Patterns, and OS concepts
- Good scripting skills in Python
- Excellent analytical and debugging skills
- Good communication skills (verbal, presentation, written)
- Ability to collaborate across a globally diverse team and multiple interests
Skills Required
C
C
Java
Python
LVM
Design Patterns
Analytical skills
Communication skills
System design
OpenCL
CUDA
Generative AI models
LLM
LMMs
selfattention
cross attention
kv caching
Floatingpoint representations
Fixedpoint representations
Quantization concepts
Optimizing algorithms for AI hardware accelerators
OS concepts
Scripting skills in Python
Debugging skills
SIMD processor architecture
Objectoriented software development
Linux environment
Windows environment
Kernel development for SIMD architectures
Frameworks like llamacpp
MLX
MLC
PyTorch
TFLite
ONNX Runtime
Parallel computing systems
Posted on: April 3, 2026
Relevant Jobs
Step 2 of 2