Lambda Launches Cost-Effective Inference API for AI Model Deployment
Established in San Francisco over a decade ago, Lambda has become synonymous with on-demand graphics processing units (GPUs) tailored for machine learning researchers and professionals working on artificial intelligence models.
Introducing the Lambda Inference API
The company is now advancing its suite of services with the introduction of the Lambda Inference API, which it touts as the most cost-efficient solution available today. This new application programming interface (API) enables organizations to implement AI models directly into their operations, alleviating concerns about computing resource procurement and maintenance.
This latest offering perfectly aligns with Lambda’s existing services that focus on providing GPU clusters intended for training and refining machine learning algorithms.
Cost Savings Without Compromises
“Our platform is built to be entirely streamlined, allowing us to offer significant savings compared to competitors such as OpenAI,” stated Robert Brooks, Vice President of Revenue at Lambda, during a video discussion with VentureBeat. “Additionally, there are no limitations on usage keeping you from scaling your projects; moreover, initiating service doesn’t involve consultations with sales personnel.”
According to Brooks in his conversation with VentureBeat, developers can access Lambda’s newly launched Inference API webpage within five minutes by generating an API key and beginning their work promptly.
State-of-the-Art Model Support
The Inference API provides compatibility with cutting-edge models like Meta’s Llama 3.3 and 3.1 versions, Nous’s Hermes-3 model, along with Alibaba’s Qwen 2.5—making it an accessible option for those engaged in machine learning pursuits. A comprehensive list of supported models includes:
- deepseek-coder-v2-lite-instruct
- dracarys2-72b-instruct
- hermes3-405b
- hermes3-405b-fp8-128k
- hermes3-70b; hermes3-8b
- false-news-lfm–40-b;
- Llama Models:
- Llama ۳۱ -۴۰۵ ب – instruct-fp8
- < li>Llama 31/70B-Instruct FP8 li > ul >< ul >< li >Llama 31/۸B-Instruct < / li >< / ul >
< ul >
< li >Llama 32/۳B-Instruct [or]
< / li >
< / ul >
< p >Pricing begins at $0.02 per million tokens for smaller variants such as Llama -۳۲/۳ B-instruct while reaching up to $0.90 per million tokens when utilizing larger advanced models like Llama 31/٤٠٥ B-instruct.< / p >
Compare yourself better too !