Together AI promises faster inference and lower costs with enterprise AI platform for private cloud

Together AI promises faster inference and lower costs with enterprise AI platform for private cloud


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Running AI in the public cloud can presents enterprises with numerous concerns about data privacy and security.

That’s why some enterprises will choose to deploy AI on a private cloud or on-premises environment. Together AI is among the vendors looking to solve the challenges of effectively enabling enterprises to deploy AI in private clouds in a cost effective approach. The company today announced its Together Enterprise Platform, enabling AI deployment in virtual private cloud (VPC) and on-premises environments.

Together AI made its debut in 2023, aiming to simplify enterprise use of open-source LLMs. The company already has a full-stack platform to enable enterprises to easily use open source LLMs on its own cloud service. The new platform extends AI deployment to customer-controlled cloud and on-premises environments. The Together Enterprise Platform aims to address key concerns of businesses adopting AI technologies, including performance, cost-efficiency and data privacy.

okex

“As you’re scaling up AI workloads, efficiency and cost matters to companies, they also really care about data privacy,” Vipul Prakash, CEO of Together AI told VentureBeat. “Inside of enterprises there are also well-established privacy and compliance policies, which are already implemented in their own cloud setups and companies also care about model ownership.”

How to keep private cloud enterprise AI cost down with Together AI

The key promise of the Together Enterprise Platform is that organizations can manage and run AI models in their own private cloud deployment.

This adaptability is crucial for enterprises that have already invested heavily in their IT infrastructure. The platform offers flexibility by working in private clouds and enabling users to scale to Together’s cloud.

A key benefit of the Together Enterprise platform is its ability to dramatically improve the performance of AI inference workloads. 

“We are often able to improve the performance of inference by two to three times and reduce the amount of hardware they’re using to do inference by 50%,” Prakash said. “This creates significant savings and more capacity for enterprises to build more products, build more models, and launch more features.” 

The performance gains are achieved through a combination of optimized software and hardware utilization.

 “There’s a lot of algorithmic craft in how we schedule and organize the computation on GPUs to get the maximum utilization and lowest latency,” Prakash explained. “We do a lot of work on speculative decoding, which uses a small model to predict what the larger model would generate, reducing the workload on the more computationally intensive model.”

Flexible model orchestration and the Mixture of Agents approach

Another key feature of the Together Enterprise platform is its ability to orchestrate the use of multiple AI models within a single application or workflow. 

“What we’re seeing in enterprises is that they’re typically using a combination of different models – open-source models, custom models, and models from different sources,” Prakash said. “The Together platform allows this orchestration of all this work, scaling the models up and down depending on the demand for a particular feature at a particular time.”

There are many different ways that an organization can orchestrate models to work together. Some organizations and vendors will use technologies like LangChain to combine models together. Another approach is to use a model router, like the one built by Martian, to route queries to the best model. SambaNova uses a Composition of Experts model, combining multiple models for optimal outcomes.

Together AI is using a different approach that it calls – Mixture of Agents. Prakash said this approach combines multi-model agentic AI with a trainable system for ongoing improvement. The way it works is by using “weaker” models as “proposers” – they each provide a response to the prompt. Then an “aggregator” model is used to combine these responses in a way that produces a better overall answer.

“We are a computational and inference platform and agentic AI workflows are very interesting to us,” he said. “You’ll be seeing more stuff from Together AI on what we’re doing around it in the months to come.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

CryptoKorner
Fiverr
CryptoKorner
Together AI promises faster inference and lower costs with enterprise AI platform for private cloud
okex
Changelly
Maximum Football launches on PC and consoles as community-driven football sim
Apple hints at AI chip design automation future
Dunk City Dynasty launches Season 2 with Jayson Tatum and $10K community competition
China's AI future and Huawei's long game
bitcoin
ethereum
bnb
xrp
cardano
solana
dogecoin
polkadot
shiba-inu
dai
Free book
Bybit
Bitcoin at $100K Shows Institutional Dominance, Not Retail FOMO 
10-year Bitcoin holdings grow faster than daily issuance, marking scarcity signal after 2024 halving
XRP ETF
Gala Games Enhances Leaderboard Rewards and Introduces Affinity System
Bitcoin trades near $105K amid low volatility; analysts offer mixed outlooks
Bitcoin at $100K Shows Institutional Dominance, Not Retail FOMO 
10-year Bitcoin holdings grow faster than daily issuance, marking scarcity signal after 2024 halving
XRP ETF
Gala Games Enhances Leaderboard Rewards and Introduces Affinity System
ar
zh-CN
nl
en
fr
de
it
pt
ru
es
en
bitcoin
ethereum
tether
xrp
bnb
solana
usd-coin
tron
dogecoin
staked-ether
bitcoin
ethereum
tether
xrp
bnb
solana
usd-coin
tron
dogecoin
staked-ether