Hugging FACE Reduces The Vision Models From AI To A Phones Compatible Size, Drastically Reducing Computer Costs

Unique our daily and weekly newsletters to obtain the latest updates and exclusive content on the coverage of the leader in the industry. More information

hugging the face He has achieved remarkable advance in AI, by presenting vision language models that are executed on devices as small as smartphones and exceed their predecessors that require massive data centers.

The new company SMOLRLM-256M MODELwhich requires less than a GPU memory gigabyte exceeds the performance of its IDEFICS 80B For only 17 months: a 300 times larger system. This drastic reduction of size and improvement of the capacity marks a decisive moment for the practical deployment of AI.

“When we launched 80b in August 2023, we were the first company to open an open source video language model,” said Andrés Marafioti, automatic learning research engineer at Hugging Face, in an exclusive interview with VentureBeat. “By achieving a size reduction of 300 times and at the same time improve performance, SMolrolm marks a great advance in vision and language models.”

The performance comparison of the new SMOLMM models of Hugging FACE shows that the smallest versions (256m and 500m) consistently exceed their predecessor of 80 billion parameters in key tasks of visual reasoning. (Credit: hugged face)

Table of Contents

Smallest models that are executed on everyday devices

The advance arrives at a crucial moment for the companies that fight with the Astronomical Computer Costs to implement AI systems. The new SMOLMM models, available in 256m and 500m Parameter sizes: Process images and understand the visual content at speeds that were previously unattainable in their size class.

The smallest version process 16 examples per second using only 15 GB of RAM with a lot size of 64, which makes it particularly attractive to companies that seek to process large volumes of visual data. “For a medium -sized company that processes 1 million images per month, this translates into substantial annual savings in computer costs,” Marafioti told Venturebeat. “Memory reduction means that companies can implement cheaper cloud instances, which reduces infrastructure costs.”

Development has already caught the attention of the main technological actors. IBM has associated with Hugging Face to integrate the 256m model into DoclingYour document processing software. “While IBM certainly has access to substantial computer resources, the use of smaller models such as these allows them to efficiently process millions of documents to a cost fraction,” Marafioti said.

*Processing speeds of SMolrolm models in different lot sizes, which shows how the smallest variants 256m and 500m significantly exceed version 2.2b in graphics cards A100 and L4. (Credit: hugged face)*

How Hugging Face reduced model size without compromising power

Efficiency gains come from technical innovations both in vision processing and language components. The team changed from a 400m parameter vision encoder to a 93m parameter version and implemented more aggressive tokens compression techniques. These changes maintain high performance while drastically reduce computational requirements.

For new companies and smaller companies, these advances could be transformers. “Emerging companies can now launch sophisticated computer vision products in weeks instead of months, with infrastructure costs that were just a few months ago,” Marafioti said.

The impact extends beyond cost savings and allows completely new applications. The models promote advanced documents search capabilities through ColipaliAn algorithm that creates databases with search capacity from document files. “They obtain yields very close to those of 10 times larger models and at the same time significantly increase the speed at which the database is created and looking First time, “Marafioti explained.

*A breakdown of the 1.7 billion examples of SMOLMMM training shows that document processing and image subtitles comprise almost half of the data set. (Credit: hugged face)*

Why the smallest models are the future of the development of AI

The advance challenges conventional wisdom about the relationship between the size and capacity of the model. While many researchers have assumed that larger models are needed for advanced vision and language tasks, SMololm demonstrates that smaller and efficient architectures can achieve similar results. The 500M parameter version achieves 90% of its 2.2B parameter brother performance at key reference points.

Instead of suggesting an efficiency plateau, Marafioti sees these results as evidence of an unleashed potential: “Until today, the standard was to launch VLM from 2b parameters; We thought that smaller models were not useful. We are demonstrating that, in fact, 1/10 models of size can be extremely useful for companies. ”

This development comes in the midst of growing concerns about AI. Environmental impact and Computer costs. By drastically reducing the necessary resources for the vision-language, the innovation of Hugging Face could help address both problems and at the same time make the advanced abilities of AI be accessible to a broader range of organizations.

The models are Open source availableContinuing the Hugging Face tradition of increasing access to artificial intelligence technology. This accessibility, combined with the efficiency of the models, could accelerate the adoption of the AI of visual language in industries ranging from medical care to retail trade, where processing costs have been previously prohibitive.

In a field where for a long time it means better, the achievement of Hugging Face efficient that run directly on our devices. While the industry faces issues of scale and sustainability, these smaller models could represent the greatest advance so far.

Daily information about business use cases with VB Daily

If you want to impress your boss, VB Daily is covered. We provide privileged information about what companies are doing with generative AI, from regulatory changes to practical implementations, so that you can share information to obtain the maximum return on investment.

Read our Privacy Policy

Thanks for subscribing. See more VB bulletins here.

There was an error.

Christmas Discounts
Source link

Hugging FACE reduces the vision models from AI to a phones compatible size, drastically reducing computer costs

Smallest models that are executed on everyday devices

How Hugging Face reduced model size without compromising power

Why the smallest models are the future of the development of AI

Leave a ReplyCancel Reply

Market resilience challenged by Trump’s weekend rates save

Superman’s darker episode shows why Clark Kent matters

Real Oviedo thanks to Santi Cazorla for ‘continuing to dream of us’ after the extension of the contract

Smallest models that are executed on everyday devices

How Hugging Face reduced model size without compromising power

Why the smallest models are the future of the development of AI

Leave a ReplyCancel Reply

Trending now

Market resilience challenged by Trump’s weekend rates save

Superman’s darker episode shows why Clark Kent matters

Real Oviedo thanks to Santi Cazorla for ‘continuing to dream of us’ after the extension of the contract