TLDR AI 2024-09-06

Your one stop-shop for AI search solutions (Sponsor)

Algolia generates 1.7 trillion searches a year at 99.999% availability. Our unique end-to-end AI search allows business and development teams to understand users and show them what they need.

Get a peek at our search solutions and take advantage of our interactive tools to evaluate your search and see the impact of Algolia on your business.

17,000+ businesses trust Algolia to empower their user experience with fast, reliable, and scalable AI search solutions. Learn more at algolia.com

🚀

Headlines & Launches

OpenAI Considers $2,000 Monthly Subscription Prices For New LLMs (1 minute read)

OpenAI is reportedly considering subscription prices as high as $2,000 per month for the company's upcoming large language models, like Strawberry and Orion.

Google's AI-Powered Ask Photos Feature Begins US Rollout (2 minute read)

Google Photos' new AI-powered search feature, "Ask Photos," is rolling out to select users in the U.S., allowing them to search their photos using more complex natural language queries.

Alibaba releases new AI model Qwen2-VL that can analyze videos more than 20 minutes long (5 minute read)

Alibaba Cloud has released Qwen2-VL, a new vision-language model with enhanced visual understanding, video comprehension, and multilingual text-image processing. Qwen2-VL shows superior performance against models like Meta's Llama 3.1 and OpenAI's GPT-4o and supports various applications, including real-time video analysis and tech support. The models, available in three sizes (7B, 2B, and soon 72B), are open-source under Apache 2.0 for the smaller variants.

🧠

Research & Innovation

SGLang 0.3 (12 minute read)

SGLang inference improvements are here in the latest release, with 7x Faster DeepSeek MLA, 1.5x Faster torch.compile, Multi-Image/Video LLaVA-OneVision, and more.

OLmo MoE (27 minute read)

A great open MoE with best in class performance for 1B activated parameters.

Aligning Style and Text in Image Generation (26 minute read)

This paper introduces StyleTokenizer, a method for improving style control in text-to-image generation by aligning style representation with text prompts.

🧑‍💻

Engineering & Resources

💪 Small models, massive performance with OctoAI (Sponsor)

OctoAI empowers engineers to leverage small open-source models like Llama 3.1-8B, outperforming GPT-4o and dramatically reducing costs for enterprise tasks.

The team just released a new tutorial to teach you how to:

Apply advanced prompt engineering to slash expenses.
Use parameter-efficient fine-tuning for optimal performance.
Achieve GPT-4o quality with Llama 3.1-8B at a fraction of the cost.

Read the tutorial and get $10 in free credits on OctoAI super-performant endpoints right now.

Cornell's Applied ML Class (GitHub Repo)

Open resources for the Fall 2024 Applied ML class at Cornell.

Laminar (GitHub Repo)

Open-source observability, analytics, evals, and prompt chains for complex LLM apps.

Long-Context Understanding with LongLLaVA (GitHub Repo)

LongLLaVA is a multimodal model designed for handling long-context tasks like video and high-resolution image understanding.

🎁

Miscellaneous

Maturing Enterprise AI Infrastructure (16 minute read)

Interesting interview with the CEO of BentoML that talks about how to improve your enterprise tooling, ensuring that you are able to scale, but not over-engineering it at the beginning.

LLM-Based Embedding Models (GitHub Repo)

This study investigates various designs for LLM-based embedding models, comparing different pooling and attention strategies.

Optical connectivity directly in the GPU (12 minute read)

GPU interconnect bandwidth is one of the primary bottlenecks in training large models today. Broadcom is working to integrate optical transfer directly into GPUs which would alleviate the issue substantially.

⚡

Quick Links

YouTube Is Making Tools To Detect Face And Voice Deep Fakes (1 minute read)

YouTube is developing new tools to protect artists and creators from unauthorized use of their likenesses, including AI-generated face and singing voice detection technology, with pilot programs launching early next year.

Icon (Product Launch)

Icon helps brands partner with creators, turn 1 video into 20 videos with AI, and A/B test messaging to find winning ads.

Google is working on AI that can hear signs of sickness (1 minute read)

Google is using AI models trained on 300 million audio samples to detect early signs of diseases like tuberculosis.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!

https://refer.tldr.tech/6d412934/2

Track your referrals here.

Want to advertise in TLDR? 📰

If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us.

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Andrew Tan & Andrew Carr

If you don't want to receive future editions of TLDR AI, please unsubscribe from TLDR AI or manage all of your TLDR newsletter subscriptions.

Email Details

OpenAI considers new subscription 💰,Google Ask Photos 🖼️, Cornell’s Applied ML Class 📚

TLDR AI 2024-09-06

Headlines & Launches

Research & Innovation

Engineering & Resources

Miscellaneous

Quick Links