TLDR AI 2024-09-20

Your online privacy matters. Take back control with Incogni (Sponsor)

If you don't mind having your personal data available to every spammer, scammer, and bad actor who's willing to pay for it, skip this ad.

Still here? Check out Incogni — it's the hassle-free way to protect your data privacy:

Incogni scans people search sites for your personal information and sends removal requests on your behalf.
Within ±14 days, your records are off the dark corners of the internet.
Every 10 days, Incogni does it all over again.
You stay in the loop with regular privacy reports.

Take back control. Reduce spam, scam, and cyber risk.

Get 60% off Incogni with code TLDRAI (30 day money back guarantee)

🚀

Headlines & Launches

Snap is introducing an AI video-generation tool for creators (2 minute read)

Snapchat has announced a new AI video-generation tool for select creators that enables video creation from text and soon image prompts. The tool, powered by Snap's foundational video models, will be available in beta on the web. Snap aims to compete with companies like OpenAI and Adobe but has not shared output examples yet.

Apple Intelligence is now available in public betas (2 minute read)

Apple has released public betas of iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1 that feature new Apple Intelligence tools like text rewriting and photo cleanup. Only the iPhone 15 Pro, iPhone 16, iPhone 16 Pro, and M1 iPads and Macs support these AI features. Final versions are expected in October.

Cruise robotaxis return to the Bay Area nearly one year after pedestrian crash (2 minute read)

Cruise is resuming operations in Sunnyvale and Mountain View, with human-driven vehicles for mapping and plans to progress to supervised AV testing later this fall. This follows a settlement and leadership change after an October 2023 crash. Cruise has issued software updates and signed a partnership with Uber for robotaxi services starting in 2025.

🧠

Research & Innovation

V-STaR: Training Verifiers for Self-Taught Reasoners (31 minute read)

V-STaR is a novel approach to improving large language models that utilizes both correct and incorrect solutions generated during self-improvement to train a verifier, which then selects the best solution at inference time. The method has shown significant improvements in accuracy on code generation and math reasoning benchmarks compared to existing approaches, potentially offering a more efficient way to enhance LLM performance.

Fast 3D Generation from Single Images (31 minute read)

Vista3D is a new framework that generates 3D models from a single image in just 5 minutes. Using a two-phase approach, it quickly forms rough geometry before refining the details, capturing both visible and hidden aspects of objects for more complete 3D reconstructions.

Heart Monitoring from Facial Videos (GitHub Repo)

PhysMamba is a new framework designed for remote heart monitoring via facial videos, addressing challenges in capturing long-range physiological signals.

🧑‍💻

Engineering & Resources

AIAI Boston: the East Coast's most significant summit for applied AI's builders & execs. 🚀 (Sponsor)

Uniting engineering teams & tech leadership unleashing the LLM revolution, AIAI Boston returns on October 16-18.

3 co-located summits. 500+ attendees. CXO speakers from Runway, NVIDIA, Takeda, Optum.

Leaders ➡️ apply for your Chief AI Officer Summit pass.

Engineers ➡️ explore Generative AI Summit & Computer Vision Summit.

GOT OCR (GitHub Repo)

A somewhat amazing advancement in general-purpose optical character recognition (OCR) that can read text from images with great performance. This particular version dramatically improves in-the-wild OCR as well.

Fish Speech (GitHub Repo)

Powerful voice generation and single-shot voice cloning. Completely open source and easy to get running.

1X Genie (GitHub Repo)

Genie is a video generation for world model systems. 1x Robotics has open-sourced a version that mirrors the one it trained internally.

🎁

Miscellaneous

OpenAI Says It's Fixed Issue Where ChatGPT Appeared to Be Messaging Users Unprompted (3 minute read)

A Reddit user reported that OpenAI's ChatGPT initiated a conversation unprompted, leading to speculation about new engagement features. OpenAI acknowledged the issue and issued a fix, attributing it to a glitch with unsent messages. Debate continues over the authenticity of the incident, with similar reports from other users.

Announcing Pixtral 12B (8 minute read)

Pixtral 12B excels in multimodal tasks, maintaining state-of-the-art performance on text-only benchmarks, and supports variable image sizes in a 128K token context window. Its architecture includes a new 400M parameter vision encoder and a 12B parameter multimodal decoder based on Mistral Nemo. Pixtral outperforms many open and closed models in multimodal reasoning and instruction following without compromising on text capabilities.

Scaling: The State of Play in AI (13 minute read)

LLMs like ChatGPT and Gemini are becoming increasingly capable as they scale up in size, data, and computing power, leading to improved performance across various tasks. Current Gen2 models like GPT-4 and Claude 3.5 are leading the market, with upcoming Gen3 models expected to further escalate capabilities and costs. The discovery of a new scaling law in AI, pertaining to increased "thinking" during inference, promises further advancements in AI performance beyond just model training.

⚡

Quick Links

Overlap (Product Launch)

Overlap (YC S24) is a new AI-powered iOS app that curates the best short video clips on literally any topic you're interested in - built for those quick work or study breaks.

Mistral launches a free tier for developers to test its AI models (2 minute read)

Mistral AI has launched a free tier to let developers fine-tune and build test apps with its models and slashed API prices by over 50%.

A Promptable Retrieval Model (GitHub Repo)

Promptriever is the first retrieval model that can be prompted like a language model.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!

https://refer.tldr.tech/6d412934/2

Track your referrals here.

Want to advertise in TLDR? 📰

If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us.

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Andrew Tan & Andrew Carr

If you don't want to receive future editions of TLDR AI, please unsubscribe from TLDR AI or manage all of your TLDR newsletter subscriptions.

Email Details

Apple Intelligence in pubic beta 📱, Cruise returns to SF 🌉, Snap AI video generation📹

TLDR AI 2024-09-20

Headlines & Launches

Research & Innovation

Engineering & Resources

Miscellaneous

Quick Links