- The Neural Frontier
- Posts
- OpenAI Unveils GPT-4.5 š!
OpenAI Unveils GPT-4.5 š!
Also: Anthropic launches its first hybrid reasoning model, while Figure plans to start alpha testing its humanoid robot in the home in 2025 š¤.
Last week, xAI released Grok 3. This week, we get to double dip: Anthropic released Claude 3.7 Sonnet, and OpenAI debuted the highly anticipated GPT-4.5.
Hello forward thinkers, we have a lot to discuss today!
In this weekās issue, weāre diving into two pivotal model releases: one from OpenAI and one from Anthropic. Also, weāll cover Figureās recent update regarding its alpha tests for humanoid robots.
We know itās a pretty stacked lineup, so letās jump right in! š
In a rush? Here's your quick byte:
š¤ OpenAI unveils GPT 4.5!
š Anthropic launches its first hybrid reasoning model.
š¦¾ Figure plans to start alpha testing its humanoid robot in the home in 2025!
š AI Reimagines: Star signs as mythical warriors!
šÆ Everything else you missed this week.
ā” The Neural Frontierās weekly spotlight: 3 AI tools making the rounds this week.
Source: OpenAI
OpenAI has unveiled GPT-4.5, its latest and largest LLM, designed to enhance world knowledge, writing abilities, and computational efficiency.
GPT-4.5 brings a more natural conversational experience and reduces hallucinations, but itās not considered a frontier AI model like OpenAIās o-series.
Here are the essentials:
š Whatās New in GPT-4.5? OpenAI describes GPT-4.5 as its "most knowledgeable model yet," refining previous capabilities while introducing new training optimizations:
More Natural Conversations: GPT-4.5 has improved ability to follow user intent, interpret subtle cues, and engage in warm, intuitive conversations.
Fewer Hallucinations: OpenAI says GPT-4.5 produces more accurate responses compared to GPT-4o and slightly outperforms o1 in this regard.
Larger Knowledge Base: It recognizes patterns and connections better than its predecessors, making it more effective at writing, programming, and problem-solving.
More Efficient Training: Built on Microsoft Azure AI supercomputers, GPT-4.5 benefits from 10x improved computational efficiency over GPT-4.
However, OpenAI does not consider GPT-4.5 a frontier AI model, stating that it doesnāt introduce enough new capabilities to meet that threshold.
š§ How GPT-4.5 Differs from OpenAIās Other Models: GPT-4.5 takes a different path from OpenAIās reasoning-focused models, such as o1 and o3-mini:
GPT-4.5 ā Scales unsupervised learning, focusing on pattern recognition and world knowledge.
o1 & o3-mini ā Scale reasoning, breaking down complex problems into logical steps before responding.
While reasoning models self-check answers before responding, GPT-4.5 relies on its expansive training data to deliver more intuitive and fluid responses.
š Performance vs. Other OpenAI Models: GPT-4.5 outperforms other OpenAI models like o1, o3-mini, and GPT-4o in factual accuracy and hallucination rates.
Source: OpenAI
š Better Human Collaboration & Emotional Intelligence: One major improvement with GPT-4.5 is its "EQ" (emotional intelligence), allowing for more empathetic responses, as seen below.
Source: OpenAI
Overall, GPT-4.5 adapts its tone based on the situation, knowing when to engage in deeper conversation and when to provide structured guidance.
GPT-4.5 is currently available to ChatGPT Pro users. It will start rolling out to Plus and Team users by next week, and Enterprise and Edu users can expect it later. Itās also available on Microsoftās Azure AI Foundry, alongside models from Stability AI, Cohere, and Microsoft.
Source: Anthropic
Anthropic has launched Claude 3.7 Sonnet, a frontier AI model designed to offer both real-time responses and deep reasoning capabilities. Unlike traditional AI models, users can now control how long Claude "thinks" before answering, making it the first hybrid reasoning model in the industry.
Hereās what you need to know:
š What Makes Claude 3.7 Sonnet Unique? Claude 3.7 Sonnet introduces user-controlled reasoning, allowing users to toggle between quick responses and in-depth, step-by-step thinking. This approach eliminates the need to switch between different AI models based on complexity. Overall, this model features:
Hybrid Reasoning: Users can choose how long Claude thinks about a question, optimizing for either speed or accuracy.
Better Performance Compared to Past Claude Models: Claude 3.7 Sonnet beats its predecessor, Claude 3.5 Sonnet, across reasoning, coding, and automation tasks.
A Larger Context Window: Supports up to 200K tokens, making it one of the most contextually aware models available.
Fewer Unnecessary Refusals: Reduces "overcautious" AI responses by 45%, ensuring better user experience.
š§ How Hybrid Reasoning Works: Claude 3.7 Sonnet blends two AI approaches into one model:
Real-Time Mode ā Instant answers, optimized for casual queries, conversations, and simple lookups.
Reasoning Mode ā Engages in deeper analysis, solving math, coding, and complex logic problems by breaking them down step-by-step.
Anthropic also introduced a "Visible Scratch Pad," which allows users to see Claudeās reasoning process in real time.
š Performance vs. Other AI Models: Claude 3.7 Sonnet outperforms OpenAI and DeepSeek models in reasoning benchmarks, as seen below:
Source: Anthropic
Anthropic says these real-world tests prove Claude 3.7 Sonnetās strength in coding, automation, and AI-assisted workflows.
š Introducing Claude Code: Alongside Claude 3.7 Sonnet, Anthropic is launching Claude Code, a research preview of an AI-powered coding assistant that lets developers:
š» Modify codebases with natural language commands.
š Analyze project structures for better organization.
š Push code to GitHub directly from Claude.
š Availability & Pricing: Claude 3.7 Sonnet is rolling out to Claude Pro users and developers. Free users get standard responses, while premium users unlock reasoning mode.
Regarding pricing, Anthropic has provided the following information:
$3 per million input tokens (~750,000 words per $3)
$15 per million output tokens
Up to 90% cost savings via prompt caching
Moving forward, Anthropic aims to refine Claude 3.7 Sonnetās hybrid reasoning to automatically decide when to "think" longerāremoving the need for manual toggling. Meanwhile, OpenAI is rumored to be working on its own hybrid AI model, with CEO Sam Altman hinting at a similar system launching "in months."
With Claude 3.7 Sonnet, Anthropic is setting the bar for AI adaptabilityābut the competition is hot on the trail.
Source: Figure
š¤ Figureās Humanoid Robot Will Begin Home Testing in 2025: Figure is moving faster than expected in bringing its Figure 02 humanoid robot into homes. CEO Brett Adcock announced that alpha testing for home use will begin later in 2025, thanks to Helix, the companyās new Vision-Language-Action (VLA) model.
Hereās the lowdown:
āļø Why This Matters: Most robotics companiesāincluding Tesla and Apptronikāare prioritizing factories and warehouses before tackling home environments. Figure, however, is accelerating its roadmap, testing humanoids for home use earlier than expected.
š§ Helix ā The AI That Makes It Possible: The Helix model, announced just last week, enables robots to process both visual data and language. This allows Figure 02 to:
Quickly learn new household tasks like food preparation.
Coordinate multiple robots working on a single task.
Adapt to unstructured environments, like homes with clutter, stairs, pets, and children.
Figure had initially partnered with OpenAI but cut ties in favor of its own AI models, allowing full vertical integration of hardware and AI.
š” Challenges of Home Robotics: The home remains one of the most difficult environments for robots due to unpredictable layouts (stairs, different surfaces, lighting), children & pets (fast-moving obstacles), and cost concerns (pricing remains a major unknown).
Norwegian startup 1X is one of the few firms focusing on home robotics, but most competitors remain hesitant due to these challenges.
Figureās 2025 home tests will be in early āalphaā stages, meaning limited trials with controlled testing and continued refinements before potential consumer deployment. And while some claim that this timeline is too ambitious, others remain hopeful that Figureās robots will prove practical for home use.
Source: u/Tizzlefoshizzle123 via Reddit
Yep, thatās Aquarius!
As you can probably tell, this weekās showcase is all about the celestial bodies! From Aquarius to Capricorn, letās dive into a mythical re-imagination of your favorite star signs! š«
šÆ Everything else you missed this week.
Source: Amazon
š§Ø Meta fires āroughly 20ā employees for leaking confidential information.
šļø Airbnb co-founder, Joe Gebbia, joins President Trumpās Department of Government Efficiency (DOGE).
š Waymo doubles its weekly robotaxi rides in < 1 year, logging over 200,000 paid rides by the week!
ā” The Neural Frontierās weekly spotlight: 3 AI tools making the rounds this week.
Source: ChatGPT Image Generator
1. šØ Forage offers an AI-powered email management solution that works directly within Gmail, without requiring users to switch to a new app. The platform focuses on automatic filtering of low-priority emails, daily summary of filtered messages, and AI-generated TLDRs of newsletters.
2. š Coursebox has helped create 20,000 courses in 180+ countries. It stands out by automating traditionally time-consuming tasks like grading and learner support, allowing course creators to focus on content quality.
3. š§Ŗ Basalt stands out by addressing the entire AI feature development lifecycle, from ideation to monitoring. It targets development teams frustrated by the reliability gap between AI prototypes and production features, providing structured tools to test, iterate, and deploy AI capabilities with confidence.
Could this be the year of AGI?
Thatāll surely make the headlines. Granted, itās not looking very likely this year, as some experts even claim weāre years away.
However, with the rate of product releases from AI giants, one canāt help but wonder if this is the year. Regardless of where you stand, only time will tell.
Till then, you can expect us in your inbox, same time, every single week. š¤©
PS: Please hit that Subscribe button to keep receiving content like this š.