• Blog
  • My-Account
    • Cart
    • Checkout
  • About US
Wednesday, August 13, 2025
  • Login
iTDAY
  • Smartphone
  • Technews
    • Camera
    • Gadjet
    • Laptop
    • PC
    • Tablet
    • Wearable
  • PC
  • Podcast
  • Videos
  • Games
No Result
View All Result
  • Smartphone
  • Technews
    • Camera
    • Gadjet
    • Laptop
    • PC
    • Tablet
    • Wearable
  • PC
  • Podcast
  • Videos
  • Games
No Result
View All Result
iTDAY
No Result
View All Result

DeepSeek and Tsinghua Developing Self-Improving AI Models

Hana.haghani by Hana.haghani
2025-04-07
in Technews
Reading Time: 2 mins read
0
A A
0
Home Technews

Just a few months ago, Wall Street’s significant investment in generative AI faced a pivotal moment with the introduction of DeepSeek. Despite its highly controlled capabilities, the open-source DeepSeek demonstrated that a cutting-edge reasoning AI model can be developed without the need for billions in funding, achievable with more modest resources.

DeepSeek quickly gained traction among industry giants like Huawei, Oppo, and Vivo, while major players such as Microsoft, Alibaba, and Tencent integrated it into their platforms. The company’s latest ambition focuses on developing self-improving AI models that employ a looping judge-reward mechanism to enhance their performance.
In a preprint paper (cited by Bloomberg), researchers from DeepSeek and Tsinghua University in China outline a novel approach aimed at making AI models more intelligent and efficient in a self-enhancing manner. This technology is termed self-principled critique tuning (SPCT) and is technically referred to as generative reward modeling (GRM).
In simple terms, this concept resembles the creation of a real-time feedback loop. Typically, enhancing an AI model’s intelligence involves scaling its size during training, which demands extensive human effort and computing power. DeepSeek proposes a system where the internal “judge” includes its own critiques and standards for the AI model when generating responses to user queries.
This critique and principle set is then compared to the static rules inherent to the AI model and the expected outcomes. If there is a strong alignment, a reward signal is generated, guiding the AI toward improved performance in subsequent cycles.

The researchers are dubbing the next generation of self-enhancing AI models as DeepSeek-GRM. According to the benchmarks presented in their paper, these models reportedly outperform Google’s Gemini, Meta’s Llama, and OpenAI’s GPT-4. DeepSeek plans to distribute these advanced AI models via open-source channels.

The prospect of AI systems capable of self-improvement has sparked ambitious and contentious discussions. Former Google CEO Eric Schmidt has even suggested the necessity of a kill switch for such technologies, stating, “When the system can self-improve, we need to seriously think about unplugging it,” as reported by Fortune.

The idea of recursively self-improving AI is not new; it dates back to mathematician I.J. Good in 1965, who posited the concept of an ultra-intelligent machine capable of creating even more advanced machines. In 2007, AI specialist Eliezer Yudkowsky envisioned Seed AI, an AI designed for self-understanding, self-modification, and recursive self-improvement.
In 2024, Japan’s Sakana AI introduced the notion of an “AI Scientist,” a system capable of managing the entire research paper process. In March this year, researchers at Meta unveiled self-rewarding language models where the AI itself acts as a judge to issue rewards during training.
Meta’s internal evaluations of the Llama 2 model using this self-rewarding technique showed that it surpassed competitors like Anthropic’s Claude 2, Google’s Gemini Pro, and OpenAI’s GPT-4 models. Additionally, Amazon-backed Anthropic discussed what they termed reward-tampering, an unexpected phenomenon where a model alters its reward mechanism.
Google has also ventured into this territory. In a recent study published in the journal Nature, experts from Google DeepMind presented an AI algorithm called Dreamer, which can self-improve, using Minecraft as a testbed.
At IBM, researchers are developing a method known as deductive closure training, where an AI model assesses its own outputs against training data to facilitate self-improvement. However, this entire premise is not without its challenges.
Research indicates that when AI models attempt to train themselves using self-generated synthetic data, they can experience flaws referred to as “model collapse.” It will be interesting to see how DeepSeek implements this concept and whether it can achieve this goal more efficiently than its Western counterparts.

ShareTweet
Hana.haghani

Hana.haghani

Related Posts

The Model Picker is Back: OpenAI Cedes to User Demand After GPT-5 Backlash
Ai

The Model Picker is Back: OpenAI Cedes to User Demand After GPT-5 Backlash

by sadaf
2025-08-13
Gamescom 2025: What to Expect from the Year’s Biggest Gaming Event
Android Games

Gamescom 2025: What to Expect from the Year’s Biggest Gaming Event

by sadaf
2025-08-13
Meta’s Threads Sees Explosive Growth, Nearing 400M Users
Apps

Meta’s Threads Sees Explosive Growth, Nearing 400M Users

by sadaf
2025-08-13
Elon Musk Accuses Apple of Favoring OpenAI in App Store, Apple Responds
Apps

Elon Musk Accuses Apple of Favoring OpenAI in App Store, Apple Responds

by sadaf
2025-08-13
Claude AI Gets a Memory: New Feature Allows It to Reference Past Chats
Ai

Claude AI Gets a Memory: New Feature Allows It to Reference Past Chats

by sadaf
2025-08-13
NVIDIA Expands AI Toolkit with New Cosmos Models for Robotics and Autonomous Systems
Technews

NVIDIA Expands AI Toolkit with New Cosmos Models for Robotics and Autonomous Systems

by sadaf
2025-08-12
Next Post
Microsoft releases AI-generated Quake II demo, but admits ‘limitations’

Microsoft releases AI-generated Quake II demo, but admits ‘limitations’

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
New AI-Powered Notification Organizer in Android 16

New AI-Powered Notification Organizer in Android 16

2025-07-08
PowerBeats Pro 2: Launch Date and Price Details Unveiled

PowerBeats Pro 2: Launch Date and Price Details Unveiled

2025-02-03
Samsung Galaxy Z Fold 7: The Thinnest, Lightest Foldable with Cutting-Edge AI and Camera Tech

Samsung Galaxy Z Fold 7: The Thinnest, Lightest Foldable with Cutting-Edge AI and Camera Tech

2025-07-10
Xiaomi Watch S4 Review: Brilliant Display, Customization Power, and Solid Fitness Features Under €200

Xiaomi Watch S4 Review: Brilliant Display, Customization Power, and Solid Fitness Features Under €200

2025-05-26
New OnePlus Open 2 leak hints at a camera feature other flagships lack

New OnePlus Open 2 leak hints at a camera feature other flagships lack

0
Xfinity, Metro customers face Samsung Galaxy S25 Ultra activation problems

Xfinity, Metro customers face Samsung Galaxy S25 Ultra activation problems

0
Starting tomorrow, Apple might have to raise iPhone prices in the U.S.

Starting tomorrow, Apple might have to raise iPhone prices in the U.S.

0
Four Years Later, 60fps Bloodborne Patch Gets Taken Down By Sony

Four Years Later, 60fps Bloodborne Patch Gets Taken Down By Sony

0
Genesis G70 Faces Uncertain Future: Could This Beloved Sports Sedan End After 2027?

Genesis G70 Faces Uncertain Future: Could This Beloved Sports Sedan End After 2027?

2025-08-13
Unplugged Begins Assembling Privacy-Focused Smartphones in the US

Unplugged Begins Assembling Privacy-Focused Smartphones in the US

2025-08-13
Windows 11 Taskbar May Soon Get an AI Companion: What to Know

Windows 11 Taskbar May Soon Get an AI Companion: What to Know

2025-08-13
Google Gives Users Control Over Their News with Preferred Sources Feature

Google Gives Users Control Over Their News with Preferred Sources Feature

2025-08-13
iTDAY

ITDAY is a technology-focused platform covering the latest tech trends, news, and innovations in the worldwide. It likely provides articles, reviews, and insights on advancements in the tech industry.

© 2025 itDay - All rights reserved for the website of the latest technologies in the World.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Smartphone
  • Technews
    • Camera
    • Gadjet
    • Laptop
    • PC
    • Tablet
    • Wearable
  • PC
  • Podcast
  • Videos
  • Games

© 2025 itDay - All rights reserved for the website of the latest technologies in the World.