How MIT’s new AI framework could rethink machine learning
Researchers at the Massachusetts Institute of Technology (MIT) have developed a new framework that enables large language models (LLMs) not just to use the data, but to actively adapt and learn from it while working, Kazinform News Agency correspondent reports, citing VentureBeat.
The framework is called SEAL - Self-Adapting Language Models. Unlike conventional approaches such as fine-tuning on pre-collected datasets or in-context learning, SEAL allows a model to generate its own training examples and instructions for updating its internal parameters. In other words, the model does not just adjust to new tasks - it can retain new knowledge by updating its internal structure.
How SEAL works
At the heart of SEAL is reinforcement learning. The model learns to create self-edits - textual instructions that guide it in modifying its internal parameters. This process is similar to the model writing its own textbook: rather than simply reading data, it reformats the information into a version optimized for learning.
Training happens in two phases: first, the model makes a small update to its weights based on a self-generated instruction (the inner loop); then the system checks whether its task performance has improved (the outer loop). If the update proves effective, it is kept; if not, it is discarded. Over time, the model becomes more effective at teaching itself.
Interestingly, SEAL’s architecture can be split into two parts: one AI module acts as a “teacher,” generating self-edits, while the other acts as a “student,” updating itself based on those instructions. This setup may prove especially valuable in enterprise applications that require highly specialized training workflows.
From theory to practice
The SEAL framework was tested in two areas: integrating new knowledge and learning from a small number of examples.
In the first case, the model was tasked with memorizing facts from a text and answering questions without having access to the original material. Traditional finetuning led to only minor improvements, while SEAL (through the generation of implications and synthetic examples) boosted answer accuracy to 47%. Notably, this result outperformed similar attempts using the more powerful GPT-4.1.
In the second case, the model tackled visual problems from the Abstraction and Reasoning Corpus (ARC) - a benchmark designed to test AI’s ability to reason abstractly and generalize from limited data. Here, the model had to not only find the correct answers but also develop its own learning strategy: which data to use, how to reformat it, and what learning pace to follow. With SEAL, the model reached an accuracy of 72.5%. Without reinforcement learning, performance was four times lower, and standard in-context learning produced no meaningful results at all.
Outlook and limitations
According to the researchers, a shortage of high-quality training data will soon become a major obstacle to the advancement of AI. SEAL offers a partial solution: it enables models to generate their own useful training signals. For example, an AI system could read a scientific paper and produce hundreds of explanations and takeaways to deepen its understanding of the subject.
However, the method also comes with limitations. Frequent updates may lead to what is known as catastrophic forgetting - the loss of previously acquired knowledge. To address this, the researchers suggest a hybrid approach: integrating core knowledge through SEAL while storing factual or frequently changing information in external memory.
There are also practical constraints. Real-time editing of a model’s parameters is not yet feasible. Instead, the proposed solution is to use delayed learning cycles: the model collects data throughout the day and updates itself at set intervals.
Earlier, Kazinform News Agency reported on how ChatGPT may be weakening our minds.