The Rise of ORCA: A New AI Model That Surpassed GPT-4 in Every Task
Microsoft has released a new
research paper on ORCA, a
revolutionary AI model that learns from complex explanations of GPT-4, the most
powerful language model in the world. This is a huge deal and I'm going to tell
you why in this article. What is orca and why is it so important?
What is ORCA?
ORCA is a 13 billion parameter model
that learns from complex explanation traces of GPT-4. Which is a much bigger
model that can generate almost any kind of text you can imagine.
Why would Microsoft want to create a smaller model that learns from a bigger model? Isn't bigger always better when it comes to AI?
It is not exactly. Bigger models are
more powerful but they also have some drawbacks. They are very expensive to
train and run. They require a lot of computing resources and energy and they
are not very accessible to most researchers and developers. That's why there
has been a lot of interest in creating smaller models that can still perform
well on various tasks, such as- answering questions, summarizing texts, generating
captions and so on. These smaller models are usually fine-tuned on specific
data sets or instructions to make them more specialized and efficient.
The Problem with Smaller Models
There is a problem with this approach.
Smaller models tend to have poor reasoning and comprehension skills. Compared
to bigger models they often make mistakes or give irrelevant answers when faced
with complex or ambiguous queries. They also lack the ability to explain how
they arrived at their answers or what steps they took to solve a problem. But ORCA
is not just another smaller model that imitates a bigger model.
How Orca Works?
ORCA is a smaller model which learns
from the reasoning process of a bigger model and learns from the explanations
that GPT-4 gives. When it generates its answers, these explanations are not just
simple sentences or phrases.
They are detailed traces of how GPT-4 thinks step by step, how it uses logic, and
common sense, how it connects different pieces of information and how it
simplifies complex concept. By learning from GPT-4’s explanations ORCA becomes
much more capable and intelligent than other models it can handle more diverse
and challenging tasks. It can give more accurate and relevant answers and it
can also explain its own reasoning process to humans. This is a huge
breakthrough for open source AI. ORCA is set to be open source soon which means
anyone will be able to use it and build upon it. It will enable more people to
access the power of GPT-4 without having to pay for it or deal with its
limitations. Especially in areas that require more reasoning and understanding skills,
ORCA will also open up new possibilities for AI research and development. To
understand how ORCA works, we need to first understand how GPT-4 works.
How GPT-4 works?
GPT-4 is more than a text
generator. It performs tasks requiring reasoning like answering factual
questions, summarizing lengthy texts, generating captions writing essays and
more. Interestingly, GPT-4 can provide explanations for its outputs. These are
found in the model's internal states. Its thoughts or memories which hold the
logic and information, used to generate outputs. By using specific prompts, we
can unveil these internal explanations giving a detailed view of how GPT-4
thinks, solves problems and uses diverse sources of information, including its
own memory the web and common sense. These explanations are very valuable for
smaller models. That want to learn from GPT-4. They provide more signals and
guidance for how to perform various tasks and how to reason and understand
different concepts. They also make the learning process more transparent and
interpretable for humans. This is what ORCA does.
ORCA learns from these
explanations that GPT-4 generates when it performs different tasks. It uses these
explanations as its training data and tries to imitate them as closely as possible.
ORCA also tries to generate its own explanations. When it performs similar
tasks and compares them with GPT-4's explanations to improve itself. So, ORCA is
actually based on Vicuna, a previous
open source model that was fine-tuned on question answer pairs from GPT-3.5. ORCA
extends by Vicuna by adding a new technique called explanation tuning which allows it to learn from complex explanation
traces of GPT-4 explanation.
Explanation Tuning
It is a fresh approach that
enhances GPT-4's skill to follow specific directives. By refining this AI with prompts,
like- summarize this in a sentence
or create a love Haiku. We make it
more adept at particular tasks. But explanation tuning goes beyond. It hones
GPT-4 to reveal its thought process using prompts like think sequentially or explain
like you are a child. This way GPT-4's reasoning becomes more transparent. This
technique involves standard and explanation prompts. Former our usual tasks
like- Who leads France? or Craft a winter poem. The later instruct GPT-4 to clarify
its logic like think in steps or show how you did it using both prompt types
together. GPT-4 produces complex explanation traces for instance with the standard
prompt- who leads France and the explanation prompt think in steps. GPT-4 might
provide a step-by-step explanation. This comprehensive response not only tells
us who the president is but also illustrates GPT-4's problem-solving strategy
and information sources offering more insight than a simple answer. ORCA
leverages explanation traces as learning material striving to mimic them and
generate its own for improvement.
Where do these traces come from?
ORCA taps into FLAN 2022, a
massive collection of over 1,000 tasks and 10,000 instructions covering a spectrum
of subjects. By sampling from FLAN 2022, ORCA gets a variety of tasks and uses
them to query GPT-4 for explanation traces. It also creates complex prompts
from the data set to test GPT-4's reasoning like mashing two tasks into one. This
way, ORCA learns from diverse and intricate tasks fostering many aspects of
human intelligence. ORCA is evaluated on a number of benchmarks that test its
generative reasoning and comprehension abilities. These benchmarks include
multiple choice questions, natural language inference, text summarization, text
generation, image captioning and so on. ORCA is compared with other models of
similar size or larger size such as VICUNA-13B, TEXT-DAVINCI- 003, a free
version of GPT-3, ChatGPT-3.5 and GPT-4. ORCA's performance is stellar topping
all other open source models in most benchmarks. Particularly those needing
deeper reasoning. Despite its smaller size, it matches or beats ChatGPT in many
areas even competing with GPT-4 in tasks, like- Natural Language Inference or
image captioning.
Benchmarks
Here's a quick look at orca's benchmark performances:
|
ORCA |
VICUNA-13B |
TEXT-DAVINCI-003(A
FREE VERSION OF GPT-3) |
CHATGPT(GPT-3.5) |
CHATGPT(GPT-4) |
BIG-BENCH HARD |
64% |
30% |
_ |
59% |
62% |
SUPER GLUE |
86% |
81% |
83% |
84% |
88% |
CNN/DAILY MAIL |
41% |
38% |
39% |
40% |
42% |
COCO CAPTIONS |
120% |
113% |
115% |
117% |
119% |
As you can see, ORCA is a highly
versatile efficient model performing well across tasks and domains and soon to
be open source. It also works on a single GPU.
ORCA's success
ORCA's success reveals multiple
insights about AI's future.
Firstly, it indicates that
learning from explanations as opposed to just answers notably boosts AI
intelligence and performance by studying GPT-4's explanations. Orca not only
gains superior reasoning skills but also provides a transparent look into its problem-solving
process.
Secondly ORCA proves that despite
their size smaller models can match or outperform larger ones learning from
GPT-4. ORCA side steps size related drawbacks showing that smaller models can
be more approachable and efficient needing fewer resources and energy.
Thirdly, ORCA exemplifies how
open source AI through inventive methods can match proprietary AI and
demonstrates how open source AI's wider accessibility can benefit more people and
spur more applications concerning its positioning.