This article dissects the main differences between various language models and compares large-scale LLMs such as ChatGPT, Bart, Gemini and Claude among themselves and with other types of LLM such as open-source and small-scale language models.

The world of language models is in constant flux, with new versions becoming more sophisticated, focused and effective. This diversity offers choice but requires careful analysis of the differences in approaches, technologies and principles behind these models to pinpoint an optimal alternative. Below we discuss in more detail the most popular LLMs, including ChatGPT, Bard, Gemini, and Claude, as well as explain when smaller language models and open-source LLMs can be a strong option.

Large-Scale LLMs

All top four large-scale LLMs known today are proprietary, non-open source models belonging to their owners. As each of the models was developed with a unique perspective envisioned by their developers, these LLMs showcase clear distinctions and can be leveraged for specific tasks while also competing by the sheer size of their neural networks and context windows.

ChatGPT-3.5 and ChatGPT-4

A creation by OpenAI, ChatGPT is a buzzword that needs little introduction. Considered as industry standard for language-related tasks, OpenAI's large language models provide for diverse applications, including content generation, coding, summarizing and other creative tasks. Its unique ability to remember the details of earlier conversations makes ChatGPT ideal for developing chatbots and customer support assistants.

The lineup of ChatGPT versions currently in use is constantly evolving, including GPT-3.5, GPT-4 and GPT-4 Turbo. Compared to the free version of GPT-3.5 with 175B parameters and a context window of 2,048 tokens, the GPT-4 context window is expanded to 8,192 tokens while the GPT-4 Turbo model extends it even further to 128K tokens.

And while the number of parameters in GPT-4 and its newer model is yet unknown, the paid version of the OpenAI chatbot is rumored to have 1.76 trillion parameters, spread across 8 connected models with 220B parameters each.

Bard

Mentioned as a contender of ChatGPT, Google Bard belongs to a different breed. The AI-powered chatbot tool created by Google was initially based on the LaMDA family of LLMs and then upgraded with PaLM and later with Gemini Ultra. Designed with a focus on performing web searches, Bard does them faster and for free for all users in 40 languages, making it ideal for research applications. Drawing upon Google's capabilities, Bard can retrieve images from the web and makes it easy to connect and integrate with Google apps, for example, to export responses to Gmail and Google Docs.

While the key stats for Google Bard are not officially disclosed, it is estimated to have 137B parameters. As to the context window, many observers note that Bard tends to "forget" the conversation after around a dozen interactions. This suggests a rather limited context window, which can be an intentional move by Google to improve the model's efficiency.

Gemini

Named after NASA's namesake project, Gemini is a family of LLMs developed by Google DeepMind, succeeding Google's previous models, such as LaMDA and PaLM2. Similar to Google Bard, Gemini models boast seamless integration with Google services while generating more personalized output and allowing creation of nuanced images and visual content based on prompts.

As to its performance parameters, Gemini is the first to outperform human experts on Massive Multitask Language Understanding, according to the DeepMind official website. Gemini has also demonstrated better performance compared to GPT-4 on a number of other tests, as shown below.

Figure 1: Comparison of performance on multimodal tasks by Gemini and GPT-4. Source: Google DeepMind.

Claude and Claude-2

Created by the Anthropic team, Claude is an AI assistant working through a chat interface intended to help with searches, creative tasks, coding and more, similar to other AI chatbots. Claude's distinctive advantage lies in its ability to offer more conversational output, immediately grasping prompts on the required tone and behavior. Trained with Anthropic's Constitutional AI approach, aligning the model to a set of ethical principles, Claude emphasizes safety and ethics while being capable of adopting a specific persona and adapting to the required style, tone and personality.

The context window of Claude, initially set at 9K tokens, was later expanded to 100K tokens, according to the official report on Anthropic's website. Meanwhile, the context window for Claude 2.0 and newer models is believed to exceed 200K. While the number of parameters in Anthropic's LLMs is not officially announced, research by Ganguli et al. (2023) hints at 175B parameters in the first Claude model.

Small-Scale Language Models

Unlike large-scale models, smaller LLMs offer a more focused approach while requiring less computing power and training. By zeroing in on specific tasks, there models provide better efficiency at considerably lower cost.

For example, such models as Mistral, developed by Mistral AI, or MPT-30B by MosaicML have only 7B and 30B parameters, respectively. Meanwhile, according to estimates, it costs less than $500K to train a model like Mistral, which is almost 10 times less than the amount spent on training GPT-3. That said, Mistral demonstrates excellent narrative consistency when generating contextually relevant texts, while MPT-30B outperformed much larger GPT-3 with 175B parameters on six out of nine metrics.

Open Source LLMs

Open-source LLMs have the code which is publicly available and which can be freely accessed, modified and distributed. The concept of open-source LLM is built around collaboration and community involvement.

For example, a 13-billion parameter LLaMA (Language Model from Meta AI) was designed as a resource-conscious efficient model, providing wider access to AI technology. It is available under a non-commercial license, making it accessible to researchers and smaller organizations while offering a fair degree of customization. Meanwhile, another open-source model by Hugging Face, called BLOOM was the first LLM of this type to have more parameters than GPT-3 that was purpose-built as a language model, trained on 46 languages and 13 programming languages.

Implement AI Technology Easier with VectorShift

Language models are constantly developing in many ways, including expansion of their context lengths and number of parameters, focusing on specific niches and optimizing their efficiency. This diversity enables utilizing advantages of language modes most suited to particular tasks and applications.

At the same time, startups and businesses interested in introducing AI technology into their projects can make the process much more streamlined by utilizing no-code functionality and SDK interfaces available with platforms like VectorShift. If you are interested in learning more about how to implement AI into your processes, please don't hesitate to get in touch with our team or request a free demo.