Large Language Model

Large Language Model

  • A large language model (LLM) is a type of language model notable for its ability to achieve general-purpose language understanding and generation. 

  • LLMs acquire these abilities by using massive amounts of data to learn billions of parameters during training and consuming large computational resources during their training and operation.

  • LLMs are artificial neural networks (mainly Transformers) and are (pre-)trained using self-supervised learning and semi-supervised learning.

  • As autoregressive language models, they work by taking an input text and repeatedly predicting the next token or word.

  • Up to 2020, fine tuning was the only way a model could be adapted to be able to accomplish specific tasks. 

  • Larger sized models, such as GPT-3, however, can be prompt-engineered to achieve similar results.

  • Notable examples include OpenAI's GPT models (e.g., GPT-3.5 and GPT-4, used in ChatGPT), Google's PaLM (used in Bard), and Meta's LLaMa, as well as BLOOM, Ernie 3.0 Titan, and Anthropic's Claude 2.

Mistral AI

Mistral AI

  • Mistral AI is a French startup.

  • It raised a record 105 million euros in its seed funding round, just a month after its launch.

  • A week ago, Mistral released a 7.3 billion parameter language model positioned to compete against Meta’s LLama 2, a 13 billion parameters large language model (LLM). 

  • The French firm has claimed first place for the most powerful LLM in the nifty size LLM space.

  • A look at its pitch deck shows how Mistral has cleverly positioned itself as a potentially important player in setting up Europe as “a serious contender” to build foundational AI models and play a “big role in this geopolitical issue.”

  • AI-based product building startups in the U.S. are largely backed by dominant players like Google and Microsoft. 

  • Mistral called this a “closed technology approach” that made large firms more money but did not really create an open community.

  • Unlike OpenAI’s GPT models, details of which are still under wraps and are available only through their APIs, the Paris-based firm has released its model on GitHub under the Apache 2.0 licence, free for everyone to tinker with. 

  • The only other prominent open-source language model is Meta’s LLama, with Mistral claiming its LLM is more capable than Llama 2.

Mistral’s model versus Llama 2

  • Mistral’s model showed an accuracy of 60.1% on the Massive Multitask Language Understanding (MMLU) test which covers maths, history, law and other subjects.

  • While the LLama 2 models showed an accuracy of around 44% (7 billion parameters) and 55% (13 billion parameters). 

  • In commonsense reasoning and reading comprehension benchmarks, Mistral again outperformed LLama 2’s models.

  • Only in coding was Mistral behind Meta’s AI mode. 

  • The French startup AI’s accuracy was at 30.5% and 47.5% on the zero-shot Humaneval and three-shot MBPP benchmarks, while LLama 2’s 7 billion model delivered results of 31.1% and 52.5%.

  • Mistral also claims to use less computing power than the LLama 2 models. 

  • Like, in the MMLU benchmark, Mistral’s model delivers the output of a LLama 2 model that is more than three times its size. 

Lack of safety guardrails

  • Despite Mistral’s claims, some users have complained that it lacks the safety guardrails that ChatGPT, Bard and LLama have. 

  • There were instances of users asking Mistral’s Instruct model how to build a bomb or to self-harm, and the chatbot responded with detailed instructions. 

What is "lobotomizing" models?

  • The lobotomy reference is reminiscent of the early days of the GPT-powered Sydney, Microsoft’s Bing chatbot. 

  • The chatbot was unfettered, and told users it was in love with them, contemplated existentiality, and overall had far too much personality, until Microsoft dialled back the chatbot significantly to its current form. 

  • While there was no official statement from the company, it was rumoured that OpenAI had lobotomised the model to control its chaotic parts. 

  • Lobotomising the model can impact it in some ways — if it is barred from answering questions with certain keywords, it might also not be able to answer technical questions a user may have around say the mechanics of a missile or any other scientific questions around a subject that has been marked ‘risky’ for the bot.



Amritsar,1,Art & Culture,1,August 2023,251,Courses,7,Daily Current Affairs,48,December 2023,47,Disaster Management,2,Environment and Ecology,54,Foundation Course,1,GDP,1,GEMS Club,1,GEMS Plus,1,Geography,67,Govt Schemes,2,GST,1,History,2,Home,3,IAS Booklist,1,Important News,71,Indian Economy,46,Indian History,2,Indian Polity,56,International Organisation,12,International Relations,58,Invasive Plant,1,July 2023,281,June 2022,6,June 2023,268,May 2022,17,Mentorship,2,November 2023,169,October 2023,203,Places in News,2,Science & Technology,66,September 2023,205,UPSC CSE,111,
Learnerz IAS | Concept oriented UPSC Classes in Malayalam: Mistral AI UPSC NOTE
Learnerz IAS | Concept oriented UPSC Classes in Malayalam
Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS PREMIUM CONTENT IS LOCKED STEP 1: Share to a social network STEP 2: Click the link on your social network Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy Table of Content