The course familiarizes participants with aspects of generative artificial intelligence and the latest achievements in the field of natural language processing. Participants learn the theoretical foundations and architecture of Large Language Models (LLM) and gain practical skills in working with text data. The course places particular emphasis on understanding popular tasks using LLM such as text generation, machine translation, sentiment analysis, creating summaries and answering questions based on a database of source documents. After completing the course, the participant knows the techniques used in command engineering, metrics for evaluating the results generated by LLM, and methods for improving returned content. He can also apply large text models to a variety of applications in research, industry and other areas.

topics:
Generative AI tasks: translation, question-answer, summarize, sentiment
Architecture and types of LLM, encoder, decoder.
Text vectorization, positional coding, attention mechanism (Multi-Head Attention)
OpenAI GPT, Google Gemini, Mistral Mixtral, Meta Llama, Claude, FLAN models.
Prompt engineering, multi-task instruction fine-tuning, zero/one/few shot learning
Parameter-Efficient Fine-Tuning (PEFT), LoRA
Division of instructions into steps: chain-of-thought
Evaluation of LLM models: performance evaluation, ROUGE/BLEU metrics, benchmark
RAG – Retrieval Augmented Generation, vector database
LLM training computational challenges, scaling laws
Frameworks for working with LLM, LangChain, ReAct