Michał Żarnecki Portfolio

I'm a programmer and lecturer. My work is related to programming in Python/PHP/JavaScript and designing systems and solutions related to AI/machine learning, data mining, big data and natural language processing.

Tag: RAG

Generative AI in text Mining – laboratories at the Collegium da Vinci

Posted on 12 July 2024  in lectures

The course familiarizes participants with aspects of generative artificial intelligence and the latest achievements in the field of natural language processing. Participants learn the theoretical foundations and architecture of Large Language Models (LLM) and gain practical skills in working with text data. The course places particular emphasis on understanding popular tasks using LLM such as text generation, machine translation, sentiment analysis, creating summaries and answering questions based on a database of source documents. After completing the course, the participant knows the techniques used in command engineering, metrics for evaluating the results generated by LLM, and methods for improving returned content. He can also apply large text models to a variety of applications in research, industry and other areas.

topics:
Generative AI tasks: translation, question-answer, summarize, sentiment
Architecture and types of LLM, encoder, decoder.
Text vectorization, positional coding, attention mechanism (Multi-Head Attention)
OpenAI GPT, Google Gemini, Mistral Mixtral, Meta Llama, Claude, FLAN models.
Prompt engineering, multi-task instruction fine-tuning, zero/one/few shot learning
Parameter-Efficient Fine-Tuning (PEFT), LoRA
Division of instructions into steps: chain-of-thought
Evaluation of LLM models: performance evaluation, ROUGE/BLEU metrics, benchmark
RAG – Retrieval Augmented Generation, vector database
LLM training computational challenges, scaling laws
Frameworks for working with LLM, LangChain, ReAct

, , , ,

[Top]

Tag: RAG

AI chatbot for analysing companies source documents

Posted on 26 May 2024  in projects

https://chat.companyhouse.de/


AI-based chatbot for retrieving reliable, up to date and precise information about companies.
Chatbot is based on streamlit framework and uses vector database based on postgres pg_vector extension to store and access trade register documents.
Application is using large language model (LLM) Llama3 together with retrieval augmented generation (RAG) approach which allows to ask and get response to any question related company and managers history as well as financial condition and important changes.
Together with response also source documents are listed making this approach reliable business intelligence tool.

Responsibilities:

  • build application prototype
  • implement application code parts
  • implement authentication mechanism
  • specify and coordinate works related to building chatbot interactions
  • specify and coordinate works related to sychronizing in real time source documents and make them accessible for LLM
  • measure answers quality

, , , , , , ,

[Top]

Tag: RAG

E-learning course: Machine learning – how to use the potential of data to get better results and make smart decisions

Posted on 3 January 2024  in lectures

Course scenario:

  1. Definition and applications of machine learning
    • Data deluge and the definition of machine learning
    • Machine learning examples and related fields of knowledge
    • Types of machine learning
  2. Machine learning tools used in the course
    • Programs used in the course
    • Orange Data Mining
    • Jupyter Lab
  3. Supervised machine learning
    • Machine learning process
    • Data collection, labeling and analysis
    • Feature engineering and division into training and testing sets
    • Model training and evaluation
    • Model export, corrective actions
    • Regression example
    • Classification example
(more…)

, , , , , , , , , , , , , , ,

[Top]