Skip to content

This repository is dedicated to small projects and some theoretical material that I used to get into NLP and LLM in a practical and efficient way.

License

Notifications You must be signed in to change notification settings

AMfeta99/NLP_LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Natural Language processing & Large Language Models | NLP & LLM

This repository is dedicated to projects and some theoretical material that I used to get into the areas of NLP and LLM in a practical/efficient way.

Master Thesis

My NLP journey started in 2021-2022 during the development of my master's thesis entitled:

This aims the development of methods for clinical report writing based on EEG signals, adapting current NLP techniques and state-of-the-art captioning approaches to image, signal and video.

Due Legal/privacy restrictions, code/implementation cannot be publicly available, however the dissertation has been published and can be accessed at link.

The document includes an introdutions to NLP, state-of-the-art approach for captioning in image/signal/video, details development and comparison of 6 pipelines for EEG captioning.

Courses

Since then, interest and popularity in the area has been growing, particularly due to the emergence of transformers and LLM applications. So, in an attempt to keep up with this development, I took some courses on more recent topics in the areas offered by some of the big players. This repository includes some theoretical/notes and pratical material of thoses.

  • Deeply understand generative AI, describing the key steps in a typical LLM-based generative AI lifecycle, from data gathering and model selection, to performance evaluation and deployment.
  • Describe in detail the transformer architecture that powers LLMs, how they’re trained, and how fine-tuning enables LLMs to be adapted to a variety of specific use cases.
  • Use empirical scaling laws to optimize the model's objective function across dataset size, compute budget, and inference requirements.
  • Apply state-of-the art training, tuning, inference, tools, and deployment methods to maximize the performance of models within the specific constraints of your project.
  • Challenges and opportunities that generative AI creates for businesses after hearing stories from industry researchers and practitioners.

Course Certificate: Link; More Info

Relevant/Hot Topics

This repository includes the exploration of some topics, tools and techniques currently widely used in NLP and LLM projects or that are in the spotlight in the AI community, such as:

1 - Finetuning LLMs

2 - Large Language Model Operations

3 - Retrieval Augmented Generation (RAG)

To do this, I used a wide variety of bibliographic sources. The list of trainings I took related to thoses topics and the respective objectives can be detailed in Link.

Those trainings are offered by large companies and Big players in AI. Below is the list of completion certifications:

Disclaimer

Copyright of all materials in thoses courses belongs to DeepLearning.AI, LAMINI, AWS, Google Cloud and LangChain and can only be used or distributed for educational purpose. You may not use or distribute them for commercial purposes.

Projects:

Here are links to my NLP & LLM projects: