Artificial Intelligence
Duration:
02.2024 - 05.2024
Client:
Anonymous client in the publishing industry
Technologies:
Python, Llama-3, Huggingface, SFT, PEFT, LORA, FAST API, GPU-Optimiztaion
Situation
A client in the publishing industry faced high costs due to employees manually writing summaries for newspaper articles. Pre-trained common foundational Large Language Models did not suffice the client’s quality requirements for German newspapers.
Task
The client needed an automated system capable of generating high-quality summaries for newspaper articles in German.
Action
I fine-tuned one of the large freely available language models (LLama-3) to generate summaries in German. The fine-tuning process involved:
Data cleaning and preprocessing.
Supervised Fine-Tuning (SFT) with Parameter-Efficient Fine-Tuning (PEFT) and Low-Rank Adaption (LoRA) of the publicly available pre-trained Llama-3 model utilizing the Huggingface library.
Model training on Nvidia A100 GPUs.
Model training optimization through model quantization and GPU memory optimization (flash-attention-2, gradient checkpointing).
Model inference via FAST API.
Model inference optimization through tuning of model “creativity” (temperature, top k, top p, prompt engineering).
Result
The client successfully implemented an automated system for generating high-quality but also cost-efficient summaries of German newspaper articles. This enabled the client to deploy their staff more efficiently, significantly reducing operational costs.
More Projects