GenAI for Text Summarization in German - Daniel Manns

Home

Projects

About

Contact

Stack

Buy Template

Home

Projects

About

Contact

Stack

Daniel Manns

Data Scientist

ML Engineer

AI Consultant

Available for Amazing Projects

Daniel Manns

Data Scientist

ML Engineer

AI Consultant

Daniel Manns

Data Scientist

ML Engineer

AI Consultant

Available for Amazing Projects

All Projects

GenAI for Text Summarization in German

A GenAI Application for Creating High Quality Summaries of German Newspaper Articles.

Artificial Intelligence

Duration:

02.2024 - 05.2024

Client:

Anonymous client in the publishing industry

Technologies:

Python, Llama-3, Huggingface, SFT, PEFT, LORA, FAST API, GPU-Optimiztaion

Situation

A client in the publishing industry faced high costs due to employees manually writing summaries for newspaper articles. Pre-trained common foundational Large Language Models did not suffice the client’s quality requirements for German newspapers.

Task

The client needed an automated system capable of generating high-quality summaries for newspaper articles in German.

Action

I fine-tuned one of the large freely available language models (LLama-3) to generate summaries in German. The fine-tuning process involved:

Data cleaning and preprocessing.
Supervised Fine-Tuning (SFT) with Parameter-Efficient Fine-Tuning (PEFT) and Low-Rank Adaption (LoRA) of the publicly available pre-trained Llama-3 model utilizing the Huggingface library.
Model training on Nvidia A100 GPUs.
Model training optimization through model quantization and GPU memory optimization (flash-attention-2, gradient checkpointing).
Model inference via FAST API.
Model inference optimization through tuning of model “creativity” (temperature, top k, top p, prompt engineering).

Result

The client successfully implemented an automated system for generating high-quality but also cost-efficient summaries of German newspaper articles. This enabled the client to deploy their staff more efficiently, significantly reducing operational costs.

More Projects