Germany 100k.zip Apr 2026
: Approximately 100,000 documents with titles, tables, and images removed to provide clean, plain text.
: Building a set of unique German words or tokens for language modeling. Germany 100k.zip
: Providing a large corpus for both extractive and abstractive summarization techniques. : Approximately 100,000 documents with titles, tables, and