Press Portal

HU computer scientists develop effective search engine for quotes

A use case for research into how computers learn to process human language

Are you looking for the latest quotes from specific people or on specific topics from German-language news media? Do you want to verify the source of a quote? Or do you want to know the context in which a quote was made? Prof Dr Alan Akbik and his team at the Department of Computer Science at Humboldt-Universität zu Berlin (HU) have developed a novel search engine for quotes called “Zitatsuchmaschine”.

A fully automated process creates a huge database of quotes and speakers

The Zitatsuchmaschine goes beyond what conventional search engines have to offer. Its web crawlers continuously scan the website content of German-language online news articles and extract quotes and their speakers. The fully automated process based on AI models, which Prof Akbik and his team have developed over the past four years, has created a huge database: Over two million quotes by over 240,000speakers from around 50 different journalistic sources. More than 10,000 additional quotes are automatically found and added every day.

For Alan Akbik, head of the of the Machine Learning Group, the quote search engine is a by-product of his research. Akbik and his team are working on language models and the question of how computers learn to process human language (Natural Language Processing, NLP). The computer scientists want to develop methods that are as data- and resource-efficient as possible.

‘An important question for us is how we as a university can keep up with companies like OpenAI with our own language models,’ says Alan Akbik. ‘That's why we are working on NLP models that can be trained with as little data as possible and require fewer resources. The quote search engine is a use case for this research.’

With their search engine, Alan Akbik and his team aim to provide a research tool for journalists and other users. And they intend to improve and expand it continuously.

Further Informations

Zitatsuchmaschine - Quote Search Engine - Developed at Humboldt-Universität zu Berlin

Contact

Prof. Dr. Alan Akbik
Machine Learning Group at Humboldt-Universität zu Berlin

alan.akbik@hu-berlin.de