Chair of Digital Libraries and Web Information Systems
Research

Research

Research Areas

Flexible knowledge representation models

In this research line we investigate the complementary aspects between distributional semantics and logical/structured data models, focusing on the analysis of approximate inference on a distributional vector space. While logical models provide an expressive conceptual representation structure with support for inferences and expressive query capabilities, distributional semantics provides a complementary layer where the semantic approximation supported by large-scale comprehensive semantic models and the scalability provided by the vector space model (VSM) can address the trade-off between expressivity and semantic/terminological flexibility.

The contributions of this research line concentrate on advancing the conceptual and formal work on the interaction between distributional semantics and logic, focusing on the investigation of a distributional deductive inference model for large-scale and heterogeneous knowledge bases. The proposed inference model targets the following features:

  1. an approximative reasoning approach for logical knowledge bases,
  2. the inclusion of large volumes of distributional semantics commonsense knowledge into the inference process and
  3. the provision of a principled geometric representation of the inference process.

Reasoning supported by distributional semantics

Building intelligent applications and addressing simple computational semantic tasks demand coping with large-scale commonsense Knowledge Bases (KBs). Querying and reasoning (Q&R) over large commonsense KBs are fundamental operations for tasks such as Question Answering, Semantic Search and Knowledge Discovery. However, in an open domain scenario, the scale of KBs and the number of direct and indirect associations between elements in the KB can make Q&R grow unmanageable. To the complexity of querying and reasoning over such large-scale KBs, it is possible to add the barriers involved in building KBs with the necessary consistency and completeness requirements.

Since information completeness of the KBs cannot be guaranteed, one missing fact in the KB would be sufficient to block the reasoning process. Ideally Q&R mechanisms should be able to cope with some level of KB incompleteness, approximating and filling the gaps in the KBs. This research line investigates a selective reasoning approach which uses a hybrid distributional-relational semantic model to address the problems previously described. In this work, distributional semantic models are used as complementary semantic layer to the relational model, which supports coping with semantic approximation and incompleteness.

DRMs : Schema-agnostic queries & distributional-relational models

The evolution of data environments towards the growth in the size, complexity, dynamicity and decentralisation (SCoDD) of schemas drastically impacts contemporary data management. The SCoDD trend emerges as a central data management concern in Big Data scenarios, where users and applications have a demand for more complete data, produced by independent data sources, under different semantic assumptions and contexts of use.

The emergence of this new data environment demands the revisit of the semantic assumptions behind databases and the design of mechanisms which can support semantically heterogeneous databases. This research line aims at filling this gap by proposing a complementary semantic model for databases, based on distributional semantics. Distributional semantics provides a complementary perspective to the formal perspective of database semantics, which supports semantic approximation as a first-class database operation. Differently from models which describe uncertain and incomplete data or probabilistic databases, distributional-relational models focuses on the construction of semantic approximation approaches for databases, supported by a semantic model automatically built from large-scale unstructured data external to the database, which serves as a semantic/commonsense knowledge base. The semantic abstraction can be used to abstract the database user from the representation of the data, supporting a schema-agnostic approach to data consumption.

DSMs : Distributional semantics systems and resources

Distributional semantic models (DSMs) are semantic models which are based on the statistical analysis of co-occurrences of words in large corpora. DSMs can be used in a wide spectrum of semantic applications including semantic search, question answering, paraphrase detection, word sense disambiguation, among others. The ability to automatically harvest meaning from unstructured heterogeneous data, its simplicity of use and the ability to build comprehensive semantic models are major strengths of distributional approaches.

The construction of distributional semantic models, is dependent on the processing of large-scale data resources. The English version of Wikipedia 2014, for example, contains 44 GB of article data. The hardware and software infrastructure requirements necessary to process large-scale corpora bring high entry barriers for researchers and developers to start experimenting with distributional semantics.

In order to reduce these barriers this research line focuses on the development of fundamental distributional research infrastructures, commoditizing the access to distributional semantic resources. The infrastructure consists of software, data and service resources which could be easily reused and re-deployed by third parties.

Current Projects

CS-AWARE

CS-AWARE

Cybersecurity is one of today's most challenging security problems for commercial companies, NGOs, governmentalinstitutions as well as individuals. Reaching beyond the technology focused boundaries of classical information technology (IT) security, cybersecurity includes organizational and behavioural aspects of IT systems and also needs to comply to the currently actively developing legal and regulatory framework for cybersecurity.

more

PACE – Passau Centre for eHumanities

PACE – Passau Centre for eHumanities

The Passau Centre for eHumanities (PACE) investigates new computer-based approaches to determine the needs of the humanities and cultural studies fields for research and teaching. The Centre The Passau Centre for eHumanities (German) is an important component of the eHumanities field at the University of Passau.

more

Project SSIX – tools to get to know the sentiments of your social-media-audience

Project SSIX – tools to get to know the sentiments of your social-media-audience

Social Sentiment Indices powered by X-Scores (SSIX) aims to provide European SMEs with a collection of easy to interpret tools to analyse and understand social media users attitudes for any given subject; these sentiment characteristics can be exploited to help SMEs to operate more efficiently resulting in increased revenues.

more

MARIO - healthy ageing with use of caring service robots

MARIO - healthy ageing with use of caring service robots

MARIO addresses the difficult challenges of loneliness, isolation and dementia in older persons through innovative and multi-faceted inventions delivered by service robots. The effects of these conditions are severe and life-limiting.

more

Neoclassica

Access Project Page

Antiquarianism played an important role in the shaping of Europe's cultural heritage. Inspired by archaeological discoveries in the Mediterranean and Near East during the late 18th century Neoclassicism may be regarded despite seizable regional differences as a Pan-European cultural movement engraving antiquity in such diverse fields as architecture, landscape design, the fine and decorative arts, literature and even composition. Neoclassica is a project situated at the crossroads of History, Art History, Media Studies, Linguistics, Cultural Sociology and Computer Science to trace the impact of this movement. 

For the moment it provides an ontology for the material culture  imension of Neoclassical artefacts. The ontology comprises currently more than 500 categories from the field of furniture and furnishings and considerable number of terms that can be used to analyse architectural and constructional features of Neoclassic artefacts. Eventually Neoclassica will mature in a collaborative research platform for tracing the emergence and afterlife of Europe's Neoclassical heritage by analysing and comparing the classic formal vocabulary across various media in a comparative and multi- and transdisciplinary perspective. It will do so by applying new ways of knowledge discovery, in particular by intertwining qualitative research steps with computer assisted research- and assisted learning procedures. It will make use of existing and design and validate new methods of semantic technologies and in particular distributional semantics to approach a vast variety of multimodal sources. 

By joining such analysis with technologies having hitherto chiefly been used in opinion mining and sentiment analysis Neoclassica will also record the web of meaning associated with particular material artefacts, representations or forms, both contemporary and as a part of our cultural memory.