Antiquarianism played an important role in the shaping of Europe's cultural heritage. Inspired by archaeological discoveries in the Mediterranean and Near East during the late 18th century Neoclassicism may be regarded despite seizable regional differences as a Pan-European cultural movement engraving antiquity in such diverse fields as architecture, landscape design, the fine and decorative arts, literature and even composition. Neoclassica is a project situated at the crossroads of History, Art History, Media Studies, Linguistics, Cultural Sociology and Computer Science to trace the impact of this movement.
For the moment it provides an ontology for the material culture imension of Neoclassical artefacts. The ontology comprises currently more than 500 categories from the field of furniture and furnishings and considerable number of terms that can be used to analyse architectural and constructional features of Neoclassic artefacts. Eventually Neoclassica will mature in a collaborative research platform for tracing the emergence and afterlife of Europe's Neoclassical heritage by analysing and comparing the classic formal vocabulary across various media in a comparative and multi- and transdisciplinary perspective. It will do so by applying new ways of knowledge discovery, in particular by intertwining qualitative research steps with computer assisted research- and assisted learning procedures. It will make use of existing and design and validate new methods of semantic technologies and in particular distributional semantics to approach a vast variety of multimodal sources.
By joining such analysis with technologies having hitherto chiefly been used in opinion mining and sentiment analysis Neoclassica will also record the web of meaning associated with particular material artefacts, representations or forms, both contemporary and as a part of our cultural memory.
Flexible knowledge representation models
In this research line we investigate the complementary aspects between distributional semantics and logical/structured data models, focusing on the analysis of approximate inference on a distributional vector space. While logical models provide an expressive conceptual representation structure with support for inferences and expressive query capabilities, distributional semantics provides a complementary layer where the semantic approximation supported by large-scale comprehensive semantic models and the scalability provided by the vector space model (VSM) can address the trade-off between expressivity and semantic/terminological flexibility.
The contributions of this research line concentrate on advancing the conceptual and formal work on the interaction between distributional semantics and logic, focusing on the investigation of a distributional deductive inference model for large-scale and heterogeneous knowledge bases. The proposed inference model targets the following features:
- an approximative reasoning approach for logical knowledge bases,
- the inclusion of large volumes of distributional semantics commonsense knowledge into the inference process and
- the provision of a principled geometric representation of the inference process.
Reasoning supported by distributional semantics
Building intelligent applications and addressing simple computational semantic tasks demand coping with large-scale commonsense Knowledge Bases (KBs). Querying and reasoning (Q&R) over large commonsense KBs are fundamental operations for tasks such as Question Answering, Semantic Search and Knowledge Discovery. However, in an open domain scenario, the scale of KBs and the number of direct and indirect associations between elements in the KB can make Q&R grow unmanageable. To the complexity of querying and reasoning over such large-scale KBs, it is possible to add the barriers involved in building KBs with the necessary consistency and completeness requirements.
Since information completeness of the KBs cannot be guaranteed, one missing fact in the KB would be sufficient to block the reasoning process. Ideally Q&R mechanisms should be able to cope with some level of KB incompleteness, approximating and filling the gaps in the KBs. This research line investigates a selective reasoning approach which uses a hybrid distributional-relational semantic model to address the problems previously described. In this work, distributional semantic models are used as complementary semantic layer to the relational model, which supports coping with semantic approximation and incompleteness.
DRMs : Schema-agnostic queries & distributional-relational models
The evolution of data environments towards the growth in the size, complexity, dynamicity and decentralisation (SCoDD) of schemas drastically impacts contemporary data management. The SCoDD trend emerges as a central data management concern in Big Data scenarios, where users and applications have a demand for more complete data, produced by independent data sources, under different semantic assumptions and contexts of use.
The emergence of this new data environment demands the revisit of the semantic assumptions behind databases and the design of mechanisms which can support semantically heterogeneous databases. This research line aims at filling this gap by proposing a complementary semantic model for databases, based on distributional semantics. Distributional semantics provides a complementary perspective to the formal perspective of database semantics, which supports semantic approximation as a first-class database operation. Differently from models which describe uncertain and incomplete data or probabilistic databases, distributional-relational models focuses on the construction of semantic approximation approaches for databases, supported by a semantic model automatically built from large-scale unstructured data external to the database, which serves as a semantic/commonsense knowledge base. The semantic abstraction can be used to abstract the database user from the representation of the data, supporting a schema-agnostic approach to data consumption.
DSMs : Distributional semantics systems and resources
Distributional semantic models (DSMs) are semantic models which are based on the statistical analysis of co-occurrences of words in large corpora. DSMs can be used in a wide spectrum of semantic applications including semantic search, question answering, paraphrase detection, word sense disambiguation, among others. The ability to automatically harvest meaning from unstructured heterogeneous data, its simplicity of use and the ability to build comprehensive semantic models are major strengths of distributional approaches.
The construction of distributional semantic models, is dependent on the processing of large-scale data resources. The English version of Wikipedia 2014, for example, contains 44 GB of article data. The hardware and software infrastructure requirements necessary to process large-scale corpora bring high entry barriers for researchers and developers to start experimenting with distributional semantics.
In order to reduce these barriers this research line focuses on the development of fundamental distributional research infrastructures, commoditizing the access to distributional semantic resources. The infrastructure consists of software, data and service resources which could be easily reused and re-deployed by third parties.