Other protocols

This page lists protocols, formats or other specifications that are similar to OpenRefineā€™s reconciliation API.

NIF web services

The NLP Interchange Format (NIF) is an RDF-based format to annotate natural language text. It can be used to represent the output of an entity linking system (which spots mentions of entities in text). This format is used by GERBIL to benchmark entity linking heuristics.

Entity linking (spotting mentions of entities in text) and entity matching (linking database entries which refer to the same entity) are different tasks, but they have some similarities.

OpenSearch

OpenSearch is a protocol that can be used by search engines to expose their results in a uniform way, to be consumed by client applications such as web browsers.

Given that reconciliation is very often search-based, it might make sense to take some inspiration from this protocol.

HOBBIT and SEALS

The HOBBIT platform has been developed to benchmark entity matching systems in a uniform and principled way. Matchers need to conform to a Java API to be evaluated. This system is used in the Ontology Alignment Evaluation Initiative, an academic workshop run in conjunction with the International Semantic Web Conference.

Similarly, there seems to be a SEALS platform, not documented as extensively, which might do the same thing. Its use is described on the corresponding OAEI 2019 page.

SemTab

SemTab stands for annual Semantic Web Challenge on Tabular Data to Knowledge Graph Matching collocated with the International Semantic Web Conference. The systems for semantic table interpretation are benchmarked using SemTab evaluators. The specifications for each task in SemTab are documented at AIcrowd.

Questioning Authority

Questioning Authority is a web service that wraps various authority databases into one unified API that can be used to search for records by text, similarly to the reconciliation API. It is used in the library communities.

Common Terminology Services 2

The Common Terminology Service 2 specification defines an interface to a vocabulary system in the medical domain. The D2Refine project exposes such systems as reconciliation services in OpenRefine, which seems to suggest that they might have similarities with the reconciliation protocol we develop in this group. It should therefore be investigated.

Simple Standard for Sharing Ontology Mappings

The Simple Standard for Sharing Ontology Mappings is a format to represent correspondences between ontologies (but works also at the instance level). Beyond stating that a given URI in vocabulary matches another one in another vocabulary, they also make it possible to state, the type of match (using SKOS) and the reason why the match was established (for instance because the URIs have similar names), the values which were matched and likely other things. See for instance this TSV file which represents matching between two ontologies.