Data interlinking through robust linkrule extraction

This page contains the key extraction program and datasets used for the experiments made for evaluating it.

Linkrule extraction software

The linkrule extraction algorithm is implemented in Java. It takes as input two RDF files (either in RDF/XML or TTL), describing only instances of a particular class, and eventually another RDF file containing a set of owl:sameAs links that can be used when comparing objects of properties. The output is a set of candidate linkrules with the following statistics: #links, discriminability, coverage, h-mean.

Download linkrule extractor software

Syntax:
java -jar linkrule-extractor.jar dataset1.ttl dataset2.ttl [object-owlsame-links.ttl]

Datasets

The two datasets describing communes: communes_insee.ttl and communes_gn.ttl

The owl:sameAs links between arrondissments (object of some properties describing communes in both datasets): links_arrondissements_insee_gn.ttl

Reference owl:sameAs links between communes: links_communes_insee_gn.ttl

Reproduce Experiments

The following command performs robustness experiments. It makes varying the probability to resp. remove instances, remove triples, scramble triples, remove refernce links from 0 to 0.9. Each time, 10 runs are done and generated results are the average of these runs. The output files are the following: Syntax:
java -mx6000m -cp linkrule-extractor.jar fr.inrialpes.exmo.linkrule.EcaiExperiments communes_insee.ttl communes_gn.ttl links_communes_insee_gn.ttl links_arrondissements_insee_gn.ttl