Difference between revisions of "Wikidata Graph Split"

From BITPlan cr Wiki
Jump to navigation Jump to search
(Created page with " = Technical details on the Split = See: https://m.wikidata.org/wiki/Wikidata_talk:SPARQL_query_service/WDQS_graph_split Scholarly graph: 1. Find subjects whose P31 is a...")
 
Line 47: Line 47:
  
  
 +
== Federated Query for unknown location of ?entity ==
 +
<syntaxhighlight lang="sparql">
 +
SELECT DISTINCT ?valueLabel (COUNT(?valueLabel) AS ?count) WHERE {
 +
  {
 +
    ?entity wdt:P5008 wd:Q112895606;
 +
      (wdt:P31/(wdt:P279*)) wd:Q1266946.
 +
    ?entity p:P921 ?prop.
 +
  }
 +
  UNION
 +
  {
 +
    SERVICE <https://query-scholarly-experimental.wikidata.org/sparql> {
 +
      ?entity wdt:P5008 wd:Q112895606;
 +
        (wdt:P31/(wdt:P279*)) wd:Q1266946.
 +
      ?entity p:P921 ?prop.
 +
    }
 +
  }
 +
 
 +
  OPTIONAL { ?prop ps:P921 ?value. }
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 +
}
 +
GROUP BY ?valueLabel
 +
ORDER BY DESC (?count)
 +
</syntaxhighlight>
 +
 +
== Federated Query for known location of ?entity (location=scholarly subgraph)==
 +
<syntaxhighlight lang="sparql">
 +
SELECT DISTINCT ?valueLabel (COUNT(?valueLabel) AS ?count) WHERE {
 +
  SERVICE <https://query-scholarly-experimental.wikidata.org/sparql> {
 +
    ?entity wdt:P5008 wd:Q112895606;
 +
      (wdt:P31/(wdt:P279*)) wd:Q1266946.
 +
    ?entity p:P921 ?prop.
 +
  }
 +
  OPTIONAL { ?prop ps:P921 ?value. }
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 +
}
 +
GROUP BY ?valueLabel
 +
ORDER BY DESC (?count)
 +
</syntaxhighlight>
  
 
[[Category:Snapquery]]
 
[[Category:Snapquery]]

Revision as of 13:34, 11 July 2024

Technical details on the Split

See: https://m.wikidata.org/wiki/Wikidata_talk:SPARQL_query_service/WDQS_graph_split

Scholarly graph:
   1. Find subjects whose P31 is a scholarly article (Q13442814) and find all quads whose context matches those subjects.
   2. Find the references for the elements in 1.
   3. Get the values for the elements in 1 and 2.
   4. Add those together to produce the scholarly graph.
   Main ("non-scholarly") graph:
   5. From the full graph, subtract the items identified in 1.
   6. Then remove from 5 the references and values that are only attached to the scholarly graph, but keep any other references or values - 

Endpoints:

wikidata-main
https://query-main-experimental.wikidata.org/
wikidata-scholarly
https://query-scholarly-experimental.wikidata.org/


Which entites are were?

  • The current endpoints only include entities of type scholarly article
    • the proposed additional subtypes/additional types are not included
  • If a entity has two entity classes e.g. scholarly article and an additional one the entity is still included in main
    • I do not know if this behavior is intended


Example

SELECT DISTINCT ?valueLabel (count(?valueLabel) as ?count)
WHERE 
{  
  ?entity wdt:P5008 wd:Q112895606; wdt:P31/wdt:P279* wd:Q1266946 .  
       ?entity p:P921 ?prop . OPTIONAL { ?prop ps:P921 ?value }  
       SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
      }
GROUP BY ?valueLabel
ORDER BY DESC(?count)


  • ?entity are of type thesis, which is of the candidate list of split entities. Thus, the query is currently not affected but it can be in the future
  • ?prop is of type subject and is in the main graph, thus a federated query needs to be used


Federated Query for unknown location of ?entity

SELECT DISTINCT ?valueLabel (COUNT(?valueLabel) AS ?count) WHERE {
  {
    ?entity wdt:P5008 wd:Q112895606;
      (wdt:P31/(wdt:P279*)) wd:Q1266946.
    ?entity p:P921 ?prop.
  }
  UNION
  {
    SERVICE <https://query-scholarly-experimental.wikidata.org/sparql> {
      ?entity wdt:P5008 wd:Q112895606;
        (wdt:P31/(wdt:P279*)) wd:Q1266946.
      ?entity p:P921 ?prop.
    }
  }
  
  OPTIONAL { ?prop ps:P921 ?value. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?valueLabel
ORDER BY DESC (?count)

Federated Query for known location of ?entity (location=scholarly subgraph)

SELECT DISTINCT ?valueLabel (COUNT(?valueLabel) AS ?count) WHERE {
  SERVICE <https://query-scholarly-experimental.wikidata.org/sparql> {
    ?entity wdt:P5008 wd:Q112895606;
      (wdt:P31/(wdt:P279*)) wd:Q1266946.
    ?entity p:P921 ?prop.
  }
  OPTIONAL { ?prop ps:P921 ?value. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?valueLabel
ORDER BY DESC (?count)