Talk:Knowledge graph
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||
|
Separated from ontology (information science)
editMarkus Krötzsch, SebastianHellmann, denny -- I started separating this from ontology (information science), would welcome input. – SJ + 23:25, 29 June 2020 (UTC)
This appears to be a WP:Broad-concept article, not an ambiguous term. There is no other meaning for "knowledge graph" than the one used here, thus no need for a disambiguation page. But perhaps Knowledge Graph should be moved to Google Knowledge Graph. I'm confused about Wikidata. The lead of that article says it's a knowledge base, not a knowledge graph. I guess it's both as apparently a knowledge graph is a specific type of knowledge base. In that case the Wikidata lead should define it by the more specific term. – wbm1058 (talk) 15:08, 30 June 2020 (UTC)
- That makes sense, thanks. – SJ + 18:17, 1 July 2020 (UTC)
- Most of the experts including the recently elected TOP 10 most influential scholars in knowledge engineering, were mostly using 'knowledge base' for the newly developing large knowledge graphs. Few were already using 'knowledge graph'. The first reason is that knowledge base is the general hypernym. The second reason IMHO was that we stored the knowledge graphs in relational databases for the lack of genuine graph dbs (hundreds of these exist today): DBpedia was stored in mysql at first, then moved to Virtuoso in 2008, which still uses a rel db as SPARQL and SQL have a similar algebra. Wikidata still stores everything in a relational database, but its definitely a graph model. Technically, I would consider knowledge base too general now. The 2011 Google campaign popularized the term and it specifically refers to Knowledge Bases that use a graph model. I find the article here well written. I second wbm1058 opinion not to have Knowledge Graph disambiguation page as it is quite well defined here now as a WP:Broad-concept article . SebastianHellmann (talk) 21:46, 17 September 2020 (UTC)
Here is a relevant earlier comment from Sebastian: [1]: – SJ +
I think I found a good approach to sort the terminology between the different things:
- Knowledge graphs: are very entity-centric, so they focus primarily on entities and their relations. So the EntityAttributeValue model applies.
- Knowledge bases are more focused on axioms, like in prologue you formulate axioms or predicate logic statements. This would also generate a knowledge graph, but more implicitly as part of the Universe. They are a bit more similar to databases where you have n-ary relations or tuples and something like foreign keys.
- Ontologies: These have a strong modelling aspect and regulate Schema, Axiomatic knowledge as well as terminology of the schema like subClassOf/subPropertyOf. I would see them quite distinctive to RDF or property graphs. They are more something you can put on top of a knowledge graph and you can have several to have several views of the same knowledge graph
- to compare: Wikidata has a total item-centric view and is therefore a knowledge graph, Wikipedia has a lot of wikilinks, which also make up a graph, however the articles are split into entities and conceptual articles, so it is part graph, part ontology (very broadly). DBpedia is a knowledge graph plus ontologies on top.
Merge proposal discussion
edit- Comment. I've cleaned up the disambiguation page, and it should remain a disambiguation page. -- JHunterJ (talk) 11:15, 1 July 2020 (UTC)
- Agreed, especially after this move: the dab page disambiguates the generic term that Google chose to use for the Google Knowledge Graph. This section is about ambiguity in the definition of the general-purpose term. – SJ + 16:56, 1 July 2020 (UTC)
Section removed
editThis was unsourced and confusing. A rewritten + sourced version could be appropriate. – SJ + 17:30, 27 July 2020 (UTC)
- The benefits of using a knowledge graph
In the case of integrating supplemental data source,
A KG formally represents the meaning involved in information by describing concepts, relationships between things, and categories of things. These embedded semantics with the data offer significant advantages such as reasoning over data and dealing with heterogeneous data sources. The rules can be applied on KG more efficiently using graph query. For example, the graph query does the data inference through the connected relations, instead of repeated full search of the tables in relational database. KG facilitates the integration of new heterogeneous data by just adding new relationships between existing information and new entities. This facilitation is emphasized for the integration with existing popular linked open data source such as Wikidata.org.
An SQL query is tightly coupled and rigidly constrained by datatype within the specific database. It can join tables and extract data from tables. The result is generally a table. A query can join tables by any columns which match by datatype. A SPARQL query is the standard query language and protocol for Linked Open Data on the Web. It is only loosely coupled with the database so that it facilitates the reusability and can extract data through the relations free from the datatype, and not only extract but also generate additional knowledge graph with more sophisticated operations (logic: transitive/symmetric/inverseOf/functional). The inference based query (query on the existing asserted facts without the generation of new facts by logic) can be fast comparing to the reasoning based query (query on the existing plus the generated/discovered facts based on logic).
The information integration of heterogeneous data sources in traditional database is intricate, which requires the redesign of the database table such as changing the structure and/or addition of new data. In the case of semantic query, a SPARQL query reflects the relationships between entities in a way that is aligned with human's understanding of the domain, so the semantic intention of the query can be seen on the query itself. Unlike SPARQL an SQL query reflects the specific structure of the database and is derived from matching the relevant primary and foreign keys of tables. Thereby, it loses the semantics of the query by missing the relationships between entities.
A KG helps to find latent connections among items: improving of the precision; and a KG helps to identify a user's intention which was hidden only by the ML output: It brings the explainability to the targeting system.
Diagram(s) needed
editThis article would benefit from one or more suitable diagrams explaining the structure of knowledge graphs and giving examples of data in such structures. I assume those would be similar to semantic nets so drawing them should not be difficult. Chiswick Chap (talk) 10:17, 4 August 2020 (UTC)
- @Addshore: any better diagrams come to mind? – SJ + 01:32, 4 August 2021 (UTC)
Wiki Education assignment: INFO 505 - Foundations of Information Science
editThis article was the subject of a Wiki Education Foundation-supported course assignment, between 22 August 2023 and 11 December 2023. Further details are available on the course page. Student editor(s): CarpenterAnt (article contribs). Peer reviewers: SummerNightmare2023, Ftalebhaghighi, Waveformleaf, Bellestar12, Blackshadow005.
— Assignment last updated by Blackshadow005 (talk) 21:37, 6 November 2023 (UTC)