RDF queries using a relational triple store

While I was travelling in the States, I had the idea of writing a SQL stored procedure that would query a RDF triple store and return matching triple sets. The query would itself be specified as a triple set. For example, a simple query ("find the nodes with a name property that has the literal value of 'ian'") might look like this: ['?b', 'name', 'ian'] A more complex Friend of a Friend query ("find names of people that know a person that knows a person called 'ian'") might look like the triple set ['?a', 'knows', '?b'], ['?b', 'knows', '?c'], ['?c', 'name', 'ian'], ['?a', 'name', '?e'] When I talk about triple sets to describe RDF queries, I was inspired by the work of Libby Miller in thinking of RDF query definitions as sub-graphs for SquishQL. The stored procedure I have built returns sets of triples that match all of the constraints declared in the query. A single matching set of triples might be: ['_:123', 'knows', '_:456'], ['_:456', 'knows', '_:789'], ['_:789', 'name', 'ian'], ['_:123', 'name', 'james']. I need to undertake more extensive testing to look at the unusual cases more. The current logic seems to work correctly but I need more confidence. The following are examples of queries that have been somewhat harder to deal with: 1. ['?a', 'knows', '?b'], ['?b', 'knows', '?c'], ['?b', 'knows', '?d'] This translates as "find people who know a person who knows two people" and the problem is that result sets need to repeat values for ?a and ?b for each set of known pairs of people. 2. ['?a', 'knows', '?b'], ['?c', 'knows', '?b'], ['?b', 'knows', '?d'] This translates as "find two people who know the same person who knows a person". 3. ['?a', 'knows', '?b'], ['?b', 'knows', '?c'], ['?c', 'knows', '?a'] This translates as "find people who know a person who knows a person who knows them" i.e. a circular chain relationship. The next step for me is to build this on top of a large triple store, to look at scalability issues. Following this, I will implement a simple web query interface and get some feedback on the query syntax etc. As you can see, at this stage I have completely glossed over issues like datatypes, the difference between resources and literals, and other very important areas in RDF. The goal is to build a high performance system that will scale to millions of triples, and to do it using plain ANSI-92 SQL. I think this would be of real value to the FOAF scutter builders, allowing a logical progression from spidering RDF statements to being able to query them.

Comments

No comments yet.

Leave a comment

Sorry, the comment form is closed at this time.