Parsing FOAF with PHP

Summary:An introduction to parsing FOAF and RDF using the RAP parser for PHP.

Introduction

This article is a guide to using PHP to parse FOAF documents. FOAF stands for Friend-of-a-Friend and is a fun application of RDF that describes people and their relationships to one another. It assumes that the reader is familiar with XML and PHP but that they have little or no knowledge of RDF or FOAF and how to parse them.

Three Minute RDF

RDF (Resource Description Framework) is a format devised by the W3C (World Wide Web Consortium) to represent metadata and knowledge on the web.

RDF uses a graph model to store information. Each node in the graph is called a Resource and may have a URI as a label. Some nodes represent text and are called Literals which are special types of Resources. They are labelled by their text. The arcs that connect the nodes are called Properties and must be labelled by a URI. Arcs always have a direction, i.e. a start node and an end node.

Here's a simple RDF graph to represent the following facts:

  1. John Brown works at a school called "Hill View".
  2. John Brown is aged 27.

A graph with an oval representing John Brown, an oval representing a school, a rectangle with the text "27" and a rectangle with the text "Hill View". An arc links the John Brown oval to the rectangle containing 27. Another arc links the John Brown oval to the school oval. A final arc links the school oval to the rectangle containing Hill View.

Since in RDF resources are labelled with a URI we have to choose a URI for each of our concepts. In this case http://example.com/people/john represents John, http://example.com/orgs/hillview represents the school teacher job and http://example.com/nouns/employer and http://example.com/nouns/age represent the properties that link these resources together.

An RDF graph can be written down as a list of relationships between the nodes, i.e. the start node, the arc, the end node. These relationships are called Triples. The start node is called the Subject, the arc is called the Predicate and the end node is called the Object. The graph above can be written down as the following triples:

Subject: http://example.com/people/john
Predicate: http://example.com/nouns/employer
Object: http://example.com/orgs/hillview

Subject: http://example.com/people/john
Predicate: http://example.com/nouns/age
Object: "27"

Subject: http://example.com/orgs/hillview
Predicate: http://example.com/nouns/name
Object: "Hill View"

There are several ways to store RDF in a file, one of which is XML. An XML version of the above could be:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:noun="http://example.com/nouns/">
  <rdf:Description rdf:about="http://example.com/people/john">
    <noun:employer>
      <rdf:Description rdf:about="http://example.com/orgs/hillview">
        <noun:name>Hill View</noun:name>
      </rdf:Description>
    </noun:employer>
    <noun:age>27</noun:age>
  </rdf:Description>
</rdf:RDF>

This article only deals with parsing the XML form of RDF since that is the most commonly used form on the web. For a more comprehensive introduction to RDF with tons of examples, see the excellent RDF Primer from the W3C.

Three Minute FOAF

FOAF provides about two dozen useful terms for describing people. These can be plugged into existing RDF or form the basis for a dedicated FOAF file, packed full of infomation about a person and the people they know. The most important term in FOAF is Person which (surprise!) represents a person. There are also a set of helpful properties such as name, homepage and knows.

A typical FOAF file might look like the following:

<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      >

  <rdf:Description rdf:about="">
    <dc:title>FOAF for Ian Davis</dc:title>
    <dc:description>Friend-of-a-Friend description for Ian Davis</dc:description>
    <dc:creator rdf:resource="#ian" />
  </rdf:Description>

  <foaf:Person rdf:ID="ian">
    <foaf:name>Ian Davis</foaf:name>
    <foaf:title>Mr</foaf:title>
    <foaf:firstName>Ian</foaf:firstName>
    <foaf:surname>Davis</foaf:surname>
    <foaf:nick>iand</foaf:nick>
    <foaf:mbox rdf:resource="mailto:iand@internetalchemy.org"/>
    <foaf:mbox_sha1sum>69e31bbcf58d432950127593e292a55975bc55fd</foaf:mbox_sha1sum>
    <foaf:homepage rdf:resource="http://internetalchemy.org/iand/"/>
    <foaf:depiction rdf:resource="http://internetalchemy.org/iand/ianportrait.jpg"/>
    <foaf:workplaceHomepage rdf:resource="http://www.innovateer.com/"/>
    <foaf:schoolHomepage rdf:resource="http://www.rhul.ac.uk/" />

    <foaf:made>
      <rss:channel rdf:about="http://internetalchemy.org/index.rss">
        <dc:title xml:lang="en">Internet Alchemy Weblog RSS Feed</dc:title>
        <dc:description xml:lang="en">An RSS feed of weblog postings to the Internet Alchemy weblog</dc:description>
      </rss:channel>
    </foaf:made>

    <foaf:interest>
      <rdf:Description rdf:about="http://xmlns.com/foaf/0.1" rdfs:label="Friend-of-a-Friend"/>
    </foaf:interest>

    <foaf:interest>
      <rdf:Description rdf:about="http://www.w3.org/2000/01/sw/" rdfs:label="Semantic Web Development"/>
    </foaf:interest>

    <foaf:interest>
      <rdf:Description rdf:about="http://www.w3.org/RDF/" rdfs:label="Resource Description Framework (RDF)"/>
    </foaf:interest>
    
    <foaf:knows>
      <foaf:Person>
        <foaf:name>James Carlyle</foaf:name>
        <foaf:mbox_sha1sum>9b7ac29183a3106b9ca8bc436d42f61ee284d147</foaf:mbox_sha1sum>
        <foaf:depiction rdf:resource="http://internetalchemy.org/iand/pics/jamesCarlyle.jpg" />
        <rdfs:seeAlso rdf:resource="http://www.takepart.com/about/foaf.rdf" />
      </foaf:Person>
    </foaf:knows>
  </foaf:Person>
</rdf:RDF>

This file describes a person (me) who has a name of Ian Davis, a title of Mr, a firstName of Ian, a…you get the idea I hope. There are also some other interesting properties such as made which describes something that a person has made, in this case an RSS channel. (RSS is an RDF format for sharing news items). We'll be looking at this property more closely later on. Another interesting property is knows. In the file above it contains another person. This could in turn contain knows properties linking this person with yet more people. This linking feature is one of the things that makes FOAF such an interesting subject for exploring RDF. To find out more about FOAF see the FOAF project page.

RDF Parsers for PHP

There are very few RDF parsers available for PHP. One of the first was the RDF Parser packaged with the phpxmlclasses SourceForge project. It was originally a port of the C Repat parser to pure PHP but has lagged behind both the evolving RDF specification and PHP itself. Running it under PHP 4 results in a number of ugly error messages about deprecated reference syntax.

A more recent, and more compliant parser, is RAP - RDF API for PHP. This is a PHP implementation of Sergey Melnik's proposed RDF API and is the RDF Parser we'll be using for the examples in this article.

Getting Started with RAP

Installing RAP involves downloading the latest version from SourceForge and extracting the zipped files to somewhere on your web server. To import the classes into your script you need to define a constant RDFAPI_INCLUDE_DIR that points to the installation directory. RAP uses this constant internally so you can't avoid defining it:

<?php
  define("RDFAPI_INCLUDE_DIR", "lib/rdfapi-php/api/");
  require_once(RDFAPI_INCLUDE_DIR . "RdfAPI.php");
?>

This header code needs to go at the top of any PHP script that uses RAP.

Parsing Some FOAF

Let's dive straight in and get parsing. This first example creates an RDF parser, parses some FOAF and outputs a list of all the triples it finds:

$parser = new RdfParser();
$model = $parser->generateModel("http://internetalchemy.org/iand/foaf.rdf");
$it = $model->getStatementIterator();

while ($it->hasNext()) {
  $statement = $it->next();
  echo "<p>";
  echo "Subject: " . $statement->getLabelSubject() . "<br />";
  echo "Predicate: " . $statement->getLabelPredicate() . "<br />";
  echo "Object: " . $statement->getLabelObject();
  echo "</p>";
}

The first thing this code does is create the parser. The parser will be responsible for fetching the RDF from the URL we provide, parsing it and converting it into a set of triples. The next line actually does the parsing by calling the parser's generateModel method, passing in the URL of my FOAF file. The generateModel method returns a model object which is like a database that stores triples. RAP only has one kind of model and it stores all the triples in memory. More mature RDF parsers typically provide several different implementations of the model class that store the triples in a more scalable database.

Line 3 calls getStatementIterator, one of model's methods to create the iterator. You might be wondering why the method is called getStatementIterator and not getTripleIterator. There's no deep magic involved, a statement is just another name for a triple in RDF-world.

Iterators have a number of methods for traversing the triples, the most important of which are hasNext and next. hasNext returns true if there are any triples left in the iterator, while next returns the next triple in the sequence. Yes, the triples are in sequence but it's a random one and you can't tell in advance how the triples will be organised.

The next few lines simply print out the various components of the triple using the convenience methods getLabelSubject, getLabelPredicate and getLabelObject. For resources, these methods return a string containing the resource's URI, for literals it returns the literal value.

Lines 3 to 11 form an idiom that you'll use over and over again. Because the number of triples in a model may be very large RAP it's more scalable and future proof to provide access to them through an iterator rather than loading them into an array. In future versions that use database storage for the models the iterator switch to making database calls without you having to change your code.

The general form of the idiom is:

Get iterator from model
While iterator has another statement
  get next statement

Here's some of the output from parsing my FOAF file:

Subject: http://internetalchemy.org/iand/foaf.rdf
Predicate: http://purl.org/dc/elements/1.1/title
Object: FOAF for Ian Davis

Subject: http://internetalchemy.org/iand/foaf.rdf
Predicate: http://purl.org/dc/elements/1.1/description
Object: Friend-of-a-Friend description for Ian Davis

Subject: http://internetalchemy.org/iand/foaf.rdf
Predicate: http://purl.org/dc/elements/1.1/creator
Object: http://internetalchemy.org/iand/foaf.rdf#ian

Subject: http://internetalchemy.org/iand/foaf.rdf#ian
Predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#type
Object: http://xmlns.com/foaf/0.1/Person

Let's look at the triples produced and try to see what they're telling us. If you're au fait with RDF then you can skip this part. Lines 1 to 3 represent the first triple extracted from the RDF. The subject is http://internetalchemy.org/iand/foaf.rdf which happens to be the URI of the FOAF file so this triple is saying something about the FOAF file itself. The predicate is http://purl.org/dc/elements/1.1/title, which corresponds to the title element in the Dublin Core namespace. The Dublin Core namespace defines a vocabulary for describing documents and is hugely successful and widely deployed on the Internet. The object for this triple is 'FOAF for Ian Davis’. So, this triple is telling us that the title of this FOAF file is ’ FOAF for Ian Davis’. Phew!

The next triple uses the description element from the Dublin Core namespace and provides a more detailed description of the file, namely ’ Friend-of-a-Friend description for Ian Davis’.

The third triple uses the Dublin Core creator element but instead of having a simple text value it has a URI - http://internetalchemy.org/iand/foaf.rdf#ian. This URI looks like it is referring to something in the original FOAF file in the same way that you use a # and a name at the end of an HTML document's URL to refer to a part of that document. However, instead of referring to a piece of markup it refers to a resource in the RDF model that is produced from that document. Somewhere in the RDF document something is defined to have an rdf:ID of 'ian’ which is RDF's way of labelling things so that they can be referred to within the same document. Don't worry, the next triple is about to make things clearer.

This fourth triple, on lines 13 to 15, says something about this resource. It has a predicate of http://www.w3.org/1999/02/22-rdf-syntax-ns#type which corresponds to the RDF type property and an object of http://xmlns.com/foaf/0.1/Person. Aha! our first reference to FOAF. This triple is telling us that the resource identified by http://internetalchemy.org/iand/foaf.rdf#ian is a member of the Person class.

Collectively these four triples say something like:

The file at http://internetalchemy.org/iand/foaf.rdf has a title of 'FOAF for Ian Davis', a description of 'Friend-of-a-Friend description for Ian Davis’ and was created by the person identified by the URI http://internetalchemy.org/iand/foaf.rdf#ian.

Getting Information About a Person

It's time to do something more interesting and start querying the RDF model. FOAF allows you to describe all kinds of things about people, probably the most important of which is their name. We know that http://internetalchemy.org/iand/foaf.rdf#ian refers to a resource that is a member of the Person class. The FOAF schema tells us that there is a property called 'name’ which has the URI http://xmlns.com/foaf/0.1/name. So the triples we're looking for have the following pattern:

Subject: http://internetalchemy.org/iand/foaf.rdf#ian
Predicate: http://xmlns.com/foaf/0.1/name
Object: ??

To match the triples in the model with these subject and predicates we create resource objects for the things we're looking for and pass them to the model's find method. We can create a Resource object in two ways: by passing it the complete URI of the resource we want to create or by passing in a namespace URI plus the local name of the resource. The following code illustrates both ways of doing it:

$personResource = new Resource("http://internetalchemy.org/iand/foaf.rdf#ian");
$foafNameResource = new Resource("http://xmlns.com/foaf/0.1/", "name");
$matches = $model->find($personResource, $foafNameResource, NULL);

The find method returns another model which contains all the triples that match the pattern we supplied. We can use the model's iterator to print these out:

$it = $matches->getStatementIterator();
echo "<p>http://internetalchemy.org/iand/foaf.rdf#ian has the following names:</p>";

while ($it->hasNext()) {
  $statement = $it->next();
  echo "<p>";
  echo $statement->getLabelObject();
  echo "</p>";
}

Of course, most people only have one name and those that have more normally use one at a time. We'll take advantage of this to simplify things a little. We'll create a function that gets the first statement that matches a pattern, ignoring the rest, and use that to write another function that gets the name of a person. For our purposes this will work well, but in general you need to think about resources possessing multiple properties of the same tyope.

function getFirstMatchingStatement($subjectResource, $predicateResource, $objectResource, $model) {
  $statement = NULL;

  $matches = $model->find($subjectResource, $predicateResource, $objectResource);
  $statementIterator = $matches->getStatementIterator();

  if ($statementIterator->hasNext()) {
    $statement = $statementIterator->next();
  }

  return $statement;
}

function getNameForPerson($personResource, $model) {
  $foafNameResource = new Resource("http://xmlns.com/foaf/0.1/", "name");
  $name = '';

  $statement = getFirstMatchingStatement($personResource, $foafNameResource, NULL, $model);

  if ($statement) {
    $name = $statement->getLabelObject();
  }

  return $name;
}

We can now refactor our original code to get the name very simply:

$personResource = new Resource("http://internetalchemy.org/iand/foaf.rdf#ian");
echo "<p>http://internetalchemy.org/iand/foaf.rdf#ian has the following name:</p>";
echo "<p>";
echo getNameForPerson($personResource);
echo "</p>";
}

Listing the People in a FOAF File

Now we can get the name for one person, why not get the names for all the people listed in the FOAF file. The steps we need to take are:

  1. Query the model to find all resources of type Person.
  2. For each person:
    1. Call getNameForPerson.

The first step obviously translates into a call to find. This time we're looking for the subject of the triples because we want all things that are of type Person. This is similar to the last triple we saw previously. The triple pattern we need to use looks like this:

Subject:??
Predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#type
Object: http://xmlns.com/foaf/0.1/Person

The second step sounds like iteration again. Here's one way to write it:

$rdfTypeResource = new Resource("http://www.w3.org/1999/02/22-rdf-syntax-ns#", "type");
$foafPersonResource = new Resource("http://xmlns.com/foaf/0.1/", "Person");

$parser = new RdfParser();

$model = $parser->generateModel("http://internetalchemy.org/iand/foaf.rdf");

$people = $model->find(NULL, $rdfTypeResource, $foafPersonResource);

$personIterator = $people->getStatementIterator();

while ($personIterator->hasNext()) {
  $personStatement = $personIterator->next();
  echo "<p>Name: " . getNameForPerson($personStatement->getSubject(), $model) . "</p>";
}

Listing the Relationships Between People

Listing names is all very well, but FOAF is all about relationships. FOAF defines a property called knows (http://xmlns.com/foaf/0.1/knows) which defines a link between two people. FOAF is deliberately silent about the precise meaning of this link apart from saying that one person knows another. Knowing could mean anything from friend-once-told-me-about-them all the way up to sharer-of-my-most-intimate-secrets. Other RDF schemas, such as Eric Vitiello Jr.'s relationship schema can be used to expand on the meaning of knows.

The algorithm is just an extension of the previous one:

  1. Query the model to find all resources of type Person.
  2. For each person:
    1. Call getNameForPerson.
    2. Query the model to find triples with a subject of the person, and predicate of knows.
    3. For each of these triples:
      1. The object of the triple identifies a resource of type Person.
      2. Call getNameForPerson using the object of the triple

Now we have the algorithm it should be easy to apply our pattern matching and iterate techniques to translate it into real code.

The first pattern is the same as before, but the second is new:

Subject: ?Person
Predicate: http://xmlns.com/foaf/0.1/knows
Object: ??

Here I'm using ?Person to denote the resource that's just been found using the first pattern.

Here's the code that does the job:

$rdfTypeResource = new Resource("http://www.w3.org/1999/02/22-rdf-syntax-ns#", "type");
$foafPersonResource = new Resource("http://xmlns.com/foaf/0.1/", "Person");
$foafKnowsResource = new Resource("http://xmlns.com/foaf/0.1/", "knows");

$parser = new RdfParser();
$model = $parser->generateModel("http://internetalchemy.org/iand/foaf.rdf");
  
$people = $model->find(NULL, $rdfTypeResource, $foafPersonResource);

$personIterator = $people->getStatementIterator();
while ($personIterator->hasNext()) {
  $personStatement = $personIterator->next();
  $personName = getNameForPerson($personStatement->getSubject(), $model);

  if ($personName) {
    $personKnows = $model->find($personStatement->getSubject(), $rdfsKnowsResource,NULL);
    $knowsIterator = $personKnows->getStatementIterator();
    while ($knowsIterator->hasNext()) {
      $knowsStatement = $knowsIterator->next();
      $knownPersonName = getNameForPerson($knowsStatement->getObject(), $model);
      if ($knownPersonName) {
        echo "<p>" . $personName . " knows " . $knownPersonName . "</p>";
      }
    }
  }
}

I've put some extra checks to make sure that each person actually has a name defined. You can see from the code and the algorithm that we're doing some nested iteration. This is characteristic of the way the triple data model works and arises because we're chaining the triples together; the object of one triple becomes the subject of another. An RDF query language such as RQL or Versa would perform all this iteration for you but as yet there aren't any available for PHP so, for now, you're on your own.

Listing the RSS Channels Published by a Person's Friends

Let's try something ambitious that involves parsing the FOAF files of the people an individual knows. FOAF provides a property made (http://xmlns.com/foaf/0.1/made) that states that a person made something, usually a document or an RSS channel. There's also a handy property defined by RDF Schema called seeAlso (http://www.w3.org/2000/01/rdf-schema#seeAlso) which is used to refer to another document, usually more RDF. In FOAF it's typically used when describing another person that you know and allows FOAF aggregators to traverse the FOAF network for information.

The final piece tool we'll need is another FOAF property called mbox_sha1sum (http://xmlns.com/foaf/0.1/mbox_sha1sum) which corresponds to a cryptographic hash of the person's email address. FOAF defines this in such a way that it uniquely identifies a person. (Each mbox_sha1sum can belong to only one person, but a person can have multiple mbox_sha1sums). In RDF the same resource can be addressed with different URIs so mbox_sha1sum can be used to reliably detect when two pieces of FOAF are talking about the same person.

If you refer back to the FOAF example at the start of this article you'll see a section of RDF using these three properties that describes someone I know:

<foaf:Person>
  <foaf:name>James Carlyle</foaf:name>
  <foaf:mbox_sha1sum>9b7ac29183a3106b9ca8bc436d42f61ee284d147</foaf:mbox_sha1sum>
  <rdfs:seeAlso rdf:resource="http://www.takepart.com/about/foaf.rdf" />
</foaf:Person>

The goal of this final example is to fetch the FOAF files for all the people I know and use those to determine what RSS channels those people publish. (RSS is an RDF format for publishing news headlines). I can then subscribe to them in my newsreader and keep up to date with their news.

This is the algorithm we're going to use:

  1. Query the model to find all resources of type Person.
  2. For each person:
    1. Call getNameForPerson.
    2. Query the model to find triples with a subject of the person, and predicate of knows.
    3. For each of these triples:
      1. The object of the triple identifies a resource of type Person.
      2. Call getNameForPerson using the object of the triple.
      3. Get the mbox_sha1sum for the person.
      4. Get any seeAlso for the person.
      5. If the person has both mbox_sha1sum and seeAlso then:
        1. Fetch and parse the person's FOAF file.
        2. Query the model to find the resource that has a mbox_sha1sum with the required value. This is the known person.
        3. Query the model to find resources that are made by the known person.
        4. For each of these made resources:
          1. Check to see if it's type is an RSS channel.

This is quite a bit more complicated than previous examples, but believe me it doesn't use any different techniques. It's still pattern matching and iteration. Let's look at the new patterns we need. The first is to get the mbox_sha1sum for a person. This is almost exactly the same as getting the name:

Subject: ?Person
Predicate: http://xmlns.com/foaf/0.1/ mbox_sha1sum
Object: ??

In fact, it's so similar we can put it in a function like we did for getting a person's name:

function getMailboxSha1sumForPerson($personResource, $model) {
  $foafSha1sumResource = new Resource("http://xmlns.com/foaf/0.1/", "mbox_sha1sum");
  $mbox_sha1sum = '';

  $statement = getFirstMatchingStatement($personResource, $foafSha1sumResource, NULL, $model);

  if ($statement) {
    $mbox_sha1sum = $statement->getLabelObject();
  }

  return $mbox_sha1sum;
}

The next pattern match is almost identical:

Subject: ?Person
Predicate: http://www.w3.org/2000/01/rdf-schema#seeAlso
Object: ??

We can write an equivalent function. Note that in common with the other utility functions we're only retrieving the first seeAlso property for a person. This is a little bit lazy because quite often there are more seeAlsos that might be useful. However, we've already bitten off quite a lot in this project, so we'll leave it as it is for now.

function getSeeAlsoForPerson($personResource, $model) {
  $rdfsSeeAlsoResource = new Resource("http://www.w3.org/2000/01/rdf-schema#", "seeAlso");
  $seeAlsoUri = '';

  $statement = getFirstMatchingStatement($personResource, $rdfsSeeAlsoResource, NULL, $model);

  if ($statement) {
    $seeAlsoUri = $statement->getLabelObject();
  }

  return $seeAlsoUri;
}

The next pattern we need is to one find the resource that owns the mbox_sha1sum.

Subject: ??
Predicate: http://xmlns.com/foaf/0.1/mbox_sha1sum
Object: "?value"

This pattern is slightly different from the others we have used. The object in this case isn't a resource it's a piece of text, or literal in RDF terminology. This means we have to create a literal object and pass it to the find method:

function getPersonBySha1sum($sha1sumValue, $model) {
  $foafPersonResource = new Resource("http://xmlns.com/foaf/0.1/", "Person");
  $foafSha1sumResource = new Resource("http://xmlns.com/foaf/0.1/", "mbox_sha1sum");
  $personResource = NULL;

  $statement = getFirstMatchingStatement(NULL, $foafSha1sumResource, new Literal($sha1sumValue), $model);

  if ($statement) {
    $personResource = $statement->getSubject();
  }
  
  return $personResource;
}

The next pattern finds all the things made by a person:

Subject: ?Person
Predicate: http://xmlns.com/foaf/0.1/made
Object: ??

To see if this thing is an RSS channel we have to use an RDF property called type (http://www.w3.org/1999/02/22-rdf-syntax-ns#type). We just need to look in the model to see if there is a triple like the following. If there isn't then we ignore the thing we're checking.

Subject: ?madeThing
Predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#type
Object: http://purl.org/rss/1.0/channel

We can put these two patterns together with our mbox_sha1sum code into a new function that builds a list of all the RSS channels made by a person with a particular mbox_sha1sum given the URI of their FOAF file:

function getChannelsMadeByPerson($sha1sum, $foafuri) {
  $rdfTypeResource = new Resource("http://www.w3.org/1999/02/22-rdf-syntax-ns#", "type");
  $foafMadeResource = new Resource("http://xmlns.com/foaf/0.1/", "made");
  $rssChannelResource = new Resource("http://purl.org/rss/1.0/", "channel");

  $channels = array();

  $parser = new RdfParser();

  $model = $parser->generateModel($foafuri);

  $personResource = getPersonBySha1sum($sha1sum, $model);
  $madeThings = $model->find($personResource, $foafMadeResource, NULL);
  $madeIterator = $madeThings->getStatementIterator();
  while {$madeIterator->hasNext()) {
    $madeStatement = $madeIterator->next();
    // Got something the person made
    // Is it rss?
    $statement = $getFirstMatchingStatement($madeStatement->getObject(), $rdfTypeResource, $rssChannelResource, $model);
    if ($statement) {
     // add it to our list
      $channels[] = $madeStatement->getLabelObject();
    }
  }

  return $channels;
}

We now have all the components to write the PHP that implements our algorithm:

$rdfTypeResource = new Resource("http://www.w3.org/1999/02/22-rdf-syntax-ns#", "type");
$foafPersonResource = new Resource("http://xmlns.com/foaf/0.1/", "Person");
$foafKnowsResource = new Resource("http://xmlns.com/foaf/0.1/", "knows");
$rdfsSeeAlsoResource = new Resource("http://www.w3.org/2000/01/rdf-schema#", "seeAlso");

$people = $model->find(NULL, $rdfTypeResource, $foafPersonResource);

$personIterator = $people->getStatementIterator();
while ($personIterator->hasNext()) {
  $personStatement = $personIterator->next();
  $personName = getNameForPerson($personStatement->getSubject(), $model);
  if ($personName) {

    $personKnows = $model->find($personStatement->getSubject(), $rdfsKnowsResource,NULL);
    
    $knowsIterator = $personKnows->getStatementIterator();
    while ($knowsIterator->hasNext()) {
      $knowsStatement = $knowsIterator->next();
      
      $knownPerson = $knowsStatement->getObject();
      $seeAlsoUri = getSeeAlsoForPerson($knownPerson, $model);
      $knownPersonName = getNameForPerson($knownPerson, $model);
      $knownPersonSha1sum = getMailboxSha1sumForPerson($knownPerson, $model);
      
      if�($seeAlsoUri && $knownPersonName && $knownPersonSha1sum) {
        $channels = getChannelsMadeByPerson($knownPersonSha1sum, $seeAlsoUri);
        if (count($channels) > 0) {
          echo '<p>' . $knownPersonName . ':</p>';
          echo '<ul>';
          while (list ($key, $channel) = each ($channels)) {
            echo '<li>' . $channel . '</li>';
          } 
          echo '</ul>';
        } 
      }
    }
  }
}

I've made a version of this code that produces nicer HTML output available as a live service. Click on the Show Source Code link to examine all the code in this article plus the enhancements.

Summary

This article set out to show you how PHP can be used to explore FOAF social networks. Two simple idioms were explored: pattern matching and iteration. Together, these two techniques are at the heart of any application that involves parsing RDF. Hopefully you're now in a position to start exploring the world of FOAF and, by extension, RDF.

Finding out more

The definitive source for information about RDF and the place to find the formal specifications is the W3C's RDF web site.

Libby Miller has written a good introduction to RDF Query aimed at librarians, but anyone interested in learning about more advanced querying techniques will find it useful. There is also a more advanced survey of RDF query languages which is comprehensive, but not for the beginner.

Create your first FOAF file with Leigh Dodds’ FOAF-a-matic or find someone else's using our RDF source catalogue.

About the Author

Ian Davis is a British developer, based in central England. He is a co-founder and contributor to Semantic Planet, a semantic web advocacy website, which can be found at http://www.semanticplanet.com. Ian's weblog is at http://InternetAlchemy.org.

Copyright

This article is copyright Ian Davis 2003. Permission is granted to reproduce this document in its entirety so long as this copyright message is preserved and a link to the original article is provided.

8 Comments

  1. Parsing FOAF with PHP
    The article [1] by Ian Davis is a guide to using PHP to parse FOAF documents. FOAF stands for Friend-of-a-Friend and is a fun application of RDF that describes people and their relationships to one another.

    [1] http://www.semanticplanet.com/2003/05/p…

    Trackback by circle.ch weblog — 26 Jun 2003 @ 9:01 am

  2. FOAF with PHP
    Semantic Planet Weblog: Parsing FOAF with PHP This article is a guide to using PHP to parse FOAF documents. FOAF…

    Trackback by Jim Mangan's Weblog — 3 Jul 2003 @ 5:42 pm

  3. Parsing FOAF with PHP
    I can't remember if I blogged this before, and I'll probably blog it again. An excellent piece from Ian Davis,…

    Trackback by Raw Blog — 3 Jul 2003 @ 6:05 pm

  4. Parsing Foaf with PHP
    Parsing FOAF with PHP…

    Trackback by KevinDonahue.com : Sideblog — 12 Jul 2003 @ 1:05 am

  5. Parsing FOAF with PHP
    (SOURCE:"marcc")

    Trackback by Roland Tanglao's Weblog — 7 Aug 2003 @ 9:26 am

  6. FOAF & PHP
    Parsing FOAF with PHP…

    Trackback by Weblogger.ch — 14 Jan 2004 @ 5:34 pm

  7. FOAF for PHP
    Excellent Tutorial for Parsing FOAF for PHP. I am going to use this method for the next version of BunnyFoaf me thinks.
    Listening to: 14:31 - Global

    Trackback by Sacrificial Rabbit — 13 Jun 2004 @ 9:39 pm

  8. Comment parser du foaf avec php ?
    Ian Davis explique très bien comment il est possible de s'amuser avec php afin de parser des fichiers foaf….

    Trackback by Jean Jacques's blog — 27 Jun 2004 @ 10:16 am

Leave a comment

Sorry, the comment form is closed at this time.