Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-LD parser does only find the first item #16

Open
jkphl opened this issue Mar 24, 2017 · 5 comments
Open

JSON-LD parser does only find the first item #16

jkphl opened this issue Mar 24, 2017 · 5 comments

Comments

@jkphl
Copy link
Owner

jkphl commented Mar 24, 2017

Am 20.03.2017 um 13:59 schrieb Claas Kalwa:

Hallo Joschi,

ich habe Probleme beim Extrahieren mehrerer JSON-LD Items mit dem
Micrometa V1 Parser. Er erkennt lediglich das erste Item, egal ob die
Items mit @graph gruppiert sind oder seperat in eigenen script-Elementen
vorkommen.

Im Anhang habe ich ein Beispiel, das eigentlich funktionieren sollte,
denke ich.

Hast Du eine Idee, wo das Problem liegen könnte?

Example source:

<!DOCTYPE html>

<html>
    <head>
        <title>TODO supply a title</title>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">

	<script type="application/ld+json">
	{
	 "@context": "http://schema.org",
	 "@graph": [
	{
	  "name": "Google Inc.",
	  "@type": "LocalBusiness",
	  "address": {
	    "@type": "PostalAddress",
	    "addressCountry": "United States",
	    "streetAddress": "1600 Amphitheatre Parkway",
	    "addressLocality": "Mountain View",
	    "addressRegion": "CA",
	    "postOfficeBoxNumber": null,
	    "postalCode": "94043",
	    "telephone": "+1 650-253-0000",
	    "faxNumber": "+1 650-253-0001"
	  }
	},
	{
	  "name": "Google Ann Arbor",
	  "@type": "LocalBusiness",
	  "address": {
	    "@type": "PostalAddress",
	    "addressCountry": "United States",
	    "streetAddress": "201 S. Division St. Suite 500",
	    "addressLocality": "Ann Arbor",
	    "addressRegion": "MI",
	    "postOfficeBoxNumber": null,
	    "postalCode": "48104",
	    "telephone": "+1 734-332-6500",
	    "faxNumber": "+1 734-332-6501"
	  }
	}
	 ]
	}
	</script>

    </head>
    <body>
        <div>TODO write content</div>
        
    </body>
</html>
@jkphl jkphl closed this as completed in af5eb6a May 13, 2017
@rvanlaak
Copy link
Collaborator

rvanlaak commented Oct 31, 2019

The commit closing this issue does not entirely fix this issue. The JSON LD implementation still does not find multiple items in case the value of @graph has more than one root item (read: is an array).

Why? Because \Jkphl\Micrometa\Infrastructure\Parser\JsonLD::parseRootNode does only return the first found node. This probably is the specific framing implementation the class docbloc mentions (?)

Did you ever think of writing some sort of "filter" option, so users can provide the type for which building up the graph should start? That way only returning one node would still be possible.

I will try to write a test that demonstrates that only the graph of the first node gets returned.

{
  "@context": "http://schema.org",
  "@graph": [
    {
      "@type": "Article",
      "@id": "/articles/foobar",
      "comment": [
        {"@id": "/articles/foobar#comment-1"},
        {"@id": "/articles/foobar#comment-2"}
      ]
    },
    {
      "@type": "Comment",
      "@id": "/articles/foobar#comment-1"
    },
    {
      "@type": "Comment",
      "@id": "/articles/foobar#comment-2"
    }
  ]
}

@jkphl
Copy link
Owner Author

jkphl commented Nov 1, 2019

@rvanlaak Re-opening ... looking forward to any constructive suggestion! 👍

@jkphl jkphl reopened this Nov 1, 2019
@rvanlaak
Copy link
Collaborator

rvanlaak commented Nov 6, 2019

We for now added a custom JSON-LD parser that decorates the one of the library to support named graphs.

Our domain depends on filtering on @type, so that's embedded in the parser because the constructor on ParserInterface does not allow us to nicely inject it.

When $jsonLDRoot does not match specification (read: has @graph and @context), the regular JsonLD behavior gets used.

class JsonLDFilteredParser extends JsonLD
{
    public const FORMAT = 32;

    protected function parseRootNode($jsonLDRoot)
    {
        // Test Named Graphs specification
        if (!isset($jsonLDRoot->{'@graph'}, $jsonLDRoot->{'@context'})) {
            return parent::parseRootNode($jsonLDRoot);
        }

        try {
            $jsonDLDocument = JsonLDParser::getDocument($jsonLDRoot, ['documentLoader' => $this->contextLoader]);

            /** @var GraphInterface $graph */
            $graph = $jsonDLDocument->getGraph();

            // Run through all nodes to parse the first one
            foreach (FilterTypes::types as $type) {
                $nodes = $graph->getNodesByType('http://schema.org/'.$type);

                if (1 === \count($nodes)) {
                    $node = current($nodes);

                    return $this->parseNode($node);
                }
            }
        } catch (JsonLdException $exception) {
            $this->logger->error($exception->getMessage(), ['exception' => $exception]);
        }

        return null;
    }
}

@Sarke
Copy link
Contributor

Sarke commented Nov 22, 2019

Same problem, here's an example: https://www.macobserver.com/news/apple-changes-testing-ios-14/

@rvanlaak Where is the FilterTypes class from in your example? I'm inferring that JsonLDParser is ML\JsonLD\JsonLD.

@rvanlaak
Copy link
Collaborator

FilterTypes::types is one of our local constants, it is just an array we prioritized based on which node type we want to find first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants