Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDFa in HTML parsing broken #658

Open
timbl opened this issue Jan 26, 2025 · 1 comment
Open

RDFa in HTML parsing broken #658

timbl opened this issue Jan 26, 2025 · 1 comment

Comments

@timbl
Copy link
Member

timbl commented Jan 26, 2025

The code seems to assume a RDFaProcessor.trim function which I guess used to exist but was deprocated... (who deprocated things?)

Solution in rdfaparser to switch it for the native stim trim... like foo.node.value.trim()

Failed 200: Fetch of <file:///Users/timbl_1/Content/DesignIssues/Overview.html> failed: Error trying to parse <file:///Users/timbl_1/Content/DesignIssues/Overview.html> as RDFa:
TypeError: RDFaProcessor.trim is not a function:
TypeError: RDFaProcessor.trim is not a function
    at RDFaProcessor.process (/usr/local/lib/node_modules/rabel/node_modules/rdflib/lib/rdfaparser.js:379:37)
@csarven
Copy link
Member

csarven commented Jan 26, 2025

You seem to have introduced static trim here:

c74e911

https://github.com/linkeddata/rdflib.js/blob/main/src/rdfaparser.js#L910

Perhaps that needs a revisit?


Aside: The RDFa in https://www.w3.org/DesignIssues/Overview.html seems fine to me. Seems fine with rdfa-streaming-parser.js and http://rdf.greggkellogg.net/distiller?command=serialize&url=https:%2F%2Fwww.w3.org%2FDesignIssues%2FOverview.html&raw


Aside: While RDFa processors still support xmlns for prefix mappings from RDFa 1.0, it is deprecated in RDFa 1.1. Consider changing:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:dct=
"http://purl.org/dc/terms/" xmlns:sioc="http://rdfs.org/sioc/ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
...
<body xml:lang="en" bgcolor="#FFFFFF" lang="en" text="#000000">

to:

<!DOCTYPE html>
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
...
<body prefix="dct: http://purl.org/dc/terms/ sioc: http://rdfs.org/sioc/ns# foaf: http://xmlns.com/foaf/0.1/">

(Move lang and xml:lang to <html>. bgcolor and text are not applied, so remove them from <body>)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants