My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members
Links

Welcome to NxParser

NxParser is a Java open source, streaming, non-validating parser for the Nx format, where x = Triples, Quads, or any other number. For more details see the specification for the NQuads format, a extension for the N-Triples RDF format. Note that the parser handles any combination or number of N-Triples syntax terms on each line (the number of terms per line can also vary).

It ate 2 mil. quads (~4GB, (~240MB GZIPped)) on a T60p (Win7, 2.16 GHz) in ~1 min 35 s (1:18min). Overall, it's more than twice as fast as the previous version when it comes to reading Nx.

It comes in two versions: lite and not-so-lite. The latter is provided "as-is" and comes with a whole bunch of stuff you probably won't need, though there's some code for batch sorting large-files and various other utilities, which some may find useful. If you just want to parse Nx into memory, go for the lite version.

The NxParser is non-validating, meaning that, e.g., it will happily eat non-conformant N-Triples. Also, the NxParser will not parse certain valid N-Triples files where (i) terms are delimited with tabs and not spaces; (ii) the final full-stop is not preceded by a space. See here for more info.

For example code on how to use the classes see here.

Powered by Google Project Hosting