|
PluginWhio
V8Convert users: please see V8Convert_Whio instead. That's a newer/improved incarnation of this API. Achtung #1: these docs are for the current version in the subversion source tree (i document here as changes are made). If you are using an "older" version, i recommend searching through the subversion repo for a version of this page which is closer to the version you're working with (follow that link, and there are options to browse all versions). Achtung #2: It has come to my attention that using this API to read and write non-ASCII data may (and eventually will) produce undefined results (due to how v8 internally stores and translates strings). Do not try to read/write binary data via the JS interface! You Have Been Warned. The whio I/O pluginThe whio plugin adds object-oriented i/o support to v8 JavaScript applications.
Aboutwhio gets its name from the i/o library it is based on: http://fossil.wanderinghorse.net/repos/whio/ That library is provided with this plugin, and need not be installed separately. This plugins provides the following features:
C++ usageWhen using the plugins API, you don't need to do anything C++-side to use this addon - simply load it as described on the plugins page. When this plugin is registered it will create an object called whio inside the target object into which it is added (normally the global object). That whio object holds the rest of the classes and functions for this plugin. The native whio API has huge amounts of API documentation which might be interesting to users of this API: whio_amalgamation.h But all of the "less technical" info one needs for using it in JS is documented below. For the real guts (e.g. "what are the behavioural differences for FILE- and memory-based i/o devices?") see the API documentation JS usageload_plugin('v8-juice-whio');
var dev = new whio.IODevice("/path/to/file", true); // true= Write-mode
dev.truncate(0);
dev.write("Hi, world!");
...
dev.close();See below for the complete class and function list. TODOs
Error reportingThe i/o classes are based on an abstract C API. That API uses numeric error codes or null values to indicate errors, and this library normally passed on errors from the C API directly back to the caller. The value zero is normally the success value and non-zero is some device-dependent error value (this convention is derived from common C I/O interfaces). Errors which happen at the JS level, as opposed to the C level, are reported via exceptions. All of the constructors will throw if they cannot open their underlying device or stream. For example, a call to read(20) which can only read 13 bytes will tell us so by returning the number 13, but calling read() with no arugments would trigger an exception because there is no sensible default value to pass on to the C API. Where integer error codes are used, they have the following symbolic equivalents:
All of them have unspecified values which are unique within that list. The only entries with predefined values are OK, which is always 0, and SizeTError, which is always equal to -1. Interestingly enough, its exact native value of -1 may depends on the compile-time size of the native whio_size_t type. Errors which happen outside the scope of the underlying i/o library, but within the realm of JS-specific usage, are reported by throwing exceptions. For example, when passing invalid arguments to a constructor, an exception is the only way to report the error. JS API for the whio classesThe whio ObjectThe whio object contains all classes and other symbols in this API. The non-class symbols include the error codes listed above and:
IOBase ClassThis class is abstract - it cannot be instantiated directly. However, it does specify several functions of the i/o interface used by its subclasses, and it can be used with instanceof to ensure that a given object has the most basic i/o operations. The API for this class includes any routines which are common amongst the InStream, OutStream, and IODevice classes, but those classes sometimes provide their own implementations for these functions.
InStream ClassInStream represents a read-only stream. In addition to the interface defined by IOBase, it has:
OutStream ClassOutStream represents a write-only stream. In additions to the interface defined by IOBase, it has:
IODevice ClassIODevice represents a random-access i/o device, with read-only or read/write access. When we say "device", we're normally referring to a file, but the concept of i/o device is more abstract than that, and can essentially mean any data source or destination to which we have random access (the ability to access any point in the data at any time). In addition to the interface defined by IOBase, it has:
SEEK_SET and friendswhio.IODevice.seek() is semantically identical to the C function fseek(), and requires a second argument with one of these values:
they are analogous to the C-standard constants of the same names, and are described in the man pages for fseek(). SubdevicesThe IODevice class has a constructor with the signature new IODevice(IODevice dev,int lower, int upper) which used to create "subdevices". A subdevice is a proxy which wraps up a certain range of bytes inside of another IODevice object. This is covered in gross detail in the source file whio_amalgamation.h (search for "subdev_create"), and here's a short demonstration: Here's one explanation of what subdevices are for: Consider an IODevice pointing to 1000 bytes of storage (somewhere - we don't care where). We can create a subdevice which fences off some amount of that storage. The subdevice is a full-fledged i/o device, but it cannot read or write outside of the bounds it is assigned to. For example: var origin = new whio.IODevice(":memory:");
origin.write("012345679"); // 10 bytes long
var subdev = new whio.IODevice(origin, 4, 7 ); // fence bytes [4,7)
subdev.write("abcdefghij"); // will only write first 3 bytes
origin.rewind();
print(origin.read(10));
subdev.close();
origin.close();The output of that script is: 0123abc79 Note that the upper bound is equivalent to EOF, or "one past the end", and is not accessible to the subdevice. Subdevices have some obscure uses, but probably not many which JS applications can make use of. They were originally designed to partition off access to various internal areas of an embedded filesystem library. Subdevices get an extra function compared to other IODevices:
And a new property:
Embedded/Virtual Filesystems via whioThis plugin comes with an optional component, whefs, which provides a virtual/embedded filesystem for JavaScript applications. The default build process includes whefs, but it can be disabled for platforms where it won't compile (it is untested on anything other than Unix/Posix platforms). whefs is described on its own page. Tips and tricksstdin and stdoutOn Unix platforms you can open standard input and output like this: var stdout = new whio.OutStream("/dev/stdout");
var stdin = new whio.InStream("/dev/stdin");
stdout.write("hi, world!\n");Note, however, that you shouldn't mix such usage with, e.g. print(), or other routines which read/write stdin/stdout, as that may garble your output (depending on the buffering used by the underlying interface, e.g. std::cout vs. printf()). Calling close() on such devices will not close the stdin/stdout associated with the application. Use in-memory IODevices as buffersYou don't need files to use IODevices: var dev = new whio.IODevice(":memory:");Creates an i/o device which uses RAM as storage, growing as necessary. It works just like a file-based i/o device, and can be used to buffer arbitrary data in memory. To specify a starting size of the device, pass it as the optional second parameter. Multiple in-memory devices can be created, but each has its own memory despite having the same name of ":memory:". There is a caveat, however: the memory used for the buffer is not reported to the v8 engine, which means that it can grow to larger than any limit which has been imposed on v8. Adding this awareness to v8 would actually require a significant performance overhead to all write operations, due to the generic nature of the underlying i/o device API (the JS wrapper doesn't actually know, after construction, that it's an in-memory device, and we'd have to query it on each write or truncate operation in order to figure out if it is). The memory associated with an in-memory device can be freed by calling the truncate() member (which might or might not actually free any memory, depending on the magnitude of the change in size, but this implementation explicitly frees the memory when truncate(0) is called). While it may not be immediately obvious, this is one easy way to buffer large amounts of text for later output, and should be much more efficient than using string concatenation to build up very large strings. (See the big ugly warning at the top of this page regarding writing binary data this way!) If you want to ensure sequential, write-only access to a buffer until you are ready to send it, simply wrap up the IODevice in an OutStream using new OutStream(myDevice). Using gzip with streamsThe InStream class has minimal support for gzip compression and decompression. It does not support incremental de/compression, but does support compressing a whole input stream at once. While the JS API does not allow us to safely pass binary around, this support happens at a lower level, transferring bytes directly between the two native-level streams. Here's an example: function tryGzip()
{
var fname = "test.js";
var outname = fname + ".gz";
var ist = new whio.InStream(fname);
var ost = new whio.OutStream(outname);
var rc = ist.gzipTo(ost);
print("gzip rc =",rc,'outfile =', outname);
ist.close();
ost.close();
if( whio.rc.OK != rc ) throw new Error("Gzip failed with code "+rc);
ist = new whio.InStream(outname);
outname = outname + '.check';
ost = new whio.OutStream(outname);
rc = ist.gunzipTo(ost);
print("gunzip rc =",rc,'outfile =', outname);
ist.close();
ost.close();
if( whio.rc.OK != rc ) throw new Error("Gunzip failed with code "+rc);
}
tryGzip();That might output: gzip rc = 0 outfile = test.js.gz gunzip rc = 0 outfile = test.js.gz.check And the list of files it used or created: stephan@jareth:~/cvs/v8-juice/trunk/src/lib/plugins/whio$ l test.js* -rw-r--r-- 1 stephan stephan 5922 2009-03-21 14:16 test.js -rw-r--r-- 1 stephan stephan 1860 2009-03-21 14:19 test.js.gz -rw-r--r-- 1 stephan stephan 5922 2009-03-21 14:19 test.js.gz.check stephan@jareth:~/cvs/v8-juice/trunk/src/lib/plugins/whio$ cmp test.js test.js.gz.check; echo $? 0 Use streams as device proxiesMany i/o algorithms only need sequential access to a stream, and thus accept InStream or OutStream arguments. If you have an IODevice you can easily wrap a stream around it by using new InStream(myDevice) or new OutStream(myDevice), as appropriate. When finished, the stream can be closed, leaving the device intact. Note that all read/write operations performed on such a stream will use the device's current cursor position for their read/write. Thus, if the device is manipulated while streams are using it, the effects may be unpredictable (unless planned very carefully). Also be aware that an output stream might not actually write its data until it is flushed or closed, so do not rely on the contents of a wrapped i/o device to be accurate until the stream proxy has been closed or explicitly flushed. Reimplementing print()It is trivial to re-implement the print() function so that it writes to an output stream of your choice. As a simple example: function print()
{
print.stream.write( Array.prototype.slice.apply( arguments, [0] ).join(" ") + "\n" );
print.stream.flush();
}
print.stream = new whio.OutStream('/dev/stdout'); // or whateverNow any calls to print() will use the proxy output stream. Don't forget to flushThe i/o API is ignorant of any underlying storage mechanism (if any) and may inherit certain behaviours which differ from implementation to implementation. The most notable case is flushing. To "flush" a device means "write any pending data," and is necessary because some device types buffer some amount of output to improve overall performance. When writing to an in-memory buffer, the device's i/o happens at the exact moment it is requested. On the other side, a write to a file may be cached somewhere (e.g. the C FILE object may do buffering of its own). Such a cache will not be written until flush() is called on the device (or the device is closed, in which case it is automatically flushed). The case where this is most visible is when using an OutStream to write to stdout. When doing so, it may be necessary to call flush() on the device before any output is written to the console. If the output need not appear immediately (e.g. it's not intended for interactive use) then explicit flushing is not necessary. It is never harmful to call flush (except on a closed device, in which case the function will throw an exception), but over-flushing may lead to performance problems for some devices. If you need to be certain that a certain bit of output is written to its final destination, call flush(). If you don't need to be certain (that is, you don't mind it internally buffering the data for a bit) then calling flush() is not necessary. The only use case i've personally experienced where it is necessary is when writing to stdout, so that the output becomes available immediately. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||