|
ByblosSteleDocs
How to write translation stelae for Byblos, Classilla's HTML rewrite engine.
Byblos was introduced in Classilla 9.3.0. This document does not apply to 9.2.3 or previous versions of Classilla. Byblos, in modern Lebanon, is believed by many to be the oldest continuously-inhabited city in the world. The Byblos syllabary is one of the most famous undeciphered writing systems, found on a few bronze tablets and "spatula"-like objects, and fragments of a monument stele. Like the famous Rosetta Stone stele which enabled the translation of Egyptian hieroglyphs from ancient Greek and Demotic scripts (and Apple's Rosetta, allowing superior older PowerPC code to run on newer inferior x86 processors), the Byblos steles you can create for Classilla can allow modern web pages to be rewritten dynamically for Classilla's older layout engine and those of us continuously inhabiting Mac OS 9. How Byblos operatesByblos is similar to Greasemonkey in theory, but implementationally quite different. Greasemonkey is a tool allowing users of Firefox and selected other browsers to dynamically rewrite the layout of a page in JavaScript by manipulating the Document Object Model, which is the abstract representation of the page after it has been loaded by the browser. Tweaking the DOM works fine if the browser core can handle sophisticated runtime manipulations and can do so in a determinative fashion even if the rendering is wrong. On an older core such as Classilla, however, this may lead to subtle bugs or unexpected results, and the DOM parser may not be able to handle certain complex constructs. So Byblos works at a lower level than Greasemonkey and allows you to dynamically rewrite the page's HTML even before the browser tries to process it. The process is very simple: Byblos will check to see if a stele exists for the domain it is accessing (HTTP or HTTPS). If there is a stele and it can be processed, and the stele agrees to process this URL, Byblos downloads the entire page, gives it to the stele, the stele manipulates it, and the HTML is presented to the browser transparently. Stelae are dynamic and bulletproof. You can modify them while the browser is running; you can disable them simply by moving them out of the Byblos folder (in the Classilla folder) as the browser is running; and even if you make a mistake or syntax error, the browser continues to function (the stele just doesn't work). This makes it easy to fine-tune your code on the fly. They are written in JavaScript and use a simple API, allowing anyone with a basic knowledge of the language to write one with just a text editor (we recommend BBEdit 6). Stelae do not translate images, plain text, Gopher or FTP menus, or the source you see when you select View Source (this is intentional, so that you can see the original layout of the page HTML as you write your stele's code). Stelae also do not translate data received through XMLHttpRequests, even if that data is HTML, due to technical constraints. Stelae do not currently translate CSS, although you can add <style> tags to simulate this to your page. CSS rewriting is being considered for a future version. Writing your first steleTo make the most of Byblos, you need to know basic HTML and JavaScript, tutorials of which are beyond the scope of this document. However, let's look at a simple example. function() {
return {
wantURI : function(uri) {
return true;
},
parseHTML : function(text) {
var answer;
// Replace all occurrences of google, regardless
// of case, with g00G13.
answer = text.replace(/google/mi, "g00G13");
return {
response : "ok",
body : answer
};
}
};
};Name this stele www.google.com.js and place it in the Byblos folder (the same folder with your Classilla executable). You do not need to restart Classilla for this to take effect. Open the JavaScript Console (under Web Development) to watch the process, then go to www.google.com (you may need to reload the page if it is cached). You will now see that all occurrences of google in the page text, even the title, have been immaturely replaced with g00G13. This makes the page quite useless, but serves as a demonstration. If you had the Console window open, you will see that Byblos logged the operations it took. If your stele had a syntax or other error in it, Byblos would log it to the console as well. All stelae consist of a single JavaScript function(). This function must return an object with at least two slots: the first, wantURI, being a function that accepts an nsIURI object and returns true or false depending on whether the stele wishes to parse this actual URI, and the second, parseHTML, also being a function that accepts a string representing the document HTML and returns itself an object of two slots, one being the response ("ok" or otherwise) and the other being the body, which is the HTML after it has been rewritten. In our case, this stele returns true for any URI it gets (it will only be presented with URIs that point to www.google.com), and in parseHTML, takes the raw HTML given to it by Byblos and does a simple string replace, and returns that in the response object. By now, people unfamiliar with Mozilla will wonder what an nsIURI is, and will already have visited MDN's nsIURI Documentation. In doing so, you will have executed a stele that comes by default with Classilla which cleans up the MDN pages somewhat. Here is how it appears in 9.3.0:
/* Clean up the header for developer.mozilla.org.
Cameron Kaiser
Public domain. */
function() {
return {
wantURI : function(uri) {
if (uri.path == "" || uri.path == "/" || uri.path == "/en-US/")
return false; // We can't do much with the main page.
return true;
},
parseHTML : function(text) {
var answer;
// Remove the top header.
answer = text.replace(/<\/?header[^>]+>/mig, "");
// Remove the mozilla-tab.
answer = answer.replace(/<a id="moz-tab"/, "<a");
// Remove the stupid JavaScript whine.answer = answer.replace(/<div class="noscript">.+<\/div><\/noscript>/, "</noscript>"); // Remove the styling from the h1 title.
answer = answer.replace(/<h1 id="logo">/,
"<h1 style=\"color: #404040\">");
// Remove the CSS rollovers; they don't work. [\s\S]+? matches all characters,
// including newlines; +? is a non-greedy match a la Perl.
answer = answer.replace(/<nav id="nav"[\s\S]+?<\/nav>/m, "");
return {
response : "ok",
body : answer
};
}
};
};
This example actually is not much more complicated. It checks the path of the URI to see if it's something it can help with. The MDN main page has lots of links and widgets which Classilla can't render right even with help (it lacks those facilities), so there is no point in taking the Byblos performance hit there (more about that in the next section) and it returns false (prompting Byblos to simply fall through to the default handlers). In the HTML rewriting portion, it's the same basic idea as the g00G13 example except we're removing portions of the page and entirely rewriting others. In fact, you could even just take a piece of the page into a temporary variable and write out a new page with just that piece in it, such as for a video player site. The piece would just be the video link, and chaff such as snarky comments and ads would simply be eliminated because you would never pass them back to the browser. Nothing says that what you spit out has to be anything like what you took in. The key idea for Byblos is functionality, not rendering fidelity. Performance considerationsBecause stelae are written in JavaScript, they have the advantage of being dynamic but the disadvantage of being interpreted (and because they are dynamic, are interpreted each time they are invoked). Furthermore, if the stele indicates to Byblos that it can translate a URI, then Byblos must download the entire HTML of the page first before it can hand it to the stele (for obvious reasons), which means that Classilla's HTML parser cannot be working on the page while it downloads. You should therefore make sure that your stele is worth the trouble: it should either restore functionality that is not possible any other way, or it should remove portions of the page that are difficult or slow for Classilla to render ordinarily. In the latter case, even with Byblos' performance impact, a stele can boost the browser's overall performance. When you are unsure if you can help in a situation (such as the MDN stele above), you should return false from wantURI(). Even though the cost of interpreting the script is paid, Classilla can still render HTML as it comes down the pipe instead of waiting for all of it and passing it to a stele which can't do anything useful with it anyway, which at least doesn't add insult to injury in the end. LimitationsWhen writing your stelae, be aware of the following limitations as well as the performance concerns above:
We want your workLike the Greasemonkey community, we would like to build up a library of useful stelae for sites Classilla can't render properly. Scripts that are judged particularly essential or useful may even be included with future versions of Classilla. If you have a stele you would like to submit for review, please open an issue and attach it. Don't take it personally if we decline; we're trying to restrict this to useful ones, in our sole purview. But others might like it through unofficial means, and the great thing about Byblos is there is little penalty for experimentation. |