Introduction
Detailed documentation on sbaz can be found at http://www.lexspoon.org/sbaz/. Everything from design rationale to the data model can be found in the various links on this page.
Fundamentally, sbaz is used to manage a scala development and execution environment, called a managed directory. Packages can be downloaded from remote locations, installed and intelligently removed with package based dependencies enforced. Any Jar files or class directories that are installed into the /lib directory will be automatically included (nesting not supported) on the classpath of all executable scripts (scala, scalac, scaladoc, etc.) in the /bin directory. As a result, multiple versions of the same packages are not supported with this usage. There is no requirement that packages place libraries into the top-most level of the /lib directory, so multiple library versions can be installed into a single working directory, but doing so prevents the automatic inclusion of jar files into the scala tool's classpath. Below, you will find summaries of core concepts to the sbaz design and some brainstorming for allowing sbaz to coexist with maven with some degree of ease.
Core Concepts
Universe - a means to define an index of available packages
- A SimpleUniverse points to a single bazaars server.
- Other kinds of Universe exist to build composites of SimpleUniverses.
- In the current implementation, the logic surrounding the Universe exists primarily on the client side. It pulls down the available packages from each bazaar server listed in the SimpleUniverses, then joining and filtering them according to the kind of Universe.
Package - a grouping of files for installation into a managed directory
- Is an archive (zip compression) file with .sbp extension
- A package cannot be installed over a previously installed version. The existing package must be removed before the replacing package can be installed.
- All contents of a package are extracted into the working directory structure, which has a traditional Unix style environment format. The one exception is the package's meta directory which is not extracted.
- The data needed to track dependencies and manage installs/removals is stored in the meta directories of the working directory and packages.
- The client side sbaz tool tracks where each file is stored so it can be completely removed later.
- Dependencies are by package name only. No support for versions currently supported.
Sbaz Command Line Interface - client-side tool for managing the managed directory
- Installs and uninstalls packages for a working directory with some degree of compatibility assurance.
- Provides a single command to update all packages in the managed directory
- Displays a list of packages available for installation
- Displays a list of installed packages
- Displays detailed info about a specific package
- Stores access keys needed for secured sbaz servers
- Bundles packages for sharing
- Communicates with bazaars server to advertise packages for sharing
- Moving the package to a server for hosting is not supported
- List of available commands are hard coded, but could probably be enhanced for plug-in support.
- The client side tool itself is managed in the managed directory.
- Somewhat surprisingly, the client does not appear to allow a direct URL read (i.e. static XML file) when loading available package lists for a Universe. It must communicate with a Bazaars Server server using XML messages for both reads and writes.
Bazaars Server - a central location for distributing package indexes (not the packages themselves) to clients
- Manages a list of advertised packages (add, remove, distribute)
- Provides basic security through keys.
- Current implementation is simple and Servlet based.
- Current implementation doesn't appear to have much knowledge about Universe design beyond its own SimpleUniverse definition. It does have potential to grow.
Feature differences between Maven and sbaz requiring design decisions
- Unlike Maven, repositories are not a core concept of sbaz design. Where sbaz provides a listing of packages and pointers (URLs) to download locations, Maven provides a repository and expects users to know what is available. Fortunately, these are not mutually exclusive design points, and should be able to be merged.
- Maven is geared toward project management while Sbaz is geared toward managing an environment.
- Maven supports versioned dependencies, sbaz does not.
- With transitive dependencies, Sbaz assumes compatibility across most recent versions. Maven makes no such assumptions by using explicitly defined dependency versions. When ambiguity between versions appears in a complex dependency graph, Maven does assume the version found at the most shallow position in the graph should be used (process called Dependency Mediation). Both allow explicit overrides by specifying a version at the top-most level, but Maven has additional version compatibility audits.
- Maven's versioned dependencies may cause a single library addition or change to a version result in many updates to the transitive dependencies' versions as well. Supporting this kind of behavior in sbaz would involve many removals and installs, possibly requiring a rewrite of the dependency enforcement logic. This can be a performance issue, as packages are added incrementally to a working directory unlike Maven, where all dependencies are defined within the POM before calculating dependencies and downloading them.
Given the above points, I propose sbaz retain its existing dependency management paradigm while adding the ability to reference packages within the Maven repositories. This will require a "dumbing down" of Maven's versioned dependencies paradigm to map into the sbaz way of doing things. This approach will allow the greatest reuse of existing code and design in sbaz, and shouldn't disrupt current users of the tool.
Possible sbaz Enhancements
Collision Management
The client does handle collisions; however, the behavior could be improved. When upgrading from scala 2.7.2 RC1 to 2.7.2 RC2, the scala-swing Jar files were merged into the scala-library package. I already had scala-swing installed as a separate package, so encountered the collision during upgrade. I received the message 'Error: package scala-swing/0.1 already includes src/scala-swing-src.jar', but there was no indication of how the error was handled or who the offending package was. The client simply aborted with a partial installation. It didn't install the scala-library package, thus resulting in my managed directory being left in an inconsistent state. The client should be enhanced to never leave a managed directory in an inconsistent state and/or provide the user with enough detail to easily recover.
- Collisions can occur between files in different packages.
- Collisions can occur between classes in different Jar files.
We need a clear definition on how such collisions are to be handled.
- As with the implementation today, cross package collisions should be avoided by default. Better feedback to the user must be added to speed problem resolution.
- There may be times when installing a collision is desirable. When forcing such installations, the order of installation (particularly for package level collisions) and inclusion into the classpath (class level collisions) is important. The command line tool should manage this sequencing for subsequent removals and possibly promoting shadowed packages to the front.
- A new 'which' utility (similar to the Unix commands) could be made available via the command line tool. This utility could be used when class level collisions exist to identify where the used class file is located on the file system as well as listing all shadowed versions.
- The package advertisement should include a contents list (early collision checks), archive file size (download status feedback), total contents file size (disk management) and archive checksum (security), all of which can be generated directly from the .sbp file.
Version Management
The Scala language's trait mechanism is a very powerful feature, but must be treated with special care when packaging and deploying. Aspects of the trait are effectively copied into the extending class, making it impossible for the class loader to adopt changes to the trait alone. Given the way traits work, a version change to the trait package would likely require a new version (recompile) of the dependent packages even if they require no code changes. To manage this scenario, a package should specify the strictness of a version dependency.
- By default, sbaz will prefer the most recent version available.
- No version number restrictions are required (results in same behavior as today). Version number constraints are explicitly added to relevant packages.
- Strict version dependencies can be used to specify a specific, singular version number for which the dependency is compatible. This is useful for packages that depend upon traits in other packages.
- Version number dependencies can be specified using ranges to specify a lower bound, upper bound, or both.
- Enhance sbaz to favor explicitly installed packages over transitive dependencies when resolving version conflicts
- Dependency resolution algorithm
- Starting at each explicitly installed package, build a list of all dependencies with version ranges
- If multiple dependency references exist for a specific package, calculate the inner join of the version ranges
- Select the latest version of each package falling into its joined version range
- Do not let transitive dependencies install if a package's version ranges cannot be resolved automatically
- Explicitly installed packages take precedence over the automated dependency version algorithm.
- A command line tool should be available to provide the dependency paths leading to a version conflict
With the above version management solution, it would be ideal if the entire scala community could adopt a version number convention that indicates the significance of the changes included within a release. For example:
major.minor.trait.bug
- major denotes a significant design change with no regard to backward compatibility
- minor denotes incremental enhancement where backward incompatibilities should be minimal
- trait denotes a bug fix change to a trait that would only effect dependencies if they extend the trait
- bug denotes backward compatible bug fix release
Alternative Packaging Solution
- Add support an XML based package that references the components of the package via multiple location URLs. This could be used to define packages that point directly into Maven repositories, Google code projects and others that may not provide .sbp packages.
- Maven Repo Read - Implement or enhance bazaar server to generate advertisements for XML based packages pointing into maven repo. This could be fully automated, but reasonable packaging of versions may require manual intervention. For example, regular expressions could be used to choose what versions of a library belong to a given package (e.g. Junit3 vs. Junit4).
- Maven Repo Read - How should sbaz treat Maven packages? Should they always be installed into the /lib directory for automatic inclusion on the classpath? Should they be installed elsewhere (e.g. Maven style local repo within workspace) for explicit classpath inclusion. This could be configured within the /config directory managed directory wide behavior, but it would be nice to have these options mixed.
- Maven Repo Write - Package Description enhancements to provide required data for Maven repo upload
- Maven Repo Write - Add support to bazaar server for uploading artifacts. Could add audits for valid versions such that the repo cannot become broken when using most recent versions per the sbaz version management way.
General Enhancements
- Add project and contact info to the package metadata
- A meta directory summary file that describes referenced Universes, package versions, installation sequences and other such information could be recorded in a single flat file. This flat file could then be used to restore a broken or build a new and identical working directory. Applications of this include simple sharing of setup between developers, version management without storing large third party files, simple environment backups with single command restores, etc.
- A command line tool managed checkpoint/rollback feature may be nice.
- The managed directory itself could become a file system based bazaars "server" to support scenarios where only a file server is available
- Make adding sbaz commands modular such that new commands could be plugged into sbaz via downloadable packages
Some of your general enhancements are rather interesting. Eventually sbaz would have the same features as apt or yum.
IN terms of priorities, I see these as being the first things we attack:
1) Meta-Data changes (we need to make sure future/existing sbaz packages have enough meta-data for the features outlined). 2) Maven Rep Read enhancements (suddenly, sbaz will have access to many more libraries, improving the scala community)
Also, an idea for a genral sbaz enhancement -> Support upgardeing packages rather than uninstalling/replacing.
Has Buildr been considered as an alternative to sbaz? It already supports Scala.