|
Project Information
Featured
Downloads
Links
|
OverviewUIMA-connectors aims mainly at offering solutions to build the bridge between some markup languages and the UIMA structure data, namely the CAS. In comparison, the Tika project aims at detecting and extracting metadata and structured text content from various type MIME documents. UIMA-connectors is more dedicated to perfom mapping from/to text formats to/from CAS, providing solutions for handling language formats such as eXtended Markup Language (XML), Comma Separated Value (CSV), whitespace token and newline sentence... or applications of these formats such as Message Understanding Conferences (MUC), Apache OpenNLP... In practice, solutions could be collection readers, analysis engines and cas consumers. We preferentially adopt an approach in terms of development of AE which allows to cut into any point of a workflow by specifying the view to process. These components can complete the wrapping performed by the uima-shell component but they can also be used with more straightforward integrations like the ones which implements the API of third-part tools. Examples of format transformation supported by UIMA-connectors:
If you want to receive notifications on major updates, please send an email to the nicolas.hernandez's gmail account with the following subject: uima-connectors request for notifcation. |