|
Builtin_functions
JAQL built-in function list - autogenerated
generated
systemexternalfn()Description An expression that constructs a JSON value for a Java UDF Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any ls()Description This function returns an array of file objects that match a user provided glob / path filter. Usage :
{ <file status fields> } ls(string glob); Input Parameters: glob or a path / file pattern string. The path pattern is absolute if it begins with a slash. Output: An array of file status records that adhere to the following schema {"accessTime": Date, "blockSize": Long, "group": String, "length": Long, "modifyTime": Date, "owner": String, "path": String, "permission": String, "replication": Long} In standalone mode, the path pattern is applied to the local filesystem. In distributed modes, ls applies the path pattern per default to the HDFS file system. To apply this function on the local file system in distributed mode, the path pattern has to be prefixed with 'file:///' Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
corecatch()Description Wrap any expression with catch to guard against exceptions. Usage:
T1|null catch( T1 e1, { errThresh: long } | null, T2 e2); Wrap the catch expression around the first argument, e1, that needs to be guarded for exceptions. The second argument is optional. It specifies an exception handling policy. If unspecified or null, the default exception handling policy is used. By default, if an exception occurs, it is propagated (which typically results in aborted execution). This default can be overridden globally using the registerExceptionHandler function, or at can be overridden per usage of catch by using the second argument. Such an override allows catch to throw an exception errThresh times before propagating the exception. Thus, the default has errThresh set to 0. The third argument, e2, is optional and is used to specify an expression whose value is logged when an exception is thrown. Catch returns the result of evaluating e1 (whose type is T1). If an exception is thrown, but skipped, then null is returned. Note that catch s a "blocking" call: the result of e1 will be materialized. If e1 could be streamed (e.g., read(...)), when used in the context of catch, its result will be entirely materialized. Parameters (1 - 3 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any),( arg2 = null: schema any) Output schema any Examples
compare()Description This function compares two JSON values with each other
Usage : long compare(T1 val1, T1 val2); Input Parameters: Two JSON values of the same type Output: Returns -1, 0, or 1 as val1 is less than, equal to, or greater than val2. Parameters (2 inputs) Input Types: ( x, required: schema any),( y, required: schema any) Output schema long Examples
getHdfsPath()Description This function returns the absolute path of a file or directory in HDFS
Usage : string getHdfsPath(string); Input Parameters: The input string is either a fileName, or a directory name residing in HDFS. Absolute paths have to be prefixed with a slash. Output: A string value containing the absolute path name of the file system object in HDFS context. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
getOptions()Description Return Jaql's options as a record Usage:
{ : any, } getOptions(); Jaql maintains globally accessible options, e.g., name-value pairs. These options are represented as a record; the getOptions function returns these options. Note that if you set the field "conf" with a record, those options are overlaid onto Hadoop's JobConf when a MapReduce job is run. Using getOptions and setOptions, one can override settings in the default JobConf. Parameters (0 inputs) Input Types: {{{}}} Output schema any Examples
index()Description index(array, index) returns the value at the position index in the arraypassed in. It is equivalent to array[index], but it captures a simpler case that does not use path expressions. array[index] is transformed to use the index() function for better performance.
Note: index() is zero-based. Usage: any index( array, long ) Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
listVariables()Description This function lists all global variables that are in scope.
Usage : [{ <variable fields> }] listVariables() Output: Returns an array of JSON records, each representing a global variable.Every JSON record adheres to he following schema {"var": string, "schema": schema, "isTable": bool, "package": string, "module": string, "alias": string} var is the name of a variable schema is the schema of the variable isTable is true if the schema is an array of records package is the package name that declared the variable, or "" for the top script module is the module name that declared the variable, or "" for the top script alias is the module alias used to refer to the variable in the current context, or "" Parameters (0 inputs) Input Types: {{{}}} Output {{{schema [{"var": string, "schema": schematype, "isTable": boolean}]}}} Examples
perPartition()Description perPartition() declares that a function f can be evaluated in parallel using an arbitrarypartitioning of the input array in.
Usage: array perPartition( array in, array fn( array ) f ) perPartition() returns the array obtained by applying f to in. Note: This function is declared experimental. perPartiton() states that the partitioning of the input array does not influence the result of f. Since f is can be an arbitrary function, this statement may or may not be correct. If f is sensitive to the partitioning of in, the result of this function is depending on the evaluation strategy of the query and may be arbitrary. Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
range()Description Range generates a continuous array of numbers
Usage: range(size) = [0,size-1] range(size,null) = [0,size-1] range(start,end) = [start,end] range(start,end,skip) = if skip > 0 then for(i = start, i <= end, i += skip)else errorrange(size,null,skip) = if skip > 0 then for(i = 0, i < size, i += skip)else error Parameters (1 - 3 inputs) Input Types: ( startOrSize, required: schema long?),( end = null: schema long?),( by = 1: schema long) Output {{{schema [long]}}} registerExceptionHandler()Description Register a default exception handling policy. Usage:
bool registerExceptionHandler( { errThresh: long } ); This function allows the default exception handling policy to be overridden. Currently, the policy can specify how many exceptions to skip before propagating the exception up the call stack. This is specified by the errThresh field of the input. By default, errThresh is set to 0, meaning that no exceptions are skipped. When an exception is skipped, the enclosing expression decides what to do. If the exception occurs in the catch function, then it returns null and logs the results of a user supplied expression. If the exception occurs in a transform, then the result is skipped and logged. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
setOptions()Description Set Jaql's options as a record Usage:
bool setOptions( {: any, } ); Jaql maintains globally accessible options, e.g., name-value pairs. These options are represented as a record; the setOptions function modified these options. Note that if you set the field "conf" with a record, those options are overlaid onto Hadoop's JobConf when a MapReduce job is run. Using getOptions and setOptions, one can override settings in the default JobConf. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
tee()Description tee() streams an array into each specified function. It returns its input array. tee() canbe thought of as analogous to the tee command in unix which is used to replicate data streams.
Usage: array tee( array in, any fn( array ) f1, ..., any fn( array ) fN ) tee() applies the functions f1 through fN to in. It returns in. tee() can be called with no functions in which case tee() is effectively a no-operation on the array in. The functions passed to tee() usually perform a side-effect as the result of the function f1 through fN is discarded. A typical use of tee() would be to write the input array to different output files. Callers to function should not assume a particular order of evaluation of f1 through fN. Parameters (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any)... Output schema any Examples
timeout()Description Wrap any expression to limit the amount of time it will run. Usage:
T timeout(T e, long millis); Given an arbitrary expression e (of type T), all it to be evaluated for now more than millis ms. If e completes in less than millis time, then its value is returned. Otherwise, an exception is thrown. Parameters (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any) Output schema any Examples
hadooploadJobConf()Description load a Hadoop JobConf into a record Usage:
{ : string, } loadJobConf( string? filename ) If filename to conf is not specified, then the default JobConf is loaded. Parameters (0 - 1 inputs) Input Types: ( arg0 = null: schema any) Output schema any Examples
mapReduce()Description This function runs a MapReduce job from within JAQL
Usage : mapReduce(JSON record); Input: A JSON record that describes the MapReduce job to run. This record adheres to the following schema{input: {type: string, location: string}, output: {type: string, location: string}, map: JAQL function combine: JAQL function reduce: JAQL function} Input and output expect a file-descriptors (fd), more information on this can be found in the I/O wiki at "http://code.google.com/p/jaql/wiki/IO" Note that input and output can also be an array of file descriptors, which is needed for other functions like co-group, union, and tee, for example. Also note that the output is also a file descriptor, i.e. result of a map-reduce invocation can be read. The map function takes as input an array and produces an array of pairs of the grouping key and value. The combiner function takes as input the output of the map function and performs partial aggregation on the map side before sending it to the reduce function. The reduce function processes the output of the map function or combiner function and operates on the grouping key and an array of values. Output: A JSON record describing the type and location of the reduce output, i.e. file descriptor{type: String, location: String} Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
mrAggregate()Description This function runs a MapReduce job from within JAQL and allows for running multiple algebraic aggregates in one pass
Usage : mrAggregate(JSON record); Input: A JSON record that describes the MapReduce job to run. This record adheres to the following schema{input: {type: string, location: string}, output: {type: string, location: string}, map: JAQL function aggregate: JAQL function final: JAQL function} Input and output expect a file-descriptors (fd), more information on this can be found in the I/O wiki at "http://code.google.com/p/jaql/wiki/IO" Note that input and output can also be an array of file descriptors, which is needed for other functions like co-group, union, and tee, for example. Also note that the output is also a file descriptor, i.e. result of a map-reduce invocation can be read. This function is evaluated running map/reduce. It allows a user however to specify an array of partial aggregates in the aggregate parameter. These algebraic aggregates (have to be commutative and associative combiner functions) are then evaluated without making multiple passes over the group as it would be necessary with the mapReduce function. More information can be found in the JAQL wiki at "http://code.google.com/p/jaql/wiki/Functions" Output: A JSON record describing the type and location of the reduce output, i.e. file descriptor{type: String, location: String} // Generate some data Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
nativeMR()Description launch a natively specified MapReduce job Usage: { status: boolean } nativeMR( { job conf } conf , { apiVersion: "0.0" | "1.0", useSessionJar: boolean } options );
Launch a stand-alone map-reduce job that is exclusively described by job conf settings. The conf can be obtained using loadJobConf or it can be specified using a record literal that lists the needed name/value pairs for the job. If apiVersion is set to "0.0", then the old Hadoop MapReduce API is used. Otherwise, the new API is used. The useSessionJar is convenient for those native MapReduce jobs that use jaql libraries. Since the jaql client already packages up jars when submitting jobs to Hadoop's MapReduce, the useSessionJar is used to specify that the job's jar should use the client's currently packaged jar. Parameters (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any) Output schema any Examples
readConf()Description read a value from jaql's current Hadoop JobConf Usage:
string readConf(string name, string? dflt); Jaql stores the JobConf that is associated with the current map-reduce job. This function reads name from this JobConf and returns its value, otherwise it returns the dflt value. Parameters (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any) Output schema any Examplesjaql> readConf( "mapred.reduce.tasks" ); "1" iohdfsShell()Description This expression allows for running HDFS shell commands. This is equivalent to executing 'hadoop fs'
Usage : long hdfsShell(string); Input: A HDFS file system command, that is supported by hadoop's FsShell. Output: The command output.Return value -1 --> command failed. Return value 0 --> command successfully executed. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
arrayarrayToRecord()Description This function generates a JSON Record by merging two input arrays. The current sort order is maintained for the merging process.
Usage : T1 arrayToRecord(JSON array1, JSON array2); Input Parameters: Two JSON arrays, where the first array contains the list of names, the second array contains the respective list of valuesIf count(array1) > count(array2) --> A null value is used to fill up all names that do not have an associated value If count(array1) > count(array2) --> Error Output: Returns a JSON record that contains all merged name value pairs from array1 and array2 Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
deempty()Description This function removes empty sub-objects and null values from an JSON array.
Usage : T2 deempty(T1); Output: Returns a JSON array without empty sub-objects, e.g. empty records and array. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
distinct()Description List distinct values from an array, remove duplicates.
Usage: any distinct( any ) Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examplesjaql> distinct( [1, 1d, 1m, 1.5d, 1.5m, 1.50d, 1.50m ] ) -> sort by [$]; [ 1,1.5 ] enumerate()Description Take an input an array of any type and returns an array of pairs, one pair per input value. Each pair will list the ordinal value of the array value (e.g., its index in the array), along with the value of the array.
Usage: [long, T, ] enumerate( T, * ) Parameters (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any) Output schema any Examplesjaql> enumerate( ["a", "b", "c"]); [ [ 0, "a"] , [1, "b"], [2, "c"] ] exists()Description Usage : bool exists(any); If the argument is null, return null , If the argument is a empty array , return false , If the argument is an array with at least one element, return true , If the argument is not an array or a null, return true. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examplesjaql> exists(null); null jaql> exists([]); false jaql> exists([...]); true //when the array has at least one element (even a null) jaql> exists(...); true //when the argument is not an array or a null lag1()Description This function is deprecated and should not be used.
lag1(arr)arr is A , returns {prev: A, cur: A} If arr has k items, the result has k - 1 items. result.prev is the first k-1 items result.cur is the last k-1 items. eg: [1,2,3] -> lag1() == { prev: 1, cur: 2 }, { prev: 2, cur: 3 } Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any nextElement()Description Given an array in this function associates the next element witheach element of in.
Usage: { cur: any, next?: any } nextElement( array in ) Note: If in has k items, the result has k items. The record returned for the lastelement does not have a next field. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
pair()Description Combines two values to an array.
Usage: array pair( any , any ); The arguments can be any type of date, each of them will be one element of the return array. Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
pairwise()Description Combine two arrays (A,B) into one array C, assume A = a1,a2,a3 ... , B = b1,b2,b3 ... , pairwise combines every elements in the same position in each array, produces C = [a1,b1 , [a2,b2] , [a3,c3] ... ].
Usage: array pairwise( array A , array B ); Parameters (2 - 3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2 = null: schema any)... Output schema any Examplesjaql> pairwise([1,2],[3,4]); [ [1,3], [2,4] ] powerset()Description This function returns the power-set of a list of items.
Usage : [T...... ]powerset([T...]) Output: JSON array containing the power-set of the input items Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examplesjaql> [1,2,3] -> powerset() == [ [], [1], [2], [1,2], [3], [1,3], [2,3], [1,2,3] ] jaql> ['a', 'b', 'c'] -> powerset() == [ [], ['a'], ['b'], ['a','b'], ['c'], ['a','c'], ['b','c'], ['a','b','c'] ] prevAndNextElement()Description Given an array in this function associates the previous and the next element witheach element of in.
Usage: { cur: any, prev?: any, next?: any } prevAndNextElement( array in ) Note: If in has k items, the result has k items. The record returned for the firstelement does not have a prev field. The record returned for the last element does not have a next field. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
prevElement()Description Given an array in this function associates the previous element in the array in witheach element of in.
Usage: { cur: any, prev?: any } prevElement( array in ) Note: If in has k items, the result has k items. The record returned for the firstelement does not have a prev field. Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
removeElement()Description Remove element from array in the given position.
Usage: array removeElement( array arr , int position); Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examplesjaql> removeElement([1,2,3],0); [ 2,3 ] replaceElement()Description Replace an element of the target array with a given value.
Usage : array replaceElement( array arr , int position, value v ); Parameters (3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2, required: schema any) Output schema any Examplesjaql> replaceElement([1,2,3],2,100); [ 1,2,100 ] reverse()Description Reverse an array
Usage: array reverse(array arr) Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examplesjaql> range(1,10) -> reverse(); [ 10,9,8,7,6,5,4,3,2,1 ] jaql> [[0],[1,2],[3,4,5],[6,7,8,9]] -> transform reverse($)->reverse(); [ [9,8,7,6] , [5,4,3] , [2,1], [0] ] // reverse sequence shift()Description This function is deprecated. Do not use! Parameters (3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2, required: schema any) Output schema any slice()Description slice(array, startIndex, stopIndex) returns elements of array starting at positionstartIndex up to the position stopIndex. It is equivalent to array[startIndex:stopIndex], but it captures a simpler case that does not use path expressions. array[startIndex:stopIndex] is transformed to use the slice() function for better performance.
Note: slice() is zero-based. Usage: array slice( array, long, long ) Parameters (3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2, required: schema any) Output schema any Examples
slidingWindow()Description Given an array input and two predicates start and end, this function associates each elementcurElem in input with a sub-array of input, called a window, which is computed using the predicates start and end. The window is a "sliding" window since for each curElem the window is computed relative to curElem potentially starting at curElem or any position after curElem and ending at any position at startElem or after startElem. The window is said to be sliding across the input array.
Usage
In order to find the sliding window of each element curElem = inputcurIndex in input, slidingWindow() repeatedly evaluates the predicate start() starting from curElem until start() evaluates to true or the end of the array is reached:
Note: slidingWindow() computes its window by value and not by position in the array. See slidingWindowBySize() for a positional version of slidingWindows(). Note: slidingWindow() is optimized as a streaming operation. The memory used by slidingWindow()is proportional to the size of the window, not the input array. Parameters (3 inputs) Input Types: ( input, required: schema [ * ]?),( start, required: schema function),( end, required: schema function) Output {{{schema [{ }]}}} Examples
slidingWindowBySize()Description Given an array input and two longs size and offset, this function associates each elementcurElem in input with a sub-array of input, called a window, which is computed using size and offset. The window is a "sliding" window since for each curElem the window is computed relative to curElem. The window is said to be sliding across the input array.
Usage
inputcurIndex + offset : curIndex + offset + size The optional parameter exact can be specified by callers to indicate if only windows of size size are returned. Note: slidingWindow() computes its window in a positional way. See slidingWindowBySize() for a value-based version of slidingWindowsBySize(). Note: slidingWindowBySize() is optimized as a streaming operation. The memory used by slidingWindow()is proportional to the size of the window, not the input array. Parameters (2 - 4 inputs) Input Types: ( input, required: schema [ * ]?),( size, required: schema long),( offset = null: schema long?),( exact = false: schema boolean) Output {{{schema [{ }]}}} Examples
toArray()Description This function wraps the input into a JSON array. In case the input is a JSON array or null, this function simply returns the input.
Usage : T1toArray(T1) Output: A JSON array that wraps the input object Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
tumblingWindow()Description Given an input array input, a start and a stop predicate, tumblingWindows() computesnon-overlapping sub-arrays, also called windows, of input. Starting from the current index in input or the beginning of input, tumblingWindow() probes the predicates start and stop to find an appropriate window to return. Once that window has been determined, it is returned and the current index is advanced past the last element in the returned window. Elements contained in this window are nnot returned again. The window has "tumbled".
Usage
boolean stop( any prev, any first, any last, any next, long size ) and start is a function defined with the following signature boolean start( any prev, any first, any size ) Note: tumblingWindow() uses a call-by-name model to probe start and stop. That means, that thenames of the parameters to the user-provider predicates has to match the parameter names provided here. In order to find each tumbling window in input, tumblingWindow() keeps the state of the last element it looked and the current element. The last element may be null if it is to determine the first window. Then, tumblingWindows() repeatedly evaluates the predicate start() beginning at the current element until start() evaluates to true or the end of the array is reached:
If firstGroup is set to false, the first window is not returned. If lastGroup is set to false, the last window is not returned. Parameters (2 - 5 inputs) Input Types: ( input, required: schema [ * ]?),( stop, required: schema function),( start = null: schema function?),( firstGroup = true: schema boolean?),( lastGroup = true: schema boolean?) Output {{{schema [*]}}} Examples
tumblingWindowBySize()Description Given an input array input, and a size parameter, tumblingWindows() returnsnon-overlapping sub-arrays, also called tumbling windows, of input of size size.
Usage [ any ] tumblingWindowBySize( any input,long size, boolean lastGroup = true ) If lastGroup is set to false, the last ( possibly incomplete group ) is not returned. Parameters (2 - 3 inputs) Input Types: ( input, required: schema [ * ]?),( size, required: schema long | double | decfloat),( lastGroup = true: schema boolean?) Output schema [ * ] Examples
union()Description This function unions multiple JSON arrays into one JSON array in arbitrary order without removing duplicates (like SQL's UNION ALL)
Usage T1union(JSON array1, JSON array2, ...) Input Parameters: An arbitrary number of JSON arrays Output: A JSON array containing the union of all input arrays Parameters (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any)... Output schema any Examples
indexkeyLookup()Description This function performs a left outer join on two sets of key/value pairs. It works similar to a hash-join in relational databases.In the first step, the function builds a hash table on the inner key/value pairs (expr1). For each key/value in the outer pairs (expr0)return [key, value1, value2] tuples.
Usage : [key,value1 <outer key/value pairs> ] -> keyLookup([key,value2 < inner key/value pairs > ]) ==> [key, value1, value2 ] Input: - JSON array of outer key value pairs Input Parameter: - JSON array of inner key/value pairs Output: JSON array of joined key/value pairs If the outer key does not exist in the inner set, null is returned for the inner value. So this is preserving the outer input (left outer join) Note: The function assumes that the inner keys are unique, otherwise an arbitrary value is kept. // Join between an inner and an outer array of key/value pairs Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
keyMerge()Description Given two arrays of key-value pairs ( sub-arrays ) a1 and a2, this functionreturns an array resulting from merging a1 and a2 based on the key of each pair.
Note: The input arrays a1 and a2 must be sorted on key. The behavior of this function is undefined if there are duplicate keys within a2. Usage Let k be of type T: [ T, any, any ] keyMerge( [ T key1, any value1 ] a1, T key2, any value2 ] a2 ) For each key/value1 pair in a1 find the key/value2 pair in a2 and add key, valu1, value2 to the result array which is returned upon the end of a1.
This function only requires a single key from each array to be in memory at a time. Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
probeLongList()Description probeLongList() probes as array of longs of build keys passed in using an array of key/value pairscalled input. probeLongList() returns an array of size size(in): For each key/value pair in input, probeLongList() represents the lookup result as a number index in the following way:
- index is >= 0 if the value is found in the list of keys. (it is the index of the key in the sorted list of keys, but that may change in the future) - index < 0 if not found (it is the (-(insertion point) - 1 ) as defined by Arrays.binarySearch(),but that may change in the future) probeLongList() returns an array of tuples that pairs up key, value, and the index. Usage [ long key, any value, long index ] probeLongList( [ long? key, any value ],long? key )
probeLongList() builds a compact in-memory representation of an array of longs of build keys. Note that all probe items are returned. This allows us to support in and not-in predicates, as well as just simple annotations. Nulls are tolerated in the probe keys, but they will never find a match. Null [key,value] pairs are not tolerated; a pair is always expected. There is currently an implementation limit of 2B values (~16GB of memory). Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
schemacheck()Description This function checks whether the first argument matches the schema given in the second argument.If so, the function returns the first argument. Otherwise, it throws an exception.
Usage : T1 check(T1 val1, schema any) Input Parameters: A JSON object that is to be verified against the schema. Output: The JSON object is returned in case the verification was successful. // Checks that the first parameter is of type long Parameters (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) Output schema any Examples
typeof()Description This function returns the type of a JSON object
Usage : string typeof(T1 val1) Input Parameters: Any JSON object Output: The string value of the JSON object type Parameters (1 inputs) Input Types: ( arg0, required: schema any) Output schema any Examples
xmljsonToXml()Description An expression for converting JSON to XML. It is called as follows: jsonToXml(). It is counterpart of {@link XmlToJsonFn}. But it does not perform a conversion which is reverse to the conversion in {@link XmlToJsonFn}. The reason is:
Only a JSON value satisfying the following conditions can be converted to XML: An array nested in another array does not inherit the nesting array. For
example, _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any == typedXmlToJson() ==_*Description*_ This function converts an XML string to a JSON object and tries to preserve the type information Usage : T1 typedXmlToJson(string <XML string>) Output: A JSON object that represents the XML string This function is similar to xmlToJson, except that it creates typed data, i.e., instead of producing all values as strings, It tries to cast each value to a closest type. // Typed conversion creates a string and a long for the two elements of the array _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> typedXmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><array><value>test</value><value>1</value></array>");
{
"array":
{
"value":
[
"test",
1
]
}
}
// Typed conversion creates data values that matches the type information
jaql> typedXmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><array><value type=\"string\">test</value><value type=\"long\">2</value></array>");
{
"array": {
"value": [
{
"@type": "string",
"text()": "test"
},
{
"@type": "long",
"text()": 2
} ]
}
}
== xmlToJson() ==_*Description*_ This function converts an XML string to a JSON object, all values are created with string type Usage : T1 xmlToJson(string <XML string>) Output: A JSON object that represents the XML string This function is similar to typedXmlToJson, except that it creates non-typed data, i.e., all values are created with strings type // Non-Typed conversion creates a string for a long value _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> xmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><array><value>test</value><value>1</value></array>");
{
"array":
{
"value":
[
"test",
"1"
]
}
}
// Typed conversion creates data values that does not match the type information
jaql> xmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><array><value type=\"string\">test</value><value type=\"long\">2</value></array>");
{
"array": {
"value": [
{
"@type": "string",
"text()": "test"
},
{
"@type": "long",
"text()": "2"
} ]
}
}
== xpath() ==_*Description*_ This function runs an XPath on an XML document Usage : [T1] xpath(string <XML string>, string <xpath>, string <namespace>) Output: A JSON array containing the result of the xpath filter _*Parameters*_ (2 - 3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2 = null: schema any) _*Output*_ schema any _*Examples*_ jaql> xpath("<?xml version=\"1.0\" encoding=\"UTF-8\"?><record><content type=\"record\"><city>Beijing</city><no type=\"array\"><value type=\"long\">1</value><value type=\"long\">2</value><value type=\"long\">3</value></no></content></record>",
"record/content/city");
[
{
"city": "Beijing"
}
]
== xslt() ==_*Description*_ This function runs XSLT on an XML document Usage : {T1} xslt(string <XML string>, string <xslt>) Output: A JSON record holding the result of the xslt transformation _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> xslt("<?xml version=\"1.0\" encoding=\"UTF-8\"?><?xml-stylesheet type=\"text/xsl\" ?>
<record><content type=\"record\"><city>Beijing</city><no type=\"array\"><value type=\"long\">1</value><value type=\"long\">2</value></no></content></record>",
"<?xml version=\"1.0\" ?><xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">
<xsl:template match=\"city\"> <p><city><xsl:value-of select=\".\"/></city></p> </xsl:template> </xsl:stylesheet>");
{
"p":
{
"city": "Beijing"
}
}
----=regex=== regex() ==_*Description*_ Create a regular expression (regex). Usage: regex regex(string reg) regex(string reg) defines a regular expression, specified by a string, the regular-expression constructs complies with standard java. _*Parameters*_ (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any) _*Output*_ schema any _*Examples*_ jaql> reg = regex("[a-z]+"); regex_match(reg,"abc bcd");
["abc"]
== regex_extract() ==_*Description*_ Capture every first substrings which match each group (A group is a pair of parentheses used to group subpatterns.) specified in the regular expression. Return a string array like : ["match_group1", "match_group2" , "match_group3" ...] Usage: [string] regex_extract(regex reg, string text) reg is the regular expression, text is the target string. For example, given a regular expression (a(b*))+(c*) it contains 3 groups: group 1: (a(b*)) group 2: (b*) group 3: (c*) if input is "abbabcd", by use of regex_extract function, substrings matches each group(1-3) will be captured, this function will return a string array, like [ "ab", "b", "c"] where "ab" is the first hit matches group 1, as well as "b" to group 2, "c" to group 3. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> regex_extract(regex("(a(b*))+(c*)"),"abbabcd");
[ "ab", "b", "c"]
jaql> regex_extract(regex("(a(b*))"),"abbabcd");
[ "abb", "bb"]
== regex_extract_all() ==_*Description*_ Capture all the substrings which match each group (A group is a pair of parentheses used to group subpatterns.) specified in the regular expression. Return a string array like [[match1_group1, match1_group2 ...] , [match2_group1, match2_group2] ... ] Usage: [string] regex_extract(regex reg, string text) regex_extract_all(regex("(a(b*))"),"abbabcd"); reg is the regular expression, text is the target string. For example, given a regular expression (a(b*)) it contains 3 groups: group 1: (a(b*)) group 2: (b*) if input is "abbabcd", by use of regex_extract function, substrings matches each group(1-2) will be captured, this function will return a string array, like [ ["abb","bb"], ["ab","b"] ] where "abb" and "bb" is the first match of group 1 and 2 when scaning the text, "ab" and "b" is the second(last) match. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> regex_extract_all(regex("(a(b*))+(c*)"),"abbabcd");
[
[ "ab", "b", "c"]
]
jaql> regex_extract_all(regex("(a(b*))"),"abbabcd");
[
["abb","bb"],
["ab","b"]
]
== regex_match() ==_*Description*_ Returns the first substring in input that matches the pattern against the regular expression. Usage: regex_match(regex reg , string text) reg is the regular expression, text is the target string. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> regex_match(regex("[a-z]?"),"abbabcd");
"a" //this example performs a non-greedy matching
jaql> regex_match(regex("[a-z]*"),"abbabcd");
"abbabcd"//this example performs a greedy matching
== regex_spans() ==_*Description*_ Match a subset of the input, return a [begin,end] pair that indexes into the original string, where begin indicates the start index of the previous match, as well as end indicates the offset after the last character matched. Usage: [string] regex_spans(regex reg, string text); _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> regex_spans(regex("bcd"),"abbabcd");
[ span(4,6) ]
jaql> regex_spans(regex("[a-z]+"),"abbabcd");
[ span(0,6) ]
== regex_test() ==_*Description*_ Check if the target string contains substring matches given regular expression. If exist at least 1 match, return true, else return false Usage: bool regex_test(regex reg , string text) _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> regex_test(regex("[a-z]?"),"abbabcd");
true
jaql> regex_test(regex("aaa"),"abbabcd");
false
----=binary=== base64() ==_*Description*_ Convert an ascii/utf8 base64 string into a binary string. Usage: binary base64(string str) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> base64("utf8string");
hex('BAD7FCB2DAE20000')
== hex() ==_*Description*_ Convert a hexadecimal string into a binary string. Usage: binary hex(string str) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> hex("a00f");
hex('A00F')
----=date=== date() ==_*Description*_ Format a string to date value. Usage: date date(string datestr) _*Parameters*_ (1 - 2 inputs) Input Types: ( arg0, required: schema any),( arg1 = null: schema any) _*Output*_ schema any _*Examples*_ jaql> date('2000-01-01T11:59:59Z');
date('2000-01-01T12:00:00.000Z');
== dateMillis() ==_*Description*_ Represent the date using milliseconds. Usage: long dateMillis(date d) the argument is restricted with date type, or it causes bad casting exception. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> dateMillis(date('2000-01-01T12:00:00Z'));
946728000000
== dateParts() ==_*Description*_ Return a record which stores all readable fields of a date, including year, montch, day, dayofweek ... e.g. Usage: record dateParts(date d) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> dateParts(date('2000-01-01T12:00:00Z'));
{
"day": 1,
"dayOfWeek": 6,
"hour": 12,
"millis": 946728000000,
"minute": 0,
"month": 1,
"second": 0,
"year": 2000,
"zoneOffset": 0
}
== now() ==_*Description*_ Return current system date time. Usage: date now() _*Parameters*_ (0 inputs) Input Types: {{{}}} _*Output*_ schema any ----=nil=== denull() ==_*Description*_ remove nulls from a given array Usage: [T] denull([T]); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> denull( [1, null, 3] ); [ 1, 3 ]----=agg=== any() == _*Description*_ This function picks any value form an input JSON array. If there is at least one non-null values, picks a non-null value. Usage : T1 any([T1]) Output: If exists, this function returns any single non-null value from the input JSON array. Null is only returned if there is no non-null value in the array. _*Parameters*_ (1 inputs) Input Types: ( a, required: schema [ * ]?) _*Output*_ schema any _*Examples*_ jaql> [null, 1, null]->any();
1
jaql> []->any();
null
== argmax() ==_*Description*_ This function returns the maximum value of a JSON array after applying a function on every value. Usage : T1 argmax([T1] array1, schema function(T1)); Input: - A JSON array array1, which is searched for the max value in context of a function. - A function that is applied on every element of the array before evaluating the maximum _*Parameters*_ (2 inputs) Input Types: ( a, required: schema [ * ]?),( f, required: schema function) _*Output*_ schema any _*Examples*_ jaql> argmax([-3,-2,-1], fn(v) (v));
-1
jaql> argmax([-3,-2,-1], fn(v) (v*v));
-3
== argmin() ==_*Description*_ This function returns the minimum value of a JSON array after applying a function on every value. Usage : T1 argmin([T1] array1, schema function(T1)); Input: - A JSON array array1, which is searched for the min value in context of a function. - A function that is applied on every element of the array before evaluating the minimum _*Parameters*_ (2 inputs) Input Types: ( a, required: schema [ * ]?),( f, required: schema function) _*Output*_ schema any _*Examples*_ jaql> argmin([-3,-2,-1], fn(v) (v));
-3
jaql> argmin([-3,-2,-1], fn(v) (v*v));
-1
== array() ==_*Description*_ Usage : [T1] array([T1]) Output: _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> [1]->array();
[
1
]
== avg() ==_*Description*_ This function calculates the arithmetic mean (average) of a list of numbers Usage : long avg([long,...]) Input Parameters: A JSON array of numbers Output: The arithmetic mean over all numbers in the input array _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> [100, 500, 700] -> avg();
433
jaql> books = [
{publisher: 'Scholastic',
author: 'J. K. Rowling',
title: 'Chamber of Secrets',
year: 1999,
reviews: [
{rating: 10, user: 'joe', review: 'The best ...'},
{rating: 6, user: 'mary', review: 'Average ...'}]}
];
// Retrieves all the books with an average rating higher than 5
books-> filter avg($.reviews[*].rating) > 5 -> transform {$.author, $.title};
[
{
"author": "J. K. Rowling",
"title": "Chamber of Secrets"
}
]
== count() ==_*Description*_ This function counts the number of elements in a JSON array Usage : long count([T1]) Output: The number of elements in the JSON array _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> [1,2,3]->count();
3
jaql> books = [
{publisher: 'Scholastic',
author: 'J. K. Rowling',
title: 'Chamber of Secrets',
year: 1999,
reviews: [
{rating: 10, user: 'joe', review: 'The best ...'},
{rating: 6, user: 'mary', review: 'Average ...'}]}
];
// Counts the number of books per publisher
books-> group by $p = ($.publisher) into {publisher: $p, num: count($)} -> sort by [$.publisher];
[
{
"publisher": "Scholastic",
"num": 1
}
]
== javauda() ==_*Description*_ This function is a function constructor for a user-defined aggregate (function) that is written in Java. Usage function javauda( string className, any args ) className is a loadable class which implements the JavaUda interface. The result of javauda() is a function any myJavaUda( array ) This function is declared experimental. Specifics to be done. _*Parameters*_ (1 - 2 inputs) Input Types: ( class, required: schema string),( args = null: schema any)... _*Output*_ schema function == javaudacall() ==_*Description*_ _*Parameters*_ (2 - 3 inputs) Input Types: ( array, required: schema [ * ]?),( class, required: schema string),( args = null: schema any)... _*Output*_ schema any == max() ==_*Description*_ Find the max value in an array. Usage: any max( [ any ] ); Max takes an array as input and returns the max value from the array. The type of the array's elements is not restricted. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> max([1,2,3]);
3
jaql> max(["a","b","c"]);
"c"
jaql> read(hdfs("someFileOfLongs")) -> group into max($);
jaql> read(hdfs("someFileOfPairs")) -> group by g = $[0] into { first: g, maxSecond: max($[*][1]) };
== min() ==_*Description*_ Find the minimum value in an array. Usage: any max( [ any ] ); Max takes an array as input and returns the minimum value from the array. The type of the array's elements is not restricted. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> min([1,2,3]);
1
jaql> min(["a","b","c"]);
"a"
jaql> read(hdfs("someFileOfLongs")) -> group into min($);
jaql> read(hdfs("someFileOfPairs")) -> group by g = $[0] into { first: g, minSecond: min($[*][1]) };
== pickN() ==_*Description*_ select N elements from an array Usage: [T] pickN( [T], long n ); Select n elements from the input array. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> pickN( [1,2,3], 2 ) [1,2]== singleton() == _*Description*_ ensure that an array has only one element, otherwise, throw an exception Usage: T singleton( [T] ); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> [1] -> singleton 1 jaql> [1,2] -> singleton // throws an exception== sum() == _*Description*_ compute the sum of an array of numbers Usage: number sum( [ number ] ); Note that sum is currently evaluated using a sequential plan. To get parallelism, use group by: _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> read(hdfs("someNumbers")) -> group into sum($);
== topN() ==_*Description*_ compute the top N values from an array Usage: [T] topN( [T], long n, cmp(x) ); Given an input array, a limit n, and a comparator function, compute the top n elements of the input array. This implementation uses a heap to efficiently use memory and lower the network traffic that is needed for aggregation. _*Parameters*_ (3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2, required: schema any) _*Output*_ schema any _*Examples*_ jaql> [1,2,3] -> write(hdfs("test1"));
jaql> read(hdfs("test1")) -> topN( 2, cmp(x) [x desc ] ); // Simplest example
[ 3, 2 ]
jaql> read(hdfs("test1")) -> group into topN( $, 2, cmp(x) [ x desc ] ); // Now, with group by (this uses map-reduce)
jaql> [ [ 1, 1 ], [1, 2], [2, 0], [2, 11], [3, 3], [3, 4], [3, 5] ] -> write(hdfs("test2"));
jaql> read(hdfs("test2")) -> group by n = $[0] into { num: n, top: topN($[*][1], 1, cmp(x) [ x desc ]) }; // Complex data
== uda() ==_*Description*_ This function is a function constructor for a user-defined aggregate (function). Usage function uda( function init, function accumulate, function combine, fucntion final ) The result of uda() is a function any myUda( array ) This function is declared experimental. Specifics to be done. _*Parameters*_ (4 inputs) Input Types: ( init, required: schema function),( accumulate, required: schema function),( combine, required: schema function),( final, required: schema function) _*Output*_ schema function == udacall() ==_*Description*_ _*Parameters*_ (5 inputs) Input Types: ( array, required: schema [ * ]?),( init, required: schema function),( accumulate, required: schema function),( combine, required: schema function),( final, required: schema function) _*Output*_ schema any ----=number=== abs() ==_*Description*_ Return the absolute value of a numeric value Usage: number abs(number) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> abs(-100); 100 jaql> abs(-3.14) 3.14== decfloat() == _*Description*_ Construct a decfloat value Usage: decfloat decfloat( string | number ) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> decfloat(5m);
5m
jaql> decfloat("5");
5m
jaql> decfloat("-1.5e-5");
-0.000015m
jaql> 5m instanceof schema decfloat(value=5m);
true
== div() ==_*Description*_ div(A,B) divides A by B, return a numric value. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> div(4,2); 2== double() == _*Description*_ Get the double value of a numric value. Usage: double double(number A); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> 22d instanceof schema double; true jaql> double(5); 5.0 jaql> double(5m); 5.0 jaql> double(5d); 5.0== exp() == _*Description*_ raise base of natural log (e) to arg: e^a pow(x,y) = exp( y * ln(x) ) Usage: decfloat | double exp( decfloat | double ); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> exp( 10 ); 22026.465794806718 jaql> exp( 10m ); 22026.46579480671789497137069702148m== ln() == _*Description*_ Return the natural logarithm of a numeric value Usage: number abs(number) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any == long() ==_*Description*_ Parse the given atom value to long value Usage: long long(anyatom) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> long(3.14) 3 jaql> long(3) 3 jaql> long(true) 1== mod() == _*Description*_ Return the modulus of a and b, both a and b are numeric values Usage: number mod(number a, number b) _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> mod(3,2) 1== pow() == _*Description*_ Raise a number to power Usage: number pow(number a , number b) _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> mod(3,2) 1== toNumber() == _*Description*_ convert a value to number Usage: number toNumber( e ); Currently, this function converts booleans and strings. If a number is given as input, it returned verbatim. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any ----=string=== convert() ==_*Description*_ Converts an input value (string, array of strings or record with string values) to the specified types. Usage: T convert( string s, schema T ); _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> convert( "1", schema long);
1
jaql> convert( { a: "1" }, schema { a: long } );
{
"a": 1
}
== endsWith() ==_*Description*_ test whether a string has a given suffix Usage: bool endsWith( string s, string suffix ) _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any == json() ==_*Description*_ convert a string in json format into jaql's data model Usage: T json( string json ); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> json( "[1,2,3]" ); [1,2,3]== serialize() == _*Description*_ return a sctring representation of any value Usage: string serialze( value ); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any == startsWith() ==_*Description*_ bool startsWith(string target, string prefix) Check if a target string starts with a given prefix, return true or false _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any == strcat() ==_*Description*_ Concats one or more strings to a new string Usage: string strcat(string ... str) _*Parameters*_ (0 - 1 inputs) Input Types: ( arg0 = null: schema any)... _*Output*_ schema any == strJoin() ==_*Description*_ Build a string that concatentates all the items, adding sep between each item. Nulls are removed, without any separator. If you want nulls, use firstNonNull(e,'how nulls appear'). Usage: string strJoin(array items, string sep) _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any == strLen() ==_*Description*_ long strLen(string str) return the lenght of the given string _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any == strPos() ==_*Description*_ This function returns the index of the first occurrence of search string within a string Usage : long strPos(string <string>, string <search string>, long <startIndex = 0>) Input Parameters: string: the string that is search within search string: the string that is search for startIndex: the starting position of the search within search Output: the index of the first occurrence of search string within string // Search for 'sample' within string. _*Parameters*_ (2 - 3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2 = null: schema any) _*Output*_ schema any _*Examples*_ jaql> strPos('This is a sample string', 'sample', 0);
10
// Only the first occurrence is reported
jaql> strPos('The This That These Those', 'Th', 0);
0
== strPosList() ==_*Description*_ This function returns the indexes of all occurrences of search string within a string. Usage : [long] strPosList(string <string>, string <search string>, long <startIndex = 0>) Input Parameters: string: the string that is search within search string: the string that is search for startIndex: the starting position of the search within search Output: A JSON array with the indexes of all occurrences of search string within string // Search for 'sample' within string. _*Parameters*_ (2 - 3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2 = null: schema any) _*Output*_ schema any _*Examples*_ jaql> strPosList('This is a sample string', 'sample', 0);
[
10
]
// Only the first occurrence is reported
jaql> strPosList('The This That These Those', 'Th', 0);
[
0,
4,
9,
14,
20
]
== strReplace() ==_*Description*_ Replace a substring with the replacement only if it matches the given regular expression (regex). Usage: [string] strReplace(string val, regex pattern, string replacement) _*Parameters*_ (3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2, required: schema any) _*Output*_ schema any _*Examples*_ jaql> reg=regex("[a-z]+"); // define a regular expression, match at least one character.
val = "<abc>,<bcd>,<cde>"; // deine a string
strReplace(val,reg,"a"); // replace all the match substrings with "a"
"<a>,<a>,<a>"
== strSplit() ==_*Description*_ Split a string with given delimiter. Usage: [string] strSplit(string val, string delimiter); _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> strSplit("a,b,c,d",",");
[ "a","b","c","d" ]
== strToLowerCase() ==_*Description*_ Convert a string to lower case. Usage: string strToLowerCase(string val) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> strToLowerCase("aBcDEFgHiJ");
"abcdefghij"
== strToUpperCase() ==_*Description*_ Convert a string to upper case. Usage: string strToUpperCase(string val) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> strToUpperCase("abcDEFgHijk");
"ABCDEFGHIJK"
== substring() ==_*Description*_ Get a certain substring of a string, start from beginIdx , end to endIdx. If endIdx is not given or larger than the lenght of the string, return the substring from beginIdx to the end of the string. Usage: string substring(string val, int beginIdx, int endIndx ?); _*Parameters*_ (2 - 3 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any),( arg2 = null: schema any) _*Output*_ schema any _*Examples*_ jaql> substring("I love the game", 2, 7);
"love"
jaql> substring("I love the game", 2);
"love the game"
jaql> substring("I love the game", 2, 20);
"love the game"
----=function=== fence() ==_*Description*_ evaluate a function in a separate process Usage: [T2] fence( [T1], T2 fn(T1 x) ); The fence function applies the function argument to each element of the input array to produce the output array. In particular, the fence function is evaluated in a separate process. A common use of fence is to shield the Jaql interpreter from user-defined functions that exhaust memory, for example. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> [1,2,3] -> write(hdfs("test"));
jaql> read(hdfs("test")) -> fence( fn(i) i + 1 );
[2,3,4]
== javaudf() ==_*Description*_ construct a jaql function from a given class Usage: fn javaudf( string className ); The javaudf function constructs a function that knows how to evaluate itself given a className that specifies its body. The function can then be assigned to a variable (like any other value) and invoked (like any other function). This is the primary means by which users can supply user-defined functions. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> split = javaudf("com.acme.extensions.fn.Split1"); // define the function and assign it to the variable split
jaql> path = '/home/mystuff/stuff';
jaql> split1(path, "/"); // invoke the split function
----=random=== randomDouble() ==_*Description*_ return a uniformly distributed double value between 0.0 (inclusive) and 1.0 (exclusive) Usage: double randomDouble( long? seed ) The optional seed parameter is used to seed the internally used random number generator. Note: randomDouble will produce a pseudo-random sequence of doubles when called in sequence. If its called by multiple processes, in parallel (as done in MapReduce), then there are no guarantees (and in fact, if all sequential instances use the same seed, you'll get common prefixes). _*Parameters*_ (0 - 1 inputs) Input Types: ( arg0 = null: schema any) _*Output*_ schema any == randomLong() ==_*Description*_ return a uniformly distributed long value Usage: long randomLong( long? seed ) The optional seed parameter is used to seed the internally used random number generator. Note: randomLong will produce a pseudo-random sequence of longs when called in sequence. If its called by multiple processes, in parallel (as done in MapReduce), then there are no guarantees (and in fact, if all sequential instances use the same seed, you'll get common prefixes). _*Parameters*_ (0 - 1 inputs) Input Types: ( arg0 = null: schema any) _*Output*_ schema any == registerRNG() ==_*Description*_ register a random number generator Usage: T registerRNG( T key, long seed ); Register an RNG with a given name, key, and a seed. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> registerRNG('r', fn() 17);
"r"
== sample01RNG() ==_*Description*_ return a uniformly distributed double value between 0.0 (inclusive) and 1.0 (exclusive) from a registered RNG Usage: double sample01RNG( string key ) An RNG associated with key must have been previously registered using registerRNG. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> registerRNG('r', fn() 17);
"r"
jaql> sample01RNG('r');
0.6973704783607497
== sampleRNG() ==_*Description*_ return a uniformly distributed long value from a registered RNG Usage: long sampleRNG( string key ) An RNG associated with key must have been previously registered using registerRNG. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> registerRNG('r', fn() 17);
"r"
jaql> sampleRNG('r');
-4937981208836185383
== uuid() ==_*Description*_ Generate a type 4 UUID (random method) Usage: binary uuid() _*Parameters*_ (0 inputs) Input Types: {{{}}} _*Output*_ schema any _*Examples*_ jaql> uuid();
hex('389878514428442CAE1B86033A32F249')
----=record=== arity() ==_*Description*_ Return the size of a record. Usage : long arity(record r); _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> arity({a:1,b:2,c:3});
3
== fields() ==_*Description*_ Convert each key-value pair of a record to a [key,value] array. Usage: array fields(record r) _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> fields({a:1, b:2, c:3});
[ ["a",1] , ["b",2] , ["c",3] ]
jaql> fields({a:1, b:2, c:3}) -> transform $[0];
[ "a","b","c" ] //this example indicates a way to extract all the key values in a record.
== names() ==_*Description*_ Extract all the keys in a record and return as an array. names($rec) == for $k,$v in $rec return $k == fields($rec)[*][0]; _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> names({a:1, b:2, c:3});
[ "a","b","c" ]
== record() ==_*Description*_ Convert a array to a single record. Usage: record record(array arr); the argument arr will be like [record1,record2,record3...], it has restricted format since a record can not contain any duplicate keys, so this function asserts record1, record2 ... contains no same keys. _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> record([{A:11},{B:22}]);
{
"A": 11,
"B": 22
}
== remap() ==_*Description*_ Join two records. Usage: record remap(record old, record new) remap joins two records, old and new together, produce a new record and return, remove duplicate key-values of old record. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> remap({a:1,b:2},{a:3,d:4});
{
"a" : 3,
"b" : 2,
"d" : 4
}
== removeFields() ==_*Description*_ Remove fields of a record by keys. Usage: record removeFields(record target, array names); names is an array with one or more string key names, removeFields will remove fields in target record if its key appears in names. _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> removeFields({a:1,b:2},["a"]);
{
"b":2
}
== renameFields() ==_*Description*_ Replace the key of the target record with newName only if whose key equals with oldName. Usage: record renameFields(record target, record {oldName : newName , ...} ); _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> renameFields({a:1,b:2},{"a":"AAA" , "b":'BBB'});
{
"AAA": 1,
"BBB": 2
}
== replaceFields() ==_*Description*_ Replace fields in oldRec with fields in newRec only if the field name exists in oldRec. Unlike remap, this only replaces existing fields. Usage: record replaceFields( record old, record new); _*Parameters*_ (2 inputs) Input Types: ( arg0, required: schema any),( arg1, required: schema any) _*Output*_ schema any _*Examples*_ jaql> replaceFields( { a: 1, b: 2 }, { a: 10, c: 3 } );
{
"a": 10,
"b": 2
}
== values() ==_*Description*_ Extract all the values in a record and return as an array. values($rec) == for $k,$v in $rec return $v == fields($rec)[*][1]; _*Parameters*_ (1 inputs) Input Types: ( arg0, required: schema any) _*Output*_ schema any _*Examples*_ jaql> values({a:1, b:2, c:3});
[ 1,2,3 ]
----
|
The list of string functions are missing.