Skip to content

Latest commit

 

History

History
554 lines (493 loc) · 39.6 KB

File metadata and controls

554 lines (493 loc) · 39.6 KB

Morel language reference

This document describes the grammar of Morel (constants, identifiers, expressions, patterns, types, declarations), and then lists its built-in operators, types, functions. Properties affect the execution strategy and the behavior of the shell.

Query expressions (from, exists, forall) are described in more detail in the query reference.

Grammar

This reference is based on Andreas Rossberg's grammar for Standard ML. While the files have a different notation, they are similar enough to the two languages.

Differences between Morel and SML

Morel aims to be compatible with Standard ML. It extends Standard ML in areas related to relational operations. Some of the features in Standard ML are missing, just because they take effort to build. Contributions are welcome!

In Morel but not Standard ML:

  • Queries (expressions starting with exists, forall or from) with compute, distinct, except, group, intersect, into, join, order, require, skip, take, through, union, unorder, where, yield steps and in and of keywords
  • elem, implies, notelem binary operators
  • current, elements, ordinal nilary operators
  • typeof type operator
  • lab = is optional in exprow
  • record.lab as an alternative to #lab record; for tuples, tuple.1, tuple.2 etc. as an alternative to #1 tuple, #2 tuple
  • postfix method-call syntax exp.f () and exp.f arg, where f is a function whose first parameter is named self
  • identifiers and type names may be quoted (for example, `an identifier`)
  • with functional update for record values
  • overloaded functions may be declared using over and inst
  • (*) line comments (syntax as SML/NJ and MLton)

In Standard ML but not in Morel:

  • word constant
  • longid identifier
  • references (ref and operators ! and :=)
  • exceptions (raise, handle, exception)
  • while loop
  • data type replication (type)
  • withtype in datatype declaration
  • abstract type (abstype)
  • structures (structure)
  • signature refinement (where type)
  • signature sharing constraints
  • local declarations (local)
  • operator declarations (nonfix, infix, infixr)
  • open
  • before and o operators

Constants

con  int                       integer
    | float                     floating point
    | char                      character
    | string                    string
int → [~]num                    decimal
    | [~]0xhex                  hexadecimal
float → [~]num.num              floating point
    | [~]num[.num]e[~]num
                                scientific
char#"ascii"                 character
string"ascii*"               string
numdigit digit*              number
hex → (digit | letter) (digit | letter)*
                                hexadecimal number (letters
                                may only be in the range A-F)
ascii → ...                     single non-" ASCII character
                                or \-headed escape sequence

Identifiers

idletter (letter | digit | ''' | _)*
                                alphanumeric
    | symbol symbol*            symbolic (not allowed for type
                                variables or module language
                                identifiers)
symbol!
    | %
    | &
    | $
    | #
    | +
    | -
    | /
    | :
    | <
    | =
    | >
    | ?
    | @
    | \
    | ~
    | `
    | ^
    | '|'
    | '*'
var → '''(letter | digit | ''' | _)*
                                unconstrained
      ''''(letter | digit | ''' | _⟩*
                                equality
labid                        identifier
      num                       number (may not start with 0)

Expressions

expcon                       constant
    | [ op ] id                 value or constructor identifier
    | exp1 exp2                 application
    | exp1 id exp2              infix application
    | '(' exp ')'               parentheses
    | '(' exp1 , ... , expn ')' tuple (n ≠ 1)
    | { [ exprow ] }            record
    | #lab                      record selector
    | '[' exp1 , ... , expn ']' list (n ≥ 0)
    | '(' exp1 ; ... ; expn ')' sequence (n ≥ 2)
    | let dec in exp1 ; ... ; expn end
                                local declaration (n ≥ 1)
    | exp . lab ()              postfix call (no argument)
    | exp1 . lab exp2            postfix call (with argument)
    | exp : type                type annotation
    | exp1 andalso exp2         conjunction
    | exp1 orelse exp2          disjunction
    | if exp1 then exp2 else exp3
                                conditional
    | case exp of match         case analysis
    | fn match                  function
    | current                   current element (only valid in a query step)
    | elements                  elements of current group (only valid in compute)
    | ordinal                   element ordinal (only valid in a query step)
    | exp1 over exp2            aggregate (only valid in compute)
    | from [ scan1 , ... , scans ] step1 ... stept [ terminalStep ]
                                relational expression (s ≥ 0, t ≥ 0)
    | exists [ scan1 , ... , scans ] step1 ... stept
                                existential quantification (s ≥ 0, t ≥ 0)
    | forall [ scan1 , ... , scans ] step1 ... stept require exp
                                universal quantification (s ≥ 0, t ≥ 0)
exprow → [ exp with ] exprowItem [, exprowItem ]*
                                expression row
exprowItem → [ lab = ] exp
matchmatchItem [ '|' matchItem ]*
                                match
matchItempat => exp
scanpat in exp [ on exp ]    iteration
    | pat = exp                 single iteration
    | val                       unbounded variable
stepdistinct                 distinct step
    | except [ distinct ] exp1 , ... , expe
                                except step (e ≥ 1)
    | group exp1 [ compute exp2 ]
                                group step
    | intersect [ distinct ] exp1 , ... , expi
                                intersect step (i ≥ 1)
    | join scan1 , ... , scans  join step (s ≥ 1)
    | order exp                 order step
    | skip exp                  skip step
    | take exp                  take step
    | through pat in exp        through step
    | union [ distinct ] exp1 , ... , expu
                                union step (u ≥ 1)
    | where exp                 filter step
    | yield exp                 yield step
terminalStepinto exp         into step
    | compute exp               compute step
groupKey → [ id = ] exp
agg → [ id = ] exp [ of exp ]

Patterns

patcon                       constant
    | _                         wildcard
    | [ op ] id                 variable
    | [ op ] id [ pat ]         construction
    | pat1 id pat2              infix construction
    | '(' pat ')'               parentheses
    | '(' pat1 , ... , patn ')' tuple (n ≠ 1)
    | { [ patrow ] }            record
    | '[' pat1 , ... , patn ']' list (n ≥ 0)
    | pat : type                type annotation
    | id as pat                 layered
patrow → '...'                  wildcard
    | lab = pat [, patrow]      pattern
    | id [, patrow]             label as variable

Types

typvar                       variable
    | [ typ ] id                constructor
    | '(' typ [, typ ]* ')' id  constructor
    | '(' typ ')'               parentheses
    | typ1 -> typ2              function
    | typ1 '*' ... '*' typn     tuple (n ≥ 2)
    | { [ typrow ] }            record
    | typeof exp                expression type
typrowlab : typ [, typrow]   type row

Declarations

decvals valbind              value
    | fun funbind               function
    | type typbind              type
    | datatype datbind          data type
    | signature sigbind         signature
    | over id                   overloaded name
    | empty
    | dec1 [;] dec2             sequence
valbindpat = exp [ and valbind ]*
                                destructuring
    | rec valbind               recursive
    | inst valbind              overload instance
funbindfunmatch [ and funmatch ]*
                                clausal function
funmatchfunmatchItem [ '|' funmatchItem ]*
funmatchItem → [ op ] id pat1 ... patn [ : type ] = exp
                                nonfix (n ≥ 1)
    | pat1 id pat2 [ : type ] = exp
                                infix
    | '(' pat1 id pat2 ')' pat'1 ... pat'n [ : type ] = exp
                                infix (n ≥ 0)
typbind → [ vars ] id = typ [ and typbind ]*
                                abbreviation
datbinddatbindItem [ and datbindItem ]*
                                data type
datbindItem → [ vars ] id = conbind
conbindconbindItem [ '|' conbindItem ]*
                                data constructor
conbindItemid [ of typ ]
valsval
    | '(' val [, val]* ')'
varsvar
    | '(' var [, var]* ')'

Modules

sigbindid = sig spec end [ and sigbind ]*
                                signature
specval valdesc              value
    | type typdesc              abstract type
    | type typbind              type abbreviation
    | datatype datdesc          data type
    | exception exndesc         exception
    | empty
    | spec1 [;] spec2           sequence
valdescid : typ [ and valdesc ]*
                                value specification
typdesc → [ vars ] id [ and typdesc ]*
                                type specification
datdescdatdescItem [ and datdescItem ]*
                                datatype specification
datdescItem → [ vars ] id = conbind
exndescid [ of typ ] [ and exndesc ]*
                                exception specification

A signature defines an interface that specifies types, values, datatypes, and exceptions without providing implementations. Signatures are used to document module interfaces and, in future versions of Morel, will be used to constrain structure implementations.

Signature declarations appear at the top level (see grammar in Declarations).

Specifications

A signature body contains specifications that describe the interface:

Value specifications declare the type of a value without defining it:

val empty : 'a stack
val push : 'a * 'a stack -> 'a stack

Type specifications can be abstract (no definition) or concrete (type alias):

type 'a stack              (* abstract type *)
type point = real * real   (* concrete type alias *)
type ('k, 'v) map          (* abstract with multiple params *)

Datatype specifications describe algebraic datatypes:

datatype 'a tree = Leaf | Node of 'a * 'a tree * 'a tree

Exception specifications declare exceptions:

exception Empty                  (* exception without payload *)
exception QueueError of string   (* exception with payload *)

Examples

A simple signature with abstract type and value specifications:

signature STACK =
sig
  type 'a stack
  exception Empty
  val empty : 'a stack
  val isEmpty : 'a stack -> bool
  val push : 'a * 'a stack -> 'a stack
  val pop : 'a stack -> 'a stack
  val top : 'a stack -> 'a
end

Multiple signatures declared together using and:

signature EQ =
sig
  type t
  val eq : t * t -> bool
end
and ORD =
sig
  type t
  val lt : t * t -> bool
  val le : t * t -> bool
end

Current Limitations

The current implementation supports parsing and pretty-printing signatures but does not yet support:

  • Structure declarations that implement signatures
  • Signature refinement (where type)
  • Signature sharing constraints
  • Signature inclusion (include)

These features may be added in future versions.

Notation

This grammar uses the following notation:

Syntax Meaning
symbol Grammar symbol (e.g. con)
keyword Morel keyword (e.g. if) and symbol (e.g. ~, "(")
[ term ] Option: term may occur 0 or 1 times
[ term1 | term2 ] Alternative: term1 may occur, or term2 may occur, or neither
term* Repetition: term may occur 0 or more times
's' Quotation: Symbols used in the grammar — ( ) [ ] | * ... — are quoted when they appear in Morel language

Built-in operators

Operator Precedence Meaning
* infix 7 Multiplication
/ infix 7 Division
div infix 7 Integer division
mod infix 7 Modulo
+ infix 6 Plus
- infix 6 Minus
^ infix 6 String concatenate
~ prefix 6 Negate
:: infixr 5 List cons
@ infixr 5 List append
<= infix 4 Less than or equal
< infix 4 Less than
>= infix 4 Greater than or equal
> infix 4 Greater than
= infix 4 Equal
<> infix 4 Not equal
elem infix 4 Member of list
notelem infix 4 Not member of list
:= infix 3 Assign
o infix 3 Compose
andalso infix 2 Logical and
orelse infix 1 Logical or
implies infix 0 Logical implication

abs is a built-in function (not an operator, because it uses function syntax rather than prefix or infix syntax). It is overloaded: its type is int -> int when applied to an int argument, and real -> real when applied to a real argument. It is equivalent to Int.abs and Real.abs respectively.

Built-in types

Primitive: bool, char, int, real, string, unit

Datatype:

  • datatype 'a descending = DESC of 'a (in structure Relational)
  • datatype ('l, 'r) either = INL of 'l | INR of 'r (in structure Either)
  • datatype 'a list = nil | :: of 'a * 'a list (in structure List)
  • datatype 'a option = NONE | SOME of 'a (in structure Option)
  • datatype 'a order = LESS | EQUAL | GREATER (in structure General)

Eqtype:

  • eqtype 'a bag = 'a bag (in structure Bag)
  • eqtype 'a vector = 'a vector (in structure Vector)

Exception:

  • Bind (in structure General)
  • Chr (in structure General)
  • Div (in structure General)
  • Domain (in structure General)
  • Empty (in structure List)
  • Error (in structure Interact)
  • Option (in structure Option)
  • Overflow (in structure Option)
  • Size (in structure General)
  • Subscript (in structure General)
  • Unordered (in structure IEEEReal)

Structures

Structure Description
Bag Unordered collection of elements with duplicates.
@, all, app, collate, concat, drop, exists, filter, find, fold, fromList, getItem, hd, length, map, mapPartial, nil, nth, null, partition, tabulate, take, tl, toList
Bool Boolean values and operations.
bool, fromString, implies, not, toString
Char Character values and operations.
char, <, <=, >, >=, chr, compare, contains, fromCString, fromInt, fromString, isAlpha, isAlphaNum, isAscii, isCntrl, isDigit, isGraph, isHexDigit, isLower, isOctDigit, isPrint, isPunct, isSpace, isUpper, maxChar, maxOrd, minChar, notContains, ord, pred, succ, toCString, toLower, toString, toUpper
Datalog Datalog query interface.
execute, translate, validate
Date Calendar date and time values.
date, month, weekday, Date, compare, day, fmt, fromString, fromTimeLocal, fromTimeUniv, hour, isDst, localOffset, minute, second, toString, toTime, weekDay, year, yearDay
Either Values that are one of two types.
either, app, appLeft, appRight, asLeft, asRight, fold, isLeft, isRight, map, mapLeft, mapRight, partition, proj
Fn Higher-order function combinators.
apply, const, curry, equal, flip, id, notEqual, o, repeat, uncurry
General Basic types, exceptions, and utility functions.
exn, order, unit, Bind, Chr, Div, Domain, Fail, Match, Overflow, Size, Span, Subscript, before, exnMessage, exnName, ignore, o
IEEEReal IEEE 754 floating-point definitions.
decimal_approx, float_class, real_order, rounding_mode
Int Fixed-precision integer operations.
int, *, +, -, <, <=, >, >=, abs, compare, div, fmt, fromInt, fromLarge, fromString, max, maxInt, min, minInt, mod, precision, quot, rem, sameSign, scan, sign, toInt, toLarge, toString, ~
IntInf Arbitrary-precision integer operations.
int
Interact Interactive session utilities.
use, useSilently
List Polymorphic singly-linked lists.
list, Empty, @, all, app, at, collate, concat, drop, except, exists, filter, find, foldl, foldr, getItem, hd, intersect, last, length, map, mapPartial, mapi, nil, nth, null, partition, rev, revAppend, tabulate, take, tl
ListPair Operations on pairs of lists.
UnequalLengths, all, allEq, app, appEq, exists, foldl, foldlEq, foldr, foldrEq, map, mapEq, unzip, zip, zipEq
Math Mathematical functions for real numbers.
acos, asin, atan, atan2, cos, cosh, e, exp, ln, log10, pi, pow, sin, sinh, sqrt, tan, tanh
Option Optional values.
option, Option, app, compose, composePartial, filter, getOpt, isSome, join, map, mapPartial, valOf
Range Operations on ranges of ordered values.
continuous_set, discrete_set, range, complement, contains, continuousSetOf, discreteSetOf, ranges, toBag, toList
Real Floating-point number operations.
real, !=, *, *+, *-, +, -, /, <, <=, ==, >, >=, ?=, abs, ceil, checkFloat, class, compare, compareReal, copySign, floor, fmt, fromDecimal, fromInt, fromLarge, fromLargeInt, fromManExp, fromString, isFinite, isNan, isNormal, max, maxFinite, min, minNormalPos, minPos, negInf, nextAfter, posInf, precision, radix, realCeil, realFloor, realMod, realRound, realTrunc, rem, round, sameSign, scan, sign, signBit, split, toDecimal, toInt, toLarge, toLargeInt, toManExp, toString, trunc, unordered, ~
Relational Relational algebra operations for Morel queries.
descending, compare, count, elem, empty, iterate, max, min, nonEmpty, notelem, only, sum
String String operations.
string, <, <=, >, >=, ^, collate, compare, concat, concatWith, explode, extract, fields, fromCString, fromString, implode, isPrefix, isSubstring, isSuffix, map, maxSize, scan, size, str, sub, substring, toCString, toString, tokens, translate
StringCvt String conversion utilities and types.
radix, reader, realfmt
Sys System interface utilities.
clearEnv, env, file, plan, planEx, set, show, showAll, unset
Time Time values and operations.
time, Time, +, -, <, <=, >, >=, compare, fmt, fromMicroseconds, fromMilliseconds, fromNanoseconds, fromReal, fromSeconds, fromString, now, toMicroseconds, toMilliseconds, toNanoseconds, toReal, toSeconds, toString, zeroTime
Variant Dynamically-typed variant values.
variant, parse, print
Vector Immutable fixed-length arrays.
vector, all, app, appi, collate, concat, exists, find, findi, foldl, foldli, foldr, foldri, fromList, length, map, mapi, maxLen, sub, tabulate, update

Properties

Each property is set using the function Sys.set (name, value), displayed using Sys.show name, and unset using Sys.unset name. Sys.showAll () shows all properties and their values.

Name Type Default Description
banner string Morel version ... Startup banner message displayed when launching the Morel shell.
directory file Path of the directory that the 'file' variable maps to in this connection.
excludeStructures string ^Test$ Regular expression that controls which built-in structures are excluded from the environment.
hybrid bool false Whether to try to create a hybrid execution plan that uses Apache Calcite relational algebra.
inlinePassCount int 5 Maximum number of inlining passes.
lineWidth int 79 When printing, the length at which lines are wrapped.
matchCoverageEnabled bool true Whether to check whether patterns are exhaustive and/or redundant.
now string null Overrides the current time. Value is an ISO-8601 string (e.g. '2024-01-01T00:00:00Z'). If not set, the system clock is used.
optionalInt int null For testing.
output enum classic How values should be formatted. "classic" (the default) prints values in a compact nested format; "tabular" prints values in a table if their type is a list of records.
printDepth int 5 When printing, the depth of nesting of recursive data structure at which ellipsis begins.
printLength int 12 When printing, the length of lists at which ellipsis begins.
productName string morel-java Name of the Morel product.
productVersion string 0.8.0 Current version of Morel.
relationalize bool false Whether to convert to relational algebra.
scriptDirectory file Path of the directory where the 'use' command looks for scripts. When running a script, it is generally set to the directory that contains the script.
stringDepth int 70 When printing, the length of strings at which ellipsis begins.
timeZone string null Overrides the local timezone. Value is a timezone ID (e.g. 'UTC' or 'America/New_York'). If not set, the JVM default timezone is used.