2013-12-31 10:44:10 +01:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
= TOC:
|
2014-04-19 12:10:46 +02:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
notes
|
|
|
|
Public api
|
2014-04-19 12:10:46 +02:00
|
|
|
Names - parsing identifiers
|
|
|
|
Typenames
|
|
|
|
Value expressions
|
2014-04-19 12:22:11 +02:00
|
|
|
simple literals
|
2014-04-19 12:10:46 +02:00
|
|
|
star, param
|
|
|
|
parens expression, row constructor and scalar subquery
|
|
|
|
case, cast, exists, unique, array/ multiset constructor
|
|
|
|
typed literal, app, special function, aggregate, window function
|
2014-04-19 12:22:11 +02:00
|
|
|
suffixes: in, between, quantified comparison, match predicate, array
|
|
|
|
subscript, escape, collate
|
2014-04-19 12:10:46 +02:00
|
|
|
operators
|
|
|
|
value expression top level
|
|
|
|
helpers
|
2014-04-19 12:22:11 +02:00
|
|
|
query expressions
|
|
|
|
select lists
|
|
|
|
from clause
|
|
|
|
other table expression clauses:
|
|
|
|
where, group by, having, order by, offset and fetch
|
|
|
|
common table expressions
|
|
|
|
query expression
|
|
|
|
set operations
|
2014-04-19 14:10:45 +02:00
|
|
|
lexers
|
2014-04-19 12:22:11 +02:00
|
|
|
utilities
|
2014-04-19 12:10:46 +02:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
= Notes about the code
|
|
|
|
|
|
|
|
The lexers appear at the bottom of the file. There tries to be a clear
|
|
|
|
separation between the lexers and the other parser which only use the
|
|
|
|
lexers, this isn't 100% complete at the moment and needs fixing.
|
|
|
|
|
|
|
|
== Left factoring
|
|
|
|
|
|
|
|
The parsing code is aggressively left factored, and try is avoided as
|
|
|
|
much as possible. Try is avoided because:
|
|
|
|
|
2014-04-19 20:17:19 +02:00
|
|
|
* when it is overused it makes the code hard to follow
|
|
|
|
* when it is overused it makes the parsing code harder to debug
|
|
|
|
* it makes the parser error messages much worse
|
2014-04-19 14:10:45 +02:00
|
|
|
|
|
|
|
The code could be made a bit simpler with a few extra 'trys', but this
|
|
|
|
isn't done because of the impact on the parser error
|
|
|
|
messages. Apparently it can also help the speed but this hasn't been
|
|
|
|
looked into.
|
|
|
|
|
|
|
|
== Parser rrror messages
|
|
|
|
|
|
|
|
A lot of care has been given to generating good parser error messages
|
|
|
|
for invalid syntax. There are a few utils below which partially help
|
|
|
|
in this area.
|
|
|
|
|
|
|
|
There is a set of crafted bad expressions in ErrorMessages.lhs, these
|
|
|
|
are used to guage the quality of the error messages and monitor
|
|
|
|
regressions by hand. The use of <?> is limited as much as possible:
|
|
|
|
each instance should justify itself by improving an actual error
|
|
|
|
message.
|
|
|
|
|
|
|
|
There is also a plan to write a really simple expression parser which
|
|
|
|
doesn't do precedence and associativity, and the fix these with a pass
|
|
|
|
over the ast. I don't think there is any other way to sanely handle
|
|
|
|
the common prefixes between many infix and postfix multiple keyword
|
|
|
|
operators, and some other ambiguities also. This should help a lot in
|
|
|
|
generating good error messages also.
|
|
|
|
|
|
|
|
Both the left factoring and error message work are greatly complicated
|
|
|
|
by the large number of shared prefixes of the various elements in SQL
|
|
|
|
syntax.
|
|
|
|
|
|
|
|
== Main left factoring issues
|
|
|
|
|
|
|
|
There are three big areas which are tricky to left factor:
|
|
|
|
|
2014-04-19 20:17:19 +02:00
|
|
|
* typenames
|
|
|
|
* value expressions which can start with an identifier
|
|
|
|
* infix and suffix operators
|
2014-04-19 14:10:45 +02:00
|
|
|
|
|
|
|
=== typenames
|
|
|
|
|
|
|
|
There are a number of variations of typename syntax. The standard
|
|
|
|
deals with this by switching on the name of the type which is parsed
|
|
|
|
first. This code doesn't do this currently, but might in the
|
|
|
|
future. Taking the approach in the standard grammar will limit the
|
|
|
|
extensibility of the parser and might affect the ease of adapting to
|
|
|
|
support other sql dialects.
|
|
|
|
|
|
|
|
=== identifier value expressions
|
|
|
|
|
|
|
|
There are a lot of value expression nodes which start with
|
|
|
|
identifiers, and can't be distinguished the tokens after the initial
|
|
|
|
identifier are parsed. Using try to implement these variations is very
|
|
|
|
simple but makes the code much harder to debug and makes the parser
|
|
|
|
error messages really bad.
|
|
|
|
|
|
|
|
Here is a list of these nodes:
|
|
|
|
|
2014-04-19 20:17:19 +02:00
|
|
|
* identifiers
|
|
|
|
* function application
|
|
|
|
* aggregate application
|
|
|
|
* window application
|
|
|
|
* typed literal: typename 'literal string'
|
|
|
|
* interval literal which is like the typed literal with some extras
|
2014-04-19 14:10:45 +02:00
|
|
|
|
|
|
|
There is further ambiguity e.g. with typed literals with precision,
|
|
|
|
functions, aggregates, etc. - these are an identifier, followed by
|
|
|
|
parens comma separated value expressions or something similar, and it
|
|
|
|
is only later that we can find a token which tells us which flavour it
|
|
|
|
is.
|
|
|
|
|
|
|
|
There is also a set of nodes which start with an identifier/keyword
|
|
|
|
but can commit since no other syntax can start the same way:
|
|
|
|
|
2014-04-19 20:17:19 +02:00
|
|
|
* case
|
|
|
|
* cast
|
|
|
|
* exists, unique subquery
|
|
|
|
* array constructor
|
|
|
|
* multiset constructor
|
|
|
|
* all the special syntax functions: extract, position, substring,
|
2014-04-19 14:10:45 +02:00
|
|
|
convert, translate, overlay, trim, etc.
|
|
|
|
|
|
|
|
The interval literal mentioned above is treated in this group at the
|
|
|
|
moment: if we see 'interval' we parse it either as a full interval
|
|
|
|
literal or a typed literal only.
|
|
|
|
|
|
|
|
Some items in this list might have to be fixed in the future, e.g. to
|
|
|
|
support standard 'substring(a from 3 for 5)' as well as regular
|
|
|
|
function substring syntax 'substring(a,3,5) at the same time.
|
|
|
|
|
|
|
|
The work in left factoring all this is mostly done, but there is still
|
|
|
|
a substantial bit to complete and this is by far the most difficult
|
|
|
|
bit. At the moment, the work around is to use try, the downsides of
|
|
|
|
which is the poor parsing error messages.
|
|
|
|
|
|
|
|
=== infix and suffix operators
|
|
|
|
|
|
|
|
== permissiveness
|
|
|
|
|
|
|
|
The parser is very permissive in many ways. This departs from the
|
|
|
|
standard which is able to eliminate a number of possibilities just in
|
|
|
|
the grammar, which this parser allows. This is done for a number of
|
|
|
|
reasons:
|
|
|
|
|
2014-04-19 20:17:19 +02:00
|
|
|
* it makes the parser simple - less variations
|
|
|
|
* it should allow for dialects and extensibility more easily in the
|
2014-04-19 14:10:45 +02:00
|
|
|
future (e.g. new infix binary operators with custom precedence)
|
2014-04-19 20:17:19 +02:00
|
|
|
* many things which are effectively checked in the grammar in the
|
2014-04-19 14:10:45 +02:00
|
|
|
standard, can be checked using a typechecker or other simple static
|
|
|
|
analysis
|
|
|
|
|
|
|
|
To use this code as a front end for a sql engine, or as a sql validity
|
|
|
|
checker, you will need to do a lot of checks on the ast. A
|
|
|
|
typechecker/static checker plus annotation to support being a compiler
|
|
|
|
front end is planned but not likely to happen too soon.
|
|
|
|
|
|
|
|
Some of the areas this affects:
|
|
|
|
|
|
|
|
typenames: the variation of the type name should switch on the actual
|
|
|
|
name given according to the standard, but this code only does this for
|
|
|
|
the special case of interval type names. E.g. you can write 'int
|
|
|
|
collate C' or 'int(15,2)' and this will parse as a character type name
|
|
|
|
or a precision scale type name instead of being rejected.
|
|
|
|
|
|
|
|
value expressions: every variation on value expressions uses the same
|
|
|
|
parser/syntax. This means we don't try to stop non boolean valued
|
|
|
|
expressions in boolean valued contexts in the parser. Another area
|
|
|
|
this affects is that we allow general value expressions in group by,
|
|
|
|
whereas the standard only allows column names with optional collation.
|
|
|
|
|
|
|
|
These are all areas which are specified (roughly speaking) in the
|
|
|
|
syntax rather than the semantics in the standard, and we are not
|
|
|
|
fixing them in the syntax but leaving them till the semantic checking
|
|
|
|
(which doesn't exist in this code at this time).
|
|
|
|
|
2013-12-18 14:51:55 +01:00
|
|
|
> {-# LANGUAGE TupleSections #-}
|
2013-12-14 12:33:15 +01:00
|
|
|
> -- | This is the module with the parser functions.
|
2013-12-13 15:04:48 +01:00
|
|
|
> module Language.SQL.SimpleSQL.Parser
|
|
|
|
> (parseQueryExpr
|
2013-12-19 10:46:51 +01:00
|
|
|
> ,parseValueExpr
|
2013-12-13 23:34:05 +01:00
|
|
|
> ,parseQueryExprs
|
2013-12-13 18:21:44 +01:00
|
|
|
> ,ParseError(..)) where
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> import Control.Monad.Identity (Identity)
|
2014-04-18 09:47:39 +02:00
|
|
|
> import Control.Monad (guard, void, when)
|
2014-05-09 20:37:09 +02:00
|
|
|
> import Control.Applicative ((<$), (<$>), (<*>) ,(<*), (*>), (<**>), pure)
|
2014-05-09 22:26:18 +02:00
|
|
|
> import Data.Maybe (catMaybes)
|
2013-12-31 10:21:03 +01:00
|
|
|
> import Data.Char (toLower)
|
2014-05-09 20:37:09 +02:00
|
|
|
> import Text.Parsec (setPosition,setSourceColumn,setSourceLine,getPosition
|
2013-12-31 10:21:03 +01:00
|
|
|
> ,option,between,sepBy,sepBy1,string,manyTill,anyChar
|
|
|
|
> ,try,string,many1,oneOf,digit,(<|>),choice,char,eof
|
2014-06-28 14:41:11 +02:00
|
|
|
> ,optionMaybe,optional,many,letter,runParser
|
2014-05-09 20:37:09 +02:00
|
|
|
> ,chainl1, chainr1,(<?>) {-,notFollowedBy,alphaNum-}, lookAhead)
|
2014-06-28 14:41:11 +02:00
|
|
|
> -- import Text.Parsec.String (Parser)
|
2013-12-31 10:21:03 +01:00
|
|
|
> import Text.Parsec.Perm (permute,(<$?>), (<|?>))
|
2014-06-28 14:41:11 +02:00
|
|
|
> import Text.Parsec.Prim (Parsec, getState)
|
2013-12-31 10:02:26 +01:00
|
|
|
> import qualified Text.Parsec.Expr as E
|
2014-04-18 13:50:54 +02:00
|
|
|
> import Data.List (intercalate,sort,groupBy)
|
|
|
|
> import Data.Function (on)
|
2013-12-13 15:04:48 +01:00
|
|
|
> import Language.SQL.SimpleSQL.Syntax
|
2014-05-09 20:37:09 +02:00
|
|
|
> import Language.SQL.SimpleSQL.Combinators
|
|
|
|
> import Language.SQL.SimpleSQL.Errors
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-19 11:47:25 +02:00
|
|
|
= Public API
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
> -- | Parses a query expr, trailing semicolon optional.
|
2014-06-27 11:19:15 +02:00
|
|
|
> parseQueryExpr :: Dialect
|
|
|
|
> -- ^ dialect of SQL to use
|
|
|
|
> -> FilePath
|
|
|
|
> -- ^ filename to use in error messages
|
2013-12-16 09:03:46 +01:00
|
|
|
> -> Maybe (Int,Int)
|
2013-12-31 11:20:07 +01:00
|
|
|
> -- ^ line number and column number of the first character
|
2014-06-27 11:19:15 +02:00
|
|
|
> -- in the source to use in error messages
|
2013-12-16 09:03:46 +01:00
|
|
|
> -> String
|
|
|
|
> -- ^ the SQL source to parse
|
2013-12-13 18:21:44 +01:00
|
|
|
> -> Either ParseError QueryExpr
|
2013-12-14 09:55:44 +01:00
|
|
|
> parseQueryExpr = wrapParse topLevelQueryExpr
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-31 11:20:07 +01:00
|
|
|
> -- | Parses a list of query expressions, with semi colons between
|
2013-12-16 09:03:46 +01:00
|
|
|
> -- them. The final semicolon is optional.
|
2014-06-27 11:19:15 +02:00
|
|
|
> parseQueryExprs :: Dialect
|
|
|
|
> -- ^ dialect of SQL to use
|
|
|
|
> -> FilePath
|
|
|
|
> -- ^ filename to use in error messages
|
2013-12-16 09:03:46 +01:00
|
|
|
> -> Maybe (Int,Int)
|
2013-12-31 11:20:07 +01:00
|
|
|
> -- ^ line number and column number of the first character
|
2014-06-27 11:19:15 +02:00
|
|
|
> -- in the source to use in error messages
|
2013-12-16 09:03:46 +01:00
|
|
|
> -> String
|
|
|
|
> -- ^ the SQL source to parse
|
2013-12-13 23:34:05 +01:00
|
|
|
> -> Either ParseError [QueryExpr]
|
2013-12-14 09:55:44 +01:00
|
|
|
> parseQueryExprs = wrapParse queryExprs
|
2013-12-13 23:34:05 +01:00
|
|
|
|
2013-12-19 10:46:51 +01:00
|
|
|
> -- | Parses a value expression.
|
2014-06-27 11:19:15 +02:00
|
|
|
> parseValueExpr :: Dialect
|
|
|
|
> -- ^ dialect of SQL to use
|
|
|
|
> -> FilePath
|
|
|
|
> -- ^ filename to use in error messages
|
2013-12-16 09:03:46 +01:00
|
|
|
> -> Maybe (Int,Int)
|
2013-12-31 11:20:07 +01:00
|
|
|
> -- ^ line number and column number of the first character
|
2014-06-27 11:19:15 +02:00
|
|
|
> -- in the source to use in error messages
|
2013-12-16 09:03:46 +01:00
|
|
|
> -> String
|
|
|
|
> -- ^ the SQL source to parse
|
2013-12-19 10:46:51 +01:00
|
|
|
> -> Either ParseError ValueExpr
|
|
|
|
> parseValueExpr = wrapParse valueExpr
|
2013-12-14 09:55:44 +01:00
|
|
|
|
|
|
|
This helper function takes the parser given and:
|
|
|
|
|
|
|
|
sets the position when parsing
|
|
|
|
automatically skips leading whitespace
|
|
|
|
checks the parser parses all the input using eof
|
|
|
|
converts the error return to the nice wrapper
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> wrapParse :: Parser a
|
2014-06-27 11:19:15 +02:00
|
|
|
> -> Dialect
|
2013-12-14 09:55:44 +01:00
|
|
|
> -> FilePath
|
|
|
|
> -> Maybe (Int,Int)
|
|
|
|
> -> String
|
|
|
|
> -> Either ParseError a
|
2014-06-28 14:41:11 +02:00
|
|
|
> wrapParse parser d f p src =
|
2013-12-13 18:21:44 +01:00
|
|
|
> either (Left . convParseError src) Right
|
2014-06-28 14:41:11 +02:00
|
|
|
> $ runParser (setPos p *> whitespace *> parser <* eof)
|
|
|
|
> d f src
|
2014-05-09 20:37:09 +02:00
|
|
|
> where
|
|
|
|
> setPos Nothing = pure ()
|
|
|
|
> setPos (Just (l,c)) = fmap up getPosition >>= setPosition
|
|
|
|
> where up = flip setSourceColumn c . flip setSourceLine l
|
2013-12-13 18:21:44 +01:00
|
|
|
|
2013-12-14 00:14:23 +01:00
|
|
|
------------------------------------------------
|
2013-12-13 18:21:44 +01:00
|
|
|
|
2014-04-19 11:47:25 +02:00
|
|
|
= Names
|
|
|
|
|
|
|
|
Names represent identifiers and a few other things. The parser here
|
|
|
|
handles regular identifiers, dotten chain identifiers, quoted
|
|
|
|
identifiers and unicode quoted identifiers.
|
|
|
|
|
|
|
|
Dots: dots in identifier chains are parsed here and represented in the
|
|
|
|
Iden constructor usually. If parts of the chains are non identifier
|
|
|
|
value expressions, then this is represented by a BinOp "."
|
|
|
|
instead. Dotten chain identifiers which appear in other contexts (such
|
|
|
|
as function names, table names, are represented as [Name] only.
|
|
|
|
|
|
|
|
Identifier grammar:
|
|
|
|
|
|
|
|
unquoted:
|
|
|
|
underscore <|> letter : many (underscore <|> alphanum
|
|
|
|
|
|
|
|
example
|
|
|
|
_example123
|
|
|
|
|
|
|
|
quoted:
|
|
|
|
|
|
|
|
double quote, many (non quote character or two double quotes
|
|
|
|
together), double quote
|
|
|
|
|
|
|
|
"example quoted"
|
|
|
|
"example with "" quote"
|
|
|
|
|
|
|
|
unicode quoted is the same as quoted in this parser, except it starts
|
|
|
|
with U& or u&
|
|
|
|
|
|
|
|
u&"example quoted"
|
|
|
|
|
|
|
|
> name :: Parser Name
|
2014-06-28 14:41:11 +02:00
|
|
|
> name = do
|
2014-06-28 14:43:30 +02:00
|
|
|
> d <- getState
|
2014-06-28 14:41:11 +02:00
|
|
|
> choice [QName <$> quotedIdentifier
|
|
|
|
> ,UQName <$> uquotedIdentifier
|
2014-06-28 14:43:30 +02:00
|
|
|
> ,Name <$> identifierBlacklist (blacklist d)
|
2014-06-28 14:41:11 +02:00
|
|
|
> ,dqName]
|
2014-06-27 11:19:15 +02:00
|
|
|
> where
|
2014-06-28 14:41:11 +02:00
|
|
|
> dqName = guardDialect [MySQL] *>
|
|
|
|
> lexeme (DQName "`" "`"
|
2014-06-27 11:19:15 +02:00
|
|
|
> <$> (char '`'
|
|
|
|
> *> manyTill anyChar (char '`')))
|
2014-04-19 11:47:25 +02:00
|
|
|
|
2014-05-07 20:53:24 +02:00
|
|
|
todo: replace (:[]) with a named function all over
|
|
|
|
|
2014-04-19 11:47:25 +02:00
|
|
|
> names :: Parser [Name]
|
2014-05-09 20:37:09 +02:00
|
|
|
> names = reverse <$> (((:[]) <$> name) <??*> anotherName)
|
2014-05-07 20:53:24 +02:00
|
|
|
> -- can't use a simple chain here since we
|
|
|
|
> -- want to wrap the . + name in a try
|
|
|
|
> -- this will change when this is left factored
|
2014-04-19 11:47:25 +02:00
|
|
|
> where
|
2014-05-07 20:53:24 +02:00
|
|
|
> anotherName :: Parser ([Name] -> [Name])
|
|
|
|
> anotherName = try ((:) <$> (symbol "." *> name))
|
|
|
|
|
2014-04-19 11:47:25 +02:00
|
|
|
= Type Names
|
|
|
|
|
|
|
|
Typenames are used in casts, and also in the typed literal syntax,
|
|
|
|
which is a typename followed by a string literal.
|
|
|
|
|
|
|
|
Here are the grammar notes:
|
|
|
|
|
|
|
|
== simple type name
|
|
|
|
|
|
|
|
just an identifier chain or a multi word identifier (this is a fixed
|
|
|
|
list of possibilities, e.g. as 'character varying', see below in the
|
|
|
|
parser code for the exact list).
|
|
|
|
|
|
|
|
<simple-type-name> ::= <identifier-chain>
|
|
|
|
| multiword-type-identifier
|
|
|
|
|
|
|
|
== Precision type name
|
|
|
|
|
|
|
|
<precision-type-name> ::= <simple-type-name> <left paren> <unsigned-int> <right paren>
|
|
|
|
|
|
|
|
e.g. char(5)
|
|
|
|
|
|
|
|
note: above and below every where a simple type name can appear, this
|
|
|
|
means a single identifier/quoted or a dotted chain, or a multi word
|
|
|
|
identifier
|
|
|
|
|
|
|
|
== Precision scale type name
|
|
|
|
|
|
|
|
<precision-type-name> ::= <simple-type-name> <left paren> <unsigned-int> <comma> <unsigned-int> <right paren>
|
|
|
|
|
|
|
|
e.g. decimal(15,2)
|
|
|
|
|
|
|
|
== Lob type name
|
|
|
|
|
|
|
|
this is a variation on the precision type name with some extra info on
|
|
|
|
the units:
|
|
|
|
|
|
|
|
<lob-type-name> ::=
|
|
|
|
<simple-type-name> <left paren> <unsigned integer> [ <multiplier> ] [ <char length units> ] <right paren>
|
|
|
|
|
|
|
|
<multiplier> ::= K | M | G
|
|
|
|
<char length units> ::= CHARACTERS | CODE_UNITS | OCTETS
|
|
|
|
|
|
|
|
(if both multiplier and char length units are missing, then this will
|
|
|
|
parse as a precision type name)
|
|
|
|
|
|
|
|
e.g.
|
|
|
|
clob(5M octets)
|
|
|
|
|
|
|
|
== char type name
|
|
|
|
|
|
|
|
this is a simple type with optional precision which allows the
|
|
|
|
character set or the collation to appear as a suffix:
|
|
|
|
|
|
|
|
<char type name> ::=
|
|
|
|
<simple type name>
|
|
|
|
[ <left paren> <unsigned-int> <right paren> ]
|
|
|
|
[ CHARACTER SET <identifier chain> ]
|
|
|
|
[ COLLATE <identifier chain> ]
|
|
|
|
|
|
|
|
e.g.
|
|
|
|
|
|
|
|
char(5) character set my_charset collate my_collation
|
|
|
|
|
|
|
|
= Time typename
|
|
|
|
|
|
|
|
this is typename with optional precision and either 'with time zone'
|
|
|
|
or 'without time zone' suffix, e.g.:
|
|
|
|
|
|
|
|
<datetime type> ::=
|
|
|
|
[ <left paren> <unsigned-int> <right paren> ]
|
|
|
|
<with or without time zone>
|
|
|
|
<with or without time zone> ::= WITH TIME ZONE | WITHOUT TIME ZONE
|
|
|
|
WITH TIME ZONE | WITHOUT TIME ZONE
|
|
|
|
|
|
|
|
= row type name
|
|
|
|
|
|
|
|
<row type> ::=
|
|
|
|
ROW <left paren> <field definition> [ { <comma> <field definition> }... ] <right paren>
|
|
|
|
|
|
|
|
<field definition> ::= <identifier> <type name>
|
|
|
|
|
|
|
|
e.g.
|
|
|
|
row(a int, b char(5))
|
|
|
|
|
|
|
|
= interval type name
|
|
|
|
|
|
|
|
<interval type> ::= INTERVAL <interval datetime field> [TO <interval datetime field>]
|
|
|
|
|
|
|
|
<interval datetime field> ::=
|
|
|
|
<datetime field> [ <left paren> <unsigned int> [ <comma> <unsigned int> ] <right paren> ]
|
|
|
|
|
|
|
|
= array type name
|
|
|
|
|
|
|
|
<array type> ::= <data type> ARRAY [ <left bracket> <unsigned integer> <right bracket> ]
|
|
|
|
|
|
|
|
= multiset type name
|
|
|
|
|
|
|
|
<multiset type> ::= <data type> MULTISET
|
|
|
|
|
|
|
|
A type name will parse into the 'smallest' constructor it will fit in
|
|
|
|
syntactically, e.g. a clob(5) will parse to a precision type name, not
|
|
|
|
a lob type name.
|
|
|
|
|
2014-05-12 21:06:29 +02:00
|
|
|
Unfortunately, to improve the error messages, there is a lot of (left)
|
|
|
|
factoring in this function, and it is a little dense.
|
2014-04-19 11:47:25 +02:00
|
|
|
|
|
|
|
> typeName :: Parser TypeName
|
2014-05-09 20:37:09 +02:00
|
|
|
> typeName = lexeme $
|
|
|
|
> (rowTypeName <|> intervalTypeName <|> otherTypeName)
|
|
|
|
> <??*> tnSuffix
|
2014-04-19 11:47:25 +02:00
|
|
|
> where
|
|
|
|
> rowTypeName =
|
|
|
|
> RowTypeName <$> (keyword_ "row" *> parens (commaSep1 rowField))
|
|
|
|
> rowField = (,) <$> name <*> typeName
|
2014-05-09 20:37:09 +02:00
|
|
|
> ----------------------------
|
2014-04-19 11:47:25 +02:00
|
|
|
> intervalTypeName =
|
2014-05-09 20:37:09 +02:00
|
|
|
> keyword_ "interval" *>
|
|
|
|
> (uncurry IntervalTypeName <$> intervalQualifier)
|
|
|
|
> ----------------------------
|
|
|
|
> otherTypeName =
|
|
|
|
> nameOfType <**>
|
|
|
|
> (typeNameWithParens
|
2014-05-12 21:06:29 +02:00
|
|
|
> <|> pure Nothing <**> (timeTypeName <|> charTypeName)
|
2014-05-09 20:37:09 +02:00
|
|
|
> <|> pure TypeName)
|
2014-05-12 21:06:29 +02:00
|
|
|
> nameOfType = reservedTypeNames <|> names
|
|
|
|
> charTypeName = charSet <**> (option [] tcollate <$$$$> CharTypeName)
|
|
|
|
> <|> pure [] <**> (tcollate <$$$$> CharTypeName)
|
2014-05-09 20:37:09 +02:00
|
|
|
> typeNameWithParens =
|
|
|
|
> (openParen *> unsignedInteger)
|
2014-05-12 21:06:29 +02:00
|
|
|
> <**> (closeParen *> precMaybeSuffix
|
|
|
|
> <|> (precScaleTypeName <|> precLengthTypeName) <* closeParen)
|
|
|
|
> precMaybeSuffix = (. Just) <$> (timeTypeName <|> charTypeName)
|
|
|
|
> <|> pure (flip PrecTypeName)
|
|
|
|
> precScaleTypeName = (comma *> unsignedInteger) <$$$> PrecScaleTypeName
|
2014-05-09 20:37:09 +02:00
|
|
|
> precLengthTypeName =
|
2014-05-12 21:06:29 +02:00
|
|
|
> Just <$> lobPrecSuffix
|
|
|
|
> <**> (optionMaybe lobUnits <$$$$> PrecLengthTypeName)
|
|
|
|
> <|> pure Nothing <**> ((Just <$> lobUnits) <$$$$> PrecLengthTypeName)
|
|
|
|
> timeTypeName = tz <$$$> TimeTypeName
|
|
|
|
> ----------------------------
|
|
|
|
> lobPrecSuffix = PrecK <$ keyword_ "k"
|
|
|
|
> <|> PrecM <$ keyword_ "m"
|
|
|
|
> <|> PrecG <$ keyword_ "g"
|
|
|
|
> <|> PrecT <$ keyword_ "t"
|
|
|
|
> <|> PrecP <$ keyword_ "p"
|
2014-05-09 20:37:09 +02:00
|
|
|
> lobUnits = PrecCharacters <$ keyword_ "characters"
|
|
|
|
> <|> PrecOctets <$ keyword_ "octets"
|
2014-05-12 21:06:29 +02:00
|
|
|
> tz = True <$ keywords_ ["with", "time","zone"]
|
|
|
|
> <|> False <$ keywords_ ["without", "time","zone"]
|
2014-04-19 11:47:25 +02:00
|
|
|
> charSet = keywords_ ["character", "set"] *> names
|
|
|
|
> tcollate = keyword_ "collate" *> names
|
2014-05-09 20:37:09 +02:00
|
|
|
> ----------------------------
|
2014-05-09 22:26:18 +02:00
|
|
|
> tnSuffix = multiset <|> array
|
|
|
|
> multiset = MultisetTypeName <$ keyword_ "multiset"
|
|
|
|
> array = keyword_ "array" *>
|
2014-05-12 21:06:29 +02:00
|
|
|
> (optionMaybe (brackets unsignedInteger) <$$> ArrayTypeName)
|
2014-05-09 20:37:09 +02:00
|
|
|
> ----------------------------
|
2014-04-19 11:47:25 +02:00
|
|
|
> -- this parser handles the fixed set of multi word
|
|
|
|
> -- type names, plus all the type names which are
|
|
|
|
> -- reserved words
|
|
|
|
> reservedTypeNames = (:[]) . Name . unwords <$> makeKeywordTree
|
|
|
|
> ["double precision"
|
|
|
|
> ,"character varying"
|
|
|
|
> ,"char varying"
|
|
|
|
> ,"character large object"
|
|
|
|
> ,"char large object"
|
|
|
|
> ,"national character"
|
|
|
|
> ,"national char"
|
|
|
|
> ,"national character varying"
|
|
|
|
> ,"national char varying"
|
|
|
|
> ,"national character large object"
|
|
|
|
> ,"nchar large object"
|
|
|
|
> ,"nchar varying"
|
|
|
|
> ,"bit varying"
|
|
|
|
> ,"binary large object"
|
2014-04-20 22:14:55 +02:00
|
|
|
> ,"binary varying"
|
2014-04-19 11:47:25 +02:00
|
|
|
> -- reserved keyword typenames:
|
|
|
|
> ,"array"
|
|
|
|
> ,"bigint"
|
|
|
|
> ,"binary"
|
|
|
|
> ,"blob"
|
|
|
|
> ,"boolean"
|
|
|
|
> ,"char"
|
|
|
|
> ,"character"
|
|
|
|
> ,"clob"
|
|
|
|
> ,"date"
|
|
|
|
> ,"dec"
|
|
|
|
> ,"decimal"
|
|
|
|
> ,"double"
|
|
|
|
> ,"float"
|
|
|
|
> ,"int"
|
|
|
|
> ,"integer"
|
|
|
|
> ,"nchar"
|
|
|
|
> ,"nclob"
|
|
|
|
> ,"numeric"
|
|
|
|
> ,"real"
|
|
|
|
> ,"smallint"
|
|
|
|
> ,"time"
|
|
|
|
> ,"timestamp"
|
|
|
|
> ,"varchar"
|
2014-04-20 22:14:55 +02:00
|
|
|
> ,"varbinary"
|
2014-04-19 11:47:25 +02:00
|
|
|
> ]
|
|
|
|
|
|
|
|
= Value expressions
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== simple literals
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
See the stringToken lexer below for notes on string literal syntax.
|
2013-12-13 19:24:20 +01:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> stringLit :: Parser ValueExpr
|
|
|
|
> stringLit = StringLit <$> stringToken
|
2013-12-13 16:00:22 +01:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> numberLit :: Parser ValueExpr
|
|
|
|
> numberLit = NumLit <$> numberLiteral
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> characterSetLit :: Parser ValueExpr
|
|
|
|
> characterSetLit =
|
2014-04-17 23:16:24 +02:00
|
|
|
> CSStringLit <$> shortCSPrefix <*> stringToken
|
|
|
|
> where
|
2014-04-19 12:22:11 +02:00
|
|
|
> shortCSPrefix = try $ choice
|
2014-04-17 23:16:24 +02:00
|
|
|
> [(:[]) <$> oneOf "nNbBxX"
|
|
|
|
> ,string "u&"
|
|
|
|
> ,string "U&"
|
2014-04-18 11:28:05 +02:00
|
|
|
> ] <* lookAhead quote
|
2014-04-17 23:16:24 +02:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> simpleLiteral :: Parser ValueExpr
|
|
|
|
> simpleLiteral = numberLit <|> stringLit <|> characterSetLit
|
2014-04-17 23:16:24 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== star, param, host param
|
|
|
|
|
|
|
|
=== star
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-17 11:24:37 +01:00
|
|
|
used in select *, select x.*, and agg(*) variations, and some other
|
2014-04-19 14:10:45 +02:00
|
|
|
places as well. The parser doesn't attempt to check that the star is
|
|
|
|
in a valid context, it parses it OK in any value expression context.
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> star :: Parser ValueExpr
|
2013-12-17 14:21:43 +01:00
|
|
|
> star = Star <$ symbol "*"
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-19 09:44:20 +01:00
|
|
|
== parameter
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
unnamed parameter or named parameter
|
2013-12-19 09:44:20 +01:00
|
|
|
use in e.g. select * from t where a = ?
|
2014-04-17 18:27:18 +02:00
|
|
|
select x from t where x > :param
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> parameter :: Parser ValueExpr
|
|
|
|
> parameter = choice
|
|
|
|
> [Parameter <$ questionMark
|
|
|
|
> ,HostParameter
|
|
|
|
> <$> hostParameterToken
|
|
|
|
> <*> optionMaybe (keyword "indicator" *> hostParameterToken)]
|
2014-04-17 18:27:18 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== parens
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
value expression parens, row ctor and scalar subquery
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> parensExpr :: Parser ValueExpr
|
|
|
|
> parensExpr = parens $ choice
|
2014-04-19 12:10:46 +02:00
|
|
|
> [SubQueryExpr SqSq <$> queryExpr
|
|
|
|
> ,ctor <$> commaSep1 valueExpr]
|
|
|
|
> where
|
|
|
|
> ctor [a] = Parens a
|
|
|
|
> ctor as = SpecialOp [Name "rowctor"] as
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== case, cast, exists, unique, array/multiset constructor, interval
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
All of these start with a fixed keyword which is reserved, so no other
|
|
|
|
syntax can start with the same keyword.
|
2014-04-18 11:28:05 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== case expression
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> caseExpr :: Parser ValueExpr
|
|
|
|
> caseExpr =
|
2014-04-19 12:10:46 +02:00
|
|
|
> Case <$> (keyword_ "case" *> optionMaybe valueExpr)
|
|
|
|
> <*> many1 whenClause
|
|
|
|
> <*> optionMaybe elseClause
|
|
|
|
> <* keyword_ "end"
|
2013-12-14 12:05:02 +01:00
|
|
|
> where
|
2014-04-19 12:10:46 +02:00
|
|
|
> whenClause = (,) <$> (keyword_ "when" *> commaSep1 valueExpr)
|
|
|
|
> <*> (keyword_ "then" *> valueExpr)
|
|
|
|
> elseClause = keyword_ "else" *> valueExpr
|
2013-12-13 22:18:30 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== cast
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
cast: cast(expr as type)
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
> cast :: Parser ValueExpr
|
2014-05-10 09:02:16 +02:00
|
|
|
> cast = keyword_ "cast" *>
|
2014-04-19 12:10:46 +02:00
|
|
|
> parens (Cast <$> valueExpr
|
|
|
|
> <*> (keyword_ "as" *> typeName))
|
2013-12-14 10:23:58 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== exists, unique
|
|
|
|
|
|
|
|
subquery expression:
|
|
|
|
[exists|unique] (queryexpr)
|
|
|
|
|
|
|
|
> subquery :: Parser ValueExpr
|
|
|
|
> subquery = SubQueryExpr <$> sqkw <*> parens queryExpr
|
2013-12-13 22:31:36 +01:00
|
|
|
> where
|
2014-05-12 21:06:29 +02:00
|
|
|
> sqkw = SqExists <$ keyword_ "exists" <|> SqUnique <$ keyword_ "unique"
|
2013-12-13 22:31:36 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== array/multiset constructor
|
|
|
|
|
|
|
|
> arrayCtor :: Parser ValueExpr
|
|
|
|
> arrayCtor = keyword_ "array" >>
|
|
|
|
> choice
|
|
|
|
> [ArrayCtor <$> parens queryExpr
|
|
|
|
> ,Array (Iden [Name "array"]) <$> brackets (commaSep valueExpr)]
|
|
|
|
|
|
|
|
As far as I can tell, table(query expr) is just syntax sugar for
|
|
|
|
multiset(query expr). It must be there for compatibility or something.
|
|
|
|
|
|
|
|
> multisetCtor :: Parser ValueExpr
|
|
|
|
> multisetCtor =
|
|
|
|
> choice
|
|
|
|
> [keyword_ "multiset" >>
|
|
|
|
> choice
|
|
|
|
> [MultisetQueryCtor <$> parens queryExpr
|
|
|
|
> ,MultisetCtor <$> brackets (commaSep valueExpr)]
|
|
|
|
> ,keyword_ "table" >>
|
|
|
|
> MultisetQueryCtor <$> parens queryExpr]
|
|
|
|
|
2014-04-19 20:17:19 +02:00
|
|
|
> nextValueFor :: Parser ValueExpr
|
|
|
|
> nextValueFor = keywords_ ["next","value","for"] >>
|
|
|
|
> NextValueFor <$> names
|
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== interval
|
|
|
|
|
|
|
|
interval literals are a special case and we follow the grammar less
|
|
|
|
permissively here
|
|
|
|
|
|
|
|
parse SQL interval literals, something like
|
|
|
|
interval '5' day (3)
|
|
|
|
or
|
|
|
|
interval '5' month
|
|
|
|
|
|
|
|
if the literal looks like this:
|
|
|
|
interval 'something'
|
|
|
|
|
|
|
|
then it is parsed as a regular typed literal. It must have a
|
|
|
|
interval-datetime-field suffix to parse as an intervallit
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
It uses try because of a conflict with interval type names: todo, fix
|
2014-05-09 20:37:09 +02:00
|
|
|
this. also fix the monad -> applicative
|
2014-04-19 12:22:11 +02:00
|
|
|
|
|
|
|
> intervalLit :: Parser ValueExpr
|
|
|
|
> intervalLit = try (keyword_ "interval" >> do
|
2014-04-19 12:10:46 +02:00
|
|
|
> s <- optionMaybe $ choice [True <$ symbol_ "+"
|
|
|
|
> ,False <$ symbol_ "-"]
|
|
|
|
> lit <- stringToken
|
|
|
|
> q <- optionMaybe intervalQualifier
|
2014-04-19 12:22:11 +02:00
|
|
|
> mkIt s lit q)
|
2014-04-19 12:10:46 +02:00
|
|
|
> where
|
2014-05-09 20:37:09 +02:00
|
|
|
> mkIt Nothing val Nothing = pure $ TypedLit (TypeName [Name "interval"]) val
|
|
|
|
> mkIt s val (Just (a,b)) = pure $ IntervalLit s val a b
|
2014-04-19 12:10:46 +02:00
|
|
|
> mkIt (Just {}) _val Nothing = fail "cannot use sign without interval qualifier"
|
2014-04-17 20:05:47 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== typed literal, app, special, aggregate, window, iden
|
|
|
|
|
|
|
|
All of these start with identifiers (some of the special functions
|
|
|
|
start with reserved keywords).
|
|
|
|
|
|
|
|
they are all variations on suffixes on the basic identifier parser
|
|
|
|
|
|
|
|
The windows is a suffix on the app parser
|
|
|
|
|
|
|
|
=== iden prefix term
|
2014-04-17 20:05:47 +02:00
|
|
|
|
|
|
|
all the value expressions which start with an identifier
|
|
|
|
|
|
|
|
(todo: really put all of them here instead of just some of them)
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> idenExpr :: Parser ValueExpr
|
|
|
|
> idenExpr =
|
2014-04-17 21:35:43 +02:00
|
|
|
> -- todo: work out how to left factor this
|
|
|
|
> try (TypedLit <$> typeName <*> stringToken)
|
2014-05-09 22:26:18 +02:00
|
|
|
> <|> (names <**> option Iden app)
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== special
|
2013-12-14 09:55:44 +01:00
|
|
|
|
|
|
|
These are keyword operators which don't look like normal prefix,
|
|
|
|
postfix or infix binary operators. They mostly look like function
|
|
|
|
application but with keywords in the argument list instead of commas
|
|
|
|
to separate the arguments.
|
|
|
|
|
2013-12-18 14:51:55 +01:00
|
|
|
the special op keywords
|
|
|
|
parse an operator which is
|
|
|
|
operatorname(firstArg keyword0 arg0 keyword1 arg1 etc.)
|
|
|
|
|
|
|
|
> data SpecialOpKFirstArg = SOKNone
|
|
|
|
> | SOKOptional
|
|
|
|
> | SOKMandatory
|
|
|
|
|
|
|
|
> specialOpK :: String -- name of the operator
|
|
|
|
> -> SpecialOpKFirstArg -- has a first arg without a keyword
|
|
|
|
> -> [(String,Bool)] -- the other args with their keywords
|
|
|
|
> -- and whether they are optional
|
2013-12-31 10:21:03 +01:00
|
|
|
> -> Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> specialOpK opName firstArg kws =
|
|
|
|
> keyword_ opName >> do
|
2014-04-16 17:58:17 +02:00
|
|
|
> void openParen
|
2013-12-18 14:51:55 +01:00
|
|
|
> let pfa = do
|
2013-12-31 10:02:26 +01:00
|
|
|
> e <- valueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> -- check we haven't parsed the first
|
|
|
|
> -- keyword as an identifier
|
|
|
|
> guard (case (e,kws) of
|
2014-04-18 10:43:37 +02:00
|
|
|
> (Iden [Name i], (k,_):_) | map toLower i == k -> False
|
2013-12-18 14:51:55 +01:00
|
|
|
> _ -> True)
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure e
|
2013-12-18 14:51:55 +01:00
|
|
|
> fa <- case firstArg of
|
2014-05-09 20:37:09 +02:00
|
|
|
> SOKNone -> pure Nothing
|
2013-12-18 14:51:55 +01:00
|
|
|
> SOKOptional -> optionMaybe (try pfa)
|
|
|
|
> SOKMandatory -> Just <$> pfa
|
|
|
|
> as <- mapM parseArg kws
|
2014-04-16 17:58:17 +02:00
|
|
|
> void closeParen
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ SpecialOpK [Name opName] fa $ catMaybes as
|
2013-12-18 14:51:55 +01:00
|
|
|
> where
|
|
|
|
> parseArg (nm,mand) =
|
2013-12-31 10:02:26 +01:00
|
|
|
> let p = keyword_ nm >> valueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> in fmap (nm,) <$> if mand
|
|
|
|
> then Just <$> p
|
|
|
|
> else optionMaybe (try p)
|
|
|
|
|
|
|
|
The actual operators:
|
|
|
|
|
|
|
|
EXTRACT( date_part FROM expression )
|
|
|
|
|
|
|
|
POSITION( string1 IN string2 )
|
|
|
|
|
|
|
|
SUBSTRING(extraction_string FROM starting_position [FOR length]
|
|
|
|
[COLLATE collation_name])
|
|
|
|
|
|
|
|
CONVERT(char_value USING conversion_char_name)
|
|
|
|
|
|
|
|
TRANSLATE(char_value USING translation_name)
|
|
|
|
|
|
|
|
OVERLAY(string PLACING embedded_string FROM start
|
|
|
|
[FOR length])
|
|
|
|
|
|
|
|
TRIM( [ [{LEADING | TRAILING | BOTH}] [removal_char] FROM ]
|
|
|
|
target_string
|
|
|
|
[COLLATE collation_name] )
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> specialOpKs :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> specialOpKs = choice $ map try
|
|
|
|
> [extract, position, substring, convert, translate, overlay, trim]
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> extract :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> extract = specialOpK "extract" SOKMandatory [("from", True)]
|
2013-12-13 21:38:43 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> position :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> position = specialOpK "position" SOKMandatory [("in", True)]
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-18 14:51:55 +01:00
|
|
|
strictly speaking, the substring must have at least one of from and
|
|
|
|
for, but the parser doens't enforce this
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> substring :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> substring = specialOpK "substring" SOKMandatory
|
2014-04-17 23:16:24 +02:00
|
|
|
> [("from", False),("for", False)]
|
2013-12-18 14:51:55 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> convert :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> convert = specialOpK "convert" SOKMandatory [("using", True)]
|
|
|
|
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> translate :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> translate = specialOpK "translate" SOKMandatory [("using", True)]
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> overlay :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> overlay = specialOpK "overlay" SOKMandatory
|
|
|
|
> [("placing", True),("from", True),("for", False)]
|
|
|
|
|
|
|
|
trim is too different because of the optional char, so a custom parser
|
|
|
|
the both ' ' is filled in as the default if either parts are missing
|
|
|
|
in the source
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> trim :: Parser ValueExpr
|
2013-12-18 14:51:55 +01:00
|
|
|
> trim =
|
|
|
|
> keyword "trim" >>
|
|
|
|
> parens (mkTrim
|
|
|
|
> <$> option "both" sides
|
2014-04-16 17:58:17 +02:00
|
|
|
> <*> option " " stringToken
|
2014-04-17 23:16:24 +02:00
|
|
|
> <*> (keyword_ "from" *> valueExpr))
|
2013-12-18 14:51:55 +01:00
|
|
|
> where
|
|
|
|
> sides = choice ["leading" <$ keyword_ "leading"
|
|
|
|
> ,"trailing" <$ keyword_ "trailing"
|
|
|
|
> ,"both" <$ keyword_ "both"]
|
2014-04-17 23:16:24 +02:00
|
|
|
> mkTrim fa ch fr =
|
2014-04-18 10:43:37 +02:00
|
|
|
> SpecialOpK [Name "trim"] Nothing
|
2013-12-18 14:51:55 +01:00
|
|
|
> $ catMaybes [Just (fa,StringLit ch)
|
2014-04-17 23:16:24 +02:00
|
|
|
> ,Just ("from", fr)]
|
2013-12-13 23:34:05 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== app, aggregate, window
|
|
|
|
|
2014-05-09 22:26:18 +02:00
|
|
|
This parses all these variations:
|
|
|
|
normal function application with just a csv of value exprs
|
|
|
|
aggregate variations (distinct, order by in parens, filter and where
|
|
|
|
suffixes)
|
|
|
|
window apps (fn/agg followed by over)
|
|
|
|
|
2014-05-12 21:06:29 +02:00
|
|
|
This code is also a little dense like the typename code because of
|
|
|
|
left factoring, later they will even have to be partially combined
|
|
|
|
together.
|
2014-05-09 22:26:18 +02:00
|
|
|
|
|
|
|
> app :: Parser ([Name] -> ValueExpr)
|
|
|
|
> app =
|
2014-05-12 21:06:29 +02:00
|
|
|
> openParen *> choice
|
|
|
|
> [duplicates
|
|
|
|
> <**> (commaSep1 valueExpr
|
|
|
|
> <**> (((option [] orderBy) <* closeParen)
|
|
|
|
> <**> (optionMaybe afilter <$$$$$> AggregateApp)))
|
|
|
|
> -- separate cases with no all or distinct which must have at
|
|
|
|
> -- least one value expr
|
|
|
|
> ,commaSep1 valueExpr
|
|
|
|
> <**> choice
|
|
|
|
> [closeParen *> choice
|
|
|
|
> [window
|
|
|
|
> ,withinGroup
|
|
|
|
> ,(Just <$> afilter) <$$$> aggAppWithoutDupeOrd
|
|
|
|
> ,pure (flip App)]
|
|
|
|
> ,orderBy <* closeParen
|
|
|
|
> <**> (optionMaybe afilter <$$$$> aggAppWithoutDupe)]
|
|
|
|
> -- no valueExprs: duplicates and order by not allowed
|
2014-06-20 11:27:23 +02:00
|
|
|
> ,([] <$ closeParen) <**> option (flip App) (window <|> withinGroup)
|
2014-05-12 21:06:29 +02:00
|
|
|
> ]
|
2014-04-19 12:10:46 +02:00
|
|
|
> where
|
2014-05-12 21:06:29 +02:00
|
|
|
> aggAppWithoutDupeOrd n es f = AggregateApp n SQDefault es [] f
|
|
|
|
> aggAppWithoutDupe n = AggregateApp n SQDefault
|
|
|
|
|
|
|
|
> afilter :: Parser ValueExpr
|
|
|
|
> afilter = keyword_ "filter" *> parens (keyword_ "where" *> valueExpr)
|
2014-04-19 17:01:49 +02:00
|
|
|
|
2014-05-09 22:26:18 +02:00
|
|
|
> withinGroup :: Parser ([ValueExpr] -> [Name] -> ValueExpr)
|
|
|
|
> withinGroup =
|
2014-05-12 21:06:29 +02:00
|
|
|
> (keywords_ ["within", "group"] *> parens orderBy) <$$$> AggregateAppGroup
|
2014-04-19 12:10:46 +02:00
|
|
|
|
2014-05-09 22:26:18 +02:00
|
|
|
==== window
|
2014-04-19 12:10:46 +02:00
|
|
|
|
|
|
|
parse a window call as a suffix of a regular function call
|
|
|
|
this looks like this:
|
|
|
|
functionname(args) over ([partition by ids] [order by orderitems])
|
|
|
|
|
|
|
|
No support for explicit frames yet.
|
|
|
|
|
2014-05-09 22:26:18 +02:00
|
|
|
TODO: add window support for other aggregate variations, needs some
|
|
|
|
changes to the syntax also
|
|
|
|
|
|
|
|
> window :: Parser ([ValueExpr] -> [Name] -> ValueExpr)
|
2014-05-12 21:06:29 +02:00
|
|
|
> window =
|
|
|
|
> keyword_ "over" *> openParen *> option [] partitionBy
|
|
|
|
> <**> (option [] orderBy
|
|
|
|
> <**> (((optionMaybe frameClause) <* closeParen) <$$$$$> WindowApp))
|
2014-04-19 12:10:46 +02:00
|
|
|
> where
|
2014-05-12 21:06:29 +02:00
|
|
|
> partitionBy = keywords_ ["partition","by"] *> commaSep1 valueExpr
|
2014-04-19 12:10:46 +02:00
|
|
|
> frameClause =
|
2014-05-12 21:06:29 +02:00
|
|
|
> frameRowsRange -- TODO: this 'and' could be an issue
|
|
|
|
> <**> (choice [(keyword_ "between" *> frameLimit True)
|
|
|
|
> <**> ((keyword_ "and" *> frameLimit True)
|
|
|
|
> <$$$> FrameBetween)
|
|
|
|
> -- maybe this should still use a b expression
|
|
|
|
> -- for consistency
|
|
|
|
> ,frameLimit False <**> pure (flip FrameFrom)])
|
|
|
|
> frameRowsRange = FrameRows <$ keyword_ "rows"
|
|
|
|
> <|> FrameRange <$ keyword_ "range"
|
2014-04-19 12:10:46 +02:00
|
|
|
> frameLimit useB =
|
|
|
|
> choice
|
|
|
|
> [Current <$ keywords_ ["current", "row"]
|
2014-05-12 21:06:29 +02:00
|
|
|
> -- todo: create an automatic left factor for stuff like this
|
|
|
|
> ,keyword_ "unbounded" *>
|
2014-04-19 12:10:46 +02:00
|
|
|
> choice [UnboundedPreceding <$ keyword_ "preceding"
|
|
|
|
> ,UnboundedFollowing <$ keyword_ "following"]
|
2014-05-12 21:06:29 +02:00
|
|
|
> ,(if useB then valueExprB else valueExpr)
|
|
|
|
> <**> (Preceding <$ keyword_ "preceding"
|
|
|
|
> <|> Following <$ keyword_ "following")
|
2014-04-19 12:10:46 +02:00
|
|
|
> ]
|
|
|
|
|
|
|
|
== suffixes
|
|
|
|
|
|
|
|
These are all generic suffixes on any value expr
|
|
|
|
|
|
|
|
=== in
|
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
in: two variations:
|
|
|
|
a in (expr0, expr1, ...)
|
|
|
|
a in (queryexpr)
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> inSuffix :: Parser (ValueExpr -> ValueExpr)
|
2013-12-31 10:02:26 +01:00
|
|
|
> inSuffix =
|
|
|
|
> mkIn <$> inty
|
|
|
|
> <*> parens (choice
|
|
|
|
> [InQueryExpr <$> queryExpr
|
|
|
|
> ,InList <$> commaSep1 valueExpr])
|
2013-12-13 20:00:06 +01:00
|
|
|
> where
|
2014-04-16 19:22:42 +02:00
|
|
|
> inty = choice [True <$ keyword_ "in"
|
2014-04-18 11:28:05 +02:00
|
|
|
> ,False <$ keywords_ ["not","in"]]
|
2013-12-31 10:02:26 +01:00
|
|
|
> mkIn i v = \e -> In i e v
|
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== between
|
2013-12-13 20:00:06 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
between:
|
|
|
|
expr between expr and expr
|
|
|
|
|
|
|
|
There is a complication when parsing between - when parsing the second
|
|
|
|
expression it is ambiguous when you hit an 'and' whether it is a
|
|
|
|
binary operator or part of the between. This code follows what
|
|
|
|
postgres does, which might be standard across SQL implementations,
|
|
|
|
which is that you can't have a binary and operator in the middle
|
|
|
|
expression in a between unless it is wrapped in parens. The 'bExpr
|
2013-12-19 10:46:51 +01:00
|
|
|
parsing' is used to create alternative value expression parser which
|
2013-12-14 09:55:44 +01:00
|
|
|
is identical to the normal one expect it doesn't recognise the binary
|
2013-12-31 10:02:26 +01:00
|
|
|
and operator. This is the call to valueExprB.
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> betweenSuffix :: Parser (ValueExpr -> ValueExpr)
|
2013-12-31 10:02:26 +01:00
|
|
|
> betweenSuffix =
|
2014-04-18 10:43:37 +02:00
|
|
|
> makeOp <$> Name <$> opName
|
2013-12-31 10:02:26 +01:00
|
|
|
> <*> valueExprB
|
|
|
|
> <*> (keyword_ "and" *> valueExprB)
|
2013-12-13 20:13:36 +01:00
|
|
|
> where
|
2014-04-16 19:22:42 +02:00
|
|
|
> opName = choice
|
2013-12-13 20:13:36 +01:00
|
|
|
> ["between" <$ keyword_ "between"
|
2014-04-18 11:28:05 +02:00
|
|
|
> ,"not between" <$ try (keywords_ ["not","between"])]
|
2014-04-18 10:43:37 +02:00
|
|
|
> makeOp n b c = \a -> SpecialOp [n] [a,b,c]
|
2013-12-13 20:13:36 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== quantified comparison
|
2014-04-17 21:35:43 +02:00
|
|
|
|
|
|
|
a = any (select * from t)
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> quantifiedComparisonSuffix :: Parser (ValueExpr -> ValueExpr)
|
|
|
|
> quantifiedComparisonSuffix = do
|
2014-04-17 21:35:43 +02:00
|
|
|
> c <- comp
|
|
|
|
> cq <- compQuan
|
|
|
|
> q <- parens queryExpr
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ \v -> QuantifiedComparison v [c] cq q
|
2014-04-17 21:35:43 +02:00
|
|
|
> where
|
|
|
|
> comp = Name <$> choice (map symbol
|
|
|
|
> ["=", "<>", "<=", "<", ">", ">="])
|
|
|
|
> compQuan = choice
|
|
|
|
> [CPAny <$ keyword_ "any"
|
|
|
|
> ,CPSome <$ keyword_ "some"
|
|
|
|
> ,CPAll <$ keyword_ "all"]
|
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== match
|
|
|
|
|
2014-04-17 21:35:43 +02:00
|
|
|
a match (select a from t)
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> matchPredicateSuffix :: Parser (ValueExpr -> ValueExpr)
|
|
|
|
> matchPredicateSuffix = do
|
2014-04-17 21:35:43 +02:00
|
|
|
> keyword_ "match"
|
|
|
|
> u <- option False (True <$ keyword_ "unique")
|
|
|
|
> q <- parens queryExpr
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ \v -> Match v u q
|
2014-04-17 21:35:43 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== array subscript
|
2013-12-13 19:43:28 +01:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> arraySuffix :: Parser (ValueExpr -> ValueExpr)
|
|
|
|
> arraySuffix = do
|
2014-04-17 21:57:33 +02:00
|
|
|
> es <- brackets (commaSep valueExpr)
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ \v -> Array v es
|
2014-04-17 21:57:33 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== escape
|
2014-04-18 19:50:24 +02:00
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> escapeSuffix :: Parser (ValueExpr -> ValueExpr)
|
|
|
|
> escapeSuffix = do
|
2014-04-17 23:16:24 +02:00
|
|
|
> ctor <- choice
|
|
|
|
> [Escape <$ keyword_ "escape"
|
|
|
|
> ,UEscape <$ keyword_ "uescape"]
|
|
|
|
> c <- anyChar
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ \v -> ctor v c
|
2014-04-17 23:16:24 +02:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
=== collate
|
|
|
|
|
2014-04-19 12:22:11 +02:00
|
|
|
> collateSuffix:: Parser (ValueExpr -> ValueExpr)
|
|
|
|
> collateSuffix = do
|
|
|
|
> keyword_ "collate"
|
|
|
|
> i <- names
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ \v -> Collate v i
|
2014-04-17 23:16:24 +02:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== operators
|
2013-12-14 09:55:44 +01:00
|
|
|
|
|
|
|
The 'regular' operators in this parsing and in the abstract syntax are
|
|
|
|
unary prefix, unary postfix and binary infix operators. The operators
|
|
|
|
can be symbols (a + b), single keywords (a and b) or multiple keywords
|
|
|
|
(a is similar to b).
|
|
|
|
|
2013-12-31 11:20:07 +01:00
|
|
|
TODO: carefully review the precedences and associativities.
|
|
|
|
|
2014-04-18 13:50:54 +02:00
|
|
|
TODO: to fix the parsing completely, I think will need to parse
|
|
|
|
without precedence and associativity and fix up afterwards, since SQL
|
|
|
|
syntax is way too messy. It might be possible to avoid this if we
|
|
|
|
wanted to avoid extensibility and to not be concerned with parse error
|
2014-04-19 12:10:46 +02:00
|
|
|
messages, but both of these are too important.
|
2014-04-18 13:50:54 +02:00
|
|
|
|
2014-06-28 14:41:11 +02:00
|
|
|
> opTable :: Bool -> [[E.Operator String ParseState Identity ValueExpr]]
|
2013-12-31 10:02:26 +01:00
|
|
|
> opTable bExpr =
|
2014-04-17 21:35:43 +02:00
|
|
|
> [-- parse match and quantified comparisons as postfix ops
|
|
|
|
> -- todo: left factor the quantified comparison with regular
|
|
|
|
> -- binary comparison, somehow
|
2014-04-19 12:22:11 +02:00
|
|
|
> [E.Postfix $ try quantifiedComparisonSuffix
|
|
|
|
> ,E.Postfix matchPredicateSuffix
|
2014-04-17 23:16:24 +02:00
|
|
|
> ]
|
2014-04-17 21:35:43 +02:00
|
|
|
> ,[binarySym "." E.AssocLeft]
|
2014-04-19 12:22:11 +02:00
|
|
|
> ,[postfix' arraySuffix
|
|
|
|
> ,postfix' escapeSuffix
|
|
|
|
> ,postfix' collateSuffix]
|
2013-12-31 10:02:26 +01:00
|
|
|
> ,[prefixSym "+", prefixSym "-"]
|
|
|
|
> ,[binarySym "^" E.AssocLeft]
|
|
|
|
> ,[binarySym "*" E.AssocLeft
|
|
|
|
> ,binarySym "/" E.AssocLeft
|
|
|
|
> ,binarySym "%" E.AssocLeft]
|
|
|
|
> ,[binarySym "+" E.AssocLeft
|
|
|
|
> ,binarySym "-" E.AssocLeft]
|
|
|
|
> ,[binarySym ">=" E.AssocNone
|
|
|
|
> ,binarySym "<=" E.AssocNone
|
|
|
|
> ,binarySym "!=" E.AssocRight
|
|
|
|
> ,binarySym "<>" E.AssocRight
|
|
|
|
> ,binarySym "||" E.AssocRight
|
|
|
|
> ,prefixSym "~"
|
|
|
|
> ,binarySym "&" E.AssocRight
|
|
|
|
> ,binarySym "|" E.AssocRight
|
|
|
|
> ,binaryKeyword "like" E.AssocNone
|
|
|
|
> ,binaryKeyword "overlaps" E.AssocNone]
|
2014-04-18 13:50:54 +02:00
|
|
|
> ++ [binaryKeywords $ makeKeywordTree
|
|
|
|
> ["not like"
|
|
|
|
> ,"is similar to"
|
|
|
|
> ,"is not similar to"
|
|
|
|
> ,"is distinct from"
|
|
|
|
> ,"is not distinct from"]
|
|
|
|
> ,postfixKeywords $ makeKeywordTree
|
|
|
|
> ["is null"
|
|
|
|
> ,"is not null"
|
|
|
|
> ,"is true"
|
|
|
|
> ,"is not true"
|
|
|
|
> ,"is false"
|
|
|
|
> ,"is not false"
|
|
|
|
> ,"is unknown"
|
|
|
|
> ,"is not unknown"]
|
|
|
|
> ]
|
2014-04-18 19:50:24 +02:00
|
|
|
> ++ [multisetBinOp]
|
2014-04-16 19:22:42 +02:00
|
|
|
> -- have to use try with inSuffix because of a conflict
|
2014-04-18 10:18:21 +02:00
|
|
|
> -- with 'in' in position function, and not between
|
2014-04-16 19:22:42 +02:00
|
|
|
> -- between also has a try in it to deal with 'not'
|
|
|
|
> -- ambiguity
|
|
|
|
> ++ [E.Postfix $ try inSuffix,E.Postfix betweenSuffix]
|
2013-12-31 10:02:26 +01:00
|
|
|
> ]
|
|
|
|
> ++
|
|
|
|
> [[binarySym "<" E.AssocNone
|
|
|
|
> ,binarySym ">" E.AssocNone]
|
|
|
|
> ,[binarySym "=" E.AssocRight]
|
|
|
|
> ,[prefixKeyword "not"]]
|
|
|
|
> ++
|
|
|
|
> if bExpr then [] else [[binaryKeyword "and" E.AssocLeft]]
|
|
|
|
> ++
|
|
|
|
> [[binaryKeyword "or" E.AssocLeft]]
|
2013-12-14 14:05:52 +01:00
|
|
|
> where
|
2014-04-16 19:22:42 +02:00
|
|
|
> binarySym nm assoc = binary (symbol_ nm) nm assoc
|
|
|
|
> binaryKeyword nm assoc = binary (keyword_ nm) nm assoc
|
2014-04-18 13:50:54 +02:00
|
|
|
> binaryKeywords p =
|
|
|
|
> E.Infix (do
|
2014-05-09 20:37:09 +02:00
|
|
|
> o <- try p
|
|
|
|
> pure (\a b -> BinOp a [Name $ unwords o] b))
|
2014-04-18 13:50:54 +02:00
|
|
|
> E.AssocNone
|
|
|
|
> postfixKeywords p =
|
|
|
|
> postfix' $ do
|
2014-04-18 16:51:57 +02:00
|
|
|
> o <- try p
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure $ PostfixOp [Name $ unwords o]
|
2013-12-31 10:02:26 +01:00
|
|
|
> binary p nm assoc =
|
2014-05-09 20:37:09 +02:00
|
|
|
> E.Infix (p >> pure (\a b -> BinOp a [Name nm] b)) assoc
|
2014-04-18 19:50:24 +02:00
|
|
|
> multisetBinOp = E.Infix (do
|
|
|
|
> keyword_ "multiset"
|
|
|
|
> o <- choice [Union <$ keyword_ "union"
|
|
|
|
> ,Intersect <$ keyword_ "intersect"
|
|
|
|
> ,Except <$ keyword_ "except"]
|
2014-05-09 22:26:18 +02:00
|
|
|
> d <- option SQDefault duplicates
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure (\a b -> MultisetBinOp a o d b))
|
2014-04-18 19:50:24 +02:00
|
|
|
> E.AssocLeft
|
2014-04-16 19:22:42 +02:00
|
|
|
> prefixKeyword nm = prefix (keyword_ nm) nm
|
|
|
|
> prefixSym nm = prefix (symbol_ nm) nm
|
2014-05-09 20:37:09 +02:00
|
|
|
> prefix p nm = prefix' (p >> pure (PrefixOp [Name nm]))
|
2014-04-16 17:58:17 +02:00
|
|
|
> -- hack from here
|
|
|
|
> -- http://stackoverflow.com/questions/10475337/parsec-expr-repeated-prefix-postfix-operator-not-supported
|
|
|
|
> -- not implemented properly yet
|
|
|
|
> -- I don't think this will be enough for all cases
|
|
|
|
> -- at least it works for 'not not a'
|
|
|
|
> -- ok: "x is not true is not true"
|
|
|
|
> -- no work: "x is not true is not null"
|
2014-05-09 20:37:09 +02:00
|
|
|
> prefix' p = E.Prefix . chainl1 p $ pure (.)
|
|
|
|
> postfix' p = E.Postfix . chainl1 p $ pure (flip (.))
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== value expression top level
|
2013-12-14 10:28:45 +01:00
|
|
|
|
2013-12-31 10:02:26 +01:00
|
|
|
This parses most of the value exprs.The order of the parsers and use
|
|
|
|
of try is carefully done to make everything work. It is a little
|
2014-05-09 20:37:09 +02:00
|
|
|
fragile and could at least do with some heavy explanation. Update: the
|
|
|
|
'try's have migrated into the individual parsers, they still need
|
|
|
|
documenting/fixing.
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> valueExpr :: Parser ValueExpr
|
2013-12-31 10:02:26 +01:00
|
|
|
> valueExpr = E.buildExpressionParser (opTable False) term
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> term :: Parser ValueExpr
|
2014-04-19 12:22:11 +02:00
|
|
|
> term = choice [simpleLiteral
|
2013-12-31 10:02:26 +01:00
|
|
|
> ,parameter
|
2014-04-19 12:10:46 +02:00
|
|
|
> ,star
|
2014-04-19 12:22:11 +02:00
|
|
|
> ,parensExpr
|
|
|
|
> ,caseExpr
|
2013-12-31 10:02:26 +01:00
|
|
|
> ,cast
|
2014-04-17 21:57:33 +02:00
|
|
|
> ,arrayCtor
|
2014-04-18 19:50:24 +02:00
|
|
|
> ,multisetCtor
|
2014-04-19 20:17:19 +02:00
|
|
|
> ,nextValueFor
|
2013-12-31 10:02:26 +01:00
|
|
|
> ,subquery
|
2014-04-19 12:22:11 +02:00
|
|
|
> ,intervalLit
|
2014-04-19 12:10:46 +02:00
|
|
|
> ,specialOpKs
|
2014-04-19 12:22:11 +02:00
|
|
|
> ,idenExpr]
|
2014-04-17 17:32:41 +02:00
|
|
|
> <?> "value expression"
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-17 16:29:49 +01:00
|
|
|
expose the b expression for window frame clause range between
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> valueExprB :: Parser ValueExpr
|
2013-12-31 10:02:26 +01:00
|
|
|
> valueExprB = E.buildExpressionParser (opTable True) term
|
2013-12-17 16:29:49 +01:00
|
|
|
|
2014-04-19 12:10:46 +02:00
|
|
|
== helper parsers
|
2014-04-19 11:47:25 +02:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
This is used in interval literals and in interval type names.
|
|
|
|
|
2014-04-19 11:47:25 +02:00
|
|
|
> intervalQualifier :: Parser (IntervalTypeField,Maybe IntervalTypeField)
|
|
|
|
> intervalQualifier =
|
|
|
|
> (,) <$> intervalField
|
|
|
|
> <*> optionMaybe (keyword_ "to" *> intervalField)
|
|
|
|
> where
|
|
|
|
> intervalField =
|
|
|
|
> Itf
|
|
|
|
> <$> datetimeField
|
|
|
|
> <*> optionMaybe
|
|
|
|
> (parens ((,) <$> unsignedInteger
|
|
|
|
> <*> optionMaybe (comma *> unsignedInteger)))
|
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
TODO: use datetime field in extract also
|
2014-04-19 11:47:25 +02:00
|
|
|
use a data type for the datetime field?
|
|
|
|
|
|
|
|
> datetimeField :: Parser String
|
|
|
|
> datetimeField = choice (map keyword ["year","month","day"
|
|
|
|
> ,"hour","minute","second"])
|
|
|
|
> <?> "datetime field"
|
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
This is used in multiset operations (value expr), selects (query expr)
|
|
|
|
and set operations (query expr).
|
|
|
|
|
2014-05-09 22:26:18 +02:00
|
|
|
> duplicates :: Parser SetQuantifier
|
|
|
|
> duplicates =
|
2014-04-19 11:47:25 +02:00
|
|
|
> choice [All <$ keyword_ "all"
|
|
|
|
> ,Distinct <$ keyword "distinct"]
|
|
|
|
|
2013-12-13 11:39:26 +01:00
|
|
|
-------------------------------------------------
|
|
|
|
|
|
|
|
= query expressions
|
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
== select lists
|
2013-12-13 16:27:02 +01:00
|
|
|
|
2013-12-31 10:31:00 +01:00
|
|
|
> selectItem :: Parser (ValueExpr,Maybe Name)
|
2014-04-16 19:22:42 +02:00
|
|
|
> selectItem = (,) <$> valueExpr <*> optionMaybe als
|
|
|
|
> where als = optional (keyword_ "as") *> name
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-31 10:31:00 +01:00
|
|
|
> selectList :: Parser [(ValueExpr,Maybe Name)]
|
2013-12-13 16:27:02 +01:00
|
|
|
> selectList = commaSep1 selectItem
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
== from
|
|
|
|
|
2013-12-14 12:05:02 +01:00
|
|
|
Here is the rough grammar for joins
|
2013-12-14 09:55:44 +01:00
|
|
|
|
|
|
|
tref
|
2013-12-14 12:05:02 +01:00
|
|
|
(cross | [natural] ([inner] | (left | right | full) [outer])) join
|
|
|
|
tref
|
2013-12-14 09:55:44 +01:00
|
|
|
[on expr | using (...)]
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> from :: Parser [TableRef]
|
2014-04-16 19:22:42 +02:00
|
|
|
> from = keyword_ "from" *> commaSep1 tref
|
2013-12-13 11:39:26 +01:00
|
|
|
> where
|
2014-05-07 20:53:24 +02:00
|
|
|
> -- TODO: use P (a->) for the join tref suffix
|
|
|
|
> -- chainl or buildexpressionparser
|
2013-12-14 12:05:02 +01:00
|
|
|
> tref = nonJoinTref >>= optionSuffix joinTrefSuffix
|
2014-04-17 19:46:16 +02:00
|
|
|
> nonJoinTref = choice
|
|
|
|
> [parens $ choice
|
|
|
|
> [TRQueryExpr <$> queryExpr
|
|
|
|
> ,TRParens <$> tref]
|
|
|
|
> ,TRLateral <$> (keyword_ "lateral"
|
|
|
|
> *> nonJoinTref)
|
|
|
|
> ,do
|
2014-04-18 10:43:37 +02:00
|
|
|
> n <- names
|
2014-04-17 19:46:16 +02:00
|
|
|
> choice [TRFunction n
|
|
|
|
> <$> parens (commaSep valueExpr)
|
2014-05-09 20:37:09 +02:00
|
|
|
> ,pure $ TRSimple n]] <??> aliasSuffix
|
2014-05-12 21:06:29 +02:00
|
|
|
> aliasSuffix = fromAlias <$$> TRAlias
|
2014-04-19 10:18:29 +02:00
|
|
|
> joinTrefSuffix t =
|
|
|
|
> (TRJoin t <$> option False (True <$ keyword_ "natural")
|
|
|
|
> <*> joinType
|
2013-12-14 13:10:46 +01:00
|
|
|
> <*> nonJoinTref
|
2014-04-19 10:18:29 +02:00
|
|
|
> <*> optionMaybe joinCondition)
|
2013-12-14 12:05:02 +01:00
|
|
|
> >>= optionSuffix joinTrefSuffix
|
2014-04-16 19:22:42 +02:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
TODO: factor the join stuff to produce better error messages (and make
|
|
|
|
it more readable)
|
2014-04-18 11:28:05 +02:00
|
|
|
|
2014-04-16 19:22:42 +02:00
|
|
|
> joinType :: Parser JoinType
|
|
|
|
> joinType = choice
|
|
|
|
> [JCross <$ keyword_ "cross" <* keyword_ "join"
|
|
|
|
> ,JInner <$ keyword_ "inner" <* keyword_ "join"
|
|
|
|
> ,JLeft <$ keyword_ "left"
|
|
|
|
> <* optional (keyword_ "outer")
|
|
|
|
> <* keyword_ "join"
|
|
|
|
> ,JRight <$ keyword_ "right"
|
|
|
|
> <* optional (keyword_ "outer")
|
|
|
|
> <* keyword_ "join"
|
|
|
|
> ,JFull <$ keyword_ "full"
|
|
|
|
> <* optional (keyword_ "outer")
|
|
|
|
> <* keyword_ "join"
|
|
|
|
> ,JInner <$ keyword_ "join"]
|
|
|
|
|
2014-04-19 10:18:29 +02:00
|
|
|
> joinCondition :: Parser JoinCondition
|
2014-04-19 14:10:45 +02:00
|
|
|
> joinCondition = choice
|
|
|
|
> [keyword_ "on" >> JoinOn <$> valueExpr
|
|
|
|
> ,keyword_ "using" >> JoinUsing <$> parens (commaSep1 name)]
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-19 14:10:45 +02:00
|
|
|
> fromAlias :: Parser Alias
|
|
|
|
> fromAlias = Alias <$> tableAlias <*> columnAliases
|
2013-12-17 12:41:06 +01:00
|
|
|
> where
|
2014-04-16 19:22:42 +02:00
|
|
|
> tableAlias = optional (keyword_ "as") *> name
|
|
|
|
> columnAliases = optionMaybe $ parens $ commaSep1 name
|
2013-12-17 12:41:06 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
== simple other parts
|
|
|
|
|
|
|
|
Parsers for where, group by, having, order by and limit, which are
|
|
|
|
pretty trivial.
|
|
|
|
|
2014-04-16 19:22:42 +02:00
|
|
|
> whereClause :: Parser ValueExpr
|
|
|
|
> whereClause = keyword_ "where" *> valueExpr
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-16 19:22:42 +02:00
|
|
|
> groupByClause :: Parser [GroupingExpr]
|
2014-04-19 14:10:45 +02:00
|
|
|
> groupByClause = keywords_ ["group","by"] *> commaSep1 groupingExpression
|
2013-12-17 18:27:09 +01:00
|
|
|
> where
|
2014-04-19 14:10:45 +02:00
|
|
|
> groupingExpression = choice
|
2014-04-16 19:22:42 +02:00
|
|
|
> [keyword_ "cube" >>
|
2013-12-17 18:27:09 +01:00
|
|
|
> Cube <$> parens (commaSep groupingExpression)
|
2014-04-16 19:22:42 +02:00
|
|
|
> ,keyword_ "rollup" >>
|
2013-12-17 18:27:09 +01:00
|
|
|
> Rollup <$> parens (commaSep groupingExpression)
|
|
|
|
> ,GroupingParens <$> parens (commaSep groupingExpression)
|
2014-04-18 11:28:05 +02:00
|
|
|
> ,keywords_ ["grouping", "sets"] >>
|
2013-12-17 18:27:09 +01:00
|
|
|
> GroupingSets <$> parens (commaSep groupingExpression)
|
2013-12-19 10:46:51 +01:00
|
|
|
> ,SimpleGroup <$> valueExpr
|
2013-12-17 18:27:09 +01:00
|
|
|
> ]
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> having :: Parser ValueExpr
|
2014-04-16 19:22:42 +02:00
|
|
|
> having = keyword_ "having" *> valueExpr
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> orderBy :: Parser [SortSpec]
|
2014-04-18 11:28:05 +02:00
|
|
|
> orderBy = keywords_ ["order","by"] *> commaSep1 ob
|
2013-12-13 16:08:10 +01:00
|
|
|
> where
|
2013-12-18 15:27:06 +01:00
|
|
|
> ob = SortSpec
|
2013-12-19 10:46:51 +01:00
|
|
|
> <$> valueExpr
|
2014-04-18 10:18:21 +02:00
|
|
|
> <*> option DirDefault (choice [Asc <$ keyword_ "asc"
|
|
|
|
> ,Desc <$ keyword_ "desc"])
|
2013-12-17 17:28:31 +01:00
|
|
|
> <*> option NullsOrderDefault
|
2014-04-18 11:28:05 +02:00
|
|
|
> -- todo: left factor better
|
2014-04-16 19:22:42 +02:00
|
|
|
> (keyword_ "nulls" >>
|
2013-12-17 17:28:31 +01:00
|
|
|
> choice [NullsFirst <$ keyword "first"
|
2014-04-16 19:22:42 +02:00
|
|
|
> ,NullsLast <$ keyword "last"])
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-17 15:00:17 +01:00
|
|
|
allows offset and fetch in either order
|
|
|
|
+ postgresql offset without row(s) and limit instead of fetch also
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> offsetFetch :: Parser (Maybe ValueExpr, Maybe ValueExpr)
|
2013-12-17 15:00:17 +01:00
|
|
|
> offsetFetch = permute ((,) <$?> (Nothing, Just <$> offset)
|
|
|
|
> <|?> (Nothing, Just <$> fetch))
|
2013-12-13 16:27:02 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> offset :: Parser ValueExpr
|
2014-04-16 19:22:42 +02:00
|
|
|
> offset = keyword_ "offset" *> valueExpr
|
|
|
|
> <* option () (choice [keyword_ "rows"
|
|
|
|
> ,keyword_ "row"])
|
2013-12-17 15:00:17 +01:00
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> fetch :: Parser ValueExpr
|
2014-06-27 11:19:15 +02:00
|
|
|
> fetch = fetchFirst <|> limit
|
2014-04-18 22:51:05 +02:00
|
|
|
> where
|
2014-06-28 14:41:11 +02:00
|
|
|
> fetchFirst = guardDialect [SQL2011]
|
|
|
|
> *> fs *> valueExpr <* ro
|
2014-04-18 13:50:54 +02:00
|
|
|
> fs = makeKeywordTree ["fetch first", "fetch next"]
|
|
|
|
> ro = makeKeywordTree ["rows only", "row only"]
|
2014-06-27 11:19:15 +02:00
|
|
|
> -- todo: not in ansi sql dialect
|
2014-06-28 14:41:11 +02:00
|
|
|
> limit = guardDialect [MySQL] *>
|
|
|
|
> keyword_ "limit" *> valueExpr
|
2013-12-13 16:27:02 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
== common table expressions
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> with :: Parser QueryExpr
|
2014-04-16 19:22:42 +02:00
|
|
|
> with = keyword_ "with" >>
|
|
|
|
> With <$> option False (True <$ keyword_ "recursive")
|
2013-12-17 12:41:06 +01:00
|
|
|
> <*> commaSep1 withQuery <*> queryExpr
|
2013-12-13 23:58:12 +01:00
|
|
|
> where
|
2014-04-19 14:10:45 +02:00
|
|
|
> withQuery = (,) <$> (fromAlias <* keyword_ "as")
|
|
|
|
> <*> parens queryExpr
|
2013-12-13 16:27:02 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
== query expression
|
|
|
|
|
|
|
|
This parser parses any query expression variant: normal select, cte,
|
|
|
|
and union, etc..
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> queryExpr :: Parser QueryExpr
|
2014-04-19 14:10:45 +02:00
|
|
|
> queryExpr = choice
|
|
|
|
> [with
|
2014-05-07 20:53:24 +02:00
|
|
|
> ,chainr1 (choice [values,table, select]) setOp]
|
2013-12-13 23:58:12 +01:00
|
|
|
> where
|
2014-04-16 19:22:42 +02:00
|
|
|
> select = keyword_ "select" >>
|
2013-12-17 15:00:17 +01:00
|
|
|
> mkSelect
|
2014-05-09 22:26:18 +02:00
|
|
|
> <$> option SQDefault duplicates
|
2013-12-13 23:58:12 +01:00
|
|
|
> <*> selectList
|
2014-04-17 17:32:41 +02:00
|
|
|
> <*> optionMaybe tableExpression
|
|
|
|
> mkSelect d sl Nothing =
|
|
|
|
> makeSelect{qeSetQuantifier = d, qeSelectList = sl}
|
|
|
|
> mkSelect d sl (Just (TableExpression f w g h od ofs fe)) =
|
|
|
|
> Select d sl f w g h od ofs fe
|
|
|
|
> values = keyword_ "values"
|
|
|
|
> >> Values <$> commaSep (parens (commaSep valueExpr))
|
2014-04-18 10:43:37 +02:00
|
|
|
> table = keyword_ "table" >> Table <$> names
|
2014-04-17 17:32:41 +02:00
|
|
|
|
|
|
|
local data type to help with parsing the bit after the select list,
|
|
|
|
called 'table expression' in the ansi sql grammar. Maybe this should
|
|
|
|
be in the public syntax?
|
|
|
|
|
|
|
|
> data TableExpression
|
|
|
|
> = TableExpression
|
|
|
|
> {_teFrom :: [TableRef]
|
|
|
|
> ,_teWhere :: Maybe ValueExpr
|
|
|
|
> ,_teGroupBy :: [GroupingExpr]
|
|
|
|
> ,_teHaving :: Maybe ValueExpr
|
|
|
|
> ,_teOrderBy :: [SortSpec]
|
|
|
|
> ,_teOffset :: Maybe ValueExpr
|
|
|
|
> ,_teFetchFirst :: Maybe ValueExpr}
|
|
|
|
|
|
|
|
> tableExpression :: Parser TableExpression
|
2014-04-19 14:10:45 +02:00
|
|
|
> tableExpression = mkTe <$> from
|
|
|
|
> <*> optionMaybe whereClause
|
|
|
|
> <*> option [] groupByClause
|
|
|
|
> <*> optionMaybe having
|
|
|
|
> <*> option [] orderBy
|
|
|
|
> <*> offsetFetch
|
2014-04-17 17:32:41 +02:00
|
|
|
> where
|
|
|
|
> mkTe f w g h od (ofs,fe) =
|
|
|
|
> TableExpression f w g h od ofs fe
|
2013-12-13 22:41:12 +01:00
|
|
|
|
2014-05-07 20:53:24 +02:00
|
|
|
> setOp :: Parser (QueryExpr -> QueryExpr -> QueryExpr)
|
|
|
|
> setOp = cq
|
|
|
|
> <$> setOpK
|
2014-05-09 22:26:18 +02:00
|
|
|
> <*> option SQDefault duplicates
|
2014-05-07 20:53:24 +02:00
|
|
|
> <*> corr
|
2014-04-19 14:10:45 +02:00
|
|
|
> where
|
2014-05-07 20:53:24 +02:00
|
|
|
> cq o d c q0 q1 = CombineQueryExpr q0 o d c q1
|
|
|
|
> setOpK = choice [Union <$ keyword_ "union"
|
|
|
|
> ,Intersect <$ keyword_ "intersect"
|
|
|
|
> ,Except <$ keyword_ "except"]
|
2014-04-19 14:10:45 +02:00
|
|
|
> <?> "set operator"
|
|
|
|
> corr = option Respectively (Corresponding <$ keyword_ "corresponding")
|
|
|
|
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
wrapper for query expr which ignores optional trailing semicolon.
|
|
|
|
|
2014-05-07 20:53:24 +02:00
|
|
|
TODO: change style
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> topLevelQueryExpr :: Parser QueryExpr
|
2014-05-07 20:53:24 +02:00
|
|
|
> topLevelQueryExpr = queryExpr <??> (id <$ semi)
|
2013-12-14 09:55:44 +01:00
|
|
|
|
|
|
|
wrapper to parse a series of query exprs from a single source. They
|
|
|
|
must be separated by semicolon, but for the last expression, the
|
|
|
|
trailing semicolon is optional.
|
|
|
|
|
2014-05-07 20:53:24 +02:00
|
|
|
TODO: change style
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> queryExprs :: Parser [QueryExpr]
|
2014-04-19 14:10:45 +02:00
|
|
|
> queryExprs = (:[]) <$> queryExpr
|
2014-05-09 20:37:09 +02:00
|
|
|
> >>= optionSuffix ((semi *>) . pure)
|
2014-04-19 14:10:45 +02:00
|
|
|
> >>= optionSuffix (\p -> (p++) <$> queryExprs)
|
2013-12-13 23:34:05 +01:00
|
|
|
|
2014-04-19 11:47:25 +02:00
|
|
|
----------------------------------------------
|
|
|
|
|
|
|
|
= multi keyword helper
|
|
|
|
|
|
|
|
This helper is to help parsing multiple options of multiple keywords
|
|
|
|
with similar prefixes, e.g. parsing 'is null' and 'is not null'.
|
|
|
|
|
|
|
|
use to left factor/ improve:
|
|
|
|
typed literal and general identifiers
|
|
|
|
not like, not in, not between operators
|
|
|
|
help with factoring keyword functions and other app-likes
|
|
|
|
the join keyword sequences
|
|
|
|
fetch first/next
|
|
|
|
row/rows only
|
|
|
|
|
|
|
|
There is probably a simpler way of doing this but I am a bit
|
|
|
|
thick.
|
|
|
|
|
|
|
|
> makeKeywordTree :: [String] -> Parser [String]
|
|
|
|
> makeKeywordTree sets =
|
|
|
|
> parseTrees (sort $ map words sets)
|
|
|
|
> where
|
|
|
|
> parseTrees :: [[String]] -> Parser [String]
|
|
|
|
> parseTrees ws = do
|
|
|
|
> let gs :: [[[String]]]
|
|
|
|
> gs = groupBy ((==) `on` safeHead) ws
|
|
|
|
> choice $ map parseGroup gs
|
|
|
|
> parseGroup :: [[String]] -> Parser [String]
|
|
|
|
> parseGroup l@((k:_):_) = do
|
|
|
|
> keyword_ k
|
|
|
|
> let tls = catMaybes $ map safeTail l
|
|
|
|
> pr = (k:) <$> parseTrees tls
|
|
|
|
> if (or $ map null tls)
|
2014-05-09 20:37:09 +02:00
|
|
|
> then pr <|> pure [k]
|
2014-04-19 11:47:25 +02:00
|
|
|
> else pr
|
|
|
|
> parseGroup _ = guard False >> error "impossible"
|
|
|
|
> safeHead (x:_) = Just x
|
|
|
|
> safeHead [] = Nothing
|
|
|
|
> safeTail (_:x) = Just x
|
|
|
|
> safeTail [] = Nothing
|
|
|
|
|
2013-12-13 11:39:26 +01:00
|
|
|
------------------------------------------------
|
|
|
|
|
2013-12-14 09:55:44 +01:00
|
|
|
= lexing parsers
|
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
whitespace parser which skips comments also
|
2013-12-16 12:33:05 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> whitespace :: Parser ()
|
|
|
|
> whitespace =
|
|
|
|
> choice [simpleWhitespace *> whitespace
|
|
|
|
> ,lineComment *> whitespace
|
|
|
|
> ,blockComment *> whitespace
|
2014-05-09 20:37:09 +02:00
|
|
|
> ,pure ()] <?> "whitespace"
|
2013-12-14 09:55:44 +01:00
|
|
|
> where
|
2014-04-16 17:58:17 +02:00
|
|
|
> lineComment = try (string "--")
|
|
|
|
> *> manyTill anyChar (void (char '\n') <|> eof)
|
|
|
|
> blockComment = -- no nesting of block comments in SQL
|
|
|
|
> try (string "/*")
|
|
|
|
> -- try used here so it doesn't fail when we see a
|
|
|
|
> -- '*' which isn't followed by a '/'
|
|
|
|
> *> manyTill anyChar (try $ string "*/")
|
|
|
|
> -- use many1 so we can more easily avoid non terminating loops
|
|
|
|
> simpleWhitespace = void $ many1 (oneOf " \t\n")
|
2013-12-17 12:21:36 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> lexeme :: Parser a -> Parser a
|
|
|
|
> lexeme p = p <* whitespace
|
2013-12-17 12:21:36 +01:00
|
|
|
|
2014-04-18 16:51:57 +02:00
|
|
|
> unsignedInteger :: Parser Integer
|
|
|
|
> unsignedInteger = read <$> lexeme (many1 digit) <?> "integer"
|
2013-12-14 09:55:44 +01:00
|
|
|
|
|
|
|
|
|
|
|
number literals
|
|
|
|
|
|
|
|
here is the rough grammar target:
|
|
|
|
|
|
|
|
digits
|
|
|
|
digits.[digits][e[+-]digits]
|
|
|
|
[digits].digits[e[+-]digits]
|
|
|
|
digitse[+-]digits
|
|
|
|
|
2013-12-14 16:09:45 +01:00
|
|
|
numbers are parsed to strings, not to a numeric type. This is to avoid
|
2013-12-14 09:55:44 +01:00
|
|
|
making a decision on how to represent numbers, the client code can
|
|
|
|
make this choice.
|
|
|
|
|
2013-12-31 10:21:03 +01:00
|
|
|
> numberLiteral :: Parser String
|
2014-05-07 20:53:24 +02:00
|
|
|
> numberLiteral = lexeme (
|
2014-05-09 20:37:09 +02:00
|
|
|
> (int <??> (pp dot <??.> pp int)
|
|
|
|
> <|> (++) <$> dot <*> int)
|
|
|
|
> <??> pp expon)
|
2013-12-14 09:55:44 +01:00
|
|
|
> where
|
|
|
|
> int = many1 digit
|
2014-05-07 20:53:24 +02:00
|
|
|
> dot = string "."
|
|
|
|
> expon = (:) <$> oneOf "eE" <*> sInt
|
|
|
|
> sInt = (++) <$> option "" (string "+" <|> string "-") <*> int
|
2014-05-12 21:06:29 +02:00
|
|
|
> pp = (<$$> (++))
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2013-12-14 10:23:58 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> identifier :: Parser String
|
|
|
|
> identifier = lexeme ((:) <$> firstChar <*> many nonFirstChar)
|
2014-04-17 17:32:41 +02:00
|
|
|
> <?> "identifier"
|
2014-04-16 17:58:17 +02:00
|
|
|
> where
|
2014-04-18 11:28:05 +02:00
|
|
|
> firstChar = letter <|> char '_' <?> "identifier"
|
|
|
|
> nonFirstChar = digit <|> firstChar <?> ""
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> quotedIdentifier :: Parser String
|
2014-04-18 20:09:46 +02:00
|
|
|
> quotedIdentifier = quotedIdenHelper
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-18 20:09:46 +02:00
|
|
|
> quotedIdenHelper :: Parser String
|
|
|
|
> quotedIdenHelper =
|
|
|
|
> lexeme (dq *> manyTill anyChar dq >>= optionSuffix moreIden)
|
|
|
|
> <?> "identifier"
|
|
|
|
> where
|
|
|
|
> moreIden s0 = do
|
|
|
|
> void dq
|
|
|
|
> s <- manyTill anyChar dq
|
|
|
|
> optionSuffix moreIden (s0 ++ "\"" ++ s)
|
|
|
|
> dq = char '"' <?> "double quote"
|
|
|
|
|
|
|
|
> uquotedIdentifier :: Parser String
|
|
|
|
> uquotedIdentifier =
|
|
|
|
> try (string "u&" <|> string "U&") *> quotedIdenHelper
|
|
|
|
> <?> "identifier"
|
2014-04-16 17:58:17 +02:00
|
|
|
|
2014-04-17 18:27:18 +02:00
|
|
|
parses an identifier with a : prefix. The : isn't included in the
|
|
|
|
return value
|
|
|
|
|
|
|
|
> hostParameterToken :: Parser String
|
|
|
|
> hostParameterToken = lexeme $ char ':' *> identifier
|
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
todo: work out the symbol parsing better
|
|
|
|
|
|
|
|
> symbol :: String -> Parser String
|
2014-04-17 17:32:41 +02:00
|
|
|
> symbol s = try (lexeme $ do
|
|
|
|
> u <- choice (many1 (char '.') :
|
|
|
|
> map (try . string) [">=","<=","!=","<>","||"]
|
|
|
|
> ++ map (string . (:[])) "+-^*/%~&|<>=")
|
2014-04-16 17:58:17 +02:00
|
|
|
> guard (s == u)
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure s)
|
2014-04-17 17:32:41 +02:00
|
|
|
> <?> s
|
2014-04-16 17:58:17 +02:00
|
|
|
|
|
|
|
> questionMark :: Parser Char
|
2014-04-18 11:28:05 +02:00
|
|
|
> questionMark = lexeme (char '?') <?> "question mark"
|
2014-04-16 17:58:17 +02:00
|
|
|
|
|
|
|
> openParen :: Parser Char
|
|
|
|
> openParen = lexeme $ char '('
|
|
|
|
|
|
|
|
> closeParen :: Parser Char
|
|
|
|
> closeParen = lexeme $ char ')'
|
|
|
|
|
2014-04-17 21:57:33 +02:00
|
|
|
> openBracket :: Parser Char
|
|
|
|
> openBracket = lexeme $ char '['
|
|
|
|
|
|
|
|
> closeBracket :: Parser Char
|
|
|
|
> closeBracket = lexeme $ char ']'
|
|
|
|
|
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> comma :: Parser Char
|
2014-04-18 11:28:05 +02:00
|
|
|
> comma = lexeme (char ',') <?> "comma"
|
2014-04-16 17:58:17 +02:00
|
|
|
|
|
|
|
> semi :: Parser Char
|
2014-04-18 11:28:05 +02:00
|
|
|
> semi = lexeme (char ';') <?> "semicolon"
|
2014-04-16 17:58:17 +02:00
|
|
|
|
2014-04-18 11:28:05 +02:00
|
|
|
> quote :: Parser Char
|
|
|
|
> quote = lexeme (char '\'') <?> "single quote"
|
2014-04-16 17:58:17 +02:00
|
|
|
|
|
|
|
> --stringToken :: Parser String
|
|
|
|
> --stringToken = lexeme (char '\'' *> manyTill anyChar (char '\''))
|
|
|
|
> -- todo: tidy this up, add the prefixes stuff, and add the multiple
|
|
|
|
> -- string stuff
|
|
|
|
> stringToken :: Parser String
|
|
|
|
> stringToken =
|
2014-04-18 11:28:05 +02:00
|
|
|
> lexeme (nlquote *> manyTill anyChar nlquote
|
2014-04-16 17:58:17 +02:00
|
|
|
> >>= optionSuffix moreString)
|
2014-04-17 17:32:41 +02:00
|
|
|
> <?> "string"
|
2013-12-13 11:39:26 +01:00
|
|
|
> where
|
2014-04-17 19:46:16 +02:00
|
|
|
> moreString s0 = choice
|
2014-04-17 18:19:41 +02:00
|
|
|
> [-- handle two adjacent quotes
|
|
|
|
> do
|
2014-04-18 11:28:05 +02:00
|
|
|
> void nlquote
|
|
|
|
> s <- manyTill anyChar nlquote
|
2014-04-17 18:19:41 +02:00
|
|
|
> optionSuffix moreString (s0 ++ "'" ++ s)
|
|
|
|
> ,-- handle string in separate parts
|
|
|
|
> -- e.g. 'part 1' 'part 2'
|
2014-04-18 22:51:05 +02:00
|
|
|
> do --can this whitespace be factored out?
|
2014-04-19 14:10:45 +02:00
|
|
|
> -- since it will be parsed twice when there is no more literal
|
|
|
|
> -- yes: split the adjacent quote and multiline literal
|
|
|
|
> -- into two different suffixes
|
|
|
|
> -- won't need to call lexeme at the top level anymore after this
|
2014-04-18 11:28:05 +02:00
|
|
|
> try (whitespace <* nlquote)
|
|
|
|
> s <- manyTill anyChar nlquote
|
2014-04-17 18:19:41 +02:00
|
|
|
> optionSuffix moreString (s0 ++ s)
|
|
|
|
> ]
|
2014-04-18 11:28:05 +02:00
|
|
|
> -- non lexeme quote
|
|
|
|
> nlquote = char '\'' <?> "single quote"
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
= helper functions
|
|
|
|
|
|
|
|
> keyword :: String -> Parser String
|
2014-04-17 17:32:41 +02:00
|
|
|
> keyword k = try (do
|
2014-04-16 17:58:17 +02:00
|
|
|
> i <- identifier
|
|
|
|
> guard (map toLower i == k)
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure k) <?> k
|
2014-04-16 17:58:17 +02:00
|
|
|
|
2014-04-18 11:28:05 +02:00
|
|
|
helper function to improve error messages
|
|
|
|
|
|
|
|
> keywords_ :: [String] -> Parser ()
|
|
|
|
> keywords_ ks = mapM_ keyword_ ks <?> intercalate " " ks
|
|
|
|
|
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> parens :: Parser a -> Parser a
|
|
|
|
> parens = between openParen closeParen
|
|
|
|
|
2014-04-17 21:57:33 +02:00
|
|
|
> brackets :: Parser a -> Parser a
|
|
|
|
> brackets = between openBracket closeBracket
|
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> commaSep :: Parser a -> Parser [a]
|
|
|
|
> commaSep = (`sepBy` comma)
|
|
|
|
|
|
|
|
> keyword_ :: String -> Parser ()
|
|
|
|
> keyword_ = void . keyword
|
|
|
|
|
|
|
|
> symbol_ :: String -> Parser ()
|
|
|
|
> symbol_ = void . symbol
|
|
|
|
|
|
|
|
> commaSep1 :: Parser a -> Parser [a]
|
|
|
|
> commaSep1 = (`sepBy1` comma)
|
2013-12-14 09:55:44 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
> identifierBlacklist :: [String] -> Parser String
|
2014-04-17 17:32:41 +02:00
|
|
|
> identifierBlacklist bl = try (do
|
2014-04-16 17:58:17 +02:00
|
|
|
> i <- identifier
|
2014-04-18 09:47:39 +02:00
|
|
|
> when (map toLower i `elem` bl) $
|
|
|
|
> fail $ "keyword not allowed here: " ++ i
|
2014-05-09 20:37:09 +02:00
|
|
|
> pure i)
|
2014-04-17 17:32:41 +02:00
|
|
|
> <?> "identifier"
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-06-28 14:43:30 +02:00
|
|
|
> blacklist :: Dialect -> [String]
|
2014-06-28 14:41:11 +02:00
|
|
|
> blacklist = reservedWord
|
2013-12-13 11:39:26 +01:00
|
|
|
|
2014-04-16 17:58:17 +02:00
|
|
|
These blacklisted names are mostly needed when we parse something with
|
|
|
|
an optional alias, e.g. select a a from t. If we write select a from
|
|
|
|
t, we have to make sure the from isn't parsed as an alias. I'm not
|
|
|
|
sure what other places strictly need the blacklist, and in theory it
|
|
|
|
could be tuned differently for each place the identifierString/
|
2014-04-19 14:10:45 +02:00
|
|
|
identifier parsers are used to only blacklist the bare
|
|
|
|
minimum. Something like this might be needed for dialect support, even
|
|
|
|
if it is pretty silly to use a keyword as an unquoted identifier when
|
|
|
|
there is a effing quoting syntax as well.
|
2013-12-14 00:14:23 +01:00
|
|
|
|
2014-04-17 17:32:41 +02:00
|
|
|
The standard has a weird mix of reserved keywords and unreserved
|
|
|
|
keywords (I'm not sure what exactly being an unreserved keyword
|
|
|
|
means).
|
|
|
|
|
2014-06-28 14:43:30 +02:00
|
|
|
> reservedWord :: Dialect -> [String]
|
|
|
|
> reservedWord SQL2011 =
|
2014-04-20 18:38:43 +02:00
|
|
|
> ["abs"
|
2014-04-19 17:01:49 +02:00
|
|
|
> --,"all"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"allocate"
|
|
|
|
> ,"alter"
|
|
|
|
> ,"and"
|
2014-04-19 17:01:49 +02:00
|
|
|
> --,"any"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"are"
|
|
|
|
> ,"array"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"array_agg"
|
|
|
|
> ,"array_max_cardinality"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"as"
|
|
|
|
> ,"asensitive"
|
|
|
|
> ,"asymmetric"
|
|
|
|
> ,"at"
|
|
|
|
> ,"atomic"
|
|
|
|
> ,"authorization"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"avg"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"begin"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"begin_frame"
|
|
|
|
> ,"begin_partition"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"between"
|
|
|
|
> ,"bigint"
|
|
|
|
> ,"binary"
|
|
|
|
> ,"blob"
|
|
|
|
> ,"boolean"
|
|
|
|
> ,"both"
|
|
|
|
> ,"by"
|
|
|
|
> ,"call"
|
|
|
|
> ,"called"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"cardinality"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"cascaded"
|
|
|
|
> ,"case"
|
|
|
|
> ,"cast"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"ceil"
|
|
|
|
> ,"ceiling"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"char"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"char_length"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"character"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"character_length"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"check"
|
|
|
|
> ,"clob"
|
|
|
|
> ,"close"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"coalesce"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"collate"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"collect"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"column"
|
|
|
|
> ,"commit"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"condition"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"connect"
|
|
|
|
> ,"constraint"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"contains"
|
|
|
|
> ,"convert"
|
|
|
|
> --,"corr"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"corresponding"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"count"
|
|
|
|
> --,"covar_pop"
|
|
|
|
> --,"covar_samp"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"create"
|
|
|
|
> ,"cross"
|
|
|
|
> ,"cube"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"cume_dist"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"current"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"current_catalog"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"current_date"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"current_default_transform_group"
|
|
|
|
> --,"current_path"
|
|
|
|
> --,"current_role"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"current_row"
|
|
|
|
> ,"current_schema"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"current_time"
|
|
|
|
> ,"current_timestamp"
|
|
|
|
> ,"current_transform_group_for_type"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"current_user"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"cursor"
|
|
|
|
> ,"cycle"
|
2014-04-19 10:18:29 +02:00
|
|
|
> ,"date"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"day"
|
|
|
|
> ,"deallocate"
|
|
|
|
> ,"dec"
|
2014-04-19 10:18:29 +02:00
|
|
|
> ,"decimal"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"declare"
|
|
|
|
> --,"default"
|
|
|
|
> ,"delete"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"dense_rank"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"deref"
|
|
|
|
> ,"describe"
|
|
|
|
> ,"deterministic"
|
|
|
|
> ,"disconnect"
|
|
|
|
> ,"distinct"
|
|
|
|
> ,"double"
|
|
|
|
> ,"drop"
|
|
|
|
> ,"dynamic"
|
|
|
|
> ,"each"
|
|
|
|
> --,"element"
|
|
|
|
> ,"else"
|
|
|
|
> ,"end"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"end_frame"
|
|
|
|
> ,"end_partition"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"end-exec"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"equals"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"escape"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"every"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"except"
|
|
|
|
> ,"exec"
|
|
|
|
> ,"execute"
|
|
|
|
> ,"exists"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"exp"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"external"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"extract"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"false"
|
|
|
|
> ,"fetch"
|
|
|
|
> ,"filter"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"first_value"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"float"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"floor"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"for"
|
|
|
|
> ,"foreign"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"frame_row"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"free"
|
|
|
|
> ,"from"
|
|
|
|
> ,"full"
|
|
|
|
> ,"function"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"fusion"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"get"
|
|
|
|
> ,"global"
|
|
|
|
> ,"grant"
|
|
|
|
> ,"group"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"grouping"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"groups"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"having"
|
|
|
|
> ,"hold"
|
|
|
|
> --,"hour"
|
|
|
|
> ,"identity"
|
|
|
|
> ,"in"
|
|
|
|
> ,"indicator"
|
|
|
|
> ,"inner"
|
|
|
|
> ,"inout"
|
|
|
|
> ,"insensitive"
|
|
|
|
> ,"insert"
|
|
|
|
> ,"int"
|
|
|
|
> ,"integer"
|
|
|
|
> ,"intersect"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"intersection"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"interval"
|
|
|
|
> ,"into"
|
|
|
|
> ,"is"
|
|
|
|
> ,"join"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"lag"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"language"
|
|
|
|
> ,"large"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"last_value"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"lateral"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"lead"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"leading"
|
|
|
|
> ,"left"
|
|
|
|
> ,"like"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"like_regex"
|
|
|
|
> ,"ln"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"local"
|
|
|
|
> ,"localtime"
|
|
|
|
> ,"localtimestamp"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"lower"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"match"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"max"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"member"
|
|
|
|
> ,"merge"
|
|
|
|
> ,"method"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"min"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"minute"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"mod"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"modifies"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"module"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"month"
|
|
|
|
> ,"multiset"
|
|
|
|
> ,"national"
|
|
|
|
> ,"natural"
|
|
|
|
> ,"nchar"
|
|
|
|
> ,"nclob"
|
|
|
|
> ,"new"
|
|
|
|
> ,"no"
|
|
|
|
> ,"none"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"normalize"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"not"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"nth_value"
|
|
|
|
> ,"ntile"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"null"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"nullif"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"numeric"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"octet_length"
|
|
|
|
> ,"occurrences_regex"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"of"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"offset"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"old"
|
|
|
|
> ,"on"
|
|
|
|
> ,"only"
|
|
|
|
> ,"open"
|
|
|
|
> ,"or"
|
|
|
|
> ,"order"
|
|
|
|
> ,"out"
|
|
|
|
> ,"outer"
|
|
|
|
> ,"over"
|
|
|
|
> ,"overlaps"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"overlay"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"parameter"
|
|
|
|
> ,"partition"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"percent"
|
|
|
|
> --,"percent_rank"
|
|
|
|
> --,"percentile_cont"
|
|
|
|
> --,"percentile_disc"
|
|
|
|
> ,"period"
|
|
|
|
> ,"portion"
|
|
|
|
> ,"position"
|
|
|
|
> ,"position_regex"
|
|
|
|
> ,"power"
|
|
|
|
> ,"precedes"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"precision"
|
|
|
|
> ,"prepare"
|
|
|
|
> ,"primary"
|
|
|
|
> ,"procedure"
|
|
|
|
> ,"range"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"rank"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"reads"
|
|
|
|
> ,"real"
|
|
|
|
> ,"recursive"
|
|
|
|
> ,"ref"
|
|
|
|
> ,"references"
|
|
|
|
> ,"referencing"
|
2014-04-19 17:01:49 +02:00
|
|
|
> --,"regr_avgx"
|
|
|
|
> --,"regr_avgy"
|
|
|
|
> --,"regr_count"
|
|
|
|
> --,"regr_intercept"
|
|
|
|
> --,"regr_r2"
|
|
|
|
> --,"regr_slope"
|
|
|
|
> --,"regr_sxx"
|
|
|
|
> --,"regr_sxy"
|
|
|
|
> --,"regr_syy"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"release"
|
|
|
|
> ,"result"
|
|
|
|
> ,"return"
|
|
|
|
> ,"returns"
|
|
|
|
> ,"revoke"
|
|
|
|
> ,"right"
|
|
|
|
> ,"rollback"
|
|
|
|
> ,"rollup"
|
|
|
|
> --,"row"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"row_number"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"rows"
|
|
|
|
> ,"savepoint"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"scope"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"scroll"
|
|
|
|
> ,"search"
|
|
|
|
> --,"second"
|
|
|
|
> ,"select"
|
|
|
|
> ,"sensitive"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"session_user"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"set"
|
|
|
|
> ,"similar"
|
|
|
|
> ,"smallint"
|
2014-04-19 17:01:49 +02:00
|
|
|
> --,"some"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"specific"
|
|
|
|
> ,"specifictype"
|
|
|
|
> ,"sql"
|
|
|
|
> ,"sqlexception"
|
|
|
|
> ,"sqlstate"
|
|
|
|
> ,"sqlwarning"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"sqrt"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"start"
|
|
|
|
> ,"static"
|
2014-04-20 18:38:43 +02:00
|
|
|
> --,"stddev_pop"
|
|
|
|
> --,"stddev_samp"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"submultiset"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"substring"
|
|
|
|
> ,"substring_regex"
|
|
|
|
> ,"succeeds"
|
|
|
|
> --,"sum"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"symmetric"
|
|
|
|
> ,"system"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"system_time"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"system_user"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"table"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"tablesample"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"then"
|
|
|
|
> ,"time"
|
|
|
|
> ,"timestamp"
|
|
|
|
> ,"timezone_hour"
|
|
|
|
> ,"timezone_minute"
|
|
|
|
> ,"to"
|
|
|
|
> ,"trailing"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"translate"
|
|
|
|
> ,"translate_regex"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"translation"
|
|
|
|
> ,"treat"
|
|
|
|
> ,"trigger"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"truncate"
|
|
|
|
> ,"trim"
|
|
|
|
> ,"trim_array"
|
2014-04-18 23:18:15 +02:00
|
|
|
> --,"true"
|
|
|
|
> ,"uescape"
|
|
|
|
> ,"union"
|
|
|
|
> ,"unique"
|
|
|
|
> --,"unknown"
|
|
|
|
> ,"unnest"
|
|
|
|
> ,"update"
|
|
|
|
> ,"upper"
|
2014-04-19 20:17:19 +02:00
|
|
|
> --,"user"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"using"
|
|
|
|
> --,"value"
|
|
|
|
> ,"values"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"value_of"
|
2014-04-19 17:01:49 +02:00
|
|
|
> --,"var_pop"
|
|
|
|
> --,"var_samp"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"varbinary"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"varchar"
|
|
|
|
> ,"varying"
|
2014-04-20 18:38:43 +02:00
|
|
|
> ,"versioning"
|
2014-04-18 23:18:15 +02:00
|
|
|
> ,"when"
|
|
|
|
> ,"whenever"
|
|
|
|
> ,"where"
|
|
|
|
> ,"width_bucket"
|
|
|
|
> ,"window"
|
|
|
|
> ,"with"
|
|
|
|
> ,"within"
|
|
|
|
> ,"without"
|
|
|
|
> --,"year"
|
|
|
|
> ]
|
2014-06-28 14:41:11 +02:00
|
|
|
|
2014-06-28 14:43:30 +02:00
|
|
|
TODO: create this list properly
|
|
|
|
|
|
|
|
> reservedWord MySQL = reservedWord SQL2011 ++ ["limit"]
|
|
|
|
|
|
|
|
|
2014-06-28 14:41:11 +02:00
|
|
|
-----------
|
|
|
|
|
|
|
|
bit hacky, used to make the dialect available during parsing so
|
|
|
|
different parsers can be used for different dialects
|
|
|
|
|
|
|
|
> type ParseState = Dialect
|
|
|
|
|
|
|
|
> type Parser = Parsec String ParseState
|
|
|
|
|
|
|
|
> guardDialect :: [Dialect] -> Parser ()
|
|
|
|
> guardDialect ds = do
|
|
|
|
> d <- getState
|
|
|
|
> guard (d `elem` ds)
|