488310ff6a
change the parser to not attempt to parse the elements following 'from' unless there is a actual 'from' improve the symbol parser to try to deal with issues when symbols are next to eachother with no intervening whitespaces improve number literal parsing to fail if there are trailing letters or digits which aren't part of the number and aren't separated with whitespace add some code to start analysing the quality of parse error messages
270 lines
7.8 KiB
Plaintext
270 lines
7.8 KiB
Plaintext
continue 2003 review and tests
|
|
docs: how to run the tests
|
|
touch up the expr hack as best as can
|
|
left factor as much as possible (see below on notes)
|
|
table expression in syntax:
|
|
QueryExpr = Select SelectList (Maybe TableExpr)
|
|
and the TableExpr contains all the other bits?
|
|
finish off ansi 2003 support or specific subset
|
|
summarize todos
|
|
start looking at error messages
|
|
change the booleans in the ast to better types for less ambiguity
|
|
represent missing optional bits in the ast as nothing instead of the
|
|
default
|
|
look at fixing the expression parsing completely
|
|
represent natural and using/on in the syntax more close to the
|
|
concrete syntax - don't combine in the ast
|
|
|
|
review the token parsers, and make sure they have trailing delimiters
|
|
or consume bad trailing characters and fail (e.g. 1e2e3 in a select
|
|
list parses as '1e2 e3' i.e. '1e2 as e3'
|
|
split the general symbol and operator parsing, and make it tighter
|
|
in terms of when the symbol or operator ends (don't allow to end
|
|
early)
|
|
approach: review the lexical syntax, create complete list of
|
|
tokens/token generators. Divide into tokens which must be followed
|
|
by some particular other token or at least one whitespace, and ones
|
|
which can be immediately followed by another token. Then fix the
|
|
lexing parsers to work this way
|
|
whitespace/comments
|
|
integers
|
|
numbers
|
|
string literals
|
|
keywords
|
|
operator symbols <>=+=^%/*!|~&
|
|
non operator symbols ()?,;"'
|
|
identifiers
|
|
quoted identifiers
|
|
|
|
identifiers and keywords are ok for now
|
|
there are issues with integers, numbers, operators and non operator
|
|
symbols
|
|
|
|
|
|
review places in the parse which should allow only a fixed set of
|
|
identifiers (e.g. in interval literals)
|
|
|
|
decide whether to represent numeric literals better, instead of a
|
|
single string - break up into parts, or parse to a Decimal or
|
|
something
|
|
|
|
rough SQL 2003 todo, including tests to write:
|
|
can multipart identifiers have whitespace around the '.'?
|
|
multipart string literals
|
|
national, unicode, hex, bit string literals, escapes
|
|
string literal character sets
|
|
more work on date and time literals
|
|
support "" in delimited identifier
|
|
unicode identifier
|
|
support needed MODULE syntax in identifiers - already covered?
|
|
review qualification names in identifiers support in various contexts
|
|
(e.g. function app, table refs)
|
|
add missing type name support: lots of missing ones here, including
|
|
simple stuff like lob variations, and new things like interval,
|
|
row, ref, scope, array, multiset type names.
|
|
decide how to represent special identifiers including the session
|
|
variables or whatever they are called like current_user
|
|
host :parameter + indicator
|
|
collation stuff, other character set stuff: list what is needed
|
|
array[], multiset[]
|
|
grouping - needs special syntax?
|
|
review window function support and missing bits
|
|
review case expressions
|
|
next value for
|
|
probably leave for now: subtypes, methods, new /routine, dereference
|
|
array element reference
|
|
multiset element reference - maybe nothing to do
|
|
double check associativity, precedence (value exprs, joins, set ops)
|
|
position expressions
|
|
length expressions
|
|
extract expression
|
|
cardinality expression?
|
|
check concatenations
|
|
collations: review where can appear
|
|
substring expressions
|
|
regular expression substring function
|
|
convert
|
|
translate
|
|
trim
|
|
overlay
|
|
specifictype
|
|
datetime value expressions
|
|
intervals
|
|
array value constructors
|
|
multiset value expressions, constructors
|
|
row value constructors, expressions review
|
|
review table value constructor exactly what is allowed
|
|
lots more tests for from clause variations
|
|
tablesamples
|
|
unnest
|
|
table function derived table
|
|
only spec
|
|
join variations, including union join
|
|
review group by
|
|
window clauses
|
|
all fields reference with alias 'select * as (a,b,c) ... '
|
|
search or cycle clause
|
|
between symmetric/asymmetric
|
|
in predicate review
|
|
escape for like
|
|
escape for [not] similar to
|
|
regular expression syntax?
|
|
quantified comparison predicate: represent different from current
|
|
unique predicate
|
|
normalized predicate
|
|
match predicate
|
|
overlaps predicate
|
|
distinct from predicate
|
|
member predicate
|
|
submultiset predicate
|
|
set predicate
|
|
type predicate
|
|
additional stuff review:
|
|
interval stuff
|
|
collate clause
|
|
aggregate functions: lots of missing bits
|
|
complete list of keywords/reserved keywords
|
|
|
|
|
|
review areas where this parser is too permissive, e.g. value
|
|
expressions allowed where column reference names only should be
|
|
allowed, such as group by, order by (perhaps there can be a flag or
|
|
warnings or something), unqualified asterisk in select list
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
left factor/try removal:
|
|
try in the interval literal
|
|
have to left factor with the typed literal "interval 'xxx'" syntax
|
|
+ with identifier
|
|
try in the prefix cast: LF with identifier
|
|
few tries in the specialopks: need review
|
|
+ left factor the start of these (e.g. for function style substring
|
|
and for keyword style substring)
|
|
not between: needs left factoring with a bunch of suffix operators
|
|
subqueries: need left factoring with all the stuff which starts with
|
|
open parens. The subquery ast needs rethink as well
|
|
typename: left factor with identifier
|
|
inSuffix in expr table: conflicts with 'in' keyword in precision -
|
|
left factor
|
|
the binary and postfix multi keyword ops need left factoring since
|
|
several share prefixes
|
|
app needs lf with parens, identifier, etc.
|
|
parens lf in nonJoinTref
|
|
name start lf in nonJoinTref
|
|
|
|
all of the above should help the error messages a lot
|
|
|
|
big feature summary:
|
|
all ansi sql queries
|
|
better expression tree parsing
|
|
error messages, left factor
|
|
dml, ddl, procedural sql
|
|
position annotation
|
|
type checker/ etc.
|
|
lexer
|
|
dialects
|
|
quasi quotes
|
|
typesafe sql dbms wrapper support for haskell
|
|
extensibility
|
|
performance analysis
|
|
|
|
|
|
= next release
|
|
|
|
try and use the proper css theme
|
|
create a header like in the haddock with simple-sql-parser +
|
|
contents link
|
|
change the toc gen so that it works the same as in haddock (same
|
|
div, no links on the actual titles
|
|
fix the page margins, and the table stuff: patches to the css?
|
|
|
|
release checklist:
|
|
hlint
|
|
haddock review
|
|
spell check
|
|
update changelog
|
|
update website text
|
|
regenerate the examples on the index.txt
|
|
|
|
= Later general tasks:
|
|
|
|
docs
|
|
|
|
add to website: pretty printed tpch, maybe other queries as
|
|
demonstration
|
|
|
|
add preamble to the rendered test page
|
|
|
|
add links from the supported sql page to the rendered test page for
|
|
each section -> have to section up the tests some more
|
|
|
|
testing
|
|
|
|
review tests to copy from hssqlppp
|
|
|
|
add lots more tests using SQL from the xb2 manual
|
|
|
|
much more table reference tests, for joins and aliases etc.?
|
|
|
|
review internal sql collection for more syntax/tests
|
|
|
|
other
|
|
|
|
review syntax to replace maybe and bool with better ctors
|
|
|
|
----
|
|
|
|
demo program: convert tpch to sql server syntax exe processor
|
|
|
|
review abstract syntax (e.g. combine App with SpecialOp?)
|
|
|
|
more operators
|
|
|
|
sql server top syntax
|
|
|
|
named windows
|
|
|
|
extended string literals, escapes and other flavours (like pg and
|
|
oracle custom delimiters)
|
|
|
|
run through other manuals for example queries and features: sql in a
|
|
nutshell, sql guide, sql reference guide, sql standard, sql server
|
|
manual, oracle manual, teradata manual + re-through postgresql
|
|
manual and make notes in each case of all syntax and which isn't
|
|
currently supported also.
|
|
|
|
check the order of exports, imports and functions/cases in the files
|
|
fix up the import namespaces/explicit names nicely
|
|
|
|
ast checker: checks the ast represents valid syntax, the parser
|
|
doesn't check as much as it could, and this can also be used to
|
|
check generated trees. Maybe this doesn't belong in this package
|
|
though?
|
|
|
|
= other sql support
|
|
|
|
full number literals -> other bases?
|
|
apply, pivot
|
|
|
|
other dialect targets:
|
|
postgres
|
|
oracle
|
|
teradata
|
|
ms sql server
|
|
mysql?
|
|
db2?
|
|
what other major dialects are there?
|
|
sqlite
|
|
sap dbmss (can't work out what are separate products or what are the
|
|
dialects)
|
|
|
|
maybe later: other dml
|
|
insert, update, delete, truncate, merge + set, show?
|
|
copy, execute?, explain?, begin/end/rollback?
|
|
|