1
Fork 0
simple-sql-parser/TODO
2014-04-18 00:36:50 +03:00

261 lines
7.6 KiB
Plaintext

continue 2003 review and tests
docs: how to run the tests
touch up the expr hack as best as can
left factor as much as possible (see below on notes)
table expression in syntax:
QueryExpr = Select SelectList (Maybe TableExpr)
and the TableExpr contains all the other bits?
finish off ansi 2003 support or specific subset
summarize todos
start looking at error messages
change the booleans in the ast to better types for less ambiguity
represent missing optional bits in the ast as nothing instead of the
default
look at fixing the expression parsing completely
represent natural and using/on in the syntax more close to the
concrete syntax - don't combine in the ast
review the token parsers, and make sure they have trailing delimiters
or consume bad trailing characters and fail (e.g. 1e2e3 in a select
list parses as '1e2 e3' i.e. '1e2 as e3'
split the general symbol and operator parsing, and make it tighter
in terms of when the symbol or operator ends (don't allow to end
early)
approach: review the lexical syntax, create complete list of
tokens/token generators. Divide into tokens which must be followed
by some particular other token or at least one whitespace, and ones
which can be immediately followed by another token. Then fix the
lexing parsers to work this way
whitespace/comments
integers
numbers
string literals
keywords
operator symbols <>=+=^%/*!|~&
non operator symbols ()?,;"'
identifiers
quoted identifiers
identifiers and keywords are ok for now
there are issues with integers, numbers, operators and non operator
symbols
review places in the parse which should allow only a fixed set of
identifiers (e.g. in interval literals)
decide whether to represent numeric literals better, instead of a
single string - break up into parts, or parse to a Decimal or
something
rough SQL 2003 todo, including tests to write:
can multipart identifiers have whitespace around the '.'?
more work on date and time literals
support "" in delimited identifier
unicode identifier
support needed MODULE syntax in identifiers - already covered?
review qualification names in identifiers support in various contexts
(e.g. function app, table refs)
add missing type name support: lots of missing ones here, including
simple stuff like lob variations, and new things like interval,
row, ref, scope, array, multiset type names.
decide how to represent special identifiers including the session
variables or whatever they are called like current_user
multiset[]
grouping - needs special syntax?
review window function support and missing bits
review case expressions
next value for
probably leave for now: subtypes, methods, new /routine, dereference
multiset element reference - maybe nothing to do
double check associativity, precedence (value exprs, joins, set ops)
position expressions
length expressions
extract expression
cardinality expression?
check concatenations
substring expressions
regular expression substring function
convert
translate
trim
overlay
specifictype
datetime value expressions
intervals
multiset value expressions, constructors
row value constructors, expressions review
review table value constructor exactly what is allowed
lots more tests for from clause variations
tablesamples
unnest
table function derived table
only spec
join variations, including union join
review group by
window clauses
all fields reference with alias 'select * as (a,b,c) ... '
search or cycle clause
between symmetric/asymmetric
in predicate review
escape for like
escape for [not] similar to
regular expression syntax?
quantified comparison predicate: represent different from current
normalized predicate
overlaps predicate
distinct from predicate
member predicate
submultiset predicate
set predicate
type predicate
additional stuff review:
interval stuff
aggregate functions: lots of missing bits
especially: filter where, within group
complete list of keywords/reserved keywords
select into
other language format identifiers for host params?
review areas where this parser is too permissive, e.g. value
expressions allowed where column reference names only should be
allowed, such as group by, order by (perhaps there can be a flag or
warnings or something), unqualified asterisk in select list
left factor/try removal:
try in the interval literal
have to left factor with the typed literal "interval 'xxx'" syntax
+ with identifier
try in the prefix cast: LF with identifier
few tries in the specialopks: need review
+ left factor the start of these (e.g. for function style substring
and for keyword style substring)
not between: needs left factoring with a bunch of suffix operators
subqueries: need left factoring with all the stuff which starts with
open parens. The subquery ast needs rethink as well
typename: left factor with identifier
inSuffix in expr table: conflicts with 'in' keyword in precision -
left factor
the binary and postfix multi keyword ops need left factoring since
several share prefixes
app needs lf with parens, identifier, etc.
parens lf in nonJoinTref
name start lf in nonJoinTref
all of the above should help the error messages a lot
big feature summary:
all ansi sql queries
better expression tree parsing
error messages, left factor
dml, ddl, procedural sql
position annotation
type checker/ etc.
lexer
dialects
quasi quotes
typesafe sql dbms wrapper support for haskell
extensibility
performance analysis
= next release
try and use the proper css theme
create a header like in the haddock with simple-sql-parser +
contents link
change the toc gen so that it works the same as in haddock (same
div, no links on the actual titles
fix the page margins, and the table stuff: patches to the css?
release checklist:
hlint
haddock review
spell check
update changelog
update website text
regenerate the examples on the index.txt
= Later general tasks:
docs
add to website: pretty printed tpch, maybe other queries as
demonstration
add preamble to the rendered test page
add links from the supported sql page to the rendered test page for
each section -> have to section up the tests some more
testing
review tests to copy from hssqlppp
add lots more tests using SQL from the xb2 manual
much more table reference tests, for joins and aliases etc.?
review internal sql collection for more syntax/tests
other
review syntax to replace maybe and bool with better ctors
----
demo program: convert tpch to sql server syntax exe processor
review abstract syntax (e.g. combine App with SpecialOp?)
more operators
sql server top syntax
named windows
extended string literals, escapes and other flavours (like pg and
oracle custom delimiters)
run through other manuals for example queries and features: sql in a
nutshell, sql guide, sql reference guide, sql standard, sql server
manual, oracle manual, teradata manual + re-through postgresql
manual and make notes in each case of all syntax and which isn't
currently supported also.
check the order of exports, imports and functions/cases in the files
fix up the import namespaces/explicit names nicely
ast checker: checks the ast represents valid syntax, the parser
doesn't check as much as it could, and this can also be used to
check generated trees. Maybe this doesn't belong in this package
though?
= other sql support
full number literals -> other bases?
apply, pivot
other dialect targets:
postgres
oracle
teradata
ms sql server
mysql?
db2?
what other major dialects are there?
sqlite
sap dbmss (can't work out what are separate products or what are the
dialects)
maybe later: other dml
insert, update, delete, truncate, merge + set, show?
copy, execute?, explain?, begin/end/rollback?