1
Fork 0
simple-sql-parser/TODO

399 lines
11 KiB
Plaintext
Raw Normal View History

2016-02-22 22:16:15 +01:00
medium tasks next release
2016-02-22 22:24:25 +01:00
2019-07-07 14:21:20 +02:00
review alters, and think about adding rename versions
which are really common and useful, but not in ansi
https://github.com/JakeWheat/simple-sql-parser/issues/20
finish off going through the keyword list
2019-08-31 10:13:09 +02:00
do more examples
what are the use cases?
sql generator - queries
sql generator - ddl
parsing some sql - for what purpose
generating documentation of ddl
write some sort of trivial sql engine or wrapper around something?
write something that takes sql, modifies it, and outputs the result
lint checker?
do an example of adding some new syntax
-> seems quite a few people are using this
and there are some feature requests
try to give people a path to implement features themselves
goals:
1. if someone might want to use this, give them some toy examples to
help bootstrap them
2. see if can encourage people who want some missing sql to add it
themselves
review main missing sql bits - focus on more mainstream things
could also review main dialects
2016-02-22 22:24:25 +01:00
syntax from hssqlppp:
query hints, join hints
unescaping identifiers and strings
continuation strings testing
2016-02-22 22:28:59 +01:00
add tests for comment pretty printing:
use pretty then lex
2016-02-22 22:16:15 +01:00
work on better dialect design: more basic customizability and rule /
callback driven
2016-02-22 22:16:36 +01:00
review/fix documentation and website
fix the groups for generated tests
2016-02-22 22:24:25 +01:00
check the .cabal file module lists
2016-02-22 22:16:15 +01:00
medium tasks next release + 1
add annotation
lots more negative tests especially for lexing, and for dialects
escape, uescape
post hoc fixity
switch pretty printing to use ansi-wl-pprint
http://conscientiousprogrammer.com/blog/2015/12/17/24-days-of-hackage-2015-day-17-ansi-wl-pprint-avoiding-string-hacking/
error message analysis:
start with a set of bad sql, generate & write
get error messages:
simplified ssp parser
tutorial parser
hssqlppp
and also:
postgres
mysql
sqlserver
oracle
db2
vertica?
evaluate other parsing libs for error messages and general
feasibility, shortlist is:
megaparsec
trifecta
uuparsinglib
other desirables from parsing lib:
incremental parsing
context dependent lexer switch
continue after error
create some benchmarks (to measure performance when modifying for
error messages, and to compare different parser libs for instance)
use quickcheck in lexing
What will make this library nice and complete:
List of all the SQL that it doesn't support
annotation, with positions coming from the parser
dml
ddl
procedural sql
dialects: reasonable support for sql server and oracle, and maybe also
postgres, mysql, teradata, redshift, sqlite, db2, sap stuff, etc.
good work on error messages
fixity code + get it right
review names of syntax
defaults handled better (use default/nothing instead of substituting
in the default)
evaluate uu parsing lib -> could at least remove need to do left
factoring, and maybe help make better error messages also
-----
2014-06-20 11:27:23 +02:00
work on reasonable subset of sql which is similar to the current
2014-09-13 09:45:45 +02:00
subset and smaller than the complete 2011 target: describe the
exact target set for the next release
2014-06-20 11:27:23 +02:00
2014-09-13 09:45:45 +02:00
improve the dialect testing: add notes on what to do
2014-06-20 11:27:23 +02:00
2014-09-13 09:45:45 +02:00
position annotation in the syntax
2014-09-13 09:45:45 +02:00
simple stuff for error message and pretty printing monitoring:
2014-09-13 09:45:45 +02:00
create a sample set of valid statements to pretty print
pretty print these
compare every so often to catch regressions and approve improvements
start with tpch, and then add some others
2014-09-13 09:45:45 +02:00
same with invalid statements to see the error messages
2016-02-22 22:24:25 +01:00
start with some simple scalar exprs and a big query expr which has
2014-09-13 09:45:45 +02:00
stuff (either tokens, whitespace or junk strings)
semi-systematically added and/or removed
fixing the non idiomatic (pun!) suffix parsing:
typename parsing
identifier/app/agg/window parsing
join parsing in trefs (use chain? - tricky because of postfix onExpr)
top level and queryexprs parsing
review names in the syntax for correspondence with sql standard, avoid
gratuitous differences
2014-09-13 09:45:45 +02:00
touch up the expr hack as best as can, start thinking about
replacement for buildExprParser, maybe this can be a separate
general package, or maybe something like this already exists
2014-04-18 22:51:05 +02:00
2014-09-13 09:45:45 +02:00
careful review of token parses wrt trailing delimiters/junk - already
caught a few issues like this incidentally when working on other
stuff
2014-04-18 22:51:05 +02:00
undo mess in the code created by adding lots of new support:
much more documentation
refactor crufty bits
reorder the code
reconsider the names and structure of the constructors in the syntax
refactor the typename parser - it's a real mess
2014-04-21 13:16:45 +02:00
fix the lexing
2014-04-18 19:50:24 +02:00
2014-04-18 22:51:05 +02:00
add documentation in Parser.lhs on the left factoring/error handling
approach
2014-04-18 19:50:24 +02:00
fixes:
keyword tree, add explicit result then can use for joins also
keyword tree support prefix mode so can start from already parsed
token
left factor/try removal summary (this list needs updating):
2014-04-18 22:51:05 +02:00
identifier starts:
interval literal
character set literal
typed literals, multikeywords
identifier
app, agg, window
keyword function
issues in the special op internals
not between + other ops: needs new expression parsing
not in also
in suffix also
lots of overlap with binary and postfix multi keyword operators
quantified comparison also
issues in the typename parsing
dot in identifiers and as operator
issues in the symbol parser
hardcode all the symbols in the symbol parser/split?
conflict with in suffix and in in position
rules for changing the multi keyword parsing:
if a keyword must be followed by another
e.g. left join, want to refactor to produce 'expected "left join"'
if the keyword is optionally followed by another, e.g. with
recursive, then don't do this.
change join defaults to be defaults
2014-04-18 22:51:05 +02:00
2014-04-21 13:16:45 +02:00
rough SQL 2011 todo, including tests to write:
review the commented out reserved keyword entries and work out how to
fix
test case insensitvity and case preservation
big areas:
window functions
nested window functions
case
table ref: tablesample, time period spec, only, unnest, table, lateral
bug
joined table: partitioned joins
group by: set quantifier
window clause
other areas:
unicode escape, strings and idens
character set behaviour review
datetime literals
mixed quoting identifier chains
names/identifiers careful review
general value bits
collate for
numeric val fn
string exp fn
datetime exp fn
interval exp fn
rows
interval qualifier
with
setop
order/offset/fetch
search/cycle
preds:
between
in
like
similar
regex like?
null
normalize
match
overlaps
distinct
member
submultiset
period
alias for * in select list
2014-04-18 22:51:05 +02:00
create list of unsupported syntax: xml, ref, subtypes, modules?
2014-04-18 22:51:05 +02:00
---
2014-04-18 22:51:05 +02:00
after next release
2014-04-16 21:37:18 +02:00
2014-09-13 09:45:45 +02:00
medium term goals:
1. replace parser and syntax in hssqlppp with this code (keep two
separate packages in sync)
2. this replacement should have better error messages, much more
complete ansi sql 2011 support, and probably will have reasonable
support for these dialects: mssql, oracle and teradata.
2014-04-16 21:37:18 +02:00
review areas where this parser is too permissive, e.g. value
expressions allowed where column reference names only should be
allowed, such as group by, order by (perhaps there can be a flag or
warnings or something), unqualified asterisk in select list
2014-04-18 22:51:05 +02:00
fix the expression parser completely: the realistic way is to adjust
for precedence and associativity after parsing since the concrete
syntax is so messy. should also use this expression parser for
parsing joins and for set operations, maybe other areas.
2014-04-16 21:37:18 +02:00
2014-04-18 22:51:05 +02:00
table expression in syntax:
QueryExpr = Select SelectList (Maybe TableExpr)
and the TableExpr contains all the other bits?
2014-04-16 21:37:18 +02:00
2014-04-18 22:51:05 +02:00
change the booleans in the ast to better types for less ambiguity?
decide how to handle character set literals and identifiers: don't
have any intention of actually supporting switching character sets
in the middle of parsing so maybe this would be better disabled?
review places in the parse which should allow only a fixed set of
identifiers (e.g. in interval literals), keep in mind other
dialects and extensibility
decide whether to represent numeric literals better, instead of a
single string - break up into parts, or parse to a Decimal or
something
= future big feature summary
2014-04-16 21:37:18 +02:00
all ansi sql queries
completely working expression tree parsing
error messages, left factor
dml, ddl, procedural sql
position annotation
type checker/ etc.
lexer
dialects
quasi quotes
typesafe sql dbms wrapper support for haskell
extensibility
performance analysis
2014-04-18 22:51:05 +02:00
try out uu-parsing or polyparse, especially wrt error message
improvements
= stuff
2013-12-17 12:58:44 +01:00
2013-12-19 11:15:05 +01:00
try and use the proper css theme
2013-12-19 16:50:25 +01:00
create a header like in the haddock with simple-sql-parser +
contents link
change the toc gen so that it works the same as in haddock (same
div, no links on the actual titles
fix the page margins, and the table stuff: patches to the css?
2013-12-19 11:15:05 +01:00
release checklist:
hlint
haddock review
spell check
update changelog
update website text
2014-01-22 21:07:58 +01:00
regenerate the examples on the index.txt
= Later general tasks:
docs
2013-12-17 21:15:42 +01:00
add preamble to the rendered test page
add links from the supported sql page to the rendered test page for
each section -> have to section up the tests some more
testing
review tests to copy from hssqlppp
2014-01-22 21:07:58 +01:00
add lots more tests using SQL from the xb2 manual
much more table reference tests, for joins and aliases etc.?
review internal sql collection for more syntax/tests
other
----
demo program: convert tpch to sql server syntax exe processor
run through other manuals for example queries and features: sql in a
nutshell, sql guide, sql reference guide, sql standard, sql server
manual, oracle manual, teradata manual + re-through postgresql
manual and make notes in each case of all syntax and which isn't
currently supported also.
2013-12-14 19:20:41 +01:00
check the order of exports, imports and functions/cases in the files
fix up the import namespaces/explicit names nicely
2013-12-17 19:27:11 +01:00
ast checker: checks the ast represents valid syntax, the parser
doesn't check as much as it could, and this can also be used to
check generated trees. Maybe this doesn't belong in this package
though?
2014-04-16 21:37:18 +02:00
= other sql support
top
string literals
full number literals -> other bases?
apply, pivot
2013-12-19 09:27:44 +01:00
maybe add dml and ddl, source poses, quasi quotes
leave: type check, dialects, procedural, separate lexing?
other dialect targets:
postgres
oracle
teradata
ms sql server
mysql?
db2?
what other major dialects are there?
sqlite
sap dbmss (can't work out what are separate products or what are the
dialects)
2015-08-08 19:07:44 +02:00
here is an idea for a little feature:
crunch sql: this takes sql and tries to make it as small as possible
(basically, combining nested selects where possible and inlining
ctes)
expand sql:
breaks apart complex sql using nested queries and ctes, try to make
queries easier to understand in stages