402 lines
		
	
	
	
		
			11 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			402 lines
		
	
	
	
		
			11 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| medium tasks next release
 | |
| 
 | |
| review alters, and think about adding rename versions
 | |
|   which are really common and useful, but not in ansi
 | |
|   https://github.com/JakeWheat/simple-sql-parser/issues/20
 | |
| 
 | |
| try to get some control over the pretty printing and the error
 | |
| messages by creating some dumps of pretty printing and error messages,
 | |
| then can rerun these every so often to see how they've changed
 | |
| 
 | |
| finish off going through the keyword list
 | |
| 
 | |
| do more examples
 | |
| what are the use cases?
 | |
|   sql generator - queries
 | |
|   sql generator - ddl
 | |
|   parsing some sql - for what purpose
 | |
|     generating documentation of ddl
 | |
|     write some sort of trivial sql engine or wrapper around something?
 | |
|     write something that takes sql, modifies it, and outputs the result
 | |
|     lint checker?
 | |
| 
 | |
| do an example of adding some new syntax
 | |
| -> seems quite a few people are using this
 | |
|   and there are some feature requests
 | |
|   try to give people a path to implement features themselves
 | |
| 
 | |
| goals:
 | |
| 
 | |
| 1. if someone might want to use this, give them some toy examples to
 | |
| help bootstrap them
 | |
| 
 | |
| 2. see if can encourage people who want some missing sql to add it
 | |
| themselves
 | |
| 
 | |
| 
 | |
| 
 | |
| review main missing sql bits - focus on more mainstream things
 | |
|   could also review main dialects
 | |
| 
 | |
| 
 | |
| syntax from hssqlppp:
 | |
|   query hints, join hints
 | |
| 
 | |
| unescaping identifiers and strings
 | |
| continuation strings testing
 | |
| 
 | |
| add tests for comment pretty printing:
 | |
|   use pretty then lex
 | |
| 
 | |
| work on better dialect design: more basic customizability and rule /
 | |
|    callback driven
 | |
| 
 | |
| review/fix documentation and website
 | |
| fix the groups for generated tests
 | |
| 
 | |
| check the .cabal file module lists
 | |
| 
 | |
| 
 | |
| medium tasks next release + 1
 | |
| add annotation
 | |
| lots more negative tests especially for lexing, and for dialects
 | |
| escape, uescape
 | |
| post hoc fixity
 | |
| switch pretty printing to use ansi-wl-pprint
 | |
|   http://conscientiousprogrammer.com/blog/2015/12/17/24-days-of-hackage-2015-day-17-ansi-wl-pprint-avoiding-string-hacking/
 | |
| 
 | |
| 
 | |
| error message analysis:
 | |
| start with a set of bad sql, generate & write
 | |
| get error messages:
 | |
|   simplified ssp parser
 | |
|   tutorial parser
 | |
|   hssqlppp
 | |
|   and also:
 | |
|     postgres
 | |
|     mysql
 | |
|     sqlserver
 | |
|     oracle
 | |
|     db2
 | |
|     vertica?
 | |
| evaluate other parsing libs for error messages and general
 | |
|    feasibility, shortlist is:
 | |
|    megaparsec
 | |
|    trifecta
 | |
|    uuparsinglib
 | |
|    other desirables from parsing lib:
 | |
|      incremental parsing
 | |
|      context dependent lexer switch
 | |
|      continue after error
 | |
| 
 | |
| create some benchmarks (to measure performance when modifying for
 | |
|    error messages, and to compare different parser libs for instance)
 | |
| 
 | |
| use quickcheck in lexing
 | |
| 
 | |
| What will make this library nice and complete:
 | |
| List of all the SQL that it doesn't support
 | |
| annotation, with positions coming from the parser
 | |
| dml
 | |
| ddl
 | |
| procedural sql
 | |
| dialects: reasonable support for sql server and oracle, and maybe also
 | |
|    postgres, mysql, teradata, redshift, sqlite, db2, sap stuff, etc.
 | |
| good work on error messages
 | |
| fixity code + get it right
 | |
| review names of syntax
 | |
| defaults handled better (use default/nothing instead of substituting
 | |
|    in the default)
 | |
| evaluate uu parsing lib -> could at least remove need to do left
 | |
|    factoring, and maybe help make better error messages also
 | |
| -----
 | |
| 
 | |
| work on reasonable subset of sql which is similar to the current
 | |
|    subset and smaller than the complete 2011 target: describe the
 | |
|    exact target set for the next release
 | |
| 
 | |
| improve the dialect testing: add notes on what to do
 | |
| 
 | |
| position annotation in the syntax
 | |
| 
 | |
| simple stuff for error message and pretty printing monitoring:
 | |
| 
 | |
| create a sample set of valid statements to pretty print
 | |
| pretty print these
 | |
| compare every so often to catch regressions and approve improvements
 | |
| start with tpch, and then add some others
 | |
| 
 | |
| same with invalid statements to see the error messages
 | |
| start with some simple scalar exprs and a big query expr which has
 | |
|    stuff (either tokens, whitespace or junk strings)
 | |
|    semi-systematically added and/or removed
 | |
| 
 | |
| fixing the non idiomatic (pun!) suffix parsing:
 | |
|   typename parsing
 | |
|   identifier/app/agg/window parsing
 | |
|   join parsing in trefs (use chain? - tricky because of postfix onExpr)
 | |
|   top level and queryexprs parsing
 | |
| 
 | |
| review names in the syntax for correspondence with sql standard, avoid
 | |
|    gratuitous differences
 | |
| 
 | |
| touch up the expr hack as best as can, start thinking about
 | |
|    replacement for buildExprParser, maybe this can be a separate
 | |
|    general package, or maybe something like this already exists
 | |
| 
 | |
| careful review of token parses wrt trailing delimiters/junk - already
 | |
|    caught a few issues like this incidentally when working on other
 | |
|    stuff
 | |
| 
 | |
| undo mess in the code created by adding lots of new support:
 | |
| much more documentation
 | |
| refactor crufty bits
 | |
| reorder the code
 | |
| reconsider the names and structure of the constructors in the syntax
 | |
| refactor the typename parser - it's a real mess
 | |
| fix the lexing
 | |
| 
 | |
| add documentation in Parser.lhs on the left factoring/error handling
 | |
|    approach
 | |
| 
 | |
| fixes:
 | |
| 
 | |
| keyword tree, add explicit result then can use for joins also
 | |
| 
 | |
| keyword tree support prefix mode so can start from already parsed
 | |
|    token
 | |
| 
 | |
| left factor/try removal summary (this list needs updating):
 | |
| 
 | |
| identifier starts:
 | |
|   interval literal
 | |
|   character set literal
 | |
|   typed literals, multikeywords
 | |
|   identifier
 | |
|   app, agg, window
 | |
|   keyword function
 | |
| issues in the special op internals
 | |
| not between + other ops: needs new expression parsing
 | |
|   not in also
 | |
|   in suffix also
 | |
|   lots of overlap with binary and postfix multi keyword operators
 | |
|   quantified comparison also
 | |
| issues in the typename parsing
 | |
| dot in identifiers and as operator
 | |
| issues in the symbol parser
 | |
|   hardcode all the symbols in the symbol parser/split?
 | |
| conflict with in suffix and in in position
 | |
| 
 | |
| rules for changing the multi keyword parsing:
 | |
|   if a keyword must be followed by another
 | |
|     e.g. left join, want to refactor to produce 'expected "left join"'
 | |
|   if the keyword is optionally followed by another, e.g. with
 | |
|    recursive, then don't do this.
 | |
| 
 | |
| change join defaults to be defaults
 | |
| 
 | |
| 
 | |
| rough SQL 2011 todo, including tests to write:
 | |
| 
 | |
| review the commented out reserved keyword entries and work out how to
 | |
|    fix
 | |
| 
 | |
| test case insensitvity and case preservation
 | |
| 
 | |
| big areas:
 | |
| window functions
 | |
| nested window functions
 | |
| case
 | |
| 
 | |
| table ref: tablesample, time period spec, only, unnest, table, lateral
 | |
|    bug
 | |
| joined table: partitioned joins
 | |
| group by: set quantifier
 | |
| window clause
 | |
| 
 | |
| other areas:
 | |
| unicode escape, strings and idens
 | |
| character set behaviour review
 | |
| datetime literals
 | |
| mixed quoting identifier chains
 | |
| names/identifiers careful review
 | |
| general value bits
 | |
|   collate for
 | |
| numeric val fn
 | |
| string exp fn
 | |
| datetime exp fn
 | |
| interval exp fn
 | |
| rows
 | |
| interval qualifier
 | |
| with
 | |
| setop
 | |
| order/offset/fetch
 | |
| search/cycle
 | |
| preds:
 | |
| between
 | |
| in
 | |
| like
 | |
| similar
 | |
| regex like?
 | |
| null
 | |
| normalize
 | |
| match
 | |
| overlaps
 | |
| distinct
 | |
| member
 | |
| submultiset
 | |
| period
 | |
| 
 | |
| alias for * in select list
 | |
| 
 | |
| create list of unsupported syntax: xml, ref, subtypes, modules?
 | |
| 
 | |
| ---
 | |
| 
 | |
| 
 | |
| 
 | |
| after next release
 | |
| 
 | |
| medium term goals:
 | |
| 1. replace parser and syntax in hssqlppp with this code (keep two
 | |
|    separate packages in sync)
 | |
| 2. this replacement should have better error messages, much more
 | |
|    complete ansi sql 2011 support, and probably will have reasonable
 | |
|    support for these dialects: mssql, oracle and teradata.
 | |
| 
 | |
| review areas where this parser is too permissive, e.g. value
 | |
|    expressions allowed where column reference names only should be
 | |
|    allowed, such as group by, order by (perhaps there can be a flag or
 | |
|    warnings or something), unqualified asterisk in select list
 | |
| 
 | |
| fix the expression parser completely: the realistic way is to adjust
 | |
|    for precedence and associativity after parsing since the concrete
 | |
|    syntax is so messy. should also use this expression parser for
 | |
|    parsing joins and for set operations, maybe other areas.
 | |
| 
 | |
| table expression in syntax:
 | |
|   QueryExpr = Select SelectList (Maybe TableExpr)
 | |
|   and the TableExpr contains all the other bits?
 | |
| 
 | |
| change the booleans in the ast to better types for less ambiguity?
 | |
| 
 | |
| decide how to handle character set literals and identifiers: don't
 | |
|    have any intention of actually supporting switching character sets
 | |
|    in the middle of parsing so maybe this would be better disabled?
 | |
| 
 | |
| review places in the parse which should allow only a fixed set of
 | |
|    identifiers (e.g. in interval literals), keep in mind other
 | |
|    dialects and extensibility
 | |
| 
 | |
| decide whether to represent numeric literals better, instead of a
 | |
|    single string - break up into parts, or parse to a Decimal or
 | |
|    something
 | |
| 
 | |
| 
 | |
| = future big feature summary
 | |
| 
 | |
| all ansi sql queries
 | |
| completely working expression tree parsing
 | |
| error messages, left factor
 | |
| dml, ddl, procedural sql
 | |
| position annotation
 | |
| type checker/ etc.
 | |
| lexer
 | |
| dialects
 | |
| quasi quotes
 | |
| typesafe sql dbms wrapper support for haskell
 | |
| extensibility
 | |
| performance analysis
 | |
| 
 | |
| try out uu-parsing or polyparse, especially wrt error message
 | |
|    improvements
 | |
| 
 | |
| = stuff
 | |
| 
 | |
| try and use the proper css theme
 | |
|   create a header like in the haddock with simple-sql-parser +
 | |
|     contents link
 | |
|   change the toc gen so that it works the same as in haddock (same
 | |
|     div, no links on the actual titles
 | |
|   fix the page margins, and the table stuff: patches to the css?
 | |
| 
 | |
| release checklist:
 | |
| hlint
 | |
| haddock review
 | |
| spell check
 | |
| update changelog
 | |
| update website text
 | |
| regenerate the examples on the index.txt
 | |
| 
 | |
| = Later general tasks:
 | |
| 
 | |
| docs
 | |
| 
 | |
| add preamble to the rendered test page
 | |
| 
 | |
| add links from the supported sql page to the rendered test page for
 | |
|    each section -> have to section up the tests some more
 | |
| 
 | |
| testing
 | |
| 
 | |
| review tests to copy from hssqlppp
 | |
| 
 | |
| add lots more tests using SQL from the xb2 manual
 | |
| 
 | |
| much more table reference tests, for joins and aliases etc.?
 | |
| 
 | |
| review internal sql collection for more syntax/tests
 | |
| 
 | |
| other
 | |
| 
 | |
| ----
 | |
| 
 | |
| demo program: convert tpch to sql server syntax exe processor
 | |
| 
 | |
| run through other manuals for example queries and features: sql in a
 | |
|    nutshell, sql guide, sql reference guide, sql standard, sql server
 | |
|    manual, oracle manual, teradata manual + re-through postgresql
 | |
|    manual and make notes in each case of all syntax and which isn't
 | |
|    currently supported also.
 | |
| 
 | |
| check the order of exports, imports and functions/cases in the files
 | |
| fix up the import namespaces/explicit names nicely
 | |
| 
 | |
| ast checker: checks the ast represents valid syntax, the parser
 | |
|    doesn't check as much as it could, and this can also be used to
 | |
|    check generated trees. Maybe this doesn't belong in this package
 | |
|    though?
 | |
| 
 | |
| = other sql support
 | |
| 
 | |
| top
 | |
| string literals
 | |
| full number literals -> other bases?
 | |
| apply, pivot
 | |
| 
 | |
| maybe add dml and ddl, source poses, quasi quotes
 | |
| 
 | |
| leave: type check, dialects, procedural, separate lexing?
 | |
| 
 | |
| other dialect targets:
 | |
| postgres
 | |
| oracle
 | |
| teradata
 | |
| ms sql server
 | |
| mysql?
 | |
| db2?
 | |
| what other major dialects are there?
 | |
| sqlite
 | |
| sap dbmss (can't work out what are separate products or what are the
 | |
|    dialects)
 | |
| 
 | |
| 
 | |
| 
 | |
| here is an idea for a little feature:
 | |
| crunch sql: this takes sql and tries to make it as small as possible
 | |
|   (basically, combining nested selects where possible and inlining
 | |
|    ctes)
 | |
| expand sql:
 | |
|   breaks apart complex sql using nested queries and ctes, try to make
 | |
|    queries easier to understand in stages
 | |
| 
 | 
