work around for haddock refusal to parse literal comment lines with *

in first character position get rid of code_units since these are not in sql2011 implement next value for parse the nullary functions with reserved names updates to the sql2003 file
2014-04-19 21:17:19 +03:00 · 2014-04-19 21:17:19 +03:00 · 7a7f4ba7aa
commit 7a7f4ba7aa
parent 7057241974
7 changed files with 167 additions and 66 deletions
--- a/Language/SQL/SimpleSQL/Parser.lhs
+++ b/Language/SQL/SimpleSQL/Parser.lhs
@ -38,9 +38,9 @@ lexers, this isn't 100% complete at the moment and needs fixing.
 The parsing code is aggressively left factored, and try is avoided as
 much as possible. Try is avoided because:
-* when it is overused it makes the code hard to follow
+ * when it is overused it makes the code hard to follow
-* when it is overused it makes the parsing code harder to debug
+ * when it is overused it makes the parsing code harder to debug
-* it makes the parser error messages much worse
+ * it makes the parser error messages much worse
 The code could be made a bit simpler with a few extra 'trys', but this
 isn't done because of the impact on the parser error
@ -74,9 +74,9 @@ syntax.
 There are three big areas which are tricky to left factor:
-* typenames
+ * typenames
-* value expressions which can start with an identifier
+ * value expressions which can start with an identifier
-* infix and suffix operators
+ * infix and suffix operators
 === typenames
@ -97,12 +97,12 @@ error messages really bad.
 Here is a list of these nodes:
-* identifiers
+ * identifiers
-* function application
+ * function application
-* aggregate application
+ * aggregate application
-* window application
+ * window application
-* typed literal: typename 'literal string'
+ * typed literal: typename 'literal string'
-* interval literal which is like the typed literal with some extras
+ * interval literal which is like the typed literal with some extras
 There is further ambiguity e.g. with typed literals with precision,
 functions, aggregates, etc. - these are an identifier, followed by
@ -113,12 +113,12 @@ is.
 There is also a set of nodes which start with an identifier/keyword
 but can commit since no other syntax can start the same way:
-* case
+ * case
-* cast
+ * cast
-* exists, unique subquery
+ * exists, unique subquery
-* array constructor
+ * array constructor
-* multiset constructor
+ * multiset constructor
-* all the special syntax functions: extract, position, substring,
+ * all the special syntax functions: extract, position, substring,
  convert, translate, overlay, trim, etc.
 The interval literal mentioned above is treated in this group at the
@ -143,10 +143,10 @@ standard which is able to eliminate a number of possibilities just in
 the grammar, which this parser allows. This is done for a number of
 reasons:
-* it makes the parser simple - less variations
+ * it makes the parser simple - less variations
-* it should allow for dialects and extensibility more easily in the
+ * it should allow for dialects and extensibility more easily in the
  future (e.g. new infix binary operators with custom precedence)
-* many things which are effectively checked in the grammar in the
+ * many things which are effectively checked in the grammar in the
  standard, can be checked using a typechecker or other simple static
  analysis
@ -481,7 +481,6 @@ TODO: this code needs heavy refactoring
 >                     ,return Nothing]
 >         return (p,x)
 >     lobUnits = choice [LobCharacters <$ keyword_ "characters"
 >                       ,LobCodeUnits <$ keyword_ "code_units"
 >                       ,LobOctets <$ keyword_ "octets"]
 >     -- deal with multiset and array suffixes
 >     tnSuffix x =
@ -657,6 +656,10 @@ multiset(query expr). It must be there for compatibility or something.
 >     ,keyword_ "table" >>
 >      MultisetQueryCtor <$> parens queryExpr]
 > nextValueFor :: Parser ValueExpr
 > nextValueFor = keywords_ ["next","value","for"] >>
 >     NextValueFor <$> names
 === interval
 interval literals are a special case and we follow the grammar less
@ -1161,6 +1164,7 @@ fragile and could at least do with some heavy explanation.
 >               ,cast
 >               ,arrayCtor
 >               ,multisetCtor
 >               ,nextValueFor
 >               ,subquery
 >               ,intervalLit
 >               ,specialOpKs
@ -1740,7 +1744,7 @@ means).
 >     ,"class_origin"
 >     ,"coalesce"
 >     ,"cobol"
->     ,"code_units"
+>     --,"code_units"
 >     ,"collation"
 >     ,"collation_catalog"
 >     ,"collation_name"
@ -2002,13 +2006,13 @@ means).
 >     ,"cube"
 >     ,"current"
 >     --,"current_date"
->     ,"current_default_transform_group"
+>     --,"current_default_transform_group"
->     ,"current_path"
+>     --,"current_path"
->     ,"current_role"
+>     --,"current_role"
 >     ,"current_time"
 >     ,"current_timestamp"
 >     ,"current_transform_group_for_type"
->     ,"current_user"
+>     --,"current_user"
 >     ,"cursor"
 >     ,"cycle"
 >     ,"date"
@ -2052,7 +2056,7 @@ means).
 >     ,"global"
 >     ,"grant"
 >     ,"group"
->     ,"grouping"
+>     --,"grouping"
 >     ,"having"
 >     ,"hold"
 >     --,"hour"
@ -2088,7 +2092,7 @@ means).
 >     ,"method"
 >     --,"minute"
 >     ,"modifies"
->     ,"module"
+>     --,"module"
 >     --,"month"
 >     ,"multiset"
 >     ,"national"
@ -2151,7 +2155,7 @@ means).
 >     --,"second"
 >     ,"select"
 >     ,"sensitive"
->     ,"session_user"
+>     --,"session_user"
 >     --,"set"
 >     ,"similar"
 >     ,"smallint"
@ -2167,7 +2171,7 @@ means).
 >     ,"submultiset"
 >     ,"symmetric"
 >     ,"system"
->     ,"system_user"
+>     --,"system_user"
 >     ,"table"
 >     ,"then"
 >     ,"time"
@ -2187,7 +2191,7 @@ means).
 >     ,"unnest"
 >     ,"update"
 >     ,"upper"
->     ,"user"
+>     --,"user"
 >     ,"using"
 >     --,"value"
 >     ,"values"
--- a/Language/SQL/SimpleSQL/Pretty.lhs
+++ b/Language/SQL/SimpleSQL/Pretty.lhs
@ -218,6 +218,9 @@ which have been changed to try to improve the layout of the output.
 > valueExpr (Collate v c) =
 >     valueExpr v <+> text "collate" <+> names c
 > valueExpr (NextValueFor ns) =
 >     text "next value for" <+> names ns
 > doubleUpQuotes :: String -> String
 > doubleUpQuotes [] = []
@ -263,7 +266,6 @@ which have been changed to try to improve the layout of the output.
 >                            LobG -> text "G") m
 >                <+> me (\x -> case x of
 >                        LobCharacters -> text "CHARACTERS"
 >                        LobCodeUnits -> text "CODE_UNITS"
 >                        LobOctets -> text "OCTETS") u)
 > typeName (CharTypeName t i cs col) =
 >     names t
--- a/Language/SQL/SimpleSQL/Syntax.lhs
+++ b/Language/SQL/SimpleSQL/Syntax.lhs
@ -158,6 +158,7 @@
 >     | MultisetBinOp ValueExpr CombineOp SetQuantifier ValueExpr
 >     | MultisetCtor [ValueExpr]
 >     | MultisetQueryCtor QueryExpr
 >     | NextValueFor [Name]
 >       deriving (Eq,Show,Read,Data,Typeable)
 > -- | Represents an identifier name, which can be quoted or unquoted.
@ -190,7 +191,6 @@ TODO: add ref and scope, any others?
 > data LobMultiplier = LobK | LobM | LobG
 >                      deriving (Eq,Show,Read,Data,Typeable)
 > data LobUnits = LobCharacters
 >               | LobCodeUnits
 >               | LobOctets
 >                 deriving (Eq,Show,Read,Data,Typeable)
--- a/9
+++ b/9
@ -1,5 +1,11 @@
 continue 2003 review and tests
 1. start replacing the 2003 stuff with 2011
 2. create an error message document for the website
   - base of error messages but add some more variations
 3. start thinking about tests for invalid syntax
 touch up the expr hack as best as can
 careful review of token parses wrt trailing delimiters/junk
@ -56,6 +62,8 @@ rules for changing the multi keyword parsing:
 rough SQL 2003 todo, including tests to write:
 switch to SQL 2011
 now:
 review the commented out reserved keyword entries and work out how to
   fix
@ -64,7 +72,6 @@ go through almost all the predicates
 window functions missing bits, window clauses
 from: more tests, review missing
  tablesample, unnest, etc.
 aggregates: where, filter + review
 rows review
 match missing bit
 between symmetric
--- a/7
+++ b/7
@ -1,4 +1,4 @@
-0.4.0-dev (updated to 7a847045163feb2339ab40ebe93afe2f1c9ad813)
+0.4.0-dev (updated to 705724197463cd19dd8749dfd51e2eb8f1d02b8e)
 	completely remove dependency on haskell-src-exts
 	improve the error messages a great deal
 	fix some trailing whitespace issues in the keyword style functions,
@ -52,6 +52,11 @@
 	  quote character in the identifier
 	implement complete interval literals (fixed the handling of the
 	  interval qualifier)
 	make most of the standard reserved words actually reserved (still
 	  some gaps)
 	change the natural in join abstract syntax to match the concrete
 	  syntax instead of combining natural, on and using into one field
 	support filter and within group for aggregates
 0.3.1 (commit 5cba9a1cac19d66166aed2876d809aef892ff59f)
 	update to work with ghc 7.8.1
 0.3.0 (commit 9e75fa93650b4f1a08d94f4225a243bcc50445ae)
--- a/simple-sql-parser.cabal
+++ b/simple-sql-parser.cabal
@ -28,7 +28,7 @@ library
                       Language.SQL.SimpleSQL.Parser,
                       Language.SQL.SimpleSQL.Syntax
  other-extensions:    TupleSections
-  build-depends:       base >=4.6 && <4.7,
+  build-depends:       base >=4.6 && <4.8,
                       parsec >=3.1 && <3.2,
                       mtl >=2.1 && <2.2,
                       pretty >= 1.1 && < 1.2
@ -40,7 +40,7 @@ Test-Suite Tests
  type:                exitcode-stdio-1.0
  main-is:             RunTests.lhs
  hs-source-dirs:      .,tools
-  Build-Depends:       base >=4.6 && <4.7,
+  Build-Depends:       base >=4.6 && <4.8,
                       parsec >=3.1 && <3.2,
                       mtl >=2.1 && <2.2,
                       pretty >= 1.1 && < 1.2,
@ -71,7 +71,7 @@ Test-Suite Tests
 executable SQLIndent
  main-is:             SQLIndent.lhs
  hs-source-dirs:      .,tools
-  Build-Depends:       base >=4.6 && <4.7,
+  Build-Depends:       base >=4.6 && <4.8,
                       parsec >=3.1 && <3.2,
                       mtl >=2.1 && <2.2,
                       pretty >= 1.1 && < 1.2
--- a/tools/Language/SQL/SimpleSQL/SQL2003.lhs
+++ b/tools/Language/SQL/SimpleSQL/SQL2003.lhs
@ -15,7 +15,8 @@ large amount of the SQL.
 > sql2003Tests :: TestItem
 > sql2003Tests = Group "sql2003Tests"
->     [stringLiterals
+>     [Group "literals" [
 >      stringLiterals
 >     ,nationalCharacterStringLiterals
 >     ,unicodeStringLiterals
 >     ,binaryStringLiterals
@ -23,32 +24,72 @@ large amount of the SQL.
 >     ,intervalLiterals
 >     ,booleanLiterals
 >     ,identifiers
->     ,typeNameTests
+>      ],Group "value expressions"
 >     [typeNameTests
 >     ,parenthesizedValueExpression
 >     ,someGeneralValues
 >     ,targetSpecification
 >     ,contextuallyTypeValueSpec
->     --,nextValueExpression
+>     ,moduleColumnRef
 >     ,groupingOperation
 >     --,windowFunction
 >     --,caseExpression
 >     --,castSpecification
 >     ,nextValueExpression
 >     -- subtype treatment, method invoc, static m i, new spec, attrib/method ref, deref, method ref, ref res
 >     ,arrayElementReference
 >     ,multisetElementReference
->     --,numericValueExpression
+>     ,numericValueExpression
 >     --,numericValueFunction
 >     --,stringValueExpression
 >     --,stringValueFunction
 >     --,datetimeValueExpression
 >     --,datetimeValueFunction
 >     --,intervalValueExpression
 >     --,intervalValueFunction
 >     --,booleanValueExpression
 >     --arrayValueExpression
 >     ,arrayValueConstructor
 >     ,multisetValueExpression
 >     ,multisetValueFunction
 >     ,multisetValueConstructor
 >     ],Group "query expressions"
 >     [
 >     -- rowValueConstructor
 >     --,rowValueExpression
 >     --,tableValueConstructor
 >     --,fromClause
 >     --,joinedTable
 >     --,whereClause
->     ,groupbyClause
+>      groupbyClause
 >     --,havingClause
 >     --,windowClause
 >     --,querySpecification
->     --,queryExpressions
+>     --,querySpecifications
->     ,quantifiedComparisonPredicate
+>     --,setOperations
 >     --,withExpressions
 >     ],Group "predicates"
 >     [--comparisonPredicate
 >     --,betweenPredicate
 >     --,inPredicate
 >     --,likePredicate
 >     --,similarPredicae
 >     --,nullPredicate
 >      quantifiedComparisonPredicate
 >     --,existsPredicate
 >     ,uniquePredicate
 >     --,normalizedPredicate
 >     ,matchPredicate
 >     --,overlapsPredicate
 >     --,distinctPredicate
 >     --,memberPredicate
 >     --,submultisetPredicate
 >     --,setPredicate
 >     ,collateClause
 >     ,aggregateFunctions
 >     ,sortSpecificationList
 >     ]
 >     ]
 = 5 Lexical Elements
@ -1001,8 +1042,6 @@ create a list of type name variations:
 >          ,("blob(3M)", LobTypeName [Name "blob"] 3 (Just LobM) Nothing)
 >          ,("blob(4M characters) "
 >           ,LobTypeName [Name "blob"] 4 (Just LobM) (Just LobCharacters))
 >          ,("blob(5 code_units) "
 >           ,LobTypeName [Name "blob"] 5 Nothing (Just LobCodeUnits))
 >          ,("blob(6G octets) "
 >           ,LobTypeName [Name "blob"] 6 (Just LobG) (Just LobOctets))
 >          ,("national character large object(7K) "
@ -1172,7 +1211,19 @@ This is used in row type names.
     |     USER 
     |     VALUE
-TODO: review how the special keywords are parsed and add tests for these
+
 > someGeneralValues :: TestItem
 > someGeneralValues = Group "some general values" $ map (uncurry TestValueExpr) $
 >     map mkIden ["CURRENT_DEFAULT_TRANSFORM_GROUP"
 >                ,"CURRENT_PATH"
 >                ,"CURRENT_ROLE"
 >                ,"CURRENT_USER"
 >                ,"SESSION_USER"
 >                ,"SYSTEM_USER"
 >                ,"USER"
 >                ,"VALUE"]
 >   where
 >     mkIden nm = (nm,Iden [Name nm])
 <simple value specification>    ::= 
         <literal>
@ -1269,8 +1320,11 @@ already covered above in the identifiers and names section
         <basic identifier chain>
     |     MODULE <period> <qualified identifier> <period> <column name>
-TODO: work out the exact syntax and add
+> moduleColumnRef :: TestItem
-
+> moduleColumnRef = Group "module column ref" $ map (uncurry TestValueExpr)
 >     [("MODULE.something.something", Iden [Name "MODULE"
 >                                          ,Name "something"
 >                                          ,Name "something"])]
 == 6.8 <SQL parameter reference> (p190)
@ -1304,7 +1358,19 @@ ORDER BY department, job, "Total Empl", "Average Sal";
 TODO: de-oracle the syntax and add as test case
-
+> groupingOperation :: TestItem
 > groupingOperation = Group "grouping operation" $ map (uncurry TestQueryExpr)
 >     [("SELECT SalesQuota, SUM(SalesYTD) TotalSalesYTD,\n\
 >       \   GROUPING(SalesQuota) AS Grouping\n\
 >       \FROM Sales.SalesPerson\n\
 >       \GROUP BY ROLLUP(SalesQuota);"
 >      ,makeSelect
 >       {qeSelectList = [(Iden [Name "SalesQuota"],Nothing)
 >                       ,(App [Name "SUM"] [Iden [Name "SalesYTD"]],Just (Name "TotalSalesYTD"))
 >                       ,(App [Name "GROUPING"] [Iden [Name "SalesQuota"]],Just (Name "Grouping"))]
 >       ,qeFrom = [TRSimple [Name "Sales",Name "SalesPerson"]]
 >       ,qeGroupBy = [Rollup [SimpleGroup (Iden [Name "SalesQuota"])]]})
 >     ]
 == 6.10 <window function> (p193)
@ -1323,6 +1389,10 @@ TODO: de-oracle the syntax and add as test case
 TODO: window functions
 > windowFunctions :: TestItem
 > windowFunctions = Group "window functions" $ map (uncurry TestValueExpr)
 >     [
 >     ]
 == 6.11 <case expression> (p197)
@ -1371,7 +1441,10 @@ TODO: window functions
 TODO: case expressions plus the 'abbreviations'
-
+> caseExpression :: TestItem
 > caseExpression = Group "case expression" $ map (uncurry TestValueExpr)
 >     [
 >     ]
 == 6.12 <cast specification> (p200)
@ -1391,7 +1464,7 @@ This is already covered above
 > nextValueExpression :: TestItem
 > nextValueExpression = Group "next value expression" $ map (uncurry TestValueExpr)
->     [("next value for a.b", undefined)
+>     [("next value for a.b", NextValueFor [Name "a", Name "b"])
 >     ]
@ -1569,13 +1642,16 @@ Specify a numeric value.
 > numericValueExpression :: TestItem
 > numericValueExpression = Group "numeric value expression" $ map (uncurry TestValueExpr)
->     [("a + b", undefined)
+>     [("a + b", binOp "+")
->     ,("a - b", undefined)
+>     ,("a - b", binOp "-")
->     ,("a * b", undefined)
+>     ,("a * b", binOp "*")
->     ,("a / b", undefined)
+>     ,("a / b", binOp "/")
->     ,("+a", undefined)
+>     ,("+a", prefOp "+")
->     ,("-a", undefined)
+>     ,("-a", prefOp "-")
 >     ]
 >   where
 >     binOp o = BinOp (Iden [Name "a"]) [Name o] (Iden [Name "b"])
 >     prefOp o = PrefixOp [Name o] (Iden [Name "a"])
 == 6.27 <numeric value function> (p242)
@ -1600,12 +1676,22 @@ Specify a function yielding a value of type numeric.
         <string position expression>
     |     <blob position expression>
 > numericValueFunction :: TestItem
 > numericValueFunction = Group "numeric value function" $ map (uncurry TestValueExpr)
 >     [
 <string position expression>    ::= 
         POSITION <left paren> <string value expression> IN <string value expression> [ USING <char length units> ] <right paren>
 <blob position expression>    ::= 
         POSITION <left paren> <blob value expression> IN <blob value expression> <right paren>
 >      ("position (a in b)",undefined)
 >     ,("position (a in b using characters)",undefined)
 >     ,("position (a in b using octets)",undefined)
 TODO: position expressions
 <length expression>    ::= 
@ -1663,6 +1749,9 @@ TODO: extract expression
 TODO: lots more expressions above
 >     ]
 == 6.28 <string value expression> (p251)
 Specify a character string value or a binary string value.
@ -3160,9 +3249,3 @@ TODO: review sort specifications
 >     qe = makeSelect
 >          {qeSelectList = [(Star,Nothing)]
 >          ,qeFrom = [TRSimple [Name "t"]]}
 TODO: what happened to the collation in order by?
 Answer: sort used to be a column reference with an optional
 collate. Since it is now a value expression, the collate doesn't need
 to be mentioned here.