A while ago I was making a specialized search engine of sorts and I wanted to use Google's nice query syntax with my own custom modifiers instead of things like inurl:, intitle:, site:, etc. Besides these powerful modifiers, Google's search syntax is nice because no query is invalid. Yahoo, MSN, and Amazon also at least support more than just basic search expressions.

Sometimes I wish other sites (like reddit) would implement this syntax for their search queries. So tomorrow I'll release a little Python module that parses this query syntax and makes the query easy to read and process. I wrote code that did this successfully a year or two ago, but I'd like to rewrite it now with pyparsing or something. Now there will be no excuse to offer only lame search queries.

To get the ball rolling, here's the BNF for the syntax I will implement. I have no idea if this is even a proper way to express BNF, but I used the Python grammar as a reference (because if it's good enough for Guido, it's good enough for me).

L ::= expr | expr L
expr ::= term | binary_expr
binary_expr ::= term " " binary_op " " term
binary_op ::= "*" | "OR" | "AND"
include_bool ::= "+" | "-"
term ::= ([include_bool] [modifier ":"] (literal | range)) | ("~" literal)
modifier ::= (letter | "_")+
literal ::= word | quoted_words
quoted_words ::= '"' word (" " word)* '"'
word ::= (letter | digit | "_")+
number ::= digit+
range ::= number (".." | "...") number
letter ::= "A"..."Z" | "a"..."z"
digit ::= "0"..."9"

Contrary to what many people believe, you can NOT use parentheses for grouping or precedence in Google search queries. Every punctuation character except for [+-_".] is converted to a space and becomes meaningless.