Data Query Language Syntax Help
General Commands:
Comamand History: list [NumberOfCommandsToReturn] commands
Exit Program: exit
Reparse The Grammar: reparse grammar
Macro Commands:
DQL Macros Are A Series Of Commands That You Want Executed
Begin Macro: begin macro
Clear All Macros: clear all macros
Delete All Macros: delete all macros
Delete Macro: delete macro ["MacroName"]
Edit Macro Save File: edit macros [from Filename] in Notepad
End Macro: end macro [into "MacroName"]
List Current Macros: list macros
List Macro Commands: list macro "MacroName" commands
List Macro Files: list macro files
Load Macro File: load macros [from FileName]
Run The Macro: run macro ["MacroName"]
Save Macros To File: save macros [into FileName]
Database Object Queries:
List Table Names: list [avante| dw | datawarehouse] tables [into ResultSetName]
List Table Columns: list [avante| dw | datawarehouse] table "ITMMST" columns [into ResultSetName]
Database Queries:
Run Query: run [avante| dw | datawarehouse] query "LIST ITMMST PART.NBR DESCRIPTION" [INTO ResultSetName]
Result Set Commands:
Copy Results: copy result set SourceResultSetName to DestinationResultSetName
Delete Results: delete ResultSetName
Filter Results: filter ResultSetName by "FilterString1" [and "FilterString2" and ...]
List Result Sets: list result sets
Match Results: match ResultSetName on "MatchString1" [and "MatchString2" and ...]
Output To Excel: output ResultSetName to excel file "ExcelFileName"
Revert Results: revert ResultSetName
Show Results: show [NumberOfLinesToReturn] [ResultSetName]
I wanted the ability to run queries on both of our main enterprise databases and do simple filtering/matching on the results (called Result Sets in this system). I also wanted the ability to create a series of commands and save them in a "macro" that I could run later. I wanted to be able to save these macros to a file to be run later as well. In this way I could build up libraries of commonly used commands. Finally, I wanted to be able to call up lists of database objects I am always searching for (table names and columns).
I knew that the "easiest" way to achieve this would be to create a DSL that encapsulated all these commands. You could do it by manually parsing regexes but that would be no fun. Luckily Ruby has many tools to do DSLs. I chose Treetop to create my new language. Treetop is a great tool. The only real negative to using it is the complete lack of good documentation. Hopefully this post may help someone avoid bashing their head into a wall like I did.
In Treetop you build you language using strings and regular expressions. Let's take an example from the language we created above:
List Table Names: list [avante| dw | datawarehouse] tables [into ResultSetName]
I represented this in a rule that looks like this:
rule db_command_list_tables
'list' space database_type space database_object space? optional_resultset:into_resultset?
{
def evaluation_code
eval_code = ""
my_result_set_name = 'default'
# Write Code To Get The Table Names
case database_type.text_value
when 'avante'
if database_object.text_value == 'tables'
eval_code = "result_set = avante_db.get_table_names"
end
when 'dw','datawarehouse'
if database_object.text_value == 'tables'
eval_code = "result_set = datawarehouse_db.get_table_names"
end
end
# Write Code To Handle Optional Result Set
if defined?(optional_resultset.result_set_name)
my_result_set_name = optional_resultset.result_set_name.text_value
end
# Add ResultSet To Result Set Dictionary
eval_code = eval_code + "; result_set_dict.add_existing_result_set(result_set,'#{my_result_set_name}')"
# Probably Want To List Result Set By Default
if my_result_set_name == 'default'
eval_code = eval_code + "; puts result_set.value; puts \"\\n\" + Rainbow(\"Results Stored In Result Set: default\").green"
end
return eval_code
end
}
end
rule database_type
'avante' / 'dw' / 'datawarehouse'
end
rule database_object
'table' [s]* / 'query' / 'data' / 'info'
end
rule into_resultset
'into' space result_set_name
end
rule space
[\s]+
end
The db_command_list_tables rule builds on the database_type, database_object, into_resultset and space rules to create a complete command. One important thing to note is that when you make something optional (by appending a question mark to the end of it) it no longer is able to be referenced directly. An example of this is the into_resultset? part of the db_command_list_tables rule. We added a question mark to the end of it. Notice that I prepended "optional_resultset:" to into_resultset? This enables me to later use this code to look up its value:
# Write Code To Handle Optional Result Set
if defined?(optional_resultset.result_set_name)
my_result_set_name = optional_resultset.result_set_name.text_value
end
It would make sense if your were able to use a construct like into_resultset.result_set_name.text_value to reference the result set name. This will fail. Instead you must prepend optional_resultset: to into_resultset to look into the into_resultset rule. Why? I have no idea and it took me forever to figure this out. I re-iterate ... it is completely necessary to prepend an identifier to any referenced rule you make optional (with a question mark). This is nowhere (that I could find) in the "official" Treetop documentation and it will trip you up.
The final item of interest is the:
{
def evaluation_code
....
....
}
construct inside the rule. This enables you to use Ruby code to process the text data from rules that match your input. How meta ;-). What this enables you to do is create code that would be evaluated when a rule matches certain input. To use these rules you have to compile the Treetop rules into a file and load them into a parser. I used a parser class to accomplish this:
class DQL_Parser
def initialize(grammar_path)
@path_to_grammar = grammar_path
Treetop.load(grammar_path)
@parser = DQLParser.new
end
def reload_grammar()
puts Rainbow("Reparsing Grammar").green
Treetop.load(@path_to_grammar)
@parser = DQLParser.new
end
def parse(command)
tree = nil
if command !~/^\s*$/
# Pass the data over to the parser instance
tree = @parser.parse(command)
# If the AST is nil then there was an error during parsing
# we need to report a simple error message to help the user
if(tree.nil?)
puts Rainbow("ERROR: Cannot parse \"#{command}\". Error is at offset: #{@parser.index}").red
puts Rainbow((' ' * (@parser.index + 21)) + '^').red
puts Rainbow((' ' * (@parser.index + 21)) + '|').red
end
end
return tree
end
end
$parser = DQL_Parser.new($base_path + '\Data_Query_Language.treetop')
.... some code to set up a REPL
parse_tree = $parser.parse(command)
if !parse_tree.nil?
if parse_tree.evaluation_code !~ /^ERROR\:/i
if parse_tree.evaluation_code != ''
eval (parse_tree.evaluation_code)
....
These commands create a parser using all of the commands contained in the Data_Query_Langague.treetop file. Then I parse the commands a user enters with $parser.parse. If the parser tree is not null and if it returns something from the evaluation_code ruby code then we have a winner. We now take the evaluation code returned from the parser and evaluate it with Ruby's eval method. If we put this code in a loop then we have a REPL that we can use to run a series of DQL (Data Query Language) commands.
The key thing to remember is that the evaluation code is run in the context of the Ruby program that is creating the parser instance. By using the Ruby readline and rainbow gems you can create such convenience features as auto command completion and colored error/information output.
The only things left for me to do were to create the classes that would enable me to interface with my databases and easily manipulate result sets that they returned. This was accomplished using the DBI, net/telnet, net/ftp and good old fashioned Ruby classes.
My database classes had the following structure:
class DBInfo
def initialize
end
def log_in
end
def run_query
end
def get_table_names
end
def get_table_cols
end
def close
end
end
My SQL Server and UniData classes inherit from DBInfo. Your classes would depend on the type of databases you were interfacing with.
The more interesting classes are the result set classes (excuse the mess, I haven't had time to properly re-factor these):
class ResultSet
attr_reader :query, :query_type, :column_chars, :name
def initialize(orig_result_temp_file, query, query_type, column_chars = '|', name = 'default')
@orig_result_file = orig_result_temp_file
@curr_result_file = @orig_result_file
@query = query
@query_type = query_type
@column_chars = column_chars
@name = name
end
def value
# Read Results From (Possibly Filtered) Temporary Result Files
if File.exist?(@curr_result_file.path)
return File.read(@curr_result_file.path)
else
puts Rainbow("ERROR: Result File \"#{@curr_result_file.path}\"Does Not Exist").red
return ""
end
end
def original_value
# Return Original Results (Before Filters And Matches)
if File.exist?(@orig_result_file.path)
return File.read(@orig_result_file.path)
else
puts Rainbow("ERROR: Result File \"#{@orig_result_file.path}\"Does Not Exist").red
return ""
end
end
def print_line_in_columns(lines_to_display = 0)
line_counter = 1
if File.exist?(@curr_result_file.path)
# Even At One Byte Per Line It Will Never Have More Than This Many Lines
if lines_to_display <= 0 then lines_to_display = File.size(@curr_result_file.path) end
File.readlines(@curr_result_file.path).each do |line|
break if line_counter > lines_to_display
puts line.gsub(/#{@column_chars}/n,"\t")
line_counter = line_counter + 1
end
else
puts Rainbow("Results File \"#{@curr_result_file.path}\" Does Not Exist").red
exit
end
end
def write_to_excel(filename)
# Write This Result Set To Excel (xlsx)
excel_filename = "#{$base_path}/DQL/Excel/#{filename}.xlsx"
p = Axlsx::Package.new
p.use_autowidth = true
wb = p.workbook
wb.add_worksheet(:name => "Results") do |sheet|
File.open(@curr_result_file.path,"r").each_line do |line|
sheet.add_row(line.split(/#{@column_chars}/n))
end
end
p.serialize(excel_filename)
end
def revert_to_original_value
# Revert Result Set To Original Value (Before Filters And Matches)
if !@orig_result_file == @curr_result_file
@curr_result_file.close
@curr_result_file.unlink
end
@curr_result_file = @orig_result_file
puts Rainbow("Result Set \"" + name + "\" Reverted To Original Data").green
self
end
def filter_out(filter_query = "")
# Filter Things Out Of The Result Set
if @orig_result_file == @curr_result_file
if not(File.exist?(@orig_result_file.path))
puts Rainbow("ERROR: Result File \"#{@orig_result_file.path}\"Does Not Exist").red
exit
end
source = @orig_result_file
destination = Tempfile.new('OpenODBCECL_Dest')
else
if not(File.exist?(@curr_result_file.path))
puts Rainbow("ERROR: Result File \"#{@curr_result_file.path}\"Does Not Exist").red
exit
end
source = @curr_result_file
destination = Tempfile.new('OpenODBCECL_Dest')
end
# Create New Result File Base On The Matching Query
output = File.open( destination.path,"a" )
File.readlines(source.path).each do |line|
if line !~/#{filter_query}/i
output << line
end
end
output.close
if @orig_result_file != @curr_result_file
# Remove Intermediate Result File If It Is Not Equal To Original Results
@curr_result_file.close
@curr_result_file.unlink
end
# Set Current Result File To New Destination
@curr_result_file = destination
# Give Status
puts Rainbow("Filter \"#{filter_query}\" Applied To Result Set \"" + name + "\"").green
# Return self So You Can Chain Filters
self
end
def matching(matching_query = "")
# Return Items Matching matching_query
if @orig_result_file == @curr_result_file
if not(File.exist?(@orig_result_file.path))
puts Rainbow("ERROR: Result File \"#{@orig_result_file.path}\"Does Not Exist").red
exit
end
source = @orig_result_file
destination = Tempfile.new('OpenODBCECL_Dest')
else
if not(File.exist?(@curr_result_file.path))
puts Rainbow("ERROR: Result File \"#{@curr_result_file.path}\"Does Not Exist").red
exit
end
source = @curr_result_file
destination = Tempfile.new('OpenODBCECL_Dest')
end
# Create New Result File Base On The Matching Query
output = File.open( destination.path,"a" )
File.readlines(source.path).each do |line|
if line =~/#{matching_query}/i
output << line
end
end
output.close
if @orig_result_file != @curr_result_file
# Remove Intermediate Result File If It Is Not Equal To Original Results
@curr_result_file.close
@curr_result_file.unlink
end
# Set Current Result File To New Destination
@curr_result_file = destination
# Give Status
puts Rainbow("Match Query \"#{matching_query}\" Applied To Result Set \"" + name + "\"").green
# Return self So You Can Chain Filters
self
end
def clean_up_result_files
# Clean Up Result Files
if !@orig_result_file.nil?
if File.exist?(@orig_result_file.path)
@orig_result_file.close
@orig_result_file.unlink
end
end
# @curr_result_file.path.nil? Necessary If @orig_result_file = @curr_result_file
if !@curr_result_file.nil? and !@curr_result_file.path.nil?
if File.exist?(@curr_result_file.path)
@curr_result_file.close
@curr_result_file.unlink
end
end
end
def close
clean_up_result_files
end
end
class ResultSetDictionary
def initialize
@result_set_hash = Hash.new
end
def result_set_exists?(key)
if @result_set_hash[key].nil?
return false
else
return true
end
end
def list_result_sets()
puts "Existing Result Sets:"
@result_set_hash.keys.each {|key| puts " #{key}"}
end
def get_result_set(result_set_name)
#puts @result_set_hash[result_set_name].value
if result_set_exists?(result_set_name)
return @result_set_hash[result_set_name]
else
puts "Result Set \"#{result_set_name}\" Does Not Exist"
end
end
def delete_result_set(result_set_name)
if result_set_exists?(result_set_name)
get_result_set(result_set_name).close
@result_set_hash.delete(result_set_name)
puts Rainbow("Result Set \"#{result_set_name}\" Is Deleted.").green
else
puts Rainbow("ERROR: Result Set \"#{result_set_name}\" Does Not Exist.").red
return
end
end
def add_existing_result_set(existing_result_set, result_set_name = 'default')
# Given A ResultSet Add It To My Dictionary
if result_set_exists?(result_set_name)
puts Rainbow("ERROR: Result Set #{result_set_name} For Database Type #{existing_result_set.query_type} All Ready Exists. Deleting Existing Key").red
# return nil
delete_result_set(result_set_name)
end
@result_set_hash.store(result_set_name,existing_result_set)
return @result_set_hash[result_set_name]
end
def add_new_result_set(orig_result_file, result_set_name = 'default', database_type = '', column_chars = '|')
# Given Proper Items Add A New Result Set To The Result Set Dictionary
if result_set_exists?(result_set_name)
puts Rainbow("ERROR: Result Set #{result_set_name} For Database Type #{database_type} All Ready Exists. Deleting Existing Key").color(255,102,0)
#return nil
delete_result_set(result_set_name)
end
@result_set_hash.store(result_set_name,ResultSet.new(orig_result_file, query, database_type, result_set_name))
return @result_set_hash[result_set_name]
end
def cleanup
puts "Cleaning Up Result Sets"
# Close All Result Sets
@result_set_hash.keys.each {|key| puts " Result Set \"#{key}\" Cleaned"; @result_set_hash[key].close}
puts "Result Sets Cleaned"
end
end
Basically these two classes let me reference result sets by name in my evaluation_code Treetop definitions. Result sets are basically just temporary text files with my results in them (one record per line). Filtering and matching are done by reading these files on line at a time and dumping appropriate lines into new temporary result set files. These temporary files are cleaned up on program exit or when they are deleted.
This is the process I used to create my DQL meta programming language. I was able to create the whole language using 634 lines of Treetop definitions and a 734 line program that contains all the database, result set and REPL logic.
Overall I am impressed with Treetop's capabilities. More documentation would be nice but answers can be found on the Internet (with a lot of effort). I'm hoping this will become another resource for people curious about this outstanding library.