The SQL Shell

Reads: 31  | Likes: 0  | Shelves: 0  | Comments: 0

More Details
Status: Finished  |  Genre: Non-Fiction  |  House: Booksie Classic
Jack Dikian developed the Structured query language shell in order to provide greater efficiency in use as well as provide a platform supporting a natural language interface

Submitted: August 25, 2015

A A A | A A A

Submitted: August 25, 2015

A A A

A A A


ABSTRACT

The SQL+sh is an interactive front-end to Unify’s Structured Query Language (SQL). It’s main purpose is to add Csh/tenex like functionality to a vanilla query interpreter in the way of SQL. A query history stack, ability to recall and edit previous queries as well as an interactive RECORD and FIELD name recognition and completion mechanism are a sample of the sort of enhancements SQL+sh supports. This paper presents a brief background to SQL before discussing some of the features we included to this package.

Working in an environment where a significant portion of a programmers time is spent writing and maintaining applications software around the Unify relational database; any facility that simplifies database interactions must be an advantage. This database is quickly approaching the 2 G-byte mark with over 300 Mb of supporting software. Like other large database users, the overhead of database related maintenance is a significant consideration. Improvements in database related utilities greatly increase productivity as well as reliability.

One of the most powerful facilities available to the maintenance programmer in our environment is Unify’s Structured Query Language SQL. This utility is often used to interrogate as well as patch the underlying database. Ad hoc SQL queries are often generated to confirm the correctness of application modules as well as serving the more simple day to day user information requirements.

A Quick Look At SQL

SQLis an English keyword orientated query language of great flexibility. It is a language that is easy enough for non-programmers to learn, yet has enough power for data processing professionals. Chamberlin and others originally defined this product at the IBM Research Laboratory in San Jose, California, under the name SystemR.

A family of IBM products based on the System R technology was developed. These products are now generally available and are known as DB2, SQL/DS and QMF [1]. A number of other vendors have also produced systems that support SQL. SQL’s data manipulation statements typically operate on entire sets of records. For example, the select and update clauses can retrieve and modify a set of values and tables. SQL, like all relational data manipulation languages is a set-level language. For this reason, SQL is often described as a nonprocedural language.

The user specifies "what" data they want and not so much "how" to get it. It is up to SQL to decide on how best to execute any particular query. It needs for example to consider which tables are being referenced in any request; the size of the tables; what indexes exits; how selective those indexes are and of course, the form of the where clause. SQL queries consist of clauses, each of which is preceded by a keyword. Examples of keywords include; select, update, delete and insert. In fact, the previous four keywords all belong to that part of SQL, which is commonly referred to as the DMLor Data Manipulation Language. Other optional keywords are used to control, format and operate on the various queries. Some simple examples of queries are given below:-

> select Name, Phone

> from PERSONS?

> where Age > 30/

The above example illustrates the selecting or retrieving of the specified fields Name and Phone from a specified table PERSON where some specified condition is true. It is important to note that the result of the query is another table.

  • select PERSON.*, COMPANY.*?
  • from PERSON, COMPANY?
  • where PERSON.PName = COMPANY.CName/

This example demonstrates the retrieving of data from two tables namely PERSON and COMPANY. We are interested in all instances of the field PName in PERSON matching the field CName in the table COMPANY. This is commonly referred to as "Joining" two or more tables. The availability of the join operation is, almost more than anything else that distinguishes relational from non-relational systems

> update COMPANY?

set CName = ’ACME’

where Type = ’ACTIVE’/?

This example sets the field CName to "ACME" for all COMPANY records that satisfy the condition after the where clause.

The SQL+sh

Our main database currently supports over a 100 tables and close to a 1000 fields. Using SQL to interrogate and manipulate data in this environment almost always requires the programmer to first browse through the Database schema listing. This is not only due to the large number of different tables and fields but is also due to UNIFY’s record and field naming conventions. The maximum length of a record name is eight characters. It is therefore impossible to create two records with the names "PROGRAMMER" and "PROGRAMME". A compromise may lead to the names "PROGMR" and "PROGME" etc. It is easy to see why the schema listing may be required in such cases.

Creating tables in Unify requires the user to nominate both a short and a long field name. Short field names must begin with a letter and can be up to eight characters long. The long field names begin with a letter and can be up to sixteen characters long. It is the long name that SQL requires for carrying out queries. The schema is used to determine or look up this long name. The schema is also used to determine relationships between tables and their corresponding fields.

Editing large queries are handled by – SQL writing the last query in/tmp. The edit facility invokes a standard editor such as vi with the last query loaded in the editor buffer. The user modifies and saves the changes before using the restart clause to re-execute the query. Although this facility is useful, it is however often tedious. This is especially true when a simple typo needs to be repaired. Because only the last query is effectively saved, access to previous queries is lost unless the user explicitly saves the editor buffer to a nominated file.

Interestingly, we required in SQL a similar transformation in functionality as that provided by say csh and tcsh over the bourne shell. Where tcsh provides file name recognition and completion, we required record and field name recognition and completion. Where csh provides a history and edit facility for commands, we required, a history and edit mechanism for queries. In implementing some of the ideas found in csh and tcsh, we were able to address both the above mentioned short comings as well provide a much more effective user interface.

Not having access to SQL source, the only other alternative in implementing the above changes was to write our own parser sitting on top of SQL.This would simply read the input stream, decide if it needs to do anything special like manipulate the history stack, carry through edit commands, expand alias’ etc and then write to SQL via a pipe. The output of SQL is not altered.

SQL+sh reads a schema description file on startup. The systems administrator typically generates this file by running a specially written shell script. The description file describes the database tables, there respective fields and other information such as field type and length. The shell script usesSQL to dump the relevant table, field types and names.

On startup, SQL+sh looks at the environment variable DBPATH and displays the name and address of the working database. After this point,SQL+sh enters a forever loop waiting for queries, internal commands and or the quit or end clause. A new prompt including the event number is displayed. An environment variable defines the maximum history size. An internal command has been added called " Mod On/Off" which enables and disables the availability of non-passive SQL clauses. For example, after entering the command " Mod Off", such clauses as delete, update, insert etc are disabled or ignored. This is useful in cases where support staff use SQL to answer quick telephone queries and should not update the database inadvertently or otherwise.

Unlike Unix commands, which are newline terminated, SQL queries often span over many lines. In fact, users of?SQL are encouraged to use good formatting procedures when making SQL queries. This is in part due to the fact?that quite complex SQL scripts can be written and saved for regular use. These scripts are also used to feed data?to Unify’s report generator RPT. The "/" character is used to indicate the end of a query.

For this reason, SQL+sh supports a modified history substitution command in the way of "!event+". This tells SQL+sh to re- execute the query beginning with the event number "event" and continue to re-execute events forward in the stack until a "/" character is encountered. All other normal history substitution commands such as "!!", "!- number", "!number" as well as "!pattern" etc have been implemented. Where a query spans many lines, SQL+sh collects together the individual clauses to echo a single event in its history stack.

Editing previous queries are handled two ways. The standard SQL procedure is to invoke the system editor with the last query loaded into the editor buffer. The edit clause facilitates this procedure. This method is still available and is usually used for editing large query texts. This method allows only the last query to be modified and re-executed. SQL+sh introduces the csh like "!event s/patternl/patternl" and ^patternl^pattern2^ mechanisms. These are extremely convenient for repairing typos and or for substituting record or field names while leaving the general structure of the query untouched.

One of the most useful additions to SQL was the introduction of record and field name recognition and completion. The idea here was to provide a convenient way to avoid having to look up the record and field names before generating queries. Automatically displaying field types and length was considered useful.  Other considerations included providing a means by which keystrokes could be reduced and accurately associating relevant field names to their correct parent tables. This mechanism is used in conjunction with the database schema description file. It is no longer necessary to type a complete record or field name. Only a unique abbreviation is necessary. Typing the ESCAPE key after the abbreviation will complete the record or field name, echoing the full name. Unlike tcsh, where there is really only one type of file name completion, SQL+sh needs to consider context and determine whether a record, or field name is being sought.

This is achieved by adding some of the SQL syntax rules into SQL+sh. For example the following grammar extracts define the syntax for the insert and select clauses:-

insert into RECORD [(FIELD .... )]: from filenamell select/

select ["unique"] I * I RECORD.* I RECORD.FIELD I FIELD ....?I * I RECORD.* I RECORD.FIELD I FIELD ....

from I RECORD [label] I ....?where ["not"] I FIELD I RECORD.FIELD I constant

SQL+sh tries to carry out a search of either the appropriate record or field based on the position the ESCAPE key was pressed in the input stream. It is obvious from the above two syntax examples that it is not often possible to determine whether a RECORD or a FIELD needs expanding. In the select clause for example, it is possible to say " select record.field from ..." or " select field from...". Hitting the ESCAPE key just after the select token leaves SQL+sh with a choice of searching for appropriate records or fields. In fact, in this particular example, the system will first search through the record list and then the field list. In general, as each word is read, SQL+sh updates a flag indicating whether it is in a "RECORD" or "FIELD" state.

This flag is initially set to a "NULL" state thus causing a bell to sound when the ESCAPE key is pressed. A "BOTH" state causes SQL+sh to search records and then fields. This state is established by tracking entered words against various syntax rules defined in SQL+sh. We have also provided a means of commenting query text. Text found enclosed within the "{" and "}" braces are ignored. This facility was implemented in order to allow a clean method of displaying field types and length in-line. On Hitting ESCAPE in a "FIELD" state, the system will not only display a candidate field name but also place the relevant field type and length already commented.

Besides providing a recognition and completion mechanism, SQL+sh also provides a facility where fields belonging to a particular record can be scanned. For example, after having typed in the sub-clause select * from PERSON where " it is possible to Hit Ctrl-f to echo the first field belonging to the PERSON record. Hitting Ctrl-f again will replace the first displayed field name with the next field. When the list of fields is exhausted, the process is repeated. This allows the user to carry out a query on a record even when they had no idea of the field names associated with the given record. The field type and length is once again displayed in comments.

select * from PE" select * from PERSON.

The cursor sits at the next column position waiting for the rest of the query.

select * from PERSON where

results in select * from PERSON where PName {STRING 12}

Hitting again results in

select * from PERSON where Paddress {STRING 45}?The user can now enter the rest of the query

select * from PERSON where Paddress {STRING 45} = ’Bag End*’ and Hitting here results again

PAge {NUMERIC 3 }

PAge {NUMERIC 3} <= 111/

Often there is the need to carry out repetitive queries involving tests against large text constants such as "0 0811234678905 0" and "Speak Friend And Enter". An ability to implement a concept of macros was also considered a useful enhancement. The same query is often re-executed many times over in the event of a Database maintenance session. One or more parameters in the query may however vary. An ability to expand VMS like "Logical Variables" was added to SQL+sh.

The same variable setting and expansion mechanism is used to set and unset simple and complex variables. There are no inherent differences between variable substitution and macro processing. The difference is operational. SQL+sh maintains a set of variables each of which has as a value a list of zero or more words. Each word in this list could be a simple constant or another variable. This value may be displayed and changed by using the internal commands show and clear. After the input line is parsed, and before each query is executed, variable substitution is performed. Variables are keyed by ’$’ character. The expansion can be prevented by preceding the ‘$’ with a ’V except within ’"s.

A Macro with a single argument can be seen as a variable containing another variable in its assignment string. The second variable has to be resolved before the macro can be executed. Newline characters found in the assignment list are ignored. Looping is prevented by checking that the same variable does not appear in the assignment list of that variable. Examples of variables follow:-

[1] $new_name = "Bilbo Baggins"

[2] $my_update = " update PERSON

[3] set PName = Snew_name

[4] where PName = ’ *’/"

[5] Smy_update

We have been using this utility on a trial basis for the last few weeks. In general, the added convenience of query recall and edit far exceeds the cost of the extra overhead (minimal anyway). The ability to echo the field length and type results in much less references made to the schema listing. Record and field name completion means less typos in general. Although much of this utility came about after a "wouldn’t it be nice..." chat and a couple of very long editing sessions (a few man days), the final thing has proved to be very useful in our environment.


© Copyright 2018 Jack Dikian. All rights reserved.

Add Your Comments:

More Non-Fiction Articles