Architect Shack

Navigation



Quick Search
»
Advanced Search »

PoweredBy

Page History: Poor Man's T-SQL Formatter

Compare Page Revisions



« Older Revision - Back to Page History - Newer Revision »


Page Revision: 2011-04-28 04:35


This is a homebrew SQL (T-SQL) formatter, mainly created because I could not find any free and open-source T-SQL formatters that handle T-SQL exactly the way I want, including full stored procedure formatting.

It's written in C#, and designed to work in three phases, each using a pluggable class:
  • Tokenization - text-level parsing, identifying operators vs keywords vs whitespace vs comments, etc
  • Parsing - keyword-level parsing, identifying hierarchical relationships between components of the T-SQL code, statements, clauses, blocks, flow control, etc.
    • Please note, this is not a fully-fledged SQL parser. It does not distinguish between different types of DML, it does not parse full expression trees, etc - there's a lot it doesn't do, it just does the bare minimum to enable the formatting I was aiming for.
  • Formatting - turning the SQL parse tree back into T-SQL code, with customizable formatting preferences.

The library (and winforms demo and test projects) is released under the Affero GPL, and will continue to be subject to significant changes as I add the major features that I see are missing (keyword capitalization, less-common-keyword-handling, etc). I'll update this again when it reaches a more stable state.

The code can be found on GitHub, I'd be very grateful for any feedback: https://github.com/TaoK/PoorMansTSqlFormatter

Demo

Paste some T-SQL here:

And it will be formatted here:

Please note (1): This demo uses some default default values for formatting options supported by the library. For a more complete customizable demo, please download the windows forms app (sorry, I haven't compiled and made that available here yet - coming soon; in the meantime, you can download the source from github).

Please note (2): This demo does send the SQL that you input to the webserver, because the library is implemented as C# and I didn't feel like messing around with silverlight. I am not doing any logging of request content, however, and as far as I know my hosting provider isn't either - so your SQL code should be as reasonably private as anything you post in an online (non-SSL) form. You can see the formatting web service code on github, and you can see the submission code in the source of this page, it's just some simple jQuery code.

Please note (3): this library is released under the GNU Affero GPL v3, so third parties can feel free to host it as well, if they make available the source of the version that they are hosting (see the detailed terms of the AGPL, eg on wikipedia).

But Why??

You may wonder why create yet another SQL formatting library... I've asked myself that a few times as I've been working on this. One answer is that it is fun, and I think I can do a pretty good job, at least to suit my requirements. Another reason is that when I have searched for this in the past, I've never found a tool that did exactly what I wanted to. Finally, I hadn't found any open-source library that I thought I could help improve, rather than starting my own.

That said, I do plan to review all (or as many as possible) of the tools that exist out there, and decide/confirm whether this has been a waste of my time. To stack the cards in my favour, I only plan to do this once I consider my project V1 feature-complete, however (I'm still a little ways off).

Here are the tools I've encountered so far:
Here are features that I would be looking for / comparing:
  • Open-source / libre-ness
  • Free-ness (as in beer)
  • Installable-ness (avoiding online-only / web service solutions)
  • Accessibility / easy testing (looking for online / form-based solutions)
  • Batchable-ness / command-line execution (for mass-reformatting)
  • SSMS plugin availability
  • Cross-platform support (if not online-only of course)
  • Access to a parse tree (and completeness/detail of the parse tree)
  • Formatting according to my personal preferences
  • Ability to nicely format complete T-SQL scripts (eg stored procedures), not just individual queries
  • Input and output filters (to take and spit out code, eg VBScript, for ad-hoc SQL queries)
  • Support for multiple SQL Dialects
  • Support for different SQL formatting preferences
  • Recent Activity / Updates

Your Formatting preferences? Who are you?

The only reason my personal preferences are important here, is that I'm the one writing this (and, as far as I can tell, the only one using it too :) ).

What I can do, however, is explain the rationale behind my preferences, in the hope that I might sway someone's opinion:
  • I find that too much whitespace in the body/joins distracts from the structure of the query, especially if you end up having to scroll to find parts of the query. This is why I'm not fond of indenting the "ON" portion of a join clause
  • I find that logical expressions are easiest to read (and write) when each portion is on its own line. This applies regardless of whether it's in a join clause's "ON" conditions, or the "WHERE" clause
  • I find it very important to be able to write SQL (in a variety of editors) in the target format, without needing something to clean up after you. This is the main reason I don't like "Aligned" SQL formatting. I understand it can be more legible, but it's too much of a pain to maintain (esp. in embedded/ad-hoc SQL queries). It also breaks when you use tabs (in different editors with different tab stops), and I like tabs! They're easy-to-navigate whitespace!

© 2007-2012 Tao Klerks | Home | Contact Me