Page History: Poor Man's T-SQL Formatter
Compare Page Revisions
Page Revision: 2012-09-03 04:27
Overview
This is a free and open-source SQL (T-SQL) formatter:
- Handles complete multi-batch scripts, including object definition scripts such as stored procedures, triggers, etc.
- Provides formatting options to cater to different common formatting styles/standards
- Optionally outputs "colorized" html code rather than just the formatted SQL
- Also provides "minifier" option to strip out comments and whitespace, to obfuscate rather than pretty-print your code
- Available ready-to-use in a variety of forms
- SSMS (SQL Server Management Studio) Addin that allows you to format the current file or selected text with a single hotkey - supports any version of SSMS or SSMS Express.
- Notepad++ plugin, for quick single-key formatting in your favorite general-purpose text editor.
- Command-line utility that lets you format any number of files in bulk, or format from some other arbitrary program
- Winforms app for easy offline formatting (also lets you look at the token stream and parse tree)
- WinMerge plugin, for automatically formatting SQL files before comparison, allowing WinMerge to display content changes only, ignoring formatting differences.
- Web Service that exposes the same functionality across http, used in the demo/online formatting site http://poorsql.com
- Also available as a .Net 2.0/3.5 library
- Written in C# using a pluggable design that should allow for other SQL dialects to be supported in future
This formatter does implement a high-level SQL tokenizer and parser, but the granularity of the parser is not very high. It does not distinguish between different types of DML, it does not parse full expression trees, etc - there's a lot it doesn't do, it just does the bare minimum to support the target formatting.
The library is largely complete, but undocumented. For the list of known issues, please see the
issue tracker on github. As there is only as much parsing detail as necessary to support the desired formatting, the parse tree format is still not finalized; many formatting enhancements require new elements in the parse tree.
The code is hosted on GitHub at
https://github.com/TaoK/PoorMansTSqlFormatter; I'm always very grateful for any feedback on the functionality, the code or any other aspect of the project!
Download
Ready-to-run:
- Command-line bulk formatting utility (simple exe): SqlFormatter.1.4.1.zip (1289 downloads for this version, about 1000 for previous versions)
- Notepad++ plugin: It's recommended to use the Notepad++ plugin manager if possible (Notepad++ is an amazing free GPL-licensed text editor, available at notepad-plus-plus.org; it has a very rich plugin community available through the "Plugin Manager" plugin that is installed by default).
- If you really want to install the plugin manually, you can get it from here: SqlFormatterNppPlugin.1.4.1.zip (1186 downloads for the latest version, about 24000 for previous versions) and unzip the entire contents of the ZIP file (1 DLL + 1 folder containing other DLLs and stuff) into your Notepad++ install folder's "plugins" subfolder, eg "C:\Program Files\Notepad++\plugins". If you get an error about the plugin not supporting unicode, then the wrong DLL is in the plugins folder, simply delete the incorrect DLLs and start again (please contact me for help if you have any issues with this).
Note: as of 2012-09-02, the plugin manager has not yet been updated to reference the latest version, so to try the new features and bugfixes right now you need to download and install the plugin manually (see above) - or you can just wait a few days/weeks until the plugin manager is updated.
Compiled Library to use in other GPL/AGPL projects or internal non-distributed projects:
Source & Web Service:
Latest Changes
2012-09-02, version 1.4.1
- Github Issue #54: [New Feature] Support for DB2/SQLLite string concatenation operator [thanks to numerous people for suggesting this, sorry about the delay]
- Github Issue #48: [Bugfix] Correction to SQL 2012 add-in install folder
- Github Issue #52: [Bugfix] Fix to space-indent saving in formatting plugin options (SSMS, Notepad++) [thanks to numerous people for reporting this, sorry about the delay]
- Bugfix Fix examples in commandline help to correspond to existing options thanks John Landmesser for the bug report
- Github Issue #74: [Bugfix] Correct parsing & formatting of subqueries & function calls in variable initializers
- Github Issue #73: [Bugfix] Correct parsing & formatting of OPTION clauses
- Github Issue #55: [Bugfix] Correct parsing & formatting of OUTPUT specifiers on proc arguments defined without parens
2012-03-12, version 1.3.1
- Github Issue 23: [New Feature] Support for SSMS 2012 / Denali
- Github Issue 34: [New Feature] Obfuscating formatter / minifier in Winforms app and Web Service
- Github Issue 36: [New Feature] Support for "[noformat][/noformat]" and "[minify][/minify]" block-formatting instructions in standard formatter [thanks dquille for the request]
- Github Issue 37: [Bugfix] SSMS Hotkey binding failed in non-english versions of SSMS [thanks Philipp Schornberg for the bug report]
- Github Issue 38: [Bugfix] Comment-positioning bugs, one of which could result in invalid output SQL! [thanks andywuest for the bug report]
- Github Issue 39: [Bugfix] Formatting Library not usable in .Net 3.5 and later projects, because of Linqbridge conflict on Linq namespace [thanks Sean Bornstein for the bug report]
- Github Issue 40: [Bugfix] Error parsing WITH options containing "ON" in parentheses [thanks Jean-Luc Mellet for the bug report]
2012-01-29, version 1.2.1:
- Github Issue 32: [Feature Request] Enhanced Cmdline Utility to support pipeline input (and output) [thanks William Lin for the request]
- but default behaviour unchanged, still expect to act on files directly unless piped input provided
- expected encoding is UTF-8... in powershell, for example, run "$OutputEncoding = [System.Text.Encoding]::UTF8" first.
- Enhanced Cmdline Utility "no files found" message to include extensions detail
- Github Issue 26: Testing enhancements:
- Changed testing framework to Nunit, included dependencies for building and testing in any environment
- Simplified testing code, with test sources
- Added tests for different formatting options / flags, greater coverage
- GitHub Issue 31: Completed Notepad++ Plugin, with formatting options - available trough Notepad++ plugin manager.
2012-01-21, version 1.1.1:
- Translated programs, ssms plugin + website into French and Spanish
- thanks to Threeplicate for Amanuens, a free (for open-source projects) localization tool that makes managing translations much easier!
- and thanks to my wife for proofreading... technical translations are always a mess, but at least I got a second pair of eyes on it!
- Initial Beta of Notepad++ plugin [thanks UFO and Robert Giesecke for the .Net managed plugin template!]
- See http://sourceforge.net/projects/notepad-plus/forums/forum/482781/topic/4911359 to get the plugin beta
- Corrected "Request Validation" issue on poorsql.com, caused by over-zealous web service validation in ASP.Net 4.0
- Github Issue 22: [Bugfix] Indenting of "AS" clause on 2nd or later CTE in a query
- GitHub Issue 21: [Feature Request] Added optional ignore of errors in CmdLine Formatter [thanks Jörg Burdorf for the request]
- GitHub Issue 18: [Bugfix] Corrected positioning of comments (after linebreak) at end of statements [thanks gvarol for the bug report]
- GitHub Issue 19: [Feature Request] Added optional indenting of join ON sections [thanks Pushpendra Rishi for the request]
- GitHub Issue 24: [Bugfix] HTMLEncoding missing in Web Service (eg for poorsql.com) and Winforms demo app [thanks Gokhan Varol for the bug report]
- GitHub Issue 25: [Feature Request] Settings persistence and optional display simplification in Winforms demo app [thanks Gokhan Varol for the request]
- GitHub Issue 33: [Bugfix] Culture-specific uppercasing bug (Turkish) [thanks Recep Guzel]
- GitHub Issue 35: [Enhancement] WITH clause breaking [thanks Lane Duncan for the suggestion]
- Softcoded parsing error message in library (for translation + optional removal of warning)
2011-08-25, version 1.0.1:
- Added MERGE clause and statement support
- Added OUTPUT and OUTPUT ... INTO ... clause support
- Corrected INSERT ... EXEC parsing/formatting
- Corrected VALUES ... indenting
- Corrected handling of multiple CTEs in a single statement
- Corrected handling of AS in stored procedure argument data type specification
- Added handling of ISO data type synonyms, with keyword consistency correction to Sql Server datatypes
- Added missing SQL Server 2008 datatypes, Date/Time-related, HierarchyID, and Geography
- Corrected minor indenting bug with single-word clauses at the end of multi-statement container content
- Added Expected SQL Parse Tree and Standard Output Sql tests in test suite - parse Xml format and output SQL format effectively finalized.
Full Change Log:
Known / Open Issues:
License / Redistribution
The library and accompanying programs are released under the Affero GPL. This means that you are allowed to make any changes you want, there are restrictions/requirements attached only if you choose to redistribute the original or modified code: If you choose to redistribute/expose the functionality, then you must make the source available to the recipients/users (even of a web service). For private usage, you can make any changes you want and never need to tell anyone; if you distribute the app/functionality or a service that relies on it, then you must also distribute the source. For more details, please see the
AGPL article on wikipedia or at the source, the
FSF (free software foundation).
At least for the moment (as I'm the only one who has contributed to the project so far), the project could be dual-licensed if you really, really wanted to use it in a commercial program. If so please contact me, I'm sure we could work something out.
Tips & Tricks
- If you don't want a certain block of code to be reformatted, then just put a comment containing [noformat] before and [/noformat] after, like this:
SELECT 1,
2,
3
-- [noformat] don't want any formatting here, I like it with spaces before AND after the commas here.
SELECT 4 , 5 , 6 , 7 , 8 , 9
--[/noformat]
SELECT 1,
2,
3
- If you want to minimize the space taken by a chunk of SQL, and don't care about legibility, you can minify it with a comment containing [minify] before and [/minify] after, like this:
SELECT 1,
2,
3
--[minify]
SELECT 4AS[Four],5AS Five,6,7,8,9
--[/minify]
SELECT 1,
2,
3
- Loren Halvorson pointed out you can use the command-line formatter to pre-format SQL files in Beyond Compare, with a command like this:
SqlFormatter –o "%t" "%s"
- Paul Kohler shared a registry change he uses to format files from the Windows Explorer context menu (save as something.reg and run it):
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\Reformat SQL]
[HKEY_CLASSES_ROOT\*\shell\Reformat SQL\command]
@="C:\\Tools\\SqlFormatter.1.2.1\\SqlFormatter.exe \"%1\" /is:\" \" /tc+"
- If you're the author of a SQL-related tool and you want to make use of the formatting library, but you don't want to release your project under the AGPL license, then you can always use the command-line formatter as an optional external tool. If you then redistribute the command-line tool you need to publish/provide the source of the formatter tool, but you don't need to relicense your project to match. Even simpler, you can suggest that users download the formatter separately to enable formatting functionality.
Feedback & Known Issues
Please email me at
,
or simply create issues in the
GitHub issue tracker.
Instant Demo
The demo has been moved to poorsql.com - check it out and contact me with any issues!
Contributing
So far I haven't received any contributions from anyone else, but contributing is easy with GitHub! Just fork the project, make your changes, and create a Pull Request for me to take a look and I will (if I agree with the changes of course) happily incorporate them!
https://github.com/TaoK/PoorMansTSqlFormatterFor translations, I was originally using "Amanuens", a translation platform developed by Dario Solera (the original developer/maintainer of Screwturn Wiki) and a couple of his friends, that sadly had to shut down a few months ago. At this point I haven't looked fotr a replacement yet, the level of interest generated by the translation work was, at best, very low (and that makes sense for a developer-focused tool: a very large proportion of international developers speak enough english to get by).
Background
But Why??
You may wonder why create yet another SQL formatting library... I've asked myself that a few times as I've been working on this. The main reason is that when I have searched for this in the past, I've never found a tool that did exactly what I wanted it to. Another answer is that it is fun, and I think I can do a pretty good job, and I've learned quite a lot about T-SQL in the process. Finally, I hadn't found any
open-source library that I thought I could help improve, rather than starting my own (when I started this project, that is).
There does appear to be at least one other active/in-progress open-source project out there (sqlformat, by Ben Laan), but I don't believe that the approach taken there (as far as I can tell from looking at the code and playing around) could scale to the type of full-script universal T-SQL formatting I want to be able to (and now can, as far as I can tell) perform.
The main challenge, I believe, is that I want to do full re-formatting of *any* T-SQL code, supporting all keywords and syntax constructs. To do this with a full T-SQL parsing model would be difficult, or maybe even unrealistic, to achieve in a one-man (or few-people) part-time open-source project. By scaling it back to a partial parsing model where only formatting-relevant constructs are singled out, the scope of the project is significantly reduced, hopefully to an achievable and maintainable level.
The library is currently "V1 Complete", with no major bugs or omissions that I am aware of - I will probably be scaling back my attention to this project unless/until someone contacts me with suggestions or bugs. I have a list of "Issues" in the github issue tracker, but none are significant enough that I expect to work on them in the coming weeks.
Other Products and Projects
To get a better idea of the value of this project (for myself and potentially others), I started putting together a quick comparison table for this project vs all the other products and projects I come across. Obviously I try to avoid undue bias, and equally obviously I will still be biased towards this project, as I am human :).
Here is the list of all T-SQL formatting tools I've encountered, with a comparison table below:
- SqlFormat .Net 3.5 Formatter Library: The only open-source tool I've found out there. The project aims are different to this one, as they indent to create a full parse-tree for those parts of the language that are supported. As such, only a reasonably small subset of language features are covered at present - primarily those used in Entity Framework-generated SQL.
- GuDu Software options:
- T-SQL Tidy: Online-only free tool, with free webservice and ssms add-in (webservice-based)
- Red Gate options:
- SQLInForm options:
- manoli.net c# code format: Apparently does T-SQL too
- Navicat Lite: DB Management tool with free/lite version for non-commercial use, apparently includes a sql beautifier in the lite version
- RazorSQL Trialware DB Management tool that apparently includes some formatting functionality
- DevArt SQL Complete: SSMS add-in with free version (?) that supports formatting
- UBitSoft options
- SoftTree SQL Assistant: Another trialware code completion and formatting tool, supporting multiple editors and multiple SQL dialects
- Quest Toad for SQL Server: Probably the reference third-party DB management tool; trialware.
- Apex SQL Refactor: Another trialware code completion / refactoring/reformatting add-in for SSMS, this one claims to keep formatting for free after the trial runs out
- Tidycode T-Sql Formatter: Command-line trialware formatter
- SQLIse / SQLPSX: Powershell-based tools and UI for SQL Server management with some(?) formatting functionality using the Microsoft Microsoft.Data.Schema.ScriptDom classes. I haven't tried it yet; someone claimed it strips comments, but I really don't know.
- SqlFormat Online SQL Formatting service: Another open-source project, this one using Python; doesn't support many DDL constructs, but degrades gracefully, so it still works with unknown constructs and structures.
- SilverlightSQLConvert An apparently japan-originated Silverlight-based formatter, enigmatically announced as "Made in Japan" by a commenter named "ninjya" on an old (2006) blog post, in June 2011. There is no information on the page as to the origins, purpose, license or copyright of the project, but in the title there seems to be a mention of "NodaSoft@" - also a company name claimed by a Russian software company. The code in the silverlight "SilverlightSQLConvert" DLL appears to be un-obfuscated and the SQL formatting appears to be text-pattern-matching-based. Trial and error shows it to be effective for single statements that don't use more complex constructs such as "MERGE".
Please note:
- This table is based on my own very brief research, in terms of what information is available online for each of these products. If you see any errors in this list, please contact me and I will correct them asap. Where I was not able to figure something out from online documentation, I have left a question mark.
- This table evaluates sql formatting features only - many of the packages/products in this list have a lot of additional functionality available, which I'm not addressing here at all!
- Please contact me if you're interested in this list but the table headers don't show up properly... I've tested it in the "main" browsers (Firefox, Chrome, Safari, hideous but legible in IE7/8/9), but it took some horrible CSS hacks to get it working, and I haven't tested it in many other browsers.
Your Formatting preferences? Who are you?
The only reason my personal preferences are important here, is that I'm the one writing this :).
What I can do, however, is explain the rationale behind my preferences, in the hope that I might sway someone's opinion:
- I find that too much whitespace in the body/joins distracts from the structure of the query, especially if you end up having to scroll to find parts of the query. This is why I'm not fond of indenting the "ON" portion of a join clause
- I find that logical expressions are easiest to read (and write) when each portion is on its own line. This applies regardless of whether it's in a join clause's "ON" conditions, or the "WHERE" clause
- I find it very important to be able to write SQL (in a variety of editors) in the target format, without needing something to clean up after you. This is the main reason I don't like "Aligned" SQL formatting. I understand it can be more legible, but it's too much of a pain to maintain (esp. in embedded/ad-hoc SQL queries). It also breaks when you use tabs (in different editors with different tab stops), and I like tabs! They're easy-to-navigate whitespace!
What if I want my SQL formatted differently?
I am completely open to providing additional formatting options, obviously within the time I have available to devote to this project. As it's open source and hosted on github, it's also trivial to fork the project, download to your local machine and get cracking on any changes you'd like to see!
If you don't want to, or can't make the changes yourself and you can't get a hold of me or I don't have time to work on the changes, there is always the host of commercial products identified above!