guiguts.pl Front end to gutcheck with some other post processing functions as well.


Latest version:

guiguts.62.zip  - (487k) contains both a perl script and a windows executable. To run the executable, you will need to download and install the perl runtime libraries.Note: Needs at least prl02 The perl script can be run on an install of perl 5.8.1 or above with per/Tk installed. To use all of the advanced features,  it will need perl/Tk804.026 or higher. The perl runtime libraries will allow you to run perl scripts against it as well as the executable.

prl03.zip - (5773k) perl runtime libraries - contains a full complement of perl libraries with Tk804.026 installed to allow either perl scripts or the Gui* executables to run on your system. Installation instructions and information on compiling your own package on this page.


This "manual", while it has some usage instructions, is pretty sketchy and it is not particularly easy to find specific things. At this point it is much more of a change log than a manual. David Cortesi, (dcortesi on the DP site,) has written a much more user friendly manual. It does not have a permanant home yet, but I have included a redirect page in the guiguts package which I will attempt to keep current. The page is included as an HTML file named ggmanual.html.


Written by Steve Schulze (thundergnat). 


Questions or comments? Leave a message in the Distributed Proofreaders forums or private message me as "thundergnat".

Also check out my pre processing toolkit guiprep.pl




This software has no guarantees as to it fitness to do this or any other task. Any damages to your computer, data, your mental health or anything else as a result of using this software are your problem and not mine. If not satisfied, your purchase price will be cheerfully refunded.

This program may be freely distributed, used, and modified. Reverse engineering is condoned and encouraged. If you come up with some really cool addition (or even just an idea) let me know, and it may be included in future releases. If you do reuse some of my code, I would appreciate you mentioning it in the comments of your script and dropping me a line to let me know.

CONTENTS

What is it for?

Whats new?

Background:

Installing the script:

Using the script:

A few details:
Spell check
Column Cut, Copy & Paste
Word Frequency
Fixup Function
Fixup Page Separators
HTML Fixup
Footnote Fixup

How do I....
Open a file?
Save A File?
Append a file?
Abandon changes?
Quit?
See the page images?
Set page image markers?
Set a bookmark?
Go to a bookmark?
Run gutcheck?
Do bulk case adjustment?
Do bulk indenting?
Rewrap the text?
Adjust the rewrap margins?
Check for mismatched (orphaned) brackets?
Check for mismatched (orphaned) HTML markup?
Remove trailing blanks?
Find / fix spaced hyphens/em dashes?
Check for consistent hyphenization?
Check for consistent accents?
Check for unusual or discouraged characters?
Check for unusual  capitalization?
Check spelling?
Check for scannos?
Right justify poetry line numbers?
Easily find mismatched quotes?
Use the ASCII Box drawing tool?
Keep track of what I have done with a file?
Enter accented characters?
Transliterate Greek passages?
Use tear off menus?
Make the displayed text bigger/smaller/a different font?
Change Aspell dictionaries?
Set up External program calling parameters?


Hot keys:

Known bugs and odd behavior:

Hey, it doesn't work!

Change log history:



What is it for?


This script is intended primarily to be a GUI front end to gutcheck, the file checking utility written by Jim Tinsley through which all Project Gutenberg texts must pass. Gutcheck is (right now) only available as a command line utility which produces its findings to the screen or to a capture file. These results must then be compared against the text using a 3rd party tool, namely some sort of text editor. This script is essentially a text editor that has been customized to work very closely with gutcheck to help ease fix up. There are several other specialized functions included that are also helpful to someone trying to get a text formatted for Project Gutenberg.

Guiguts also provides an interface for Jeebies: a command line utility for detecting problems with very common and hard to find he/be scannos in texts. Also written by Jim Tinsley. Get a copy from its sourceforge page.

The script provides an interface to Aspell or Ispell. If you have either program installed on your system, you can do full interactive spell checking. Aspell seems to have better support under windows, and is in general a more capable package. Both are free.


Whats new?

Version .62(487k) Ok, I give up. First there were two rounds of proofing with two proofer names to keep track of. Then there were four rounds; two proofers and two formatters, except when there wasn't--and except at DPEU. Now there are five rounds, unless there are four, unless there are three, except when there are two.... (Throws up hands.) Guiguts no longer cares how many rounds there are. (Well, there is an artificial limit to keep processing snappy, but I can increase it indefinitely by changing one variable.)  In practice, right now it is limited to up to 8 rounds. It will adjust all of the proofer name display functions to match the number of rounds in the open file.  I shuffled the order of the buttons in the proofer pop-up window slightly to make it easer to compensate for variable round counts. It no longer tries to figure out if a round is a proofing round or formatting or whatever, it just keeps track of whose name is attached to which round and it is your problem to figure out what was done in which. I did make the (perhaps rash) assumption that every page in a project will go through the same number of rounds. If (when) some of the specialty rounds that have been bandied about ever come to pass, I may have to revise that assumption. I'll burn that bridge when we get there.


When through the script and pulled out all references to the deprecated %pagemarkers hash in the .bin file. The save functions no longer save the %pagemarkers hash. It has been replaced by the more comprehensive %pagenumbers hash. They have both coexisted for about 10 versions (since .53) to give everyone a chance to get upgraded so that files saved with different versions would be backward and forward compatible. This shouldn't affect users at all unless you try to switch back and forth between pre .53 and post .62 versions. If necessary, open and save a file with one of the interim versions to make a "bridge".

Go to Change log History



Background:


Guiguts came about because of my frustration with Proofreaders Toolkit, an older, no longer supported toolkit that was used to prepare texts for Distributed Proofreaders. Proofreaders Toolkit (PRTK) has a GUI front end to gutcheck built into it. It works, but there are several things about it that are sub optimal.

Number 1. It was designed to work with an older version of gutcheck. The command line options for gutcheck have changed slightly since the PRTK was written, so it doesn't interface very well.

Number 2. A bigger problem, every time you make an edit to the file, the list of gutcheck errors becomes unsynchronized and it becomes hard to find subsequent errors that gutcheck reported.

I had previously written a preprocessing application called Prep to do pre proofing checks on texts before they were uploaded to the site. After 8 versions of Prep, I added a Gui front end to it to make it easier to select options for processing. (There were some 30 or so options and the command line was getting out of hand. There's over 70 now.) When I added the front end, I changed the name to Guiprep to differentiate it from command line Prep.

When the frustration level with PRTKs interface to Gutcheck grew too much, I thought, "Heck, I could probably write something to do that." and did so. When it came time to naming it, I thought "Well, I already have
Guiprep, a Gui front end to prep; this is a Gui front end to gutcheck, I'll call it Guigutcheck. But that was too long, so I shortened it to Guiguts. (which I find amusing, so that was a big plus too.)

It has since grown to be a full featured text processing tool kit, of which the gutcheck interface is a relatively small, (though still important!) part.

Guiguts is written in perl to take advantage of it's very powerful text processing functions and cross platform support. It will run on Windows and Linux and Mac OSX
platforms, It unfortunately cannot be easily ported to Mac OS 9 and earlier due to lack of some necessary perl modules for those OSs. Since it is written in perl, the source is automatically available for experimentation and hacking to anyone who is inclined to do so. There is also a "compiled" windows executable version, (Winguts) included for those who don't have a perl interpreter on their machine.  Winguts will need to have the perl runtime libraries installed or a up-to-date perl install for it to work on your system.

The script requires a perl interpreter to run. The ActiveState perl interpreter is probably the most popular for Windows users. (95, 98, 98se, ME, NT, 2K, XP) They also have versions available for Linux and Solaris. It's very functional and free. (They do ask that you register, but you can bypass the registration page without entering anything if you like.) The 5.8.1 or later distribution is necessary to run guiguts. For Windows users, if you use the Microsoft Installer (MSI) version, it is very simple and automatic to set up. If you don't have Microsoft Installer, a link is included on the Activeperl download page.



Installing the script:


Installation is pretty straight forward. The script comes packaged as a zip archive file. Inside the zip, there is a directory named guiguts. That directory contains the script, several support files and several subdirectories. Just select the directory you want to install guiguts in and unzip the archive file into it. That's all that is necessary for installation.

There are a few caveats about which directory you should put it in, however. Guiguts, though a graphical program, is written in perl which is built on a command line foundation. (DOS for you Windows users.) As such, it carries some baggage associated with that. Since it is built over DOS, you need to follow DOS naming conventions for the directory it resides in. I.E. No directory names in the path with more than eight characters (there's actually some wiggle room there but that's essentially it.) and no directory names in the path with spaces in the name.  Something like C:\dp\ is ideal. Something like C:\Program Files\ is not going to work. The script will refuse to run if you install it in a directory with either of those properties. (Note: There is much more leeway for using directories with long names under Win2k and WinXP. The CMD.EXE command interpreter is much better at hiding those from programs that aren't set up to deal with them. COMMAND.COM, the command interpreter under Win98 and WinMe is not that capable though.)

Ok, so you've got the script, you've installed it, what else? In order to make full use of the script, you'll probably want to install a spell checking package and some sort of image viewer. The script is designed to work seamlessly with either Aspell or Ispell. Aspell seems to have better support under windows, and is in general a more capable package. Both are free. They both support many different languages, just download the appropriate dictionary (Aspell win). Make sure you select a dictionary compiled for the correct OS.

If you are working on a project from Distributed Proofreaders, you will probably want some sort of image viewer. The script is agnostic about what viewer to use. Whatever you have available that you are comfortable with should work, as long as it will let you pass a image name as a parameter to open it. (Nearly all do) There are a few free ones that work quite well and have been extensively used. Irfanview is a nice full featured image viewer/processor. XnView is another with very similar capabilities. I tend to favor XnView as a viewer because of its re sample on scale feature which makes images much easier to read when shrunk to fit into a window smaller than the image. (I prefer Irfanview for editing or processing images however. What the heck, get both, they're free!)

If you are generating HTML versions of your files, there is now an interface to HTML Tidy built in to the editor. Handy for HTML checking and cleanup without needing to drop out to another program.

There is a forum thread on Distributed Proofreaders dedicated to questions on setting up these programs.




Using the script:

In order to get full use of the Unicode functions, you will need to have a font that has the Unicode characters defined. Many of the Windows and Mac fonts have some Unicode support, though coverage is spotty at best. There are some fonts with better than average coverage available for various sources. This page discusses several, with links to where they may be obtained. Two that I find useful are: Bitstream Cyberbit, with 29,934 characters, available as a free download  here, (Or Bitstream Cyberbase, which doesn't have the Korean/Japanese characters) and Code2000, with 34,810 characters, available here. Code2000 is a shareware font with a five dollar registration fee. If you like it and use it I would encourage you to register it. Keep in mind that if you have installed one of these large fonts, it allows you to see the characters on YOUR computer. If someone else does not have the font, (or at least a font that supports the characters you use,) they may not be able to see what you see.

The script is basically an text editor with some specialized functionality bolted on. The text editor module is actually fairly comprehensive. It supports multiple levels of undo, lots of cut and paste functionality and has many hot key combinations built in. It provides a front end to gutcheck with automatic cursor placement at errors/warnings. Unlike PRTK, the error list is tied to the text itself, so edits early in the file will not cause pointers to later errors to become invalid every time a line is added or deleted. There are also quite a few other post processing functions built in to help tidy up a text.

The first time that you run the gutcheck function, it will ask you where your gutcheck.exe executable is. The latest version is included in the gutcheck directory of the script folder.

The first time that you try to open an image through the script, it will ask you where you image viewer is. Browse to wherever/whichever it is and select the executable.

The first time you try to run a spell check, it will ask where the executable for Aspell /Ispell is. Browse to it and select the executable.

All of these may be set up or changed at any time under the Prefs menu "Set File Paths"

When you start the script, a GUI window will open. There is a menu across the top, a large text area and a status bar at the bottom. The menu bar groups similar functions together on drop down menus,  pretty much like most other windowed applications.


A short description of the available menu selections:

File: (tear off menu) - File operations; ubiquitous file stuff.

Open - Open a file. The script will remember the last directory that you opened a file in and browse from there.
--
Recently opened Files: 1 through 10
- Click on a file name to open it.
--
Save - Saves the open file with the current name. Short cut keys - Ctrl - s
--
Save As - Save the open file with a different name.

Include -  Open and insert a file after the cursor location in currently open file.

Clear - Abandon the open file but don't close the program

--

Guess Page Markers - If you are working with a file that has no page markers or already has the page markers removed, you can use this function to insert calculated page numbers. Not very accurate.

Set Page Markers - Set page markers all at once using the page separators from the DP file. Will allow you to use the image viewer button before you run the Page Separator Fixup function.
--
Exit - Abandon the open file and close the program. Will confirm discarding any unsaved changes.


Edit: (tear off menu) - File editing functions

Undo - Multi level undo. Track back all of the changes made since saving the file.

Redo - Multi level redo. Undo all your undos. A little buggy. Easier perhaps to make another change to the file, then use undo again. It will undo all your undos.
--
Col Cut - Take the selected column of text and transfer it to the clipboard.

Col Copy - Copy the selected column of text to the clipboard.

Col Paste - Take the contents of the clipboard and insert it at the cursor.
--
Select All - Select all of the text in the window.

Unselect All - Select none of the text in the window.


Search: (tear off menu) - Search and replace functions.

Search & Replace - (Pop up window). Search & replace functions. A fairly comprehensive search engine. Allows use of regular expressions (regexes) while searching to do some pretty complex searches.

Stealth Scannos - (Pop up window). Search & replace functions with automatic loading of stealth scannos. Basically an extension of the search window that supplies built in search and replace terms. In the scannos subdirectory, there are several files containing list of words that are commonly mis scanned for another (en-common.rc) and a list of regexes that are commonly used regex.rc. This function allows you to load the search terms into the search window one by one automatically.

Spell Check - (Pop up window). Spell check the open document if you have Aspell or Ispell installed on your machine. If you select a portion of the document it will only check that portion. If you don't select anything, it will check the entire document. Spellcheck will save the document before it runs if you have unsaved edits.

Goto Line - Goto the specified line. If you know what line you want to be on and don't want to scroll and check, scroll and check, you can jump directly to it. (the current line number is displayed at the bottom of  the screen.

Goto Page - Goto the specified page. Jump directly to the page number you enter. Only will work if your file had page markers.

Which Line? - Find out the line number of the line the cursor is on. Also available at bottom of page in status bar. A little redundant perhaps.

Find next /*..*/ block - Find next block of text with /*..*/ markup. Text surrounded by /* .. */ markup will be skipped when re wrapping. Useful for poetry, tables etc.This function helps you easily cycle through the "non-rewrapped" blocks of text.

Find previous /*..*/ block - Find previous block of text with /*..*/ markup. Mate to the the previous.

Find next /$..$/ block - Find next block of text with /$..$/ markup. Text surrounded by /$ .. $/ markup will be skipped when re wrapping. Useful for poetry, tables etc.This function helps you easily cycle through the "non-rewrapped" blocks of text.

Find previous /$..$/ block - Find previous block of text with /$..$/ markup. Mate to the the previous.

Find next /P..P/ block - Find next block of text with /P..P/ markup. Text surrounded by /P .. P/ markup will be formatted as poetry. This function helps you easily cycle through the poetry blocks.

Find previous /P..P/ block - Find previous block of text with /P..P/ markup. Mate to the the previous.

Find next indented block - Find next block of text that is indented. (May or may not be in a marked up block)

Find previous indented block - Find previous block of text that is indented.

Find Orphaned Brackets - Find orphan or nested brackets. ( ), [ ], { } or < >  or markup /* */ ,  /# #/ or /$ $/.  Sometime brackets are unpaired in a text and it can be a real pain trying to find the unpaired bracket. This special search function makes it easy. It will search for and highlight any unmatched brackets / markup it can find.

Highlight double quotes in selection - Highlight all of the double quotes in selection to find unmatched. A special function to help find mismatched double quote in a selection. (usually a paragraph). It can be very difficult to find missing quote marks. This highlights them to make it easier to pick them out. Hot key -> Ctrl -Shift - "

Highlight single quotes in selection - Highlight all of the single quotes in selection to find unmatched. Same thing for single quotes. Hot key -> Ctrl - '

Remove Highlights - Unhighlight any highlighted text. Typically used together with the previous two functions. Hot key -> Ctrl - 0


Bookmarks: (tear off menu) - Set and jump to bookmarks in the text. You can keep track of up to five places in the text and jump instantly to them using these functions. They will be remembered from session to session. (Each text will remember their own.)


Selection: (tear off menu) - Perform operations that are typically done on a selection of text.

lowercase Selection - Convert the selected text to all lowercase.

Sentence case Selection - Convert the selected text to sentence case. (First word capitalized, the rest all lower case.)

Title Case Selection - Convert the selected text to title case. (First letter of each word capitalized)

UPPERCASE selection - Convert the selected text to all UPPERCASE.
--
Surround Selection With - Insert customizable text (default underscore (traditional Gutenberg ASCII italic marker) )around selected text.

Flood Fill Selection With - Overwrite selected text with  customizable text (defaults to space). Control+w hot key will just overwrite without popping up string editing window.
--
Indent Selection 1 - Move the selected text right by one space.

Indent Selection -1 - Move the selected text left by one space, will not remove non whitespace characters. You can destroy relative indenting if you continue to left indent after the text is already in the first column.
--
Rewrap Selection - Rewrap the selected text. Will skip any text inside /* */ or /$ $/ markup. Will do block indenting of text inside /# #/ markup.

Block Rewrap Selection - Rewrap the selected text using the block rewrap margins. Will skip any text inside /* */ or /$ $/ markup.

Interrupt Rewrap - Break into a rewrap routine. (It can get pretty long for large files.) A small window with an Interrupt button in it will pop up as well during the rewrapping function.

ASCII Boxes - A special function to automatically draw ASCII boxes around a selection of text. Can be set to automatically rewrap the text and left, center or right justify the text within the box. The drawing characters are selectable. See an example here. (will open in its own window)

Align text on string. - A specialized function to help align columns of numbers or text that should be aligned on a common character. (often a period/decimal point) This function will let you specify a character to align on, then adjust the indent of all the lines in the selection that contain the alignment character so that they line up.

Convert To Named/Numeric Entities - Convert anything in the selected text that isn't ASCII to HTML Named/Numeric Characters.

Convert From Named/Numeric Entities - Convert any HTML Named or Numeric characters in the selection to Unicode.

Convert Fractions - Convert any fractions that are within the Unicode standard to named or numeric entities.


Fixup: (tear off menu) Specialized functions for post processing

Run Word Frequency Routine - (Pop up window) Make a list of all of the distinct word with count of how often they occur in the text. List them case sensitively or insensitively (Polish polish), sort them by frequency or alphabetically. Left click on the word to transfer the word to the search function or just right click to search the text for that pattern. There are several sub functions as well. These are more fully explained in the "A Few Details" section.
--
Run Gutcheck - (Pop up window) Run gutcheck against the file that is currently loaded in the text editor. For speed, gutcheck is actually run against the file on the disk rather than the text in the open window. Because of this, if the file will be saved first if it has been edited, to prevent it from running against a stale file. The first time gutcheck is run, it will ask where the gutcheck executable is. There is a copy included in the gutcheck directory under the guiguts directory. Browse to it, select the executable and click OK.

Gutcheck options - (pop up window) Change the behavior of gutcheck. Options -y (redirect stderr) and -e (don't echo lines) are set in the program and are not configurable, (the script wouldn't work very well otherwise). Most other options are available here and can be customized to your satisfaction. See the gutcheck documentation for more information about specific options.
    -v - Enable verbose mode. Should be enabled for most purposes.
    -t - Enable check for common typos. Do some basic checking for common typos/scannos. Enabled automatically if paranoid mode is enabled.
    -x - Disable paranoid mode. Relax checking rules. Not really recommended for most purposes..
    -p - Report ALL unbalanced single quotes.
    -s - Report ALL unbalanced double quotes.
    -m - Interpret HTML markup. Will check line lengths as if markup was not there. Automatically enabled if a threshold of HTML entities are found in a file.
    -l - Do not report non DOS newlines. Unix and Mac use different newline characters.This will suppress warnings about them.
--
Remove End-of-line Spaces - Routine to remove all end-of-line spaces. Also run during Fixup routine. A common gutcheck warning.

Run Fixup Routine - Routine to do automatic fix up of a bunch of common problems. See paragraph below in the "A Few Details" section.

--

Fix Page Separators - Functions to help automate removal page separators from DP texts and format the text around them.

Remove Blank Lines Before Page Separators - Tidy up blank lines at page separators.

--
Footnote Fixup - Footnote consistency checking and moving tools. Pop up window. See Details for more info.

HTML fixup - HTML tools and conversion. Pop up window. See Details for more info.

Sidenote Fixup - Sidenote moving and checking tool. Will search for [Sidenote  ] markup and move any it finds to the beginning of the paragraph it finds it in.

Reformat Poetry Line Numbers - Right justify and align poetry line numbers to the right of the text. The line numbers must be to the last characters on the line and must be separated from the text by at least two spaces.

Convert Windows Codepage  characters to Unicode -  Convert characters in the hex 80-9F range to their Unicode equivalents. These characters normally shouldn't show up in your text, but if a proofer cut and pasted the page into Word, proofed it and then pasted it back, these character may get inserted.
--

ASCII Table Special Effects - Tools to adjust and reformat ASCII tables.

--

Clean Up Rewrap Markers - Removes all of the /* */ & /# #/ markup from the text. Deletes the entire line that the markup is on.

--

Add  a Thought Break - Add a standard Distributed Proofreader "Thought break"        *       *       *       *       *


Prefs: (tear off menu) - Set up program preferences.

Set Rewrap Margins - Adjust margins for rewrap functions. The left margin function is NOT "leave this many spaces" it is "Start in this column". So a left margin of 5 would leave four spaces before the text.

Font - (Pop up window) Adjust font, font size and font weight. Viewable format only. Will not affect file, just the viewing parameters.

Browser Start Command - Set the startup command for your web browser. Probably best left as 'start' under Windows. Enter the full path to the executable.

Set File Paths - Set the paths to the various support programs. (gutcheck, Aspell, tidy, image viewer, pngs directory)

Leave Bookmarks Highlighted - By default, bookmarks are only highlighted when you set or jump to them. Checking this will leave the bookmarks highlighted all the time.

Disable Quotes Highlighting - The text editor can highlight pairs of quotes, double quotes, brackets or parenthesis automatically when the cursor is placed between them. It can be distracting though. This option disables it.

Keep Popups On Top - Many pop up windows will stay on top of the main window, even if the focus changes if this is selected.

Disable Bell - Many operation, particularly searches, sound the system bell when warning about errors. This disables the audible warning.

Auto Set Page Marks On File Open - Toggle auto page marker set when file loads. Probably should be left enabled unless you are working with very large files.

Toolbar Prefs - Enable/disable the toolbar, and select which side of the editing window you would like it to be on when you start the program.

Set Button Highlight Color - The highlight color is the color that the button changes to when active. The search window will flash the search button as a warning when no terms were found using the highlight color. I picked a default color. Change it here if you like.

Spellcheck Dictionary Select - Shortcut to the Aspell dictionary select routine.

Toggle Auto Save - Enable or disable automatic saving of the open file every interval of time. Interval defaults to 5 minutes.

Auto Save Interval - Pop up a box where you can adjust the interval between auto saves.

Toggle Auto Backups - Enable or disable automatic backups when the file is saved. Works with both user initiated and automatic saving. Saves the previous two iterations of the file so you could roll back changes if desired.


Help: (tear off menu) - Various help items.

About - About the program

Versions - A pop up window listing the version numbers of most of the software and libraries involved with running guiguts. Mostly useful for troubleshooting.

Open Manual - Open the local copy of this document

Check for updates - Automatically connects to the server where the files are hosted and checks if there is a more recent version available.

Hot keys - A pop up list of the various  hot key bindings

Function History - A history of all of the functions that have been performed on a particular file.

Greek Transliteration - Pop up a Greek transliteration chart.

Latin-1 Chart - Pop up a chart of Latin-1 characters not on a US standard keyboard.


External: (tear off menu) - User configurable hooks to external programs.

0 - 9 - 10 slots where you can set up external program calls.

Setup - Pop up window where you can set up calls to external files and programs.


Unicode - A drop down list of different Unicode character groups. Not comprehensive, though most groups from 0100 through FFFF are represented.

Sort by Name / Range - Change the sort order that the Unicode blocks are displayed in the drop down.



A few details:
Spell check
Column Cut, Copy & Paste
Word Frequency
Fixup Function
Fixup Page Separators
HTML Fixup
Footnote Fixup

Spell check is almost completely  implemented at this point. If you have Aspell or Ispell installed on your system, it will check through the file for misspelled words and let you cycle through the file checking each in turn. It will pop up a list of guesses for each misspelled word. Double click on a word to move it to the replacement text box, then click change to replace it in the text.  (Or triple left click on a word to use it as a replacement. It is modeless so you can go and edit in the main window and then pick up where you left off with the spell check, though you should probably avoid this if possible as it can confuse the spell checking code. The Aspell package will learn from mistakes. If you have a word misspelled in the text and use one of the replacement terms, the next time it sees that misspelling, it will put the previously selected replacement higher in the list of possibles.



The column cut, and column copy functions are a little tricky to use. The selection highlighting does not stay within the bounding box, so it looks like you are selecting entire lines. When you use the column functions though, the only text that is actually selected is whatever is within the bounding box formed by the upper left selection point and lower right selection point. The selection highlighting will extend past the actual selection.



The gutcheck function will run gutcheck against a copy of the file from your disc and pop up a window with the list of errors/warnings that it found. It may take some time to run, especially on long files and/or slow computers. When you double click on an error in the pop up window, the cursor in the text window will move to the error, or as close to the error as possible. If you click again, the focus will change to the text window. For some errors, the cursor may not end up exactly at the reported errors location. This could be for a variety of reasons. HTML markup changes how gutcheck reports column locations so that could throw it off. Query's of very short or common words may find an instance earlier on a line than the one actually being queried. In general, it will nearly always get the line right, and get the column right better than 3/4 of the time.

Handy Tip: While working in gutcheck, for mismatched quotes errors, click on the gutcheck warning 3 times. This will move the cursor to the end of the paragraph with the mismatched quotes and change the focus to the text window. Press "Control-Shift-Up Arrow" to select the paragraph before the cursor, press "Control-Shift-Double quote" to highlight all of the double quotes in the paragraph. Makes it much easier to pick out the missing/extra quote marks. (Also works for "Control-Single Quote" for single quote mismatches.)  Press "Control-Zero" to remove all highlights.



The word frequency function will build an index of all of the word in the file with a count of the number of time each appears. By default, the word list is built doing a case sensitive search, ("This" is different from "this") with the words listed by frequency. You can change the search parameters to make the search be case insensitive, and you can change the order of the list to be sorted alphabetically or by the number of occurrences. When you left click on a word in the list, it will automatically copy it to the search text entry box if the search pop up is open. If you right click on a word, it will automatically search the text for that pattern. It will search using the options selected on the search pop up. (case sensitive or not, whole word or not). There are a few sub sort functions too, to allow you to easily do some specialized word frequency sorts.

1st Harmonic - Will pop up a window and display all of the words in a text that are one edit away from the word selected. For instance, if you select "the" and press First Harmonic, you might end up with the list "he, she, She, the, The, them, then, they, thy, tie". These are all the words in the text that can be gotten from the selection with only one edit. (Well, you get the original word back, so one or less...) The edit can be a replaced letter, a removed letter or an added letter, but  there can be only one edit. You must have selected a word in the word frequency window or it won't return a list. Different texts will get different lists.  It doesn't return every POSSIBLE variation in spelling, only those that are present in the text. There is a hot key shortcut - Ctrl-left click. The harmonics window has the same search bindings as the word frequency window (left click to pop up the search window, right click to search with the current search settings. You can also recursively do a harmonic search on a word in the harmonics window. You must use the hot key to do so.)
All Words - Will display all of the words found in the document. (Default display.)
Re Run - Clears all the word hashes out of memory and runs a fresh sort on the file to pick up any edits you may have made.

Check Emdashes - Will sort out all of the emdash phrases in the text and display them with the frequency that they occurred. If there is a word that is identical to one of the emdash phrases except it has a hyphen instead of an emdash, it will be displayed as well with a string of asterisks next to it **** so it can be easily picked out.
 Check Hyphens - Will sort out all of the words with hyphens in them and display them with the frequency they occurred. If there is a word that is identical to one of the hyphenated words only without a hyphen, or, is identical except it has an emdash, it will be displayed as well with a string of asterisks next to it **** so it can be easily picked out. Makes it easy to find inconsistently hyphenated words.
Check Alpha/num - Will sort out and display all of the words in a text that contain a mixture of letters and digits. IE it will display 1st, 2nd, 23rd, 75c, etc. It will also display l86O, l9th, 0ddba11, grumb1e and M1STAKE.
Check spelling - This will run a spell check on the file and return the list of "misspelled" word that it found in the file. Note: at this time, Aspell will not handle multi byte characters, so any words with multi byte characters are filtered out of the file and added to the "misspelled" list unless they are in the project dictionary. They may be spelled correctly, but Aspell has no way of knowing (yet).

Ital/Bold Words
- Will sort out and display words and phrases that are marked up with bold or italics markup. There is a four word threshold set by default. (Will not display phrases longer than 4 words.) The threshold is adjustable by right clicking on the Ital/Bold Words button.
Check All Caps - Will sort out and display all of the words that have no lower case letters. (10TH would be displayed, even though it contains digits, since it contains no lower case letters.)
Check MiXeD CasE - will display all word with a mixture of lower case letter and at least one upper case letter not at the beginning of the word.
Initial Caps - will display all word with an initial capital letter and no other upper case letters.
Character counts - Much like the name suggests, a list of the different characters that appear in the text and how many times they appear. White space characters are represented by their names rather than the actual character.

Check , Upper - will display all phrases that have an upper case character after a comma (comma/period error). Will search across lines. Terms that have a newline in them will have the newline represented by \n.
Check . Lower - will display all all phrases that have an lower case character following a period (comma/period error). Will search across lines. Terms that have a newline in them will have the newline represented by \n.
Check Accents - Will sort out and display all of the word in the text that contain an accented letter. If there are any words that are identical except they have an unaccented letter, they will also be displayed with a string of asterisks next to it **** so it can easily be picked out. Makes it easy to find inconsistently accented words.
Unicode > FF - This will sort out and display all words (words, not characters!) that contain any Unicode character greater than hex FF (decimal 256). Characters FF (256) and lower are in the Latin-1 character set.
Stealtho check - This is another way to check for scannos in the file. You can use the en-common.rc file to get the commonly misscanned for each other words, or you could use the misspellings.rc word list which will looks for the 3500 or so most common misscanned words. Both files are in the scannos directory.



The Fixup function will comb through the text and make innocuous and easily automated repairs to the text. It will pop up a window and allow you to customize which checks you want to perform.
As of now, the fixes it will perform are:

• Remove spaces at end of line.
• Remove spaces on either side of hyphens.
• Remove space before periods.
• Remove space before exclamation points.
• Remove space before question marks.
• Remove space before commas.
• Remove space before semicolons.
• Remove space after opening and before closing brackets.
• Remove space after open angle quote and before close angle quote.
• Remove space after beginning and before ending double quote.
• Ensure space before ellipses except after period.
• Format any line that contains only 5 *s and whitespace to be the standard 5 asterisk thought break.
• Convert multiple space to singe space.
• Fix obvious l<-->1 problems.
You can also specify whether to skip text inside the /* */ markers or not.



Fix Page Separators will pop up a window with several buttons on it to help automate removal of page separators from files from Distributed Proofreaders.
The Buttons are:
Join Lines - join the lines on either side of the separator, removing any blank lines, spaces, asterisks and hyphens as necessary. - Hotkey -> j  (Notice: this will remove any leading hyphen, spaces and asterisks from the line after the separator as well.)
Join, Keep Hyphens  - join the lines on either side of the separator, removing any lines, spaces and asterisks necessary. - Hotkey -> k
Blank Line - remove the separator, leave one blank line. (paragraph break) - Hotkey -> l
New Chapter - remove the separator, leave four blank lines. (chapter break) - Hotkey -> h
Refresh - find, center and highlight the next page separator  - Hotkey -> h
Undo - automatically back out of all of the changes made for the last separator edit  - Hotkey -> u

There are also some check boxes to control some of the functions.
Full Auto will automatically search for the next page separator as soon as one has been done and try to automatically process it if it can. - Toggle state - a
Semi Auto will automatically search for the next page separator as soon as one has been done and wait for you to select an operation. - Toggle state - s



HTML Fixup: Pops up a button bar which has most of the popular HTML markup on it; at least, all that is easily translatable to TEIlite. Will automatically insert the selected markup around the selected text. Some markup buttons act differently depending on what text is selected. There is an Autogenerate HTML function that will do basic conversion to an HTML version. The available markup and functions:

Autogenerate HTML: Will do basic conversion to HTML. Will attempt to make links to out-of-line footnotes if found. Will preserve line breaking in text marked with /*..*/ (there needs to be a blank line before the open and after the close delimiters.) Will try to preserve indenting, not real elegant, but it trys. Will automatically add HTML header and footer if not there already.
Custom Page Labels: Configure page numbers independantly of image numbers.
Auto Illus Search: Automatically search for [Illustration: markers and interactively allow you to select images to display there.
Pg #s as comments - Insert the page numbers as HTML comments.

Pg #s as anchors - Insert the page numbers as internal HTML links (anchors).
<i>  Italics  -  Insert <i> </i> around the selected text, removing any that may be in the selection.
<b> Bold  -  Insert <b> </b> around the selected text, removing any that may be in the selection.
<u> Underline -  Insert <u> </u> around the selected text, removing any that may be in the selection.
<center> Center -  Insert <center> </center> around the selected text, removing any that may be in the selection.
<h1>Header 1 -  Insert <h1> </h1> around the selected text.
<h2>Header 2 -  Insert <h2> </h2> around the selected text.
<h3>
Header 3 -  Insert <h3> </h3> around the selected text.
<h4>Header 4 -  Insert <h4> </h4> around the selected text.
<h5>Header 5 -  Insert <h5> </h5> around the selected text.
<h6>Header 6 -  Insert <h6> </h6> around the selected text.
<p> Paragraph -  Insert <p> </p> around the selected text.
<br> Line break -  Insert <br> at the end of each line of  the selected text, or at the cursor if no selection is made.
<hr> Horizontal line -  Insert <hr> before  the selected text, or at the cursor if no selection is made.
&nbsp; Non breaking space -  Replace space with &nbsp; wherever there are two or more adjacent spaces in  the selected text, or at the cursor if no selection is made.
Poetry - Will automatically insert markup used in /p p/ blocks in the selection.
<big> - Insert <big> </big> around the selected text
<small> - Insert <small> </small> around the selected text
<ol> Ordered list -  (numbered list) Insert <ol></ol> around the selected text. Also need to define list items inside it.
<ul> Unordered list -  (bulleted list) Insert <ul></ul> around the selected text. Also need to define list items inside it.
<li> List item -  Insert <li></li> around the selected text. A list item. Defaults to bulleted unless surrounded with <ol></ol>.
<sup> Superscript -   Insert <sup> </sup> around the selected text.
<sub> Subscript -   Insert <sub> </sub> around the selected text.
<table> Table -   Insert <table> </table> around the selected text. Will need to define rows and columns.
<tr> Table row -   Insert <tr> </tr> around the selected text. Meaningless without <table></table> markup.
<td> Table column or cell -   Insert <td> </td> around the selected text. Meaningless without <table></table> markup.
<blockquote> Block quote -  Insert <blockquote> </blockquote>  around the selected text. (Indented from each margin)
<code> Block quote -  Insert <code> </code>  around the selected text. (mono spaced font)
Named anchor:  -  Insert an internal anchor (for a link) before the selected text, using the selected text as the name.
Image:  -  Insert an image anchor before the selected text, using the selected text as the alternate text. Will ask for image directory first time and remember it.
Named anchor:  -  Insert an internal anchor (for a link) before the selected text, using the selected text as the name.
External link:  -  Insert a link around the selected text, using the selected text as the link text, to an external file or location.
Internal link:  -  Insert a link around the selected text, using the selected text as the link text, to an previously created Named anchor.
External link:  -  Insert a link around the selected text, using the selected text as the link text, to an external file or location.
Remove markup from selection:  -  Remove all HTML markup from the selected text, will warn if it leaves orphans as a result.
Find orphan markup:  -  Search for all HTML markup that is opened but not closed or closed but not opened.
Auto list - Will automatically place list markup in a selection. each line will be treated as a list item.
Auto table - Will automatically place table markup in a selection. each line will be treated as a row, two or more spaces between items will denote a cell.
div - Insert a div around a selection. Will use the style in the entry box to the left.
span - Insert a span around a selection. Will use the style in the entry box to the left.
Header: Insert HTML header and footer. You can customize the header in the external file "header.txt" located in the guiguts directory.
Find and Format Poetry Line #s - Mark up any poetry line numbers it finds with appropriate markup during autogenerate.



Footnote Fixup: The way the function works: Open the Footnote Fixup routine under the Fixup menu. Right in the middle of the dialog window that opens, is a button called First Pass. This will comb through the file and find everything it thinks is a footnote. Once this is finished, you need to manually step through the footnotes it found and check each one to make sure it has no errors. (Missing open or closing bracket usually) If a bracket is missing, the footnote highlighting will extend beyond its boundaries. You will need to add the enclosing bracket, then hit Adjust Bounds, to re-search for the limits of that footnote.

If the footnote was left as an out-of-line footnote by the proofers, it will try to find the anchor in the text, if possible. The tool depends on the footnote being formated "[Footnote xx:" where xx is the footnote letter/number/symbol. If the footnote marker is formatted with the number/symbol following the colon, "[Footnote: xx" it will not be able to identify it. In these cases, you must tell the script where to set the anchor. You will get different behaviors depending on whether you elect to do In-line or Out-of-Line footnotes. For inline footnotes, put the cursor at the point where you want the anchor to be and press Set Anchor. If there is an existing anchor, it will be deleted and the footnote will be moved to where the anchor was. If there was no anchor, the footnote will be moved to the new anchor point (where the cursor is). For Out-of-line footnotes, if there is an anchor, select Number, Letter, or Roman to select the symbol type. If an anchor exists, it will change to the next available symbol of that type. If no anchor exists, it will add an anchor of the selected type at the present cursor location. **Notice, you may get duplicate footnote symbols. That will be fixed in the re indexing step.**If you have a footnote that has been broken across a page, you can use Join With Previous to automatically rejoin the two halves.

Once you have stepped through and checked, adjusted, fixed and anchored all of your footnotes, hit the button Re Index.  For inline footnotes, this will go through and delete any remaining anchor markers and move the footnotes into place if necessary. For out-of-line, it will renumber all of the footnotes using the same family of symbol that it had originally or a number if it had no anchor marker. This will close up any gaps in the numbers and remove duplicates. You can make changes and re index as often as you like.

Once that is finished, inline footnotes are done. Out-of-line footnotes will need to have a place (or places) selected for the footnotes to be moved to--end of text, end of each chapter, whatever you want. There are two buttons to automatically set landing zones at the end of every chapter or at the end of the text. Alternatively, you can manually select where you want the footnote landing zones to be. Put the cursor where you want footnotes to be moved to and press Set Landing Zone. This will insert the marker text "FOOTNOTES:" at that spot. The footnotes between that landing zone and the previous one will be moved to just past that marker. You can have as many landing zones as you like, and can step through them adding and removing as necessary.

After your footnotes have been relocated, you can redo a first pass to check that they are all correct. You will probably need to check Unlimited Anchor Search (This keeps the script from searching before the present page, more or less, to keep from finding anchors from previous footnotes early on when there are probably multiple footnote 1s.) You can easily view the anchor and footnote by pressing the appropriate button.

A handy hint: After the first pass has been done but before second, run the Word frequency routine, sort alphabetically, and check to see that you have the same number of occurrences of Footnote in word frequency as you do in the Footnote fixup box. If not, you may have a problem in the text, probably a footnote with missing or incorrect brackets.



Hot keys:


There are many, many functions available through hot key combinations as well. Here is a fairly complete list.

<ctrl>-x -- cut
<ctrl>-c -- copy
<ctrl>-v -- paste
<ctrl>-a -- select all

<ctrl>-s or <ctrl>-S -- save file
<ctrl>-f or <ctrl>-F -- pop up search window

F1 -- column copy
F2 -- column cut
F3 -- column paste  * Notice: column paste should only be used to paste a column onto lines that already contain text. To paste a column of text onto blank lines, use standard paste: <ctrl>-v


<ctrl>-u -- convert selection to upper case
<ctrl>-l -- convert selection to lower case
<ctrl>-t -- convert selection to title case

<ctrl>-i -- insert a tab character before cursor (Tab)
<ctrl>-j -- insert a newline character before cursor (Enter)
<ctrl>-o -- insert a newline character after cursor

<ctrl>-d -- delete character after cursor (Delete)
<ctrl>-h -- delete character to the left of the cursor (Backspace)
<ctrl>-k -- delete from cursor to end of line

<ctrl>-z -- undo
<ctrl>-y -- redo

<ctrl>-e -- move cursor to end of  current line. (End)
<ctrl>-b -- move cursor left one character (left arrow)
<ctrl>-p -- move cursor up one line (up arrow)
<ctrl>-n -- move cursor down one line (down arrow)

<ctrl>Home -- move cursor to the start of the text
<ctrl>End -- move cursor to end of the text
<ctrl>-right arrow -- move to the start of the next word
<ctrl>-left arrow -- move to the start of the previous word
<ctrl>-up arrow -- move to the start of the current paragraph
<ctrl>-down arrow -- move to the start of the next paragraph
<ctrl>PgUp -- scroll left one screen
<ctrl>PgDn -- scroll right one screen

<shift>-Home -- adjust selection to beginning of current line
<shift>-End -- adjust selection to end of current line
<shift>-up arrow -- adjust selection up one line
<shift>-down arrow -- adjust selection down one line
<shift>-left arrow -- adjust selection left one character
<shift>-right arrow -- adjust selection right one character

<shift><ctrl>Home -- adjust selection to the start of the text
<shift><ctrl>End --  adjust selection to end of the text
<shift><ctrl>-left arrow -- adjust selection to the start of the previous word
<shift><ctrl>-right arrow --  adjust selection to the start of the next word
<shift><ctrl>-up arrow --  adjust selection to the start of the current paragraph
<shift><ctrl>-down arrow -- adjust selection to the start of the next paragraph

<ctrl>-/ -- select all
<ctrl>-\ -- unselect all
<Esc> -- unselect all

<ctrl>-' -- highlight all apostrophes in selection.
<ctrl>-" -- highlight all double quotes in selection.
<ctrl>-0 -- remove all highlights.

<Insert> -- Toggle insert / overstrike mode

Double click left mouse button -- select word
Triple click left mouse button -- select line

<shift> click left mouse button -- adjust selection to click point
<shift> Double click left mouse button -- adjust selection to include word clicked on
<shift> Triple click left mouse button -- adjust selection to include line clicked on

Single click right mouse button -- pop up shortcut to menu bar

<alt>-left arrow -- move selection left one space
<alt>-right arrow --  move selection right one space

BOOKMARKS:

<ctrl>-<shift>-1 -- set bookmark 1
<ctrl>-<shift>-2 -- set bookmark 1
<ctrl>-<shift>-3 -- set bookmark 3
<ctrl>-<shift>-4 -- set bookmark 4
<ctrl>-<shift>-5 -- set bookmark 5
       
<ctrl>-1 -- go to bookmark 1
<ctrl>-2 -- go to bookmark 2
<ctrl>-3 -- go to bookmark 3
<ctrl>-4 -- go to bookmark 4
<ctrl>-5 -- go to bookmark 5.

MENUS:

<alt>-f -- file menu
<alt>-e -- edit menu
<alt>-r -- search menu
<alt>-b -- bookmark menu
<alt>-s -- selection menu
<alt>-x -- fixup menu
<alt>-p -- preferences menu
<alt>-h -- help menu



Known bugs and odd behavior:

While doing search and replace, be careful when doing bulk editing on words that may be part of another word with an apostrophe extension, i.e. won, won't, Mike, Mike's etc... Due to the way the script recognizes words, it is very difficult to ignore single quotes yet account for apostrophes, especially since they are being represented by the same character. Actually, same problem with accented characters. It's is all due to the semantics of what perl considers to be a word character or not.



Unicode support is non-existent. The Tk text widgets I am using just don't understand Unicode, so there isn't any way for me to implement it with the current widget set. There are rumors that the next major release of perl/Tk will support Unicode, but until it emerges, I won't know for sure.
Unicode is now substantially supported if you are running Tk804.025 or higher. There is still a problem of characters not being fully supported by various fonts, but that is beyond my control. Many fonts support substantial subsets of Unicode, and there are a few that support quite a bit, but you can't rely heavily on any particular character necessarily being available.



Accented character handling is broken. Again, a limitation of the perl/Tk text widget. It knows about the existence of accented characters at least, but doesn't know enough about them to treat them as word characters. Leads to some odd failures when doing  regex operations. I have tried to work around the problem as much as possible, but you will still find some oddities here and there.
This is no longer as true as it was once. Accented character support is still subtly broken, but I have managed to work around it pretty smoothly. There are still a few minor caveats, but it is not as serious as it was before.



The Regex search engine doesn't recognize the newline assertion \n. This is perhaps the biggest and most distressing drawback to the perl/Tk text widget. It has two major and serious implications for doing regex operations.

1) You can't use newline assertions in regex search and replace operations. They won't work. The perl/Tk text widget just doesn't understand them.

2) You can't search for strings that cross over a line boundary. Perl/Tk basically treats a text as an array of separate text strings. A string that crosses a line boundary will not return  as a hit when searching for that string. This is a fairly serious drawback, but can be worked around to some extent if you are aware of it.

I have neatly sidestepped the issue by bypassing the PerlTk text widget for any regex search that contains a \n assertion. It works pretty well though it is not completely seamless. It is probably close to as good as it is going to get however.



When saving a file with Unicode characters, the console window will complain "Wide character in print" for EVERY line that contains a wide (2 byte) character. This is harmless, it saves correctly, it just complains about it. I haven't figured out how to suppress this yet.
Never mind. I figured out how to suppress the warnings.


While doing regex search and replace with variable capturing, you can't use the zero width positive look ahead and look behind in the search term.  It a side effect of the way I have to do the regex handling to work with the Tk text widget. You can use them while searching, but the replacement term will not see any of the captured variables. A work around is to just capture the forward or reverse term in its own set of parenthesis and just add it back in to the replacement term. Negative lookaround assertions should be ok to use though.
This is still not perfect, though much improved. In general, you can now use positive lookarounds, EXCEPT if they contain a literal closing parenthesis. In that case, just capture the parenthesis and add it back in.




Hey, it doesn't work!


When you run winguts, if you only get a DOS box that flashes up on the screen flashes some text and then disappears, there are three possible things that could cause that.

1) A bad directory name/path - Guiguts, though a graphical program, is written in perl which is built on a command line foundation. (DOS for you Windows users.) As such, it carries some baggage associated with that. Since it is built over DOS, you need to follow DOS naming conventions for the directory it resides in. I.E. No directory names in the path with more than eight characters (there's actually some wiggle room there but that's essentially it.) and no directory names in the path with spaces in the name.  Something like C:\dp\ is ideal. Something like C:\Program Files\ is not going to work. The script will refuse to run if you install it in a directory with either of those properties. (Not as true under WinXP.)

2) Executable no longer in right directory - The executable relies on being able to find several external modules and libraries in a predetermined relative location. If the executable is moved from the winguts directory, it will not be able to find the external files it needs and will not run. If you want to run it from your Desktop, Menu, Quick Launch bar, whatever, make a shortcut to it and move that, but leave the executable where it is.

3) Corrupted or missing libraries/modules - There are a whole bunch of external modules and .dll files that the executable needs to run. If they are missing or corrupted, it will refuse to run. Make sure you've downloaded and installed the perl runtime libraries, and that the prl directory is in your path.

How do I....
Open a file?
Save A File?
Append a file?
Abandon changes?
Quit?
See the page images?
Set page image markers?
Set a bookmark?
Go to a bookmark?
Run gutcheck?
Do bulk case adjustment?
Do bulk indenting?
Rewrap the text?
Adjust the rewrap margins?
Check for mismatched (orphaned) brackets?
Check for mismatched (orphaned) HTML markup?
Remove trailing blanks?
Find / fix spaced hyphens/em dashes?
Check for consistent hyphenization?
Check for consistent accents?
Check for unusual or discouraged characters?
Check for unusual  capitalization?
Check spelling?
Check for scannos?
Right justify poetry line numbers?
Easily find mismatched quotes?
Use the ASCII Box drawing tool?
Keep track of what I have done with a file?
Enter accented characters?
Transliterate Greek passages?
Use tear off menus?
Make the displayed text bigger/smaller/a different font?
Change Aspell dictionaries?
Set up External program calling parameters?




Open a file? - Select Open from the  File menu at the top of the program window. Or if you have previously opened a file, you can simply click on the name in the recently opened files in the same menu. If your are running winguts, you can associate the extension .ggp (guiguts project) with winguts and if your files are named with a .ggp extension, you can just double click on the file to open in winguts.



Save a file? - Select File -> Save from the menus at the top of the program window. Alternately, Ctrl-s will save the file. Some functions will automatically save the file when they are run if there have been edits. (gutcheck, word frequency) Saving the file clears the undo buffer. Use Save As if you want to save with a different name. If saving with a different name, it is generally recommended to not JUST change the extension. There are several files of external information (page markers, bookmarks, function history) that are saved with the same file name but different extensions. If you have two (or more) files in the same directory with the same base name but different extensions, it will cause collisions in the external info files.



Append a file? - Place the cursor where you want the file to be added, then select File -> Insert from the menus at the top.



Abandon changes? - Use File-> Clear to clear the current file from memory, it will ask if you want to save any edits that have been made.



Quit? - Use either File -> Exit or click on the program close button in the program frame. Will ask if you want to save any edits.



See the page images? - To see the page images, you will need to have an image viewer, the page images in an accessible directory, (default is pngs directory under the directory the text file is in.) and have the page markers set. If the file is from Distributed Proofreaders, it will have page separator markers in it. You can set the page markers automatically while running Fixup page separators or set them instantly by using File-> set page markers.If your file has no page separators in it, you can use Guess page markers to set some markers which will be close to the correct page, (you won't have to search very far at least.)  Once markers are set, the current page number and a button marker SEE will appear in the bottom status bar. Clicking on SEE will open your image viewer to the image corresponding to the current page.



Set page image markers? - Page markers will automatically be set as you run the Fixup Page Separators function. If you want to set them immediately, before the Page Separator function is run (recommended) use File -> Set Page Markers.



Set a bookmark? - Either use the Bookmarks menu item or Ctrl-Shift-(1 - 5) for up to five bookmarks per file.



Go to a bookmark? - Either use the Bookmarks menu item or Ctrl-(1 - 5)



Run gutcheck? - Under the fixup menu you can select the gutcheck run options and run gutcheck.



Do bulk case adjustment? - Under the Selection menu you can do bulk adjustment of case. Switch the selected text to all lower case, all upper case, sentence case (first word capitalized), or title case (each word  initial caps) .



Do bulk indenting? - Under the Selection menu you can move the selected text right or left one space with one click. When moving text left, it will not remove non whitespace characters. An easy way to remove relative indenting is to continually move the text left until it is all at the left margin.



Rewrap the text?  - Under the section menu there are two rewrap functions, Rewrap text and Block Rewrap text. Rewrap (by default) will rewrap the text from column 1 to column 72. It will remove any double spaces or trailing spaces in the text. Any text within /*  */ markup will be ignored by the rewrap function. The /* markup should be on a line by itself  and there must be a blank line before the markup. The */ markup should be on a line by itself  and there must be a blank line after the markup. */* markup will be treated as /* since, at worst it won't rewrap something that should be, rather than rewrapping something that shouldn't be. If you want to have a relative amount of indenting without rewrapping, you can use /* markup with an indent modifier. Markup with a indent modifier will adjust the indent in the block so that the left-most line will be set to have the indent specified and all other lines will be adjusted to keep their same relative indent. The modifier is an absolute indent. Negative numbers will be ignored.

For example:

/*[4]
text text text
  text text text
text text text
  text text text
*/ 

Would become:

/*[4]
    text text text
      text text text
    text text text
      text text text
*/ 

4 spaces before the left most line, relative indenting maintained.

You can call the rewrap function quickly with Alt-s-r (Press Alt s then r without letting go of the Alt key)

Text inside /# #/ markup will be rewrapped using the Block Rewrap margins. Block rewrap (by default) will rewrap text from column 5 to column 72. The same rules apply to block rewrap markup as non rewrap markup. The markup should be on a line by itself  and there must be a blank line before the open and after the closing markup. There are ways to override the block markup defaults. If you put  margin numbers on the opening line, it will use those numbers for the margins instead of the defaults. They must be formatted thus:  ( /#[x.y,z] )  The first number is the general left margin override. ( /#[x] ) It will indent all of the lines x spaces. If a there is a period and a second number, (  /#[x.y] ), the first line will be indented y spaces and the rest x. If there is a comma followed by a number, ( /#[,z] ), it will override the default right margin setting. You can override the margins in nearly any combination. If you override the first line (y) you will need to have a x value, otherwise the y will be used for all of the lines, and if you have both a left margin and right margin setting, the left margin needs to come before the right. - /#[,z.yx]  won't work, at least not like you'd expect.
For example:

/#
Text text text
text text text
#/

will be indented and rewrapped using the standard block rewrap margins.

/#[6,53]
Text text text
text text text
#/

will block rewrap with a left margin of 6 and right margin of 53 instead.

/#[2]
Text text text
text text text
#/

will use a left margin of 2 and a standard block wrap right margin.

/#[4.6,70]
Text text text
text text text
#/

Will have first line margin at 6, the rest of the lines at 4, and wrap after column 70.

And so on.

You can call the block rewrap function quickly with Alt-s-b(Press Alt s then b without letting go of the Alt key)

The markers /p ..p/ have special meaning to the rewrap function and the HTML autogenerate function. In rewrap, text inside /p p/ will be treated the same as text with the markup /*[4] */. In other word, it will get absolute 4 spaces indent on its left most line and maintain relative indents.

During HTML Autogenerate, text in the /p p/ markup will use special poetry markup styles.

The markers /f f/ are special "Front material" markup. They are meant to be used to enclose the title, author, publishing data, etc. at the front of a text. During rewrap, they will be treated the same as /$ $/; no rewrap, no indent. During HTML autogenerate, they will automatically ensure that the front material is centered.



Adjust the rewrap margins? - Under the Prefs menu there is a selection where you can change the default rewrap margins for both standard and block rewrap. There are some common sense limitations on the allowed margins; there must be a number selected for each, and the right margins can not come before the left margin.



Check for mismatched (orphaned) brackets? - Under the Search menu there is a function dedicated to finding mismatched brackets an rewrap markup. Select which to search for, press search. It will find all of the suspect markup and let you cycle through and check each. (press next)



Check for mismatched (orphaned) HTML markup? - Under the Fixup menu, select HTML fixup. Near the bottom of the window that pops up as a button "Find orphaned markup".



Remove trailing blanks? - There is a dedicated function under the fixup menu, or it can be run as part of the fixup function under the fixup menu, or you could do a regex search and replace '\s+$' => '', (search for 1 or more spaces at the end of a line and replace with nothing), and perform a replace all.



Find / fix spaced hyphens/em dashes? -  The Fixup function under the Fixup menu has a sub function which will attempt to fix all spaced hyphens and em dashes it finds. Running gutcheck will also find any spaced hyphens or em dashes.



Check for consistent hyphenization? - Under the Fixup -> Word frequency function there is a sub function that will sort out all of the words with hyphens in them and display them with the frequency they occurred. If there is a word that is identical to one of the hyphenated words only without a hyphen, it will be displayed as well with a string of asterisks next to it **** so it can be easily picked out.



Check for consistent accents? - Under the Fixup -> Word frequency function there is a sub function that will sort out and display all of the words in the text that contain an accented letter. If there are any words that are identical except they have an unaccented letter, they will also be displayed with a string of asterisks next to it **** so it can easily be picked out.



Check for unusual or discouraged characters? - Under the Fixup -> Word frequency function there is a sub function that will make a list of the different characters that appear in the text and how many times they appear. White space characters are represented by their names rather than the actual character. Easily check for mis matched brackets, upper ASCII, tabs, etc.



Check for unusual  capitalization? - Under the Fixup -> Word frequency function there are the sub functions Check All Caps and Check Mixed Case.
Check All Caps will sort out and display all of the words that have no lower case letters. (10TH would be displayed, even though it contains digits, since it contains no lower case letters.)
Check MiXeD CasE will display all words with a mixture of lower case letter and at least one upper case letter not at the beginning of the word.



Check spelling? - Either under the Search menu or in the word frequency routine. Search -> Spell Check will spell check like a traditional spell checker. I.E. it will scan the file for unrecognized words, highlight them in the text and suggest several  possible replacements. It will learn from mistakes so that if you have a word misspelled the same way several times and you use a particular replacement, it will be moved higher in the list of possible replacements for subsequent occurrences.
Under the Word Frequency routine, it will scan the file, then return a list of unrecognized words that you can quickly look through to get an idea of how spellcheck intensive the file will be. It also makes it relatively easy to quickly pick out obviously misspelled words rather than just unrecognized ones.



Check for scannos? - Either under the Search menu or in the word frequency routine. Search Stealth Scannos will pop up a modified search window that will allow you to load predefined pairs of words that are commonly misscanned for another and easily cycle through them. The word pairs files are in the scannos directory under the guiguts directory.
Under the Word Frequency routine, it will scan through the file and pick out the words that appear in the scannos word pair files so you can quickly look through the list. You can also load a file called misspelled.rc, which contains the top 3500 or so most common misscanned letter combinations.



Right justify poetry line numbers? - Under the fixup menu. This function will move any numbers that are the last characters in a line and separated from any other text by at least two spaces over against the right margin (as specified by the rewrap right margin.)



Easily find mismatched quotes? - A common error in gutcheck is mismatched quotes. To quickly find them, click on the gutcheck warning 3 times. This will move the cursor to the end of the paragraph with the mismatched quotes and change the focus to the text window. Press "Control-Shift-Up Arrow" to select the paragraph before the cursor, press "Control-Shift-Double quote" to highlight all of the double quotes in the paragraph. Makes it much easier to pick out the missing/extra quote marks. (Also works for "Control-Single Quote" for single quote mismatches.)  Press "Control-Zero" to remove all highlights.



Use the ASCII Box drawing tool? - This will draw ASCII art boxes around selected text. The selection MUST start and end on a blank line. You can change what character are used for drawing by changing them in the ASCII Boxes pop up window. You can adjust the size of the boxes (default 64 wide). If you elect to rewrap the text as you draw, it will rewrap the text to fit inside the box with a minimum of one space between the text and the frame (default rewrap 60). You can choose to left justify, center or right justify the text within the box.  (If you do use ASCII boxes for a Gutenberg text, it is recommended that they be indented at least two spaces to prevent rewrapping during whitewashing.) See an actual example here. (will open in its own window)



Keep track of what I have done with a file? - Under the Help menu there is a Function history that keeps track of what major functions have been performed on a file with a time stamp.



Enter accented characters? - Under the Help menu the is a Latin-1 function that will pop up a little window with all (well, most) of  the 8 bit  Latin-1 characters. Click on a character to insert it at the cursor.



Transliterate Greek passages? - Under the Help menu the is a Greek transliteration function that will pop up a little window with all common Greek character glyphs. You can select to get Latin Characters transliteration, Greek character names or HTML codes as output. Click on a glyph to insert the resulting code at the cursor.

There is a newer, more comprehensive transliteration scheme based on beta encoding available for those who want to preserve more information about accented characters. For unaccented characters, the transliteration is the same as the Perseus method (What we use on the site and guiguts has used up to now.)  Beta encoding provides a method to preserve the accents. There are basically eight accents that you need to deal with for Greek, they are detailed below: (You will need a Unicode aware font to view the examples in the chart.)

Popular name
Greek name
symbol
example
encoded
rough breathing mark diasia
(
a(
soft breathing mark
psili
)
a)
acute
oxia
/
a/
grave
varia
\
a\
iota subscript
prosgegrammi
|
a|
tilde (or inverted
breve, depending on the font)

perispomeni
~
a~
diaeresis (rare)
dialyctika
+
ϋ y+
breve (rare)
vrachy
=
a=
macron (very rare)
macron
_
a_


To encode a character in beta code, transliterate the base character as normal. Then, starting from the highest point, working from left to right, place the symbols for the various accent marks after the base character. Stack as many accent symbols as needed to make the character. IE:    would be  Ô(/|. There is a utility box at the bottom of the Greek transliteration window to help assemble accented Greek characters. Select the type in the base character and select the accents you want from the list and press enter to place the character in the transliteration window.



Use tear off menus? - When you click on one of the menu items at the top of the program window, there is a dotted line separator up near the top. Click on the dotted line to "tear off" the menu and leave it open on your desktop. Especially useful when doing bulk indenting under the Selection menu.



Make the displayed text bigger/smaller/a different font? - Under the Prefs menu, select Font This will allow you to modify many of the display properties to suit your preferences. Will not have any effect on the text files, only affects the display properties.



Change Aspell dictionaries? - Start spell check from within guiguts. Click on the options page. The list box in the center of the window that pops up lists all available dictionaries. The currently loaded dictionary is listed at the bottom. double click on a dictionary to switch to that dictionary. Press OK. Close and restart Spell check to spell check the current document using the new dictionary.



Set up External program calling parameters? - Call any program using the same parameters that would be used in the Windows Start->Run box or at a command prompt. For Windows, if you have a registered extension, you can start the associated program automatically by using 'start [filename]' Some programs may require rundll [filename]. For instance to open a web page using the default browser, enter 'start http:\\www.pgdp.net' (without the quotes). If you are calling a program that has a space in the path name, you must enclose the program name in double quotes.  IE, "C:\Program Files\Accessories\wordpad.exe". I have included a few examples. Click on setup at the bottom of the External menu to see/edit them. You can also edit the setting.rc file directly if you prefer. Make a backup copy first though, if you chose to go that route. Changes made to the external calling parameters will not be visible in the menus until guiguts is closed and restarted.

There are a few internal variables exposed for use in calling external modules, if desired.  The exposed variables are:
$d - the directory path to the currently open file
$f - the name of the currently open file (without extension.)
$e - the extension of the currently open file.

In other words, the full canonical name of the open file is $d$f$e.

$i - the (i)mage directory with full path
$p - the file number corresponding to the (p)age where the cursor is in the currently open file.

For example you can pass the name of the png file of the current page to an program using the command:  "C:\some\path\program.exe $i$p.png"  -  Or, under Windows, pass the current file to your default handler "start $d$f$e" (useful to view HTML files) - Note: if you try to use any of these variables when they are not set, you will get errors. IE, trying to use $f before you have opened a file will not be successful.




Change log history:


Version .612(487k) Fixed problem where scannospath variable would sometimes get corrupted in settings.rc file. Order-of-operations error. You would think that I would have checked that; look at the first line in the .611 changes note. Apparently not....

Fixed problem with /F .. F/ markup not being correctly handled in HTML auto generate .

Fixed problem where Remove Blank Lines Before Page Separators  would go into an endless loop at the first separator.


Version .611(487k)
Fixed problem where pngspath variable would sometimes get corrupted in bin file. Order-of-operations error.

Tweaked  autosave indicator a bit. The file may not autosave after the first autosave interval has expired depending on several factors even if changes have been made.. It will autosave after each subsequent autosave interval (assuming changes have been made in the meanwhile.) Manually save the file once to sync up the autosave function if it bothers you.

Fixed problem with Font selection dialog not showing up correctly.

Tracked down and fixed problem with Footnote moving code.

Hopefully I've now found and fixed all the things I messed up while refactoring. I REALLY need to write a test suite to automatically exercise the script after I've edited it.

Added some code to check if there is a caption for an illustration when inserting html illustration markup and avoid inserting bogus <p> mark if there isn't.



Version .61(487k) Fixed minor problem in Word frequency sort routines where words that contained an upper case Æ ligature were not ending up in the expected positions.

Twiddled around with the middle button auto-scroll function. Added ability to auto-scroll in x axis as well as y.

Fixed problem with settings save function where it wasn't properly quoting the $jeebiesmode value.

Fixed problem with the gutcheck view option “Carat character” not responding to the view selection.

Added a bunch of stuff to improve auto-save. The auto-save timer will now be reset every time a file is saved or loaded, whether manually, as the result of some other operation, as the result of an auto-save, or, when you right click on the Save icon in the toolbar (the little floppy disk). The Save icon now changes it's background color to green if auto-save is enabled. When the auto-save timer is down to ten seconds from performing an auto-save, the background of the Save icon will start flashing yellow. Right click on the flashing icon to reset the timer if you want to skip a save. Shift Right-click on the Save icon to toggle auto-save on and off.

When you save a file now, the current position of the insert cursor is saved as well in the bin file (as $bookmarks[0]). When the file is reopened, the cursor and view will automatically return to the saved position, or the top of the file if no position was saved.

Modified the HTML auto-generate routine to optionally (checkbox) not convert non iso-8859-1 characters to numeric entities, ie, leave them as UTF-8. It will also make an attempt to modify the character encoding in the HTML header to UTF-8. This may fail if you have customized your header.txt file, so you may need to check that the charset encoding is set correctly after generation. Note: It is probably better to leave most English language texts as iso-8859-1 encoding, even if they contain a few characters outside of it. The non iso-8859-1 characters will be encoded as numeric entities and will work fine (assuming you have a browser/font which can display those characters, which is a totally separate issue). This is really only intended for texts that are all, or mostly non iso-8859-1 characters. (Mostly for DPEU, in other words.)

Modified Selection pop-up to update the selection parameters every time you modify the selection. Probably more useful that way.

Modified block selection code to select to the end of the line of all internal lines of the selection if the last line of the selection is selected to the end. Before, you could not select any further to the right than the end of the last line of the selection so it was difficult to select a block on the right side of ragged edge text unless you artificially padded the end line with spaces. It is possible that this behavior will be undesired in some instances, if so however, you can get back the old behavior by putting one extra space on the end of the last line and selecting up to just before that space.

Rewrote the code for the HTML pop-up window. I shuffled a few of the buttons around to allow me to factor out some common code. All of the same functionality with about 150 fewer lines of code. Much easier to maintain.

Rewrote Table Fx pop-up window.  Factored out common code. Removed about 50 lines. Changes should be completely invisible to end user.

Rewrote settings save routine to be less error prone when making modifications. The internal layout of the setting.rc file has changed but it is backward and forward compatible.

Went through entire script, editing to follow better better coding practices and factoring out common code. Shouldn't affect end user much, if at all, but makes maintenance easier. Touched probably  well over 1000 lines of code. Tried to exercise all the changes to make sure I didn't break anything. Likely that I missed something somewhere though.


Version .601(487k) Recoded the various sort routines for the Word Frequency functions using Schwartzian Transforms to cut down on the processing time. Significantly decreased time to sort the lists for large data sets. In the process, I fixed the error that version .60 had if you tried to sort character counts by length. (Which wasn't much use anyway...)

Added some code to see if your local perl installation has the Text::LevenshteinXS module available, and uses it if it is, to calculate Word Frequency harmonics. I in-lined some code from the pure perl Text::Levenshtein module but the compiled XS module is much faster.  Speeds up the harmonics functions by several orders of magnitude. It is recommended that you install Text::LevenshteinXS if at all possible.

Fixed bug in gutcheck display code where multiples of the same query on a single line were causing index problems. Worked around problem by only querying the first instance on a line. (They almost always stem from word queries on markup anyway.)

Modified HTML link checker to pick up images embedded in CSS styles.


Version .60(486k)
In honor of the version number (.60), guiguts will now work with .6x versions of Aspell. Many thanks to bgalbrect for puzzling out the difference between the versions command lines and submitting patches to get it working. Still backward compatible with .5x versions too. As of now, there still isn't a generally available .6x version compiled for Windows (that I am aware of), so Windows users are kind of stuck with .5x for the time being.

Modified rewrapping routine to ignore <sc> </sc> markup while rewrapping.  Actually, it will ignore all markup enclosed in <> brackets (As long as there are no spaces in it.) except <i></i>, for which it will allow one space. (for when it gets converted to _ _.)

Tweaked the search and Replace histories to store non-Latin-1 characters correctly. Previous changes I had made prevented the histories from corrupting the setting.rc file but there were still issues with Unicode > ordinal 255. Hopefully this will resolve them completely.

Modified sort orders in various windows, (Word Frequency, harmonics, link check, etc.) to use a "natural sort" where numbers are sorted by magnitude and words are sorted alphabetically. They used to sort numbers alphabetically, (well, "ascii-betically") so that, for instance, numbers would be sorted like: 10, 2, 300, 4, 45. Now those numbers will be sorted 2, 4, 10, 45, 300.

Added an option to sort by word length (secondary sorts alphabetically). There was limited room to squeeze it in, so the radio button labels are a little cryptic. "Alph" means sort alphabetically (natural sort), "Frq" means sort by word frequency, and "Len" means sort by word length. Changing the sort order will not automatically re-sort the list, you'll need to select a sort order, then select a function to re-sort it.

Modified the first and second harmonics functions to be much less complex. Sped up the second harmonic function by several orders of magnitude at the cost of marginally decreasing the speed of the first harmonic. Both harmonics functions now take about the same amount of time. Both will now handle Unicode characters much better.

Modified behaviors of various list boxes slightly. They no longer have a separate indicator for the 'active' and the 'selected' items. (Minor and probably unnoticeable change.) They also have had their right mouse click actions changed to occur on button release rather than button press. The right mouse button will act on the item under the mouse pointer reliably, even if it is not the currently selected item.

Twiddled around with the Footnote Fixup a bit. Changed the Landing zone code to be much less user unfriendly. Landing zones positions are now located just before the footnotes are moved. They are still denoted by the FOOTNOTES: notation , but there is no underlying significance. You can now add a landing zone by just typing in FOOTNOTES: (on a line by itself, with a blank line after.) And you can remove a Landing zone by simply deleting the FOOTNOTES: marker. The file will be scanned for landing zones when it is ready to move the footnotes. If you don't have a valid landing zone for any/all of the footnotes, one will be automatically inserted at the end of the file to receive the orphan footnotes.

HTML autoconvert will now convert <tb> to a horizontal rule. (same as the asterisk thought break).


Version .593(484k) The search and replace histories were still causing problems. They didn't handle non-Latin-1 characters very well and had problems with embedded meta and control characters. After messing around with a fragile and somewhat bizarre scheme, I realized I was an idiot and changed the save routine to simply encode all non word characters as their hexadecimal ordinals. This very neatly side steps the issue. It requires absolutely no programming changes to the load routine, removes the necessity to check for wide (multi-byte) characters on save, and is backward compatible. A win all around.

Did away with something that has been bothering me for quite a while. The bin files now use the entire base file name as their base name with .bin appended.  Now if you have file.txt your bin file will be named file.txt.bin not file.bin. This will alleviate the problem of name space collisions between plain text and Unicode text or html files. Now you can have file.txt, file.utf and file.html and the bin files will be named file.txt.bin, file.utf.bin and file.html.bin. I have no idea why I didn't just do it that way from the beginning. It seems so much more sensible. Sigh. .593 will attempt to find file.txt.bin first, then will check for file.bin for backward compatibility. It will only save bin files with the new name format. If you want to return to an earlier version of GG you will need to manually edit the bin file name. (The internal structure hasn't changed.)

Guiguts will now attempt to make a back-up copy of your bin file every time you save. Now, if your bin file gets corrupted, you should be able to recover a lot easier. The file will be the bin file name with .bak appended - file.txt.bin.bak.

Version .592(483k) Fixed stupid error on my part that was causing some saved terms in the search/replace history to corrupt the setting.rc file.


Version .591(483k) Added some logic to the status bar update routine to avoid unacceptable slowdowns in processing time while doing a gutcheck or tidy check on files that don't have any page numbers set. Most (all?) DP texts will have page numbers derived from the page separators, so it shouldn't have been an issue with DP texts. Working with a text that DIDN'T have the DP page separators would run slower and slower, till it eventually would grind to a halt, using 100% of the CPU. Should no longer be a problem. (A text with no page numbers will still be slightly slower to process, but it will be a constant delay of 25 ms or so per update rather than variable depending on how far into the file the cursor is.)

Added some code to the search term entry box for regex searches. Does continuous checking of the regex term while you are entering it. If it is a legal term, the text will be black, if it is NOT a legal regex, the text will turn red. Note: just because the term is LEGAL, it is not necessarily CORRECT.  Also note that there are some regex terms that while technically correct AND legal, will cause exceptions in the guiguts regex engine. (Like escaped alphabetic characters that are NOT a regex assertion, like \h or \y.) By the way, this is nothing new, it just becomes much more apparent with the continuous regex checking.

Added some logic to track the text box that last had focus, so that character entered form the Latin-1 or Unicode pop-up menus and Unicode character entry will be inserted into the last field to have focus instead of always the main text window. A fairly obvious enhancement, but one that I was unsure of how to implement. As it turns out, it was pretty easy due to some other refactoring I had done several versions back. Hurray for OO methodologies.

Fixed typo in Regex quick reference where the description of the character class [f-j] erroneously left out g and h.

Added search and replace  history drop-down menus to the S & R dialog. Press on the down arrow to the left of the entry box and the previous terms will be available to select. Select one and it will be automatically entered into the entry. By default, 20 terms will be saved. Duplicates will be condensed. Adjust the size of the history under the Prefs menu from 1 to 200 terms. (more than 50 or so is probably not a great idea.) Search and Replace histories will be saved from session to session. Clear the history by selecting Clear History from the top line of the history drop-down. All of the replacement term entries share a common history. I could have made then separate but it seemed overkill to me.

Tweaked the page separator fixup; Join, Keep Hyphen function to correctly join a leading emdash to the previous word.  This is very low occurrence, but relatively easy to fix. Note: plain Join Lines will NOT close up the naked leading emdash.


Version .59(474k) Modified Auto Save routine to not bother if there hasn't been anything changed since the last save.

Modified Word Frequency - Accent Check to do some special case checks for the ligature Æ. Previously it would flag as suspect words that had AE but not those with Ae.

Rewrote the Word Frequency harmonic search function to be MUCH more efficient. Sped it up by an order of magnitude. (~10 x faster than it was.)

Added a second harmonic function to the word frequency sort options. Very much like the first harmonic only it will search for words within a Levenshtein edit distance of two. (The word can be derived from the root word with two or less edits--add, change or remove characters.) Again, it only displays words that are present in the open document, not all possible words. It takes about 1-2 seconds (P4 2.2Ghz) per letter in the root word to do its search, so long words may take a while. Be patient.

Rearranged buttons slightly to accommodate the the 2nd harmonic button. Changed wording slightly to fit into the space available. (Dropped the word "Check" from several buttons, abbreviated "harmonic" to "harm".)

When doing harmonic checks by doing a Ctrl+left click or arrow up and down, the harmonic function that is run will be either 1st harmonic if you haven't run any harmonic functions before or the last harmonic function that was run by pressing one of the harmonic buttons. In other words, every implicit harmonic you do, will be of the order of the last explicit harmonic. If you do a 2nd harmonic by pressing the 2nd Harm button, then every implicit harmonic you do (Ctrl+left click or arrow up and down) will be a 2nd harmonic until do do a 1st harmonic by explicitly pressing the "1st Harm" button. This was actually a bug, or at least not intended, but I kind of like the effect so I left it that way. It removes the need to have different hot key combinations for the different harmonics.

Added the word that the harmonic is being computed for to the harmonic window header line.

On a trial basis, modified the page separator fixup functions to NOT remove spaces at the beginning of the line after the separator.  When I wrote the routines originally, it was a problem. With the separate formatting rounds, it is probably not as much of one. This should help with not losing formatting around the separators for pre-formatted blocks, (indexes, poetry, TOCs, etc.)

It has come to my attention that support for some of the Unicode lookup functions I use in guiguts was disabled in perl 5.8.5, 5.8.6 and 5.8.7.  I queried the perl maintainer who made those changes in the perl source code and got a rather feeble answer that yes, he had disabled it, no, he didn't have any handy substitute method to do what I was trying to do, and yes, he would probably re-enable it for perl 5.8.9. Sigh. In the meanwhile, for people who are using those versions of perl that lack Unicode Block support, I have written a script that will automatically download all of the latest information from www.unicode.org and rebuild all the Unicode scripts.  The script, named update_unicore.pl, is included with the distribution. It can be run at any time to update your perl installation to the latest Unicode information. (Right now, it is more up-to-date than that included with the latest perl distribution.) The script will be automatically run if you try to use functions that need the information and it is missing. Users of the perl runtime libraries I distributed do not need to run this script and indeed, should not.


Version .583(475k) Tracked down a few issues that were causing the page anchors to be moved to the end of the line under certain circumstances while auto-generating HTML. Fixed most of them. There are still a few extremely obscure circumstances that I know of that could cause it, but they will be trickier to work around. It should be much better anyway.

Fixed fairly serious bug in my file save/load routines that could corrupt certain UTF-8 characters if they were the last character on a line. Order-of-operations problem.

Added a bit more error detecting and reporting code for operations that use temporary files. Should help diagnose problems easier.


Version .582(474k) Worked a bit on the status bar Goto functions a bit to make them a bit more user friendly.

Added a "Goto Label" function to the Label readout to resemble the Goto Page and Goto Line functions.

Fixed all three Goto functions to avoid the problem where they would stop working when you tried to go to a non-existent destination.

Changed the Page Label configuration pop up to be activated by a right mouse click instead of left so it could pop up the Goto dialog on left mouse click (to make it more consistent with the other two.)

Added a label "Lbl:" to the Label status readout to be consistent with the other two. ("Ln:" & "Img:")

Modified Page Label status bar to read out "None" rather than stay blank if there were no label assigned to a page.

Changed the Insert/Overstrike status readout to just be I/O; saves room on the status bar for more critical information.

Added tool tips to all of the status bar readouts that did not already have one.

Added a "Normal" mode selection to the Jeebies interface pop up window. I had Paranoid and Tolerant as the only possible choices.


Version .581(474k) Fixed problem with image file opening introduced when I modified page/image tracking functions to work with alphanumerics.

Version .58(474k) Modified the page and image tracking functions to work with page/file names that contain alphabetic characters: 001a.png, 001b.png, etc. It is kind of a hack and may have subtle issues with blank page handling, but it is pretty close. I only have one real file to test it on and it seems to work ok for that and it doesn't seem to have broken functionality for normal files. I'm sure if something is broken, someone will find it. Note: It probably is not a good idea to have files that have NO leading digits, though it is theoretically possible now. There are a few very low use functions which won't work with alphanumeric page numbers; the original page renumber function, for instance. That is deprecated and very low use though, so I didn't feel it necessary to rewrite it.

Fixed problem with Auto end landing zone function in Footnote Fixup. Under certain (common) circumstances, it would try to access an undefined variable and get confused.

Messed around some more with the page anchor/number insertion code trying to reduce bare span errors. Think I made it better. Not sure without more testing over a wide array of projects.

Modified how guiguts tracks which platform it is running under (which it checks often during various routines to determine which operations need to/can not be run.) Instead of doing it locally at each place where it needs the information, I am doing it once at the the start of the script and assigning the value to a constant which will then allow all of the subsequent checks to be optimized away by the compiler, leading to a smaller memory footprint and faster operation. (By very small amounts in the grand scheme, to be sure.) The big win is is much improved maintainability.

Version .573(473k) Sigh. Accidentally redefined a variable which prevented header file from being loaded. Deleted bogus line.Should be ok now.


Version .572(473k) A few minor tweaks and twiddles, only one of any import. I have now made the script sensitive to what directory it is in in relation to what directory it is called from. It should now be able to find its support files even when started from a different directory from where it resides. (No longer necessary to cd to the directory before you run it.) This should have been an obvious change, but I long ago set up my system to sidestep the issue and it just never occurred to me that not everyone would, or could, or wanted to.


Version .571(472k) Apparently, the modifications I made to add checks for small cap markup, to put it bluntly, didn't work. Went back in and fixed several stupid errors to get it working and added some code to do better boundary condition checking.

Modified fixup function to not remove spaces before a full stop if it is followed by a digit. Tweaked a few other regexes to run a bit more efficiently.


Version .57(472k) Modified guiguts to provide an interface to jeebies. I am not including jeebies in the guiguts distribution since it is larger than a whole guiguts distribution including gutcheck. Either get it from this forum thread or from the sourceforge page. (Not current as of this writing.) Guiguts will only interface correctly with version .12 or above. Run jeebies from the fixup menu just below gutcheck. Much like gutcheck, the first time you run jeebies, it will ask you to locate the executable. It will then pop up a window with the suspect occurances of he & be; each one clickable to jump directly to the queried phrase.

Updated to the release version of gutcheck .99. If you compile your own, please recompile to get the release version which does have a few bug fixes. I inadvertantly released guiguts .561 with a pre release version gutcheck .99.

Added search for orphan small caps markup to the HTML orphans search. It isn't really HTML markup, but it follows the rules for it, so it was easiest to just add it in there.

Removed "-mustexist" directive from the directory chooser for the pngs directory as it was causing trouble under *nix.



Version .561(487k) Fixed problem with Aspell dictionaries not being displayed in the spellcheck options dialog. I broke the dictionary loading routine when I changed to lexical three argument opens a few versions back. I didn't realize the syntax for 3 argument opens was sightly different from 2 argument opens.

Fixed a few problems with grossly oversized fonts (most noticeably on the proofers pop up). When I changed the font handling code a few versions back, I missed updating a few spots. I think I've fixed the rest.

I liked the drag pad on the main window so much that I added it to most of the pop up windows that might need to be resized. It required that I make optional scrollbars compulsory, but I think the trade off was worth it.

Finally noticed that Jim released .99 gutcheck 4 months ago, grabbed a copy and updated the gutcheck interface to work with the newest features. I had done quite a bit of it about a year or so ago when I was working with a pre-release copy, so getting it finalized didn't take too long.

Now including gutcheck .99 with the distribution. (Bump from .97)


Version .56(462k) Worked quite a bit on layout and sizing issues. Fixed quite a few things that will likely be invisible to the average user, but bothered me.

Finally resolved issue with disappearing status bar if window was sized less than about 10 lines of text.  Now can be reliably sized down to one line of text without the status bar disappearing. I changed the way I was tracking window sizes so .56 is (slightly) incompatible with .551. The only real incompatibility is your saved window size will be off. (You may need to resize your window the first time you run it.)

Fixed problem with window size jumping when the line numbers were enabled/disabled.

Added a "drag handle" to the lower right corner of the text window to make it easier to resize the window. It could be quite fiddly to get the cursor exactly on the window border to click and drag it. Now there is a 14 x 14 pixel pad you can use to resize.

Rewrote and generally cleaned up some other code here and there. Not to add functionality so much as to improve maintainability.

Added a Unicode Character Search button to the tool bar. Just another way to access the function. Also available under the Help menu.

Fixed problem where HTML auto table was not closing cells correctly.


Version .551(461k) Fixed minor puzzling error where the Cut and Copy commands didn't work if invoked from the menu. The keyboard shortcuts still worked fine so it wasn't a huge issue.  Odd, because they both executed the same code, the menu commands were just invoking it indirectly. Changed so that both call it explicitly.

Version .55(461k) Made extensive modifications to the font handling code. Refactored to use more modern methods. Changes should be fairly transparent to the users but maintenance is much easier. You may notice a slight difference in sizes. The new methods store and use the size number slightly differently. When you first start up,  you may need to reset your font preferences.  The guiguts setting.rc file WILL NOT be backward compatible with previous versions unless you edit it to remove the $fontsize, $fontweight and $utffontsize variables. (In which case, they'll reset to defaults.)  Note: with the new font handling code, it is actually possible (and legal!) to have negative font sizes. Positive sizes are in points, negative sizes are pixel widths.

Worked quite a bit on the Unicode character search function. It is now more compact. The letter names, ordinals and blocks are now fixed size and font. Only the character itself is displayed in a variable font/size. I made a bunch of enhancements, some of which required some sacrifices. First the sacrifices:

The list is no longer scrollable with a mouse wheel. An unfortunate but necessary choice to gain a bunch of other functionality. There is still a scroll bar, you just need to use it directly.

You can no longer cut and paste the characters directly. Again, an unfortunate side effect, but one I think I have worked around quite nicely.

Now the benefits:

Left click on a character to automatically paste it into the text window at the cursor.

Right click on a character to stuff it into the clipboard buffer, you can then paste it wherever you want. Very handy for pasting characters into the search entry box without having to paste it into a document and then copy it.

Left click on a character description to pop up a window with that entire character block in it (using the same mechanism as the Unicode menu.) Even character blocks which AREN'T available through the Unicode menu are available this way.

Unicode Block coverage below hex FFFF is now complete. Every block is now available and properly identified in the character description.  I am now using a core module (script really,) to do the block/character lookups. The FFFF limit is due to a limitation in Perl/Tk, not Perl. Perl/Tk cannot handle characters with an ordinal greater than FFFF at this point.

Modified the Unicode character menu pop ups to also support the right mouse click to stuff the character directly into the clipboard buffer.



Version .546(459k) Changed some code from in guiprep import function that was causing trouble under Linux. Removed an option to not allow you to enter non-existent directory names. Replaced it with a manual check to see if the supplied directory name exists before trying to open it.

Fixed problem with HTML auto table function where it was erroneously. removing spaces around <i> and <b> markup.

Fixed a problem with an unclosed file handle in the character count function which could cause file saves to fail. Went through the entire file and converted all of the file handle operations to use lexical file handles, which should avoid future problems with file handles staying open beyond their scope.


Version .545(458k) Messed around with the Unicode Character Search function some more. There's a saying among Perl programmers; "First make it work, then make it fast."  Yesterday I made it work. :-)  Today, I sped it up about 90%.


Version .544(458k) Modified header.txt file with new CSS for sidenotes that will not generate warnings at the w3c CSS validator. It WAS ok before, but w3c has updated their validator and it was carping about a missing foreground color attribute.

Added a Unicode Character Search pop up under the Help menu. Ever need a Unicode character but didn't know which character block it was in or where to look for it? Now use this handy code point search tool. Say you need a y with a macron. Now if you work with it often, you may know that it's in the Latin Extended-B character block. If not, it's trial and error searching for it. Now, you can pop up this tool, enter "y macron" (no quotes, case insensitive) into the Search Characteristics box and press Search. It will scan through the Unicode character names looking for one with those properties. It will quickly find:

 Ȳ   -   LATIN CAPITAL LETTER Y WITH MACRON  -  Ordinal 0232
 ȳ   -   LATIN SMALL LETTER Y WITH MACRON  -  Ordinal 0233

The actual character, the full name for the character and the hex ordinal of the character. You can easily cut and paste it.

Want to see if there is a character for a Maltese cross? Try it.

Search Characteristics - Maltese cross
 ✠   -   MALTESE CROSS  -  Ordinal 2720

Cool huh?

Now the caveats. (You knew there were caveats, didn't you?)

1) You can't use the tool to locate character with ordinals over hex FFFF. This is more a limitation of Perl/Tk than anything else. Perl/Tk only understands characters up to FFFF.

2) I chopped out the all of the CJK (Chinese-Japanese-Korean) ideograph blocks. There's just so darn many of them, it was seriously slowing down the searches. (It's none too speedy still...)

3) Not as critical, I chopped out the private use block too.  It is pretty unlikely that anybody is going to be using private use glyphs in a Gutenberg bound text anyway.

By eliminating 2 & 3 I reduced the search space by about 80% (and thus sped up the search by about 500%.)

While the search is ongoing, the text background will turn gray. On completion, it will turn white again. If you want to interrupt it, hit Stop. If you close the window while a search is in progress, you WILL end up with a bunch of  (harmless) warnings in the console window. The results window uses same font as selected in the Unicode character block pop up Windows.


Version .543(456k) Tracked down a bug in the file saving code where a relatively rare set of circumstances could block the file from being saved..

Tweaked the Page Anchor HTML code again based the the results of the testing done in the "Lots of Links" thread in the Post Processing forum.


Version .542(456k) Worked on the image filename handling code some more to try to get it to play nicely with jpeg image files. Turns out I had much more hard coded png extensions than I thought.

Tweaked HTML Page anchor insertion code to not add spurious paragraph markup if the page break falls within poetry. Still probably not perfect, but at least it tries to avoid it now.

Modified Generated HTML Page Anchor code again because Internet Explorer still wasn't liking it. (And my distaste for Internet Explorer is starting to tip toward complete disgust.)

Tweaked block markup overrides so that the first line indent will be repeated for each paragraph inside the block. I had specifically made this NOT happen at someone elses request several (many) versions back. I am changing it because it is much easier to add overrides to stop it than to add an override at each paragraph in the block. (Note: If you need to change the overrides inside a block quote, you do not need to end each block separately. And block end encountered ends ALL block quoting. For example, consider the following very boring passage:

/#[5.3,60]
Text text text text text text text text text text text text text text text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text text text text text text text text text text text text text text text text
text text text text text text text text text text text text.

Text text text text text text text text text text text text text text text text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text text text text text text text text text text text text
text text text text text text text text.

/#[8.48]
Text text text text text text text text text text text text text text text text text text text text text text text text text text text text
text text text text text text.
#/

Rewrapping will indent the first lines of the first two paragraphs 3 spaces, the remaining lines of the first two paragraphs 5 spaces and the third paragraph 8 spaces. There is no need to specifically close both blocks.

Tweaked the wrapping routine to honor non-breaking spaces again. I had very carefully done this when I wrote the wrapping routine but later changed it to automatically convert non-breaking spaces to regular spaces on wrap because there was a burst of projects coming through with many spaces converted to non-breaking spaces during proofing, and it was causing lots of support questions about thew wrapping routine "not working". I think the utility of honoring the non-breaking spaces outweighs the confusion it might cause though.

Added a non-breaking space to the Latin-1 character pop-up.


Version .541(454k) Tweaked the arbitrary character highlight dialog a bit. Added some buttons to automatically select the previous selection or the entire file.

Revised the image viewer calling code to be compatible with jpeg images. It was hard coded to expect pngs only. Now it is hard coded to expect pngs or jpegs only. If we start accepting more formats on the site, I'll need to make it more general. It still expects pngs and defaults to pngs, it just will accept jpegs now.

Fixed minor problem where if you pasted some text into an empty file and then saved it as a new file, the file name was not getting updated in the title bar or the recent file list.


Version .54(454k) Modified Generated HTML Page Anchor code to be compatible with Internet Explorer.

Added a new Highlight menu option in addition to the Highlight Single Quotes and Highlight Double Quotes; Highlight Arbitrary Characters. Choose whatever character or sequence of characters you would like to be highlighted in the selected text. Choose to highlight exact text or a regex. Exercise a little caution; if you select the entire file and choose to highlight '.' (any character) in regex mode, be prepared to wait a while.

Changed the quote (and arbitrary text) highlight color to a light lavender. It was the same orange as the search highlight.



Version .538(453k) Fixed problem where Save As would not change the name of the loaded file to the new file name.


Version .537(453k) Fixed problem where Save As would not create a new .bin file if it didn't already exist.


Version .536(453k) Worked on a few aspects of the file save routines (for both the text file and bin file) to avoid problems with read-only directories and/or files. If the directory or file is read-only, the save routines will attempt to modify the write permissions to allow you to save the file(s) anyway rather than failing with error messages. Hopefully this will banish the sporadic save issues under Windows. Note: Under Linux/Unix, if you aren't the owner of the directory/file, it still may fail with permission errors.

Fixed problem with out-of-order markup on page anchors that weren't within paragraphs during HTML auto-generate.

Fixed a few other minor problems that would cause warnings in the console window during HTML operations.


Version .535(453k) Well, this one is just embarrassing. Typed the wrong variable name in the settings save routine which was writing bogus values for the selected aspell dictionary. When you would try to run a spell check, it would have NO idea what you were asking it to do until you manually selected a dictionary to use. Entirely my fault.  .....Well.... I guess the whole thing is entirely my fault except for a couple of good bits here and there that various people submitted.... sigh.

Added another non-standard replacement assertion for Search and Replace:  \G .. \E - Greek Transliterate. (Don't expect to find this anywhere but guiguts, it is extremely non-standard. \G already has a defined meaning in regexes, but not a useful one under guiguts, so I am overloading it.) Very useful to do automatic transliteration. Say you've got things like [Greek: moly/bdides] or [Greek: kekryphalos] or [Greek: nykteri/des] or [Greek: pharmakon] scattered throughout your text, (like I do,) and you want to provide a Unicode version. Do a search:  (\[Greek: ((.|\n)+?)\]) and replace: \G$2\E and end up with  μολύβδιδες or κεκρυφαλος or νυκτερίδες or φαρμακον in one easy operation.


Version .534(451k) Making one last attempt at solving the problems with my new save routines before rolling back to the old ones. Fixed problem with file save if your temp directory was located on a partition other than the one where your project directory was.  (Not uncommon for Linux/Unix systems.) The temp file is now saved in the same directory as the original file rather than system temp directory. Hopefully this will help with the permissions problem too. 

Fixed minor problem with prep file import functions. Now adds a newline before the page separator. If a file didn't end with a newline, the page separator was being appended to the last line of the previous file. Not horrible, but annoying.

Went through the settings save routine and ensured that any setting that could possibly contain a single quote would escape it on saving.


Version .533(449k) Fixed problem where file would sporadically not save with "Permission denied" errors, even if you did have write permissions in the directory.

Fixed a problem with the Save As function where the bin file was not being saved correctly.

Worked on the Footnote Fixup - Check footnotes function a bit, added another check, fleshed out the error mode descriptions a bit, made the header line not disappear while scrolling through the list.

Made Replace All work with a null replacement term.

Fixed a few problems with  Tidy check window where it would have index errors the when it was run more than once in a session.

Fixed problem with Page Label popup where it would not remember previously set labels under certain circumstances.



Version .532(449k)  Fixed bug in save routine where it was saving with Unix style line endings, even under Windows.

Modified the the search & replace dialog to allow you to replace with nothing; (delete). Previously, if the replacement term was empty, no operation was performed.


Version .531(449k) Sigh. Bugs galore. Fixed stupid assumption in Page Label dialog that the image files will be contiguously numbered.

Fixed bug that caused the Page adjust buttons to come up blank under certain circumstances.

Fixed a few warnings that were showing up in the console.


Version .53(449k) Reworked new search dialog with multiple replacement terms to be user configurable whether to show single or multiple terms. Makes the dialog less cluttered when you don't need/want them, but they are instantly available if you do.

Completely disconnected the image numbers from the page labels. You can now edit the page labels without affecting which image is displayed when "See Image" is selected.  Added a new pop up dialog linked from the HTML Fixup window where the page labels can be easily customized.  You can now easily do  page offsets,  Roman or Arabic numbering,  restart the numbering arbitrarily, skip pages in the numbering sequence, just about anything you could want. (Except compound numbers, e.g. 1-1, 1-2, 1-3, 2-1, etc. I still haven't figured how to handle those.) Since this is active, I have removed the redundant "Page Offset" features from the HTML window. To use the dialog, you must select if a page will be Arabic, Roman numerals, or the same as the previous page. Then select an action for each page label, either add 1 to the previous page label, start over from an arbitrary number, or do not label. The start point must be an Arabic number, even for Roman labels, e.g. 5, not v. It sounds complex but is pretty intuitive to use. Once you have your layout arranged, press "Recalculate" to modify the labels reflect the changes. (It was processor intensive to continuously monitor and recalculate the labels so I made it manual.) If the new labels are acceptable, press "Use These Values".  When the HTML is generated, the page label will be used for the page anchors.

The page label information is saved to the bin file associated with the text. The page offset information is saved twice, redundantly for now, to retain backward and forward compatibility. After a few releases, when pretty much everyone has upgraded, I will probably start to phase out the older style offset tables.

Rearranged the status bar a little bit. Tweaked a few cells to conserve space. Added a Page Label readout to the status bar next to the Image number and See Image cells. It reads out the label that will be used for HTML generation. If no custom page labels have been configured, displays "No Label".

Made the HTML automatically insert visible page numbers by default. If you don't want visible page numbers, put "display: none;" in the CSS style for the pagenum class.

Worked on display bug where the status bar would be covered up when the window was reduced below a certain height. Came up with a "fix",  you can reduce the window height to about 3 lines of text and the status bar remains visible if the line numbers are off, otherwise, you can only reduce it to about 20 lines of text. Sorry, but that's probably about as good as I can do.

Rewrote Save routine to be a little more robust. The original Save routine would open the file on the disk, clear the disk file, then write the file in memory to the disk.  This works fine most of the time, and, in fact, is the way the standard text widgets under Tk do it. However, during a save operation, after the disk file was cleared and before the file was completely transferred from memory, if there was some type of glitch (with guiguts, with the OS, with the hardware, whatever,) you would end up with a partial file and a funny look on your face. Not a particularly happy situation. Now the save routine writes the file in memory completely to a temporary file on the hard disk, verifies that it is intact, then renames the temporary file to the original filename. One obvious drawback to this is that there MUST be at least double the size of a file in free hard disk space or the file will not be able to be saved. (Shouldn't be much of an issue with todays hard disks.)  The resultant data integrity is a worthwhile trade off though, in my opinion.

Twiddled with the Greek transliteration tool a little bit. Tried to make the y/u <--> upsilon conversion a little more intelligent.


Version .52(445k) Fixed minor problem in Greek transliteration tool  where a 'gamma chi' converted to transliteration and back, was being rendered as 'nu chi'  rather than 'gamma chi'.

Made minor change to poetry CSS markup in header.txt file to make long lines of poetry in narrow browser windows wrap in a more correct manner.

Fixed a problem with indent overrides on blockquotes where the overrides were basically just being ignored.

Tweaked regex searches with newlines to also match newlines with a dot (.) character. Only affects searches that explicitly include a newline character in the search term.

Fixed problem where Control+` shortcut was not working for column paste. (F3 still was, so it wasn't desperate.)

Fixed obscure bug in wrapping code where standalone zeros would mysteriously vanish.

Added some import and export routines to import pre-DP text files where each file is one page named with 3 or digits and a .txt extension (001.txt, 002.txt, etc.,) and to export a text file by splitting it into individual page text files named with the page number. This is mostly in response to several requests by PMs who would like to be able to use guiguts functions to work on  pre-DP text files.

Modified the search  dialog to have three separate replacement terms so you can easily select among a few possibilities when doing search and replace. (Much like under guiprep,  but with regexes!) The hot keys, ( Control+Enter, Shift+Enter, Control+Shift+Enter) only work on the top replacement term, all others must use the mouse buttons. I may look into having the number of replacement term configurable in the future but left it alone for now.

Changed status bar readout to display Img. (number) instead of Page (number). Strictly cosmetic at this point. In anticipation of  coming up with a mechanism to separate tracking of folio page image numbers from book page numbers.

Changed the directory selection dialog for selecting the image directory. Used a more standard dialog. Didn't really add any functionality, but made it easier to:

Added another binding to the See Image status readout. Right click will now bring up the image directory selection dialog.


Version .51(443k) Modified the proofer viewing functions to work with the upcoming four round modification to the site. It now defaults to displaying the user name for each "round": Proof 1, Proof 2, Format 1, Format 2.  If you are working with a file done when there were only the original two rounds, the Format round user will be listed as <none> and the counts will be 0 for those rounds. With a bunch more work I could have made the Format rounds not display if they weren't populated in the page separators, but I didn't really feel like going through all that bother for something that is (hopefully) going to be pretty temporary.  Added buttons to the proofer pop-up to be able to sort on the additional rounds.

The 4-round changes should be both backward and forward compatible. Four round files that are processed with pre .51 guiguts will work ok, they just won't handle the two extra rounds. The Format round user names/counts will be lost. Two round files processed with .51 or later will just have zeros & <none> for the format round page counts and user names.

If you want to play around with a (short, bogus) four round file, you can find one here.

Guiguts now automatically saves the file and .bin just before performing HTML autoconvert so you can back out again encase of trouble. The file will be saved to filename-htmlsave.txt and filename-htmlsave.bin. You are still encouraged to save the file which will be used to generate the HTML to a different name from the text file before autoconversion. This is intended to be a back-out mechanism for the autogenerate process.

Worked on regex search and replace to try to work around a few bothersome problems.
1) Not being able to use positive lookarounds in the search term while doing replacements. This is still not perfect, though much improved. In general, you can now use positive lookarounds, EXCEPT  if they contain a literal closing parenthesis. In that case, just capture the parenthesis and add it back in.

2) Not being able to include a literal dollar sign followed by a digit in the replacement text.  You can now use dollar signs in the replacement text, even when followed by a digit, you just need to escape the dollar sign with a backslash. You only need to escape the dollar sign when followed by a digit, though it won't hurt to do it all of the time.

Modified the proofer viewing functions to work with the upcoming four round modification to the site. It now defaults to displaying the user name for each "round": Proof 1, Proof 2, Format 1, Format 2.  If you are working with a file done when there were only the original two rounds, the Format round user will be listed as <none> and the counts will be 0 for those rounds. With a bunch more work I could have made the Format rounds not display if they weren't populated in the page separators, but I didn't really feel like going through all that bother for something that is (hopefully) going to be pretty temporary.  Added buttons to the proofer pop-up to be able to sort on the additional rounds.

Modified the Fixup function to allow selection of either French style «guillemots» or German style »guillemots«.



Version .503(440k) Fixed another problem in HTML autogenerate code. Subscript and superscript code was not being correctly added to text enclosed in certain markup tags. Order-of-operations problem.

Reworked the middle button autoscroll code a bit. Made the pop up indicator a little less obtrusive. Changed the cursor while it is active to give a better indicator that it is. Changed how the scroll works at low speed. Still can't do a smooth, pixel level scroll, but the speed is now adjustable down to a very low rate without needing to reduce the update speed.  Since it is no longer really necessary to adjust the update speed manually, I removed the option from the Prefs menu again. Apparently, pixel level scrolling is available in Tk 8.5 which will eventually be ported to Perl/Tk 805.xxx. Sigh.

Integrated the many of the various support files into the main script file to cut down on dependencies. At this point, the script file itself (either guiguts.pl or winguts.exe) can be dropped into an empty directory and run without raising any errors.  There are still external support files that are highly recommended, but they are not absolutely necessary to run the script. (HTML manuals, scannos files, gutcheck, etc.)


Version .502(462k) Worked on HTML autogenerate problems for a bit. Made some changes to try to trap unclosed single line paragraphs at the start of block quotes. Kind of a kludge, seems to work, though I'm not really proud of the code.

Added checks to see if a page anchor is inside a paragraph or not and to add paragraph markup around it if it is not. The checks add quite a bit of overhead to the page anchor insertion routine and increases the processing time by noticeable amounts, (a few seconds,) but should cut down on the amount of manual intervention necessary to to get it to validate XHTML 1.0 Strict.

Poked around with the ampersand conversion routins in HTML autogenerate. Found a few instances where it would not convert them correctly and re worked them.

Added a new regex to the regex .rc file:  '&c(,| |$)' => '&c.$1'  Look for an abreviation "&c."  that doesn't have it's period and insert one.

Made some more changes to the lazy updates of scanno highlighting. As soon as highlighting is enabled, it will start to work through the file adding the highlighting in the background. It is quite processor intensive so it would be a problem to try to do the file all at once, but by processing a small chunk at a time, it can work through the whole file in a reasonable time in the background. The current view is still continuously updated.

Reworked some of the status bar code to update in real time while doing drag selections.

Modified status bar code to have a little more stable cell size layout. Tried to minimize the "dancing widow" effect when the character descriptions were enabled in the character ordinal readout.

Added another option under the Prefs menu; 'Leave Space After End-Of-Line Hyphens During Rewrap'. Should be pretty self-explanatory. While not very useful for English tests, other languages, especially German, will have fairly common occurrences of "hanging" hyphens. This option will keep them intact during rewrap. Standard behavior is to join the final hyphen with the first word on the next line during rewrap.

Changed the binding for the middle mouse button, which used to be a very user unfriendly paste routine, to instead activate an autoscroll routine based heavily on the middle button autoscroll as found in the Firefox web browser. I use this all the time in Firefox and find myself trying to use it in other programs too, so I added it here. It doesn't scroll quite as smoothly as the Firefox version because the text widget doesn't support pixel level scrolling, (easily) so it has to scroll in even increments of the height of a line of text. Click the middle button in the text window to enable it, move the pointer up to scroll down and down to scroll up. Press any button or key to disable it again. Middle click drag still works too. If you press the middle button and drag at all, the scroll sigil will not pop up. I tried, but was unable to make the scroll sigil background transparent. In practice, it isn't a big deal, just position it off to the right of the main body of the text.

Added an option to adjust the update interval for the above scroll function. It defaults to a 50 millisecond update interval. If this is too fast for you, you can increase the update interval to slow down the scrolling. Intervals less than 30 are not recommended to reduce processing load. Intervals larger than 100 milliseconds will probably be unpleasantly "choppy".


Version .501(462k) Fixed up a few problems with the Unicode character menus, the most annoying of which was that changes to the font or font size would not propagate through the window. Changes would not take effect until the Unicode window was closed and reopened.

Rewrote the mouse selection auto scroll code to be more compact and speedy. Added  (pseudo) logarithmic accelerator functions: the further you drag the pointer outside the text window (top or bottom) the faster the window scrolls.

Modified line number updating code to respond to mouse selection auto scroll better. It now updates in real time, or nearly so.

Tweaked scanno highlighting function to be a bit faster and to do lazy update of the highlights. Once a word has been highlighted, the highlighting is not removed (unless the word is edited) until highlighting is turned off again. This doesn't really make the highlighting run faster, but it makes it feel faster.

Fixed missing semicolon on &quot; in HTML illustration markup

Fixed problem in Footnote Fixup where Auto Landing Zones wasn't working correctly.

Twiddled around with italics handling during poetry markup generation. Should be a bit better. Probably still will have problems with heavily mixed italic and non-italic poetry.

Added italics detection to the non-rewrap markup autogeneration code. Same caveats as for poetry. (Not a surprise, it uses the same algorithm.)


Version .50(461k) Rewrote Unicode character name handling routines to use built in perl functions rather than loading a precompiled hash, seriously reduced size of guiguts package with little penalty. In fact, it seems to start up a little faster now, though I haven't really done any benchmarks.

Cleaned out a bunch of no longer need modules that were using memory to no purpose.

Fixed major bug in Footnote fixup Reindex function that was pretty much keeping it from running. Fixed several minor problems too, things not exactly wrong, but sub-optimal.

Added the "arabic" subroutine  to the package. (Actually, it was available in .49 but I forgot to document it. This is mostly only useful for converting roman numerals to arabic numbers with regex code assertions. E.G. Search \b([IVXLCDM]+\.)  Replace: \C arabic("$1") \E  will convert III. to 3 or MCCCLXVII. to 1367 and so on.

Fixed misspelled codepage conversion menu option.

Changed how the cursor is handled while doing selections with the mouse. The cursor now follows the mouse pointer. This makes it possible to do updates of the line numbers while doing mouse selections that scroll past the top or bottom of the screen. Makes it much easier to select a block of lines by line number larger than the current screen. And besides, it just bothered me the the screen would scroll but the line numbers wouldn't update.

Oh, and removed the darn setting.rc file that crept into the distribution build somehow.


Version .49(581k) Updated HTML autogenerate to automatically convert <sc> .. </sc> markup to CSS <span class="smcap"> .. </span> markup.

Modified how poetry markup for zero indent lines is handled. Changed it to use <span class="i0"> rather than a bare <span>. This will allow easy inclusion of smallcaps or underlined text (or anything else that relies on span markup) withing poetry without invoking other side effects.

Modified html header.txt file to contain the new modified poetry style.

Made "Insert anchors at page numbers"  selected by default for HTML autogenerate.

Tweaked a few other things for autogenerated HTML, mostly cosmetic changes.

Fixed a bug where if you had opened a Unicode character chart and then tried to use the Greek transliteration tool, (or vice versa,) it would throw errors and not work correctly. Variable collision.

Added another tool to the Fixup menu: "Convert Windows Codepage characters to Unicode". Can be run standalone from the Fixup menu, is automatically run as part of the HTML autoconvert routine. Acts on the whole file.


Version .482(579k) Fixed fairly obscure but potentially destructive problem where poetry with line numbers and combined with certain indents would delete characters from the beginning of the numbered lines.


Version .481(578k) Fixed problem where starting the script would occasionally throw an error about a missing subroutine.  Seems to be a timing issue. Once the file is loaded into the disk cache, it seems to load without a problem. Needed to shuffle the loading order a bit.

Fixed problem where the script was not stripping the BOM from the front of the file on open. UTF-8 files would end up with an extraneous zero width no-break space at the beginning.

Modified highlight words function to be a little more forgiving of the format of the file of words it used as a source file. It used to only accept files that had a single word per line with no punctuation. Now you can have multiple words on a line and punctuation not inside a word is ignored. ( Hyphens and apostrophes inside words and commas inside numbers will be retained.) Be a little cautious about the size of the word list. Up to a ten thousand or so words should be ok. Faster processors can handle several hundred thousand without a problem. An interesting experiment is to do a word frequency on a text, export the list (Ctrl+x), then load the list as the highlight list. EVERY word should be highlighted. (This is how I did testing.)

Modified how the script decides where to insert any non-default poetry indent classes into the CSS styles. They were winding up in some... interesting places. Should be a little better about inserting it in the correct, or , at least, less nonsensical spot. I can't just insert it in a hard coded spot, because some people like to customize their header.txt files and that will throw it off.


Version .48(577k) Started refactoring to clean up some of the worst egregious faults in guiguts. Cleaned up most of the globals that were spread throughout the script. There are still quite a few package globals, but I don't see how to get rid of those without going to a completely different, non-compatible method of saving values in the settings and other peripheral files.  Shuffled around a lot of code, grouped together a lot of initialization stuff together into sub routines, tried to comment subroutines a bit better, basically did a lot of code cleanup. Though lots of stuff has changed beneath the hood, it shouldn't affect operation at all.

Fixed problem where the HTML External Link function was producing HTML that was just plain wrong. Don't know what I was thinking. I don't use that particular function much (at all) so I hadn't noticed it before.

Changed the last few self closing anchors to be explicitly closed. Turns out I missed a couple, (though some were pretty obscure.)

Changed Auto generate HTML to not globally change " to &quot;. I had done that early on, before my link routines had been worked on much to prevent other headaches. At this point, my link generation code is robust enough to not need the hand holding and it has become more of a headache than a help.

Worked on auto link generation code to be more tolerant of Unicode characters in the link text. Will now decompose any Unicode Latin character to its nearest ASCII base character. For non-Latin characters, it will decompose them to the ASCII name of the characters, separated by hyphens. This is arguably the wrong way to "translate" the characters, but since the link name/id can not have any non-ASCII characters in them, it will yield valid and repeatable results. It kind of falls down for some of the more complex languages. If the character is not specifically a digit, letter or ligature, it will be represented by -X- so ideographic languages may have a problem. (If you've got kanji, you're on your own.)


Version .47(572k) Made Footnote Fixup to automatically scroll the footnotes to the bottom of the visible window on search. Makes for less scrolling while checking Footnotes/Anchors during First Pass.  Added an option to "Center on Search"  that will make the Footnote search behave the way it did originally, (EG, center the Footnote in the visible window as you cycle through them.)

Fixed problem with the page number variable passed to external commands not passing the complete number. Was chopping off the last digit. Ooops, my regex was a bit greedy.

Removed all instances of self closed anchors from the autogenerated xhtml files. They were causing problems if the MIME types weren't set correctly on the server. Also was apparently causing IE to completely lock up upon occasion.

Kludged up CSS for centered tables and images so broken yet popular browsers would be forced to render the centered sections correctly.

Modified Header file to avoid enabling "Quirks Mode" in Internet Explorer.  Apparently, Quirks Mode means something like "Render this any damn way you want.... except correctly".


Version .463(572k) Fixed problem where if the text window was sized so that the visible window was not an exact multiple of the text line height, and you moved the cursor down with the arrow key at the bottom of the text window, the cursor would skip every other line. Actually, I tracked it back to a bug in the Tk::Text module. I fixed it locally and submitted a bug patch to the Tk maintainers.

Looked at trying to stabilize the window size between sessions under Linux. Opens consistently the same size under Mandrake 10/KDE. Haven't tested it with other Distros/Window managers. Not sure what else I can do. I have a feeling this is going to be dependent on  how different window mangers define the window geometry.


Version .462(569k) Added .u class to CSS and autoconvert <u> </u> to <span="u"> </span> per request.


Version .461(569k) Added a Column format modifier to the HTML table tool so you can set the alignment of columns independently while automatically generating table markup. There already was a set of radio buttons where you could select Left, Center or Right aligned text for the table. This is still there and is used as a default setting.

In general, when laying out tables, it is preferable to be able to align different columns differently. E.G. names are typically left aligned, but columns numbers are often right aligned. With the original selector, EVERY cell in EVERY column got the same alignment and you had to do a manual alignment change for columns that you wanted to be different. That isn't too bad if you only have one or two tables, or even ten or fifteen. I, however, just PPed a text with 352 tables (yes I counted) where nearly every table required manually adjusting alignment after generation, and I got fed up.

There is now a "Column Fmt" entry just below the Auto Table button where you can specify column alignment.  Specify the alignment with the characters "<" for left aligned, "| for centered text & ">" for right aligned text. Every column should have a corresponding character telling it how to align the text. Extras are ignored. If you have more columns than characters, the default setting on the radio buttons is used for the excess. Every cell in the column gets the same alignment.

Say you have a eight column table and you want the columns aligned; going from left to right: left justified, centered, centered, right justified, left justified, left justified, right justified, centered. Put the string "<||><<>|" in the entry box and hit Auto Table. You will probably still have to tweak the header cells a bit, but in general this takes out a lot of the drudger


Fix minor problem where doing a explicit Block Rewrap was yielding slightly different results than rewrapping text enclosed in block markup.


Added an Interrupt button to the Search and Replace Replace All function. If you had a S&R that was taking a long time there wasn't any way to break out of it without quitting and losing all your unsaved edits. Now there is. Similar the the Interrupt Rewrap button. (Actually, it IS the same as the Interrupt Rewrap button. I generalized the code to Interrupt Operation.)


Fixed small problem where if an illustration was an illuminated initial cap letter, the paragraph markup was being mistakenly removed. Note:  they will still probably require some tweaking, but not as much.


Fixed problem where /F F/ markup was not closing open paragraphs before where it started.

Also, Check out the new manual for guiguts that dcortesi has put together. It is much better laid out and user friendly than the default one, (which has become something of a glorified change log.) It doesn't have a permanent home right now, but I have put up a redirect page to it that I will attempt to keep current at http://mywebpages.comcast.net/thundergnat/ggmanual.html Comments, questions & suggestions are welcome, though please direct them to the Guiguts Online Manual thread - http://www.pgdp.net/phpBB2/viewtopic.php?t=13808



Version .46(568k) Lots and lots of small tweaks to make interface more consistent.

Changed Quote highlighting to be visible over the selection highlighting.

Worked extensively on Footnote tool to make it less problematic to convert between inline an out-of-line footnotes. Fixed a bunch of small bugs.

Added Control+g hotkey to search for next occurrence of last search. Works very similar to Control+f except focus stays in the text window.

Fixed problem under Linux where script was referencing a no longer needed/supported package option. (KDE drag & drop)

Modified key insert functions to no longer insert control characters into the text when an unbound Control+(Key) was pressed.

Fixed Row:Column readout in status bar to read out the correct number of lines while in block select. (Was off by one.)

Suppressed bogus blank proofer from being displayed in the Display Proofers window.

Fixed a bunch of  operations  that are performed behind the scenes as a series of small steps to Undo as a single step.

Fixed HTML Link check to be a little smarter about checking for links to local files.

Fixed Word Frequency Character Counts to search correctly for whitespace characters. ( again... :-/ )

Fixed annoying problem where text would jump to the right if line numbers were enabled, the text was scrolled left and you inserted the cursor on a long line.

Worked quite a bit on the Tfx functions to be less user unfriendly. Insert/Add line will now automatically select the Inserted/Added line.  Selecting a line is much less finicky now. Added a Select Previous Line / Select Next Line buttons. Will cycle through the lines left and right respectively. Place the cursor anywhere between two lines and Select Next or Previous to select the desired line. Does a lot more "do what I mean" now. Added a column width readout. Displays the number of spaces in the selected column. (The column to the left of the selected line.) Makes it easier to rewrap columns to a certain width if you can see at a glance what the width is. Bound left and right arrow keys to to select previous and select next lines, bound Control-Left-Arrow and Control-Right-Arrow "Move Line Left" and "Move Line Right" functions.

Changed how automatically generated illustration captions are handled. Now uses a "caption" style sheet markup instead of fixed markup. Fixed Image code insert function to work if there is no text selected.

Fixed problem where Auto generated html page title was not handling words with apostrophes correctly.


Version .456(564k) Changed default HTML encoding in header file. Apparently  the whitewashers would prefer not to have UTF-8 encoded files unless you really, really need them. Ah well. Wasn't particularly necessary for the guiguts auto generated HTML anyway. I just put it in there since it seemed like a safe default, but the law of unintended consequences reared its ugly head. I could do encoding detection I guess, but it would be rather pointless. The HTML files guiguts generates are us-ascii. All the time. Every time. No matter what characters may be in the text file.

Fixed a small problem in the table Tfx function "Convert Step to Grid". It would choke on table cells that had only one character in them.

Tweaked generated HTML a little more. Trapped a few <p> marks that were creeping into the /X X/  marked passages.

Modified "Clean up markers" routine to remove /X .. X/ markup. Oops

Worked on Footnote functions a bit. Fixed a bug in the reindex function that was losing track of anchors under certain conditions. Added code in the Check footnotes function to check for footnotes that are not in sequence. Out of sequence footnotes can cause odd errors while reindexing inline footnotes.

Added option in the footnote functions to convert all of the footnotes markers to numbers, letters or Roman numerals. Select the appropriate checkbox and reindex.

Fixed a few oddities in the Greek transliteration tool. Not really bugs, just things that worked a little unexpectedly. (To me at least.) Which is close enough to being a bug, I guess.



Version .455(564k) Reworked end of line removal some more. The routine I had implemented WAS faster than the previous one, it was just deceptive because it blocked the program when it was running so it just seemed like the program had locked up and made it seem longer. I have made some changes which should help it run much faster.

Worked on a few of the HTML markup generation routines (not in auto generate) to generate correct markup. 

Added /x .. x/  markup - Skip. For text version, does basically the same as /$ .. $/ , for HTML, it does nothing; well, almost nothing. It adds <pre> </pre> markup around the block, and named entities, angle brackets, ampersands and quotes will still be converted.

Added an option to the page marker adjust function where you can insert persistent page markup into the text.



Version .454(562k) Removed the Control-` (Column Paste) key binding completely. Was causing more trouble than it was worth. Column Paste is still available bound to F3 and through the menu.

Fixed typo in spell check bookmark function that was causing it not to work. It used to work, I don't know how it got changed. Gremlins perhaps...

Rewrote Change All function in Spell Check. I was never very happy with it anyway. Should be able to use now this without adversely affecting spell check.

Changed the key bindings for the spell check hot keys to to Control key combinations. The bare hot keys were having unanticipated interactions with the term entry boxes. The hoy keys are now: "Control-a" - add word to Aspell dictionary, "Control-p" - add word to project dictionary, "Control-s" - skip word and "Control-i" - skip all occurrences of word (ignore). "Return" to accept the proposed replacement and search for the next misspelling is unchanged.

Modified Selection popup to update to the current selection if invoked while already open.

Modified header file a bit. Added a footnote anchor style - .fnanchor. Tweaked the footnote markup in general. made a few other small changes.


Version .453(562k) Worked a great deal on converting HTML auto generate routine to produce valid XHTML 1.0 strict. Not really successful for complex texts. In general, it will produce valid XHTML 1.0 transitional, but (especially for texts that have page markers and sidenotes/line numbers,) there will be some cleanup necessary of page anchors that aren't inside block elements to make it strict.  Revamped the style markup for footnotes and sidenotes. The footnote style I blatantly stole from one of Jon Ingrams projects. The sidenote markup I made up myself, it may be a bit over the top, but I like it. Image markup now uses floats instead of the depreciated align markup.  Modified image routine to automatically insert Illustration caption as  the image caption.

Made blockquote mark up selectable between CSS markup or HTML <blockquote> </blockquote> styles. I have switched this back and forth several times now and no matter how I have it, I get grumbling. Fine. Select which style you like and do it that way.

Fixed minor bug in file save routine where it appended a blank line to the end of the file every time you saved and reopened the file.

Modified search routine to automatically vertically center the found item in the screen, even if it doesn't technically need to scroll to get to it. Helps in situations where you need to see what  is on the line AFTER the found term in order to determine what action to take with it. It would often end up with the found item as the last line in the window necessitating manual scroll to view the next line.

Added some hot keys for case modifications Ctrl+u - upper case selection, Ctrl+l - Lower case selection and Ctrl+t - Title case selection. I sacrificed the transpose function which used to be attached to Ctrl+t. I don't think it was heavily used anyway. I didn't assign Sentence case to a hot key, I am running out of keys that can be assigned as hot keys and I doubt that one is as useful.

Modified the selection popup to be a little more useful. It will not disappear every time you use it now and will automatically enter the values for the current selection when it is invoked.

Worked on column paste functions to work better when trying to paste a block at the end of a line or across a line that is shorter than the insertion point. A drawback is it tends to leave trailing spaces around, but it works much more intuitively now.

Added another button to the tool bar: Eol - End of line trailing space cleanup. With the column cut & paste functions and the table tools, you'll often end up with odd trailing spaces. This is just a quick shortcut to the function already available in the menu. I rewrote the function to be much (MUCH) faster too. I have gotten a lot better with perl and Tk since I wrote the original. It will operate on a selection if there is one or the whole file if there isn't.

Modified gutcheck report window delete function to automatically search for the next item in the list after the deleted item upon deletion. Modified search routine to vertically center the found item in the display window.

Modified Word Frequency Harmonics window to allow you to scroll through the main word frequency list using the up and down arrows. If the Harmonics window has focus, pressing either the up or down arrow will move the the next/previous word in the main word frequency window and run a harmonics check on it.

Added some hot keys to the spell check pop up window. "a" - add word to Aspell dictionary, "p" - add word to project dictionary, "s" - skip word, "i" - skip all occurrences of word (ignore) & "Return" - accept suggested replacement (Change).

Added a Unicode character entry pop up under the Help menu. If you know the hex or decimal ordinal of a Unicode character you can use this tool to insert the character at the cursor.



Version .452(557k) Added some more functionality to the Table effects function. It is set up to operate on tables that have multiple lines per cell and blank line between each row. Many, (most?) tables have only one line of text in each cell and are packed with no spaces between rows. You could still use the table tools but you need to manually add a space between each row and then remove them again after the adjustment were done. I have added two new buttons Space Out Table and Compress Table to automate those operations. Space Out Table will put a blank line after each line in the table. This will allow you to use the column adjustment tools without incorrectly rewrapping the columns. Compress table will remove all of the blank lines between rows again when you are done making adjustments.

I have also added an Auto Columns button. This will try to figure out the column layout and automatically insert vertical lines between columns. If your table cells are delimited with vertical bars "|",  it will align on the vertical bars. Otherwise, it assumes  multiple consecutive spaces delimit the columns  and will insert vertical bars and align them.  If you accidentally get extra cells, delete the vertical bar just before the cell(s) in error and re run Auto Columns.

Changed Alt selection to Shift selection. Using the Alt keys carries a bunch of other baggage along with it that was becoming increasingly hard to work around as I added functions to work with the block selection.

Worked on expanding the functions that work correctly with the Shift-select selection boxes. Now the Case modifiers, Surround Selection,  Flood Fill, Convert To/From  Named Entities and Convert Fractions work correctly with non contiguous selection blocks.

Eliminated annoying flicker during Shift-selection drag operations.

Worked on making Shift-select Block selection work seamlessly with cut-'n-paste. Almost, but not quite; don't know that I can make it better though. You can now use Cut and Copy, (Ctrl+x & Ctrl+C) with either selection method and it will do a normal or column (block) operation depending on the selection made. Normal cut/copy for normal selections and column cut/column copy if there is a block selection made. It is impossible to automatically determine if you want a normal paste or column paste in a modeless operation, so you must specify normal paste, (Ctrl+v) or column paste, (Ctrl-`).  Note: the original hot keys for column  cut, copy and paste are still active, F1, F2 & F3 respectively. When doing column cut and paste operations, you will most likely be happier with the results if you change from Insert mode to Overstrike mode.

Added a Selection block vertex readout to the status bar. Shows the start point and end point of the current selection. Possibly useful during column cut and past operations. Added some more functionality. If you are doing block selections, single click on the selection box in the status bar and it will read out the size of the selection box. (If you are doing normal selection, it will not change as the size of the selection box is less useful there.) If you have made a previous selection and right click on the selection status box, the previously selected text will be selected again. If you double left click on the box, a selection dialog will pop up where you can enter a start and end point for the selection and then select it.

Reduced the size of  some of the items on the status bar, it was starting to get out of hand.

Tweaked the default English common highlighting word list. Added some more words.

Fixed problem with highlighting function where the first time you invoked it, you could not cancel out of choosing a word list.

Fixed minor problem in Word Frequency, character counts, where you couldn't search for zero. Thought I fixed this before; I must have broken it again at some point.

Started working on undo functions to make actions that are performed with a single operation be undone with a single undo. I had tried to do this before and mostly succeeded in  making the undo functions almost useless. I think I have figured out the proper methods to manipulate the undo buffer without corrupting it now. There are some operations that, while they can be undone, probably should not be undone. Namely, rewrap operations. At this point, if you undo a rewrap operation, any page markers that are in the middle of a paragraph will be shifted to the end of the paragraph if rewrap is undone. I am looking into ways around this but it may take me a while to figure out a solution. For now, avoid undoing rewrap if at all possible. If you rewrap to wrong margins, adjust your margins and rewrap again. If you accidentally rewrap a  table or index or some such, you may be forced to undo, but you'll need to manually reset the page markers where they belong, (assuming they were in the middle of paragraphs.)

Worked on the link checker yet some more. Now it is nearly bullet proof.  The only fragile part is if you get the directory wrong in the first image link, all of your images will be reported as not found.

Fixed spelling error in Greek transliteration tool; lamda -> lambda


Version .451(550k) Fixed stupid error in Greek transliteration tool where the ou ligature was mistakenly listed as an au ligature and inserting au. I have no excuse for that. That is just embarrassing.

Fixed problem with inserting Unicode encoded files into an open file not decoding the first line properly. (Or inserting it in the proper place either...)

Reworked link checking code quite a bit. Still looks essentially the same to user, but the underlying parsing code is much more tolerant of variation in the layout of the anchor. Incidentally, made checking code much more efficient too, though thats not a real big win here. Running in .1 seconds instead of .25 seconds doesn't mean a lot in the grand scheme of things if said function only runs once or twice a session.  Tweaked external link warning code to differentiate between externally linked local files and remote links. Will now check to see if externally linked local files exist and warn if they can not be found. (Will only check files located in or below the directory the main file is in. Files in other paths are assumed to be an error and are not checked for existence.)

Added new function: Automatic word highlighting. Primarily geared toward automatic stealth scanno highlighting, it can be easily customized to highlight any word list you chose.  Word lists  must be a plain text file with one word per line, and nothing else. A default word list is included with the script in the new guiguts/wordlist directory. It is essentially the English common scannos list reformatted with one word per line and all of the extremely common variations removed (and a few other word added). Left click on the little H in the status bar. The first time it runs, it will ask which word list to use. Browse to it and open the desired word list. The H status box will change to the highlight color and any of the words from the word list file located in the text will be highlighted. The highlight updating occurs every 3/4 to 1 second, so if the status bar H is highlighted, but there are no words highlighted in the text, scroll around a bit. Left click on the H again to disable highlighting. The box will return to a gray background. Once you select a word list, it will continue to be used until you change it or restart guiguts. To change the word list, Right click on the H in the status bar. The same dialog will pop up asking which file to use. If you don't care for the default highlight color, you can change it under the prefs menu. The highlight color will be retained from session to session. Words in the word list can not contain any punctuation except apostrophe. They can, however contain any legal Unicode alpha-numeric character below ordinal FE00.  Words are case sensitive.

Note: I am open to suggestions for words to be included in the default English list. (Though I may choose to ignore them.) Due to the way I am processing the list, there is little penalty for having a fairly large one, though more than a thousand words would get unwieldy. Too many words would reduce the effectiveness too by overloading the screen with highlighted words.

An interesting possible use for the word highlighting function. Open a project and go to the Word Frequency function. Do a Spell check. Press Control+x to export the word list from the word frequency window. You can now open the wordlist.txt file in the highlighting function to do automatic highlighting of  words not recognized by the spell checker while you scroll through the text!



Version .45(546k)
Twiddled around with HTML auto generate code on non rewrap marked sections to try to get rid of italics markup it persisted in inserting in error. Think I'm finally getting close.

Added a "Flood Fill" tool, sort of based on vitalogys suggestion. Select a section of text and fill it with a specific character or string. (By default space) There is a pop up window under the selection menu where you can edit the fill string and activate the function. Also available directly as a hot key; Control+w. (It's not really mnemonic for anything, I just prefer hot keys on the left side of the keyboard and I'm running out of available keys.)

Fixed incorrect label on Gutcheck  -t option. I was improperly labeled as the opposite of what it actually is. Oops. I never noticed because I always run in paranoid mode which automatically enables all checks.

Added a "book mark" function to the spell check. If you are partway through spell checking a document, you can set a "book mark" which will allow you to come back to the same spot later and pick up where you left off. (Words which have been "Skip All"ed will continue to be skipped.) You don't NEED to start where you left off, by default it will start over at the beginning each time you restart. You can choose to set a book mark and return to it later. You can even set a book mark, then restart spell checking at the beginning and then start again the the book mark. Once you set a "book mark", it will remain until you set another.

Fixed a small problem with the tidy error check where it would have problems with going to the errors if you ran it several times in a row.

Putzed around with HTML auto generate, making the code it generates a little more XHTML compliant. Not completely there, but it's getting there.

Added an alternate mouse selection method. If you hold down the Alt key while selecting with the mouse, the selection will be contained to within the rectangle defined by the anchor point vertex and the current pointer position.  This isn't MUCH use yet, as all of the selection mode functions need to be converted to be able to use non contiguous selection segments. Right now though, if you have a block selected, and you hold down the Alt key and press one of the arrow keys, the block will move in that direction. (Note: don't try to move a selected block up or down through a line that isn't  filled past the block. It won't do what you want.) Right now, this is best used with the below table tool to adjust the vertical alignment of header cells i a table after it has been rewrapped.

Add a whole new tool section for ASCII table munging.

When post processing table heavy texts, it is very common to get tables which have been bizarrely and inconsistently formatted by the proofers. Especially when you get multi page tables in which consecutive pages were done by different proofers. You could literally spend hours carefully and laboriously reformatting them, trying to fit them into the 75 space maximum allowed by PG (80 space absolute max). It was enough to make many people avoid texts with lots of tables like poison. Well, help has arrived.

Guiguts now has a specially purpose ASCII  Table Special Effects tool (Tfx in the tool bar).  If you have a table that has vertical bar separators, you can now move the separators left and right, automatically rewrapping and maintaining cell layout in the columns. Your table doesn't have vertical bar separators? Not to worry. There are tools to easily automatically add, remove and relocate them. Your table is too wide to fit within 75 spaces in a grid layout no matter how narrow you make your columns?  Automatically reformat your grid format table as a stepped column table, which will typically allow at least 3 times the nominal width and still fit in 75 spaces. Don't like your stepped format table? Automatically reformat it as a grid layout table with 1 button press.  To adjust the spacing of  columns in a grid layout table, there needs to be a bar on both sides of the column being adjusted. Select the bar to the RIGHT of the column. (To select a bar, highlight any ONE segment of it and press Vertical Line Select.) Select whether to automatically rewrap the column as you move the bar and whether to left, center or right justify the column. Move the bar left or right, the text in the columns will automatically adjust to follow the bar depending on your settings.

The toolset doesn't do absolutely everything, but it probably takes about 85-90% of the effort out of table reformatting.

Say you have a table that looks like this: (An actual table from a book I am working on)

|              |             |         |AMOUNT OF|            |
|              |             |         |  WATER  | AMOUNT OF  |
|              |             |         | NEEDED  |   SUGAR    |
|              |CHARACTER OF | HOW TO  |   FOR   | NEEDED FOR |
|KIND OF FRUIT |    FRUIT    | PREPARE | COOKING |  JELLYING  |
|              |             |         |         |            |
|APPLES, SOUR  |Excellent for|Wash,    |Include  |¾ cupful of |
|              |jelly making |discard  |One-half |sugar to 1  |
|              |             |any      |as much  |cupful of   |
|              |             |unsound  |water as |juice       |
|              |             |portions,|fruit    |            |
|              |             |cut into |         |            |
|              |             |small    |         |            |
|              |             |pieces.  |         |            |
|              |             |         |         |            |
|APRICOTS      |Not suitable |Leave a  |For jam  |¾ cupful of |
|              |for jelly    |few      |use just |sugar to 1  |
|              |making.      |stones in|enough   |cupful of   |
|              |Excellent for|for      |water to |apricots    |
|              |jam.         |flavor.  |keep from|for jam     |
|              |             |         |burning  |            |
|              |             |         |         |            |
|BLACKBERRIES  |Excellent for|Wash     |1 cupful |¾ cupful of |
|              |jelly making |         |of water |sugar to 1  |
|              |             |         |to 5     |cupful of   |
|              |             |         |quarts of|juice       |
|              |             |         |berries  |            |
|              |             |         |         |            |
|BLUEBERRIES   |Excellent for|Wash     |1 cupful |1 cupful of |
|              |jelly making;|         |of water |sugar to 1  |
|              |make a sweet |         |to 5     |cupful of   |
|              |jelly        |         |quarts of|juice       |
|              |             |         |berries  |            |
|              |             |         |         |            |
|CRANBERRIES   |Excellent for|Wash     |One-half |¾ cupful of |
|              |jelly making |         |as much  |sugar to 1  |
|              |             |         |water as |cupful of   |
|              |             |         |berries  |juice       |
|              |             |         |         |            |
|CHERRIES      |Pectin must  |Pit the  |For jam, |¾ cupful of |
|              |be added for |cherries |use just |sugar to 1  |
|              |jelly making |for jam  |enough   |cupful of   |
|              |             |         |water to |cherries for|
|              |             |         |keep from|jam         |
|              |             |         |burning  |            |
|              |             |         |         |            |
|CRAB APPLES   |Excellent for|Same as  |One-half |¾ cupful of |
|              |jelly making |apples   |as much  |sugar to 1  |
|              |             |         |water as |cupful of   |
|              |             |         |apples   |juice       |


With a few button presses you can make it look like this: (adjustable wrap margin; set to 50 here.)

KIND OF FRUIT
    |CHARACTER OF FRUIT
    |    |HOW TO PREPARE
    |    |    |AMOUNT OF WATER NEEDED FOR COOKING
    |    |    |    |AMOUNT OF SUGAR NEEDED FOR
    |    |    |    |JELLYING
    |    |    |    |
APPLES, SOUR
    |Excellent for jelly making
    |    |Wash, discard any unsound portions, cut
    |    |into small pieces.
    |    |    |Include One-half as much water as
    |    |    |fruit
    |    |    |    |¾ cupful of sugar to 1 cupful
    |    |    |    |of juice
    |    |    |    |
APRICOTS
    |Not suitable for jelly making. Excellent for
    |jam.
    |    |Leave a few stones in for flavor.
    |    |    |For jam use just enough water to
    |    |    |keep from burning
    |    |    |    |¾ cupful of sugar to 1 cupful
    |    |    |    |of apricots for jam
    |    |    |    |
BLACKBERRIES
    |Excellent for jelly making
    |    |Wash
    |    |    |1 cupful of water to 5 quarts of
    |    |    |berries
    |    |    |    |¾ cupful of sugar to 1 cupful
    |    |    |    |of juice
    |    |    |    |
BLUEBERRIES
    |Excellent for jelly making; make a sweet jelly
    |    |Wash
    |    |    |1 cupful of water to 5 quarts of
    |    |    |berries
    |    |    |    |1 cupful of sugar to 1 cupful
    |    |    |    |of juice
    |    |    |    |
CRANBERRIES
    |Excellent for jelly making
    |    |Wash
    |    |    |One-half as much water as berries
    |    |    |    |¾ cupful of sugar to 1 cupful
    |    |    |    |of juice
    |    |    |    |
CHERRIES
    |Pectin must be added for jelly making
    |    |Pit the cherries for jam
    |    |    |For jam, use just enough water to
    |    |    |keep from burning
    |    |    |    |¾ cupful of sugar to 1 cupful
    |    |    |    |of cherries for jam
    |    |    |    |
CRAB APPLES
    |Excellent for jelly making
    |    |Same as apples
    |    |    |One-half as much water as apples
    |    |    |    |¾ cupful of sugar to 1 cupful
    |    |    |    |of juice


And with a few more button presses, you can change it back.

Now, for this table the step format isn't really necessary, but if your cells contain a LOT of text, the step format can be MUCH more efficient.

There are some things it doesn't deal with well. Cells that span multiple columns may (probably will) be problematic. You may need to reformat tables that have them, in sections. You will probably need to clean up some extra cell division bars at the bottom of converted tables. No big deal, just select and delete. If your table doesn't have a right edge bar, the routines may leave some spaces at the end of the line, again, there are already methods to fix that available.

I am quite pleased with the table tools. The table conversion tools, though impressive were relatively easy to program. That was mostly just a matter of reading the text in to a matrix, doing a transform and rewrap and writing it back out again. The grid column adjusting tool was surprisingly complex to write. Every time the column moves a space, it needs to do four table matrix transforms and a rewrap operation on each cell in the column while keeping track of individual cell heights in relation to other cells on the same row and adjusting so they don't collide when being written back out.




Version .446(537k) And arggh again. Should have let .445 bake a bit longer.

Fixed problem with bin file saving code where it was adding a extraneous ');' and causing the file to unparsable by guiguts.  This would cause you to lose your page markers if you closed the file and them reopened it. Not good.

Fixed problems with Auto Save and Auto Backup functions. Now they can be enabled and disabled properly and settings are retained session to session.

Added a custom paste routine. I call it "Overpaste". If your Insert/Overstrike mode is in Overstrike, the paste function will overwrite text as you paste it in.  I found this very handy when I was reformatting tables in ASCII. Using this function, I could set up an empty framework, and then cut and paste the text into the frame without disturbing the layout. (The text I was pasting in would overwrite the spaces.) I cut the time I was spending on reformatting tables by probably at least 30-40%  since I didn't need to constantly delete spaces after I pasted in text. The function is intelligent enough to check if it is near the end of a line and not overwrite the newline and start of the next line if the pasted text is longer than the text between the insertion point and the end of the line. In that case, it will overwrite characters until the end of the line and then just append the remaining text.

Fixed a few more problems with HTML auto generate with italics markup in poetry sometimes going awry.

Fixed incorrect title text in Browser Start Command dialog window.



Version .445(531k) Arrgghhh! It helps if I actually decode the correct variable. Fixed problem where first line of text in a UTF-8 encoded file was not being decoded properly if there were any multi byte characters on the line.

Added Auto save function. Toggle it on and off under Prefs menu. If enabled, will automatically save  every interval. Also added Auto save Interval Adjust function under Prefs menu,  pick how many minutes between automatic saves if auto saves is enabled. Set to 5 minutes by default, can be any integer from 1-999. If set to zero , will revert to default, (5 minutes).

Now will check for and warn  when you attempt to Save As to a file name that will have a collision with its .bin file. In other words, if you have a file myfile.txt and want to generate an HTML version, you will typically  save a copy and work on that.  If you save it as myfile.html , there will be a collision of .bin files, since it just takes the part before the extension and changes to a .bin extension to name the bin.  Will now warn you if you attempt to do this.

Fixed problem with Goto Page and Goto Line dialogs where they would occasionally refuse to close.

Added  Auto Backups function. If enabled, will automatically save copies of the two most recent editions of the file. Assuming a file myfile.txt. When you save the file, if it is already saved to disk, the copy on disk will be renamed to myfile.txt.bk1 and the current file saved to myfile.txt. If you save it again, myfile.txt.bk1 will be renamed myfile.txt.bk2, myfile.txt will be renamed to myfile.txt.bk1, and the current file will replace myfile.txt. Any further saves will result in myfile.txt.bk2 being deleted, and the same shuffle as previously described for the remaining files. This function may be enabled or disabled under the Prefs menu.



Version .444(531k) Applied fix I had made to /P P/ poetry italics markup detection, to the /* */ block markup detection as well. It had the same problem of getting confused by markup that was inside the line.


Modified guiguts to add the BOM \x{FEFF} marker at the beginning  of files that are saved in a multi-byte format. Should have been doing that from the beginning. :-(  It is astonishing how much I don't know about Unicode files and formats. Especially that I'm allegedly developing a Unicode text editor. Should be more compatible now with other text editors. (Theoretically.) Thanks to garweyne for pointing this out.


Fixed badly implemented bug fix to convert fractions routine. Now actually works again.


Fixed problem with setting file code where it would choke on file names with and apostrophe in the path/name. Would cause settings to revert to defaults for no apparent reason. Apostrophe  is a legal character in file names, but was causing the settings file  loading code to get confused. Rather tortuous and twisted method for fixing that, but it works at leasts. (On my system,  I'm sure some other bug will manifest once it is in the wild. :-/)


Modified the auto table defaults slightly. No longer has borders set by default. I found that 95% of the time I was setting them to '0' anyway.  Increased cell padding a bit too. Another setting I was constantly modifying. Just because I like it better that way.


Added another custom assertion to the regex replacement parser. \C..\E - C is a mnemonic for "code". It will parse everything between the \C and \E as perl code, execute it and return whatever the results of the execution were. I found this useful when I needed to add an offset to a series of integers. When I autogenerated my HTML for a file, I had my page markers off by two, they were in the correct positions, just numbered incorrectly. I had done quite a bit of other tweaking to the file before I noticed, so was loathe to abandon the work I had already done and re generate the file. Using this regex assertion, I was able to search for the page anchors and increment them by 2 in about 5 seconds, (Well... not counting the 30 minutes it took me to write and debug the code assertion...).

Search for:
<a name='Page_(\d+)

replace with:

<a name='Page_\C$1+2\E

Very useful. You could also use this for more complex calculations too. Say you wanted to find all real numbers in a file(positive or negative, integer or floating point) and find the natural logarithm of the square of the number to 4 decimal places. (Why? I don't know, it's a hypothetical scenario, bear with me.)

Search for:
(?<![\.\d])([-\.]{0,2}\d(\d*)?\.?\d*)       <-- Find a real number

Replace with:
Natural log of $1 squared is \C sprintf("%.4f", log($1*$1)) \E       <-- Replace with calculated value

So 255 would return "Natural log of 255 squared is 11.0825"
and -2.56 would return "Natural log of -2.56 squared is 1.8800." and so on.

Note: If the operation yields an error or an undefined value, you will get an error message in the console window and a blank return value for the calculation.

This assertion should be used cautiously. It will execute nearly any valid perl code it is fed. If you set it with code to delete all the files on your hard drive and run it, it will cheerfully try to do so. As a bit of a safety, if you try to run regex code assertions, it will pop up a warning message asking if you really want to do so. If you do not know what the code does and/or do not trust the source of the regex term, click Cancel to escape without executing. Click OK to continue. You can also select Warnings Off if you like. It is sort of an "Expert Mode" and assumes you know what you are trying to do and take responsibility for the consequences.

Don't panic and be afraid to use this assertion; if you don't tell it to delete files, it is not going to. I just want to impress upon you that if you DO tell it to delete files, it WILL. Use judgment when  executing arbitrary code. (And no, I'm NOT going to give you an example of code to delete files, if you really want to know, read up about perl and figure it out yourself, I am NOT going to help you shoot yourself in the foot.)


Added a pop up Regex Quick Reference under the help menu. Actually loads a text file in  the guiguts directory  named regref.txt. Contains a summary of the assertions that are legal for use in the regex engine as implemented in guiguts. You can edit the file if you want too.  It can contain anything you want. A likely use would be a scratch pad to hold useful regexes that don't really belong in the scanno regex file.  (Like the above  custom regex to find a real number, for instance.)


Changed Goto Page Number and Goto Page number dialog to  highlight the value in the entry box when it pops up so that if you type something, it automatically deletes the value already there.  Simple to do in the Goto Page, required overriding default methods and rewriting of code for the Goto Line dialog.  A bit of a pain for a minor gain, but the little things like that can be a major irritation after a while.





Version .443(524k) Fixed bug where margin prefs were being modified unintentionally by some program operations.

Fixed bug in See Page code where if you had more than one page marker adjacent to each other, the See page would display the image associated with the FIRST marker in the series not the LAST. (Typically only happened if you had blank page in the text.

Fixed oddity in HTML auto generate code where adjacent page markers would be inserted in random order.

Fixed bug in Internal Link pop up window where See Page Anchors check box was not retaining state.

Fixed potential problem in Convert Fractions routine where it would mistaken partially convert some fractions. I.E. 1/25 would mistakenly be converted to ½5 (one-half 5) Fixed now.

Fixed a problem where some  arrays  were not getting cleared when you loaded a project. If you were working on a project that had footnotes and autogenerated HTML; then opened another different project with MORE footnotes, without restarting guiguts, and autogenerated HTML on the second project, it would freeze while generating the HTML.

Worked on trying to make Gutcheck view options sticky from run to run.  They are (and were) if you don't close the gutcheck window between runs. Made some modifications that will improve the situation but am not easily able to completely overcome the problem without major redesign of the view option code. (And I don't think it is serious enough a problem to invest major amount of time in.)

Worked on HTML autogenerate code for poetry. The way guiguts handles poetry is to enclose each line in a <span></span> block, however, if there are italicised portions of a poem that span lines, it was occasionally having trouble. Italics markup is not allowed by the HTML spec to be improperly nested with block elements. so something like this is illegal:

<span><i>A line of poetry that is italicized</span>
<span>and here is another.</i></span>

The <i> markup can't be  continued across a <span>. To compensate for this, guiguts was detecting whether there was unclosed  <i> markup on a line and then inserting open and closing markup on each line until it got to a line that had an unopened </i>. That all worked, but it would get confused if there was  other separate italics markup on either the starting or ending line.

I've reworked the markup detection logic and made it a lot more resistant to getting confused by those types of situation.




Version .442(523k) Oops, left some debugging code active in the HTML autoconvert routine. Wasn't hurting anything, but it slowed down the routine by a significant amount.

Worked on link checker some more. Now only reports image links if there is a problem. Now checks that all the files in the image directory are used and reports any unused files. (Note: this will give some false positives if the images are not in their own directory. It is highly recommended to use an "images" directory if your file contains any images, even if it is just one or two.)

Modified built in link checker to work correctly with "tidied" files. It was fragile in that it expected to find each link or anchor completely on one line. Tidy often breaks long links and anchors across lines. This would lead to spurious errors being reported by the link checker. Fixed now.

Figured out how to stop tidy from echoing to the console, just need to redirect STDOUT to null. I didn't think of that before because I need to capture the output to get the error list. I didn't realize at first that the error list is sent to STDERR, not STDOUT  so redirecting STDOUT  works fine. Sometimes it really does pay to read the documentation... :-/ Tidy error check runs much faster now with the echo suppressed.

Found an fixed long time bug in the HTML page anchor generation code  where it would sometimes inexplicably skip adding some anchors if you had an page offset set.

Added another option to the HTML  autogenerate  window: Convert Fractions. If selected, will automatically convert all written out fractions in the text to  Named/Numeric entities while autogenerating HTML.   I.E. 1-1/2 will become  1½, 5/8 will become ⅝ and so on, for all of the available  Latin 1 and Unicode fractions.  (halfs, quarters, fourths, fifths, sixths and eighths.) Same function is available under the Selection menu in the main window . If you select some text and press Convert Fractions, all of the fractions inside the selection will be converted.  If you DON'T make a selection it will work on the whole document.



Version .441(521k)
Modified search window to avoid weird growing problem under Linux.

Modified HTML header file load function to convert to native line breaks on load. Should alleviate  troubles under Linux while generating HTML version.

Fixed yet another buglet in overridden margin block wrapping code.

Tweaked Auto List code a bit to generate more useful code that should require less hand tuning afterwards. Fixed up unclosed paragraph markup for the previous line.

Added "t" hot key to the page separator fixup function, leave two blank lines  where separator was. Similar to "l" for one blank line or "h" for four. Very useful while PPing  cookbooks with 2 blank lines between recipes. (Wonder what I've been post processing? ;-) )

Modified named anchor routine to handle  named entities &amp; &quot; and &mdash; more gracefully.

Modified  "make anchor" replacement assertion \A..\E to not automatically add the original text back in. It is now necessary to explicitly add it back in. That allows adding tags around the displayed text in a single search and replace operation. For example:  Say you want to find all occurrences of "CHAPTER" followed by a roman numeral, make an anchor and make  the displayed text bold. With the old method, there was no way to do it in one step.  With the modified assertion, you can do a regex search for:

(CHAPTER )\s*([IVXLC]+)

and replace with

\A$1$2\E<b>$1$2</b>

For "CHAPTER    XVI" you would get: <a name='CHAPTER_XVI'></a><b>CHAPTER XVI</b>

This example is rather trivial, but the \A..\E assertion will correctly deal with punctuation, named entities, spaces and such automatically.

Modified built in link checker to display all of the critical things first; (internal links without anchor, external links, links with illegal characters) before displaying informational stuff. (anchors without a link.)

HTML Link checker now does more comprehensive image link verification. Specifically warns if an image link name contains upper case characters. (Not allowed under the PG spec.)  Specifically warns if an image file can not be found.



Version .44(519k) Modified search routine to work correctly under Linux when searching across line breaks. Modified routine to be more robust and work seamlessly across platforms.

Fixed search and replace window to not delete text after the cursor in the search term or replacement term entry boxes if you use the enter key to do searches.

Modified  search and replace entry boxes to be adjustable width instead of just 40 characters. Box is resizable in the X direction if necessary. Very handy when you are working with very long regex expressions. Quite pleased with getting this to work.Trickier to pull off than it appears.

Modified the various list windows (word frequency, gutcheck, link check, etc.) to standardize on double left click as the primary function, (search) and right click as the secondary function, (varies by window). I can't use single left click because  single left click is already used for select. It will be a little confusing for a while, but in the long run will be more user friendly.

Modified setting file loading code.  Before, if it encountered an error or something it couldn't parse while loading the settings, it would revert to defaults and then overwrite the  file containing the error with the default settings. Made it very difficult to troubleshoot when someone had a problem with the setting file. Happens rarely, but it happens. Now, if it has a problem, it will write out the original setting file to a file called setting.err before it overwrites with defaults.  That way you (I) can at least try to figure out what the problem was and probably recover the settings.

Made some additions to the HTML image window. Now allows you to maintain the aspect ratio automatically while changing the display size of images. Also displays actual size of image for reference. Slightly buggy in that if you highlight and delete either the width or height value, it will not start to calculate the other until you have at least two digits in the box you are modifying. (Done to prevent divide-by-zero errors. In practice, shouldn't be much of a problem.)

Fixed yet another minor bug in the rewrapping code. If you overrode the block wrap margins, the first paragraph inside the block markup would use the standard rewrap margins instead of the overrides.

Added a "Multi line" option to both the Auto Table and Auto List HTML markup generators. In the auto table function, if multi line "ML" is checked, it will treat each paragraph of text as a single row, grouping everything that aligns vertically into a single cell. (In other words, you need to leave a blank line between "rows" of the table.) It can either use multiple spaces or vertical bars as the cell delimiter. Vertical bar delimiter is better if there are lines in the columns that are empty.

For example if you select this ASCII table and do an Auto Table:

    Carolina Poplar |100 ft.|Grows in a dry soil. Fastest growing street
                    |       | tree. Its dropping fruit is a nuisance.
                    |       | Sheds leaves early.

    Catalpa         |50ft.  |Lovely white blossoms in June. Seed pods
                    |       | stay on into winter. Quick growing.
                    |       | Good lawn tree.

    English Hawthorn|30ft.  |Flowers in June. Red berries. Grows on
                    |       | dry soils. Slow grower. Sharp thorns.

    Linden          |90ft.  |Easy to grow. Fragrant flowers. Rapid
                    |       | grower. European species smaller than
                    |       | American.

    Live Oak        |100 ft.|Not hardy in the North. Grows south of
                    |       | Virginia. Beautiful evergreen oak. Likes
                    |       | moist soil.

It would yield this, if "ML" is checked:

<table align='center' border='1' cellpadding='2' cellspacing='0' summary=''>
<tr><td align='left'>Carolina Poplar</td><td align='left'>100 ft.</td><td align='left'>Grows in a dry soil. Fastest growing street tree. Its dropping fruit is a nuisance.  Sheds leaves early.</td></tr>
<tr><td align='left'>Catalpa</td><td align='left'>50ft.</td><td align='left'>Lovely white blossoms in June. Seed pods stay on into winter. Quick growing.  Good lawn tree.</td></tr>
<tr><td align='left'>English Hawthorn</td><td align='left'>30ft.</td><td align='left'>Flowers in June. Red berries. Grows on dry soils. Slow grower. Sharp thorns.</td></tr>
<tr><td align='left'>Linden</td><td align='left'>90ft.</td><td align='left'>Easy to grow. Fragrant flowers. Rapid  grower. European species smaller than  American.</td></tr>
<tr><td align='left'>Live Oak</td><td align='left'>100 ft.</td><td align='left'>Not hardy in the North. Grows south of Virginia. Beautiful evergreen oak. Likes  moist soil.</td></tr>
</table>

Which would display as:
Carolina Poplar 100 ft. Grows in a dry soil. Fastest growing street tree. Its dropping fruit is a nuisance. Sheds leaves early.
Catalpa 50ft. Lovely white blossoms in June. Seed pods stay on into winter. Quick growing. Good lawn tree.
English Hawthorn 30ft. Flowers in June. Red berries. Grows on dry soils. Slow grower. Sharp thorns.
Linden 90ft. Easy to grow. Fragrant flowers. Rapid grower. European species smaller than American.
Live Oak 100 ft. Not hardy in the North. Grows south of Virginia. Beautiful evergreen oak. Likes moist soil.

For best results you'll need to remove any leading or trailing vertical bars before you run the function.  Note: The vertical bars DO NOT need to line up in the ASCII table for the Auto Table converter to work correctly, they just need to be somewhere between the cells.

Multiple spaces for separators work as well, but can not compensate for blank cells. You can only use multiple spaces as the separator if every line in every cell has something on it.

Auto List ML is similar in that it treats each paragraph as a single list item. Essentially, it uses blank lines to denote item breaks rather than line breaks.


Added a basic interface to HTML tidy. HTML Tidy is a very comprehensive HTML checker/correcter. It would be almost impossible (and rather pointless) for me to duplicate the functionality in guiguts. However, it has a command line interface and can be difficult for people unused to one. This eases the interface at the cost of some restricted customization.  If you have an HTML file you want to check, open the file, open the HTML window and click on the HTML Tidy button near the bottom. It will run tidy on the open file and generate an error and warning report similar to the gutcheck report. You can double left click on an error to go to that error. and right click on an error to remove it from the list. Note: tidy works on the open file, you don't need to save it before running tidy.  There is also an option to have tidy automatically fix the file. If you select this option, it will apply all of the changes and save the file to a file with a "tidy." prefix. The open file WILL NOT BE CHANGED.  I.E. If you file is named "myfile.html" and you run the tidy modify function, myfile.html will not be changed, the changes will be written to the file tidy.myfile.html. The tidy.myfile.html file WILL NOT be compatible with the page/proofer notation from the .bin file, the tidy process will throw  off the indexes.

When you run tidy on the file, it will echo the file to the console  as it is parsing it, which slows down the process quite a bit. For large files and slow computers  it may take  several tens of seconds to work it's  way through the file. Note: This is somewhat dependent on which version of tidy you are using too. I haven't yet figured out how to suppress the  echo during processing. The -quiet option doesn't seem to do anything.

I did not include a copy of HTML Tidy in the download package. Visit the Tidy project page to download the executable (or source, if that's your thing,) appropriate to your platform.



Version .436 (513k) Arrgh.  Bug fix in .433 that fixed validation fault bug was buggy. Inserted Hex ordinal of character instead of decimal ordinal. (Basically inserted wrong character. &#27; when it should have been &#39;. Now fixed. Only a problem if you have image links with alt or title text that has apostrophes or single quotes in it.

Version .435 (513k) Added spell check dictionary select option under Prefs menu so you don't need to run a bogus spell check before you can change dictionaries.

Added a few hot keys to the word frequency window to allow saving of the word lists; Ctrl+s and Ctrl+x. Ctrl+s will Save the contents of the word frequency window exactly as it is displayed, (with the overview and word counts,) to a file, by default named wordfreq.txt in the same directory as the original text file. (May be changed if desired.) Ctrl+x will eXport the the contents of the word frequency window to a file that only contains the actual words in the list, in the order displayed, new-line separated; no frequency counts, no overview line. Exports to a file, by default named wordlist.txt in the same directory as the original file.
(May be changed if desired.) At this time, the export function will strip the asterisks off of "suspect" words in the lists, (if there are any in the current list.) I debated just removing "suspect" words completely from the exported list, but figured this is probably more useful.  If someone has a differing opinion, I would be interested to hear it. The focus must be on the Word Frequency window to use these hot keys. If the focus is on the main window, the actions associated with the hot keys for that window will be performed instead. (Save and Cut) Probably not a great idea to overload the hot keys, but there are just too many functions and not enough keys. I put the hot key codes in the title of the Word Frequency window so it would be easier to remember them. 


Version .434 (512k) Fixed bug in block wrapping code where if there were multiple paragraphs within one block, the first paragraph would be indented one more space than the following paragraphs.


Version .433 (511k) Fixed bug in image viewer initialization code where it would look in the default directory for image files if it existed, even if a different directory was selected in the file preferences.

Fixed bug in generated
HTML image links where apostrophes were not being escaped properly in alt and title properties. W3C validator would complain about them.


Version .432 (510k) Fixed very aggravating bug introduced in .43 where script would occasionally lock up while selecting text or rewrapping.

Modified rewrap routine to automatically convert non-breaking spaces to regular spaces while rewrapping. I had set up the wrapping routine to honor non-breaking spaces under the assumption that if you used them, you wanted to maintain a certain layout. However, someone has been converting all spaces to non-breaking spaces during proofing, and it really interferes with wrapping unless you happen to notice and change them manually.

Added hot key 'v' (view) to the page separator function. Will open the image viewer to the current page. Handy for doing quick checks of ambiguous paragraph breaks at page breaks.

Trapped warning that would come up on the console  window during the page separator  function if you pressed one of the buttons (or hot keys) while in full auto and not waiting for input.

Some functions write temp files to the disk while they are running. If  there was a problem where the file couldn't be written or read, the script would just silently and mysteriously fail. (Usually space or permissions problem.) Am now explicitly checking  that read and write operations are successful and popping up a warning message if they aren't.



Version .431 (510k) Fixed the normal bug I introduce fixing some other bug.  When rewrapping a text, the rewrap margin for  standard text would be shifted right by one space after a block rewrap block. Now returns to correct indent (normally zero for standard text, but I could be set to something else.)

Fixed  behavior under Linux where window was expanding and contracting when the line numbers were toggled on and off.



Version .43 (510k) Worked on Italic/Bold Word Frequency routine to have more accurate counts on the non-marked up phrases. Still not perfect, in a few unusual circumstances it can still be off, but it is much more accurate in general.

Added an optional line number bar to the main text window. Right click on the line number / column in the status bar at the bottom to toggle it on and off. Note: having line numbering on will significantly slow down functions that scroll the display; (Rewrap, Fix-up, etc.) 
Tracking and updating the line numbers adds a fair amount of overhead. It is probably a good idea, (though  not strictly necessary,) to toggle the line numbers off while performing one of those functions.

Tweaked page separator fix-up code to recognize and work with markup that is split over a page break. If a page ends in a closing markup and the next starts with an opening markup, it will delete the extraneous markups.

Fixed stupid bug in rewrap code. When I changed the parser to recognize block markup without a leading blank line, I accidentally introduced a bug . It was adding a spurious blank line before  all block markups  that DID have a leading blank line already.  Doh! Fixed now.

Cut out some debugging code I accidentally left in the previous release.



Version .424 (504k) Sigh. Fixed bone-headed error in Guess Page Markers function where it was not left padding the page names from 26-99 with zeros.


Version .423 (504k) Added Word Frequency Marked up phrase search. Sort out words / phrases with italic or bold markup and display them and similar words / phrases without markup. WARNING: Counts  for the unmarked phrases may be inaccurate, especially if they cross line boundaries. Right click on the Ital/Bold Words button to change the phrase word limit. Marked up phrases with word counts above that threshold will not be included. Default threshold is 4. If a phrase crosses a line boundary, the threshold may be off by one.

Fixed regex search results that cross line boundaries to be highlighted correctly under Linux and OSX.


Version .422 (504k) Fixed bug in Word Frequency spell check where it was always using the English dictionary no matter which dictionary was selected.

Added the currently selected dictionary to the spell check window title bar to make it easier to keep track of which is selected.



Version .421 (503k) Modified spell checking code to not be confused by unrecognized abbreviations that contain a number. ( 1er, though a valid abbreviation in French, was causing problems in Aspell.)

Change spell checker to clear highlighting and word list when it is re run while open and when changing dictionaries.

Bound F7 key to spell checker. Pressing F7 while the focus is in the main window will run spell check on the selected text or the entire document if no selection is made.

Added Word suggestion mode to Aspell options. Customize how aggressively Aspell looks for possible replacement words.

Modified suggestion list header to report how many suggestions were returned. Allows easy comparison of the various suggestion modes.

 

Version .42 (502k) Fixed small problem where if there was a hyphen at the end of the last line in a block rewrap block, it was moving the last word down on to the same line as the block end marker. A fairly rare problem, but annoying when it happens.

Worked on getting the guiguts to play nicely with the upcoming .99 version of gutcheck Lots of new functionality in gutcheck, which required some fairly substantial tweaking of the interface code.  Expanded view selection list to include ALL of the possible error queries that gutcheck can possibly emit, including some that were in earlier versions but were very low frequency. Rearranged view selection list to be in alphabetical order rather than the random haphazard order it was in before. Makes it easier to find a specific selection in the list.  (NOTE: Gutcheck .99 has not yet been released. I have been working with a beta, somewhat buggy version. Guiguts will still work with (and ships with) gutcheck .981, but when .99 becomes available, it should drop in without modification.)

Played around with the proofer messaging code a bit. Now when you press "send message" in the proofer pop-up window, if no proofer name is highlighted, it will open a generic send message window. If a proofer name IS highlighted, it will open that proofers profile page. (From which you can send a PM, but which has lots of other useful information too.) Modified it a bit to work with proofer names that contain spaces or non-alphabetic characters. They aren't common, but they DO exist. You can just select an entire line and the proofer name will be extracted. (Triple left click on the proofer name.) Selecting the entire line won't work in sort-by-page view as both round proofer names are on each line. You'll need to highlight just the proofer name to whom you wish to send a message.

Added another optional status bar to the bottom of the screen. If you are working on a DP file that has the proofer names in it, and right click on the See Proofers button in the status bar, another bar will open that displays the names of the proofers of the current page, updated as you move through the file.. (NOTE*** if you move page markers around, this will be thrown off. It works by matching up the current page marker with the proofer names stored in a hash indexed by page number. Once the page markers are moved or reindexed, the markers will no longer correspond to the correct hash entry. The pop-up proofers window will always be accurate.)

Reworked sort order for proofer names to be case insensitive. It was sorting in ASCII sort order (Capitals before lower case).  Now does true alphabetical sorting.

Fixed a bunch of problems with the link checking code. It wasn't finding links that were broken over multiple lines. (Uncommon for guiguts auto generated code, but common in many HTML editors and tidied files.) Was erroneously reporting links with forward slashes as errors. (Was supposed to be back slashes.)

Fixed small problem in rewrapping module where it was miscounting line lengths that contained bold HTML markup. Not a big deal, but it could affect rewrapping slightly if you had lots of bold markup..

Modified block rewrapping functions to work even if there ISN'T a blank line on either side of the /# #/ markup.  Will only use default indents in this case though.

Added some features to the search function. If you have a word selected in the text, and click Search (or Ctrl+f), the search box will load the selected word and search for it. Since the search function searches first on selected text - a previously requested feature - it will find the selected word first, then search the file past where the
word was selected. You can manually set the cursor where you want it to start searching if desired. If  you have text that extends over one line selected, only the first line of the selection will be used as the search term. The rest will be truncated.

 Added a "Start at Beginning" selector in the Search window to restart the search from the beginning (or end, if searching reverse) of the file.

Added Ctrl+f as a hot key combo to the search window. Does the same thing as the search button or Enter key. Now Ctrl+f in the main window starts the search function and Ctrl+f in the search window searches for the next instance. Just because I felt like it.

Tried to make HTML autogenerate more resistant to errors caused by customized header files.

Modified block rewrapping slightly. It would ignore a first line indent override if the indent was set to zero. (Fairly obscure) Fixed. Also modified so that if several paragraphs were included in a block rewrap, and a first line override was given, each paragraph in the block would get the override instead of just the first.

Worked on page marker routines to be a little more tolerant of  four digit page numbers and work smoothly at boundaries (before first page marker and after last.)

Added an Insert button to the page marker adjust functions. Will automatically make room for a page marker if necessary and add it at the cursor. (The Add button will only  insert a marker if there is already room for it.)

Finally got around to installing Linux on one of my systems and worked on trying to iron out a few of the peculiarities guiguts exhibits on Linux.

Got external programs to run in non-blocking mode under Linux. Necessitated extensive rewrite of the external program calling code, and the external program spawning script. Changed the name of the external script from runner.pl to spawn.pl. For some reason, the name runner.pl seemed to confuse a lot of people and I got quite a few questions about it.

Attempted to trap right click menu error in text window under OSX and Linux. Still am not able to get a right click to pop up a menu like under Windows, but at least it doesn't just crash the program any more when you right click..

Modified , Upper and . Lower frequency searches to work correctly (or, at  least, reasonably correctly) under Linux and presumably OSX.

Modified Unicode routine to give a little more indication of what is going on under Linux. The Linux X font server takes a long to to fault out when a particular character is not implemented in the selected font. If there are many unimplemented characters, it can seem like the program has hung. At least now you have an indication every 3-4 seconds that it is still working. Not a problem under Windows. Apparently the Windows font manager routines are much faster in dealing with unimplemented characters.

Added another preference to the Prefs menu. Set Browser Start Command. For Windows, it is probably best to leave it as 'start'. That will start whatever your default browser is. Otherwise, enter the full path to the executable. This will allow non-Windows users to customize the browser start command for their OS.



Version .411 (499k) Sigh. I had made some changes to the link generation code and changed the link searching code to find the newer links, but broke searching for older style links in the process. Fixed.


Version .41 (499k) Fixed minor problem in word frequency, ". Lower" search where searching on the results would occasionally return unexpected results under some circumstances.

Added a "Surround Selection With..." function. Replaced  "Insert _  _ Around Selection" function. It does the exact same thing except the text that is inserted  around the selection is editable. (It is still _  _  by default.)  I was contemplating adding a bunch of  single purpose functions to add various markups, but  realized this was much more flexible , ( and less aggravation in the long run.)

Fixed File Open dialog parameters for Linux thanks to a bug report and patch submitted by Gregory Margo. Probably will fix nagging problem under OSX too.

Worked on HTML internal link function a bit, to be better about guessing the anchor you are trying to link to. Much better about finding exact matches, much better about finding not exact, but close matches. When you select some text to be an internal link, it will try to find a named anchor that has that exact wording, case insensitive, and punctuation removed. Next, it will find all of the named anchors in the file that have some of the words in the selected text, (excluding the, a, and & to) and list them. Finally, it will list ALL of the named anchors in the file.  If you have fewer than a hundred or so named anchors, this is probably overkill, but I just completed a project with nearly 1800 named anchors and 3900 internal links and this was a  real sanity saver.

Added a rudimentary internal link checker to the HTML fixup page. This will find all of the named anchors, internal & external links and image links in your file and and will: list totals of each, links without anchors (this is a critical error, you've got an internal link that goes nowhere,) anchors without links, (informational, you will probably have several, especially if you have included page anchors,) and all image links (check the image links to make sure they are all lower case to comply with Gutenberg standards, and that they are all relative, not absolute links. Will warn if there ARE any image links with upper case characters.) and external links. (This is probably an error unless you have a  multi-volume, cross-linked text. Normally, external links are frowned upon in Gutenberg texts.) It will also list any links that have any spaces or backslashs in them (or their numeric equivalent) ' ', '%20', '\', '%5C', which are almost always a mistake.

Fixed (or at least worked around) annoying bug where the HTML image function would not let you edit the title or alt text until you shifted focus away from the window and back. (By the simple expedient of automatically shifting focus away when it is opened.)

A bunch more useful patches from Gregory Margo:

Modified various word frequency routines to not include page separators in the word counts.  (I was of two opinions about this but went ahead and added it.)

Modified spell checking to skip page separators. (This was definitely useful. I can't think of any reason to spell check the page separators.)

Modified spell checking process handling to be more Linux friendly. Dictionary handling is much cleaner. (Also helps under windows and presumably OSX as well. No more zombie processes hanging around after you run spell check. Yay! :-) )

Big thanks to Gregory!

Made spell check re run automatically if dictionary is changed. No longer necessary to close and re open spell check.

Fixed a rather egregious error in the spell check function. The Change All button would change ALL of the words queried by the spell checker to the first suggested replacement. Almost definitely not what you would expect (and not what I intended.) Now will change all occurrences of the PRESENT word to the selected replacement.



Version .403 (493k) Arrgh. Neglected to escape forward slash in line 6389.

Version .402 (570k) Fixed problem in spellcheck where it wasn't replacing the misspelled word completely.

Changed the poetry HTML autogenerate to use markup that would be friendlier to non CSS aware browsers. Also changed <br> and <hr> to use XMLish markup; i.e. <br /> & <hr />.

Change Unicode information file to a save method that will be compatible with big-endian byte order OSs. (OSX)


Version .401 (493k) Changes I made to fix regex \n search in files with multi byte Unicode characters broke \n searching in files without them. Inserted a switch statement in the code to use appropriate character counting scheme depending on whether the file contains multi byte characters or not.

Fixed minor problem in search popup where selecting Regex wouldn't automatically unselect Whole Word. The Regex and Whole Word are mutually exclusive. Setting both will never return any results.


Version .40 (493k) Version .40 needs to use the perl runtime libraries version 3 prl03.zip (5772kb).  If you already have prl02, you can update to the prl03 package by just downloading the prl03update.zip file (52kb) and unzipping it in the directory where your current prl directory is located. (Less than one hundredth the size.)

Finally worked out a way to get more intuitive sort order for Latin-1 accented characters in the word frequency routines without breaking Unicode compatibility. Not as elegant as I would have liked. Basically, brute forcing the sort routine. Nearly doubling the time the sort takes. Still, the trade off is acceptable IMO. Except on very large files or very slow computers, the sort still only takes a few seconds.

Added two more buttons to the word frequency routine to check for instances of a comma followed by an uppercase character or  a period (full stop) followed by a lower case character.  These will find all instances of  these whether they cross line boundaries or not. If  they DO cross a line boundary, the newline will be represented by "\n". You will be able to search for these using right and left clicks in the word frequency window, terms that have a newline in them \n may take a few seconds to find the first one when using right click. You may still need to manually check for paragraphs ending with a comma. Here is a nice regex to do so:
',("?\n{2,})' => '.$1'

Added a Initial Caps sort function to the word frequency window by special request. Sort out all of the words in the file that have the first letter capitalized, and at least one other non upper case character.

Twiddled around with the sort and parse logic for many of the word frequency functions to try to speed them up a bit.

Removed the Re Sort button from Word Frequency window. It was contributing to the large memory footprint significantly, due to the state information I needed to keep in memory to know WHAT to re sort. It has been replaced with All Words, which is more useful, in my opinion. It allows you to get back to the full list without having to do a full search and count sequence.

Added Unicode > FF button to the word frequency window. Sorts out all words that contain characters over hex FF. (Outside of Latin-1).  The sort order is by ordinal for characters over FF so the display order may seem unintuitive.


Rewrote the Search Stealth Scannos routine to no longer use recursion. It wasn't really hurting anything, but it would pop up warnings about it on the console if it recursed more than 256 levels. (Which was pretty easy to do.)

Tweaked the search code to scroll the end of the found term completely onto the screen on search. It was set up to scroll the beginning of the found term onto the screen, which was fine for single line searches, but multi line searches would often end up with the found term being half on, half off the screen. This will ensure that the found term (unless it is exceedingly long) will be completely visible after a search.

Added the number of words left to check in the spell check pop up window title bar. This is, the total number of words in the unrecognized list, not just the unique words. If you Skip All or Add To Dictionary, the count will be reduced by the number of times the word appears in the list of misspelled words.

Added a tool bar. Put some of the often used routines on it. I'm not really sure if this is necessary or even particularly desirable, but it was interesting to play with. I am open to suggestions as to which functions should be accessible through the toolbar. If you don't like or want the tool bar, disable it under the Prefs menu. You can drag the toolbar to the side of the window you want it to dock on, or drag it onto the desktop to use it as a floating widget. Select the side you want the toolbar to  start on under the prefs menu.

Fixed error in stealth scanno editors' save code where it was incorrectly escaping backslashes for the hint index terms.

Fixed file save functions to save text with Unicode characters as Unicode, and text without as Latin-1. Was saving text with Unicode as a bizarre blend of Latin-1 and Unicode. Had to rework the gutcheck functions to feed it the bizarre blend, as it doesn't like Unicode AT ALL. As a result, gutcheck will no longer save the file when you run it. (Actually, it does, but it saves it to a temp file, runs gutcheck on THAT, then deletes it again.)

Had to reconfigure the spell checking code to deal with the new file save code a bit. Aspell  0.50.03 doesn't handle Unicode characters well at all. I did some work arounds for the word frequency spellcheck routine, but the main spell check will choke on words with non Latin-1 characters. Apparently Aspell version 0.60 is due to be released soon, and that has Unicode support built in. It is available in beta, but you must build both it and the dictionaries yourself, so it is not really recommended yet. As soon as it is officially released, I will probably change over to it.

Modified HTML character code to use <center></center> markup for centered images. Apparently the align="center" attribute is not very well supported.

Modified the file save logic to let you save a file, even if you have made "no edits". There are quite a few functions now that bypass the undo buffer and so don't raise the "edited" flag. As a result, there are many occasions when you might want to save the file, even if guiguts doesn't know it has been edited. This change allows you to without having to make a bogus edit to set the flag.

Fixed the regex engine to search for terms with newlines to work correctly with files that contain multi byte Unicode characters.  It was blindly counting every byte as a character, so would get unsynchronized when there were Unicode characters in the text.

Went through code trying to tighten up variable scoping and reduce memory footprint.
Guiguts, by it's nature, is a memory hog. Don't know if I accomplished much, but *I* feel better.

Added an ordinal readout to the bottom status bar. Displays the ordinal of the character just to the right of the cursor in both decimal and hexadecimal. If you click on it, it will toggle showing the name of the character as well.

Fixed problem where thought breaks could get munged during rewrap under certain fairly rare circumstances.

Tweaked spell check and word frequency routine to not NEED to have files saved before they can be run. For best results, you SHOULD save a file that has been edited before you run them, but now, if you just want to run a quick spell check or word frequency analysis on a new, unsaved file, you can.


Version .39 (485k) Added some functions that are not supported by my  first release of the perl runtime libraries. Since I had to release a new prl, I took the opportunity to update Tk to a newer version with several important bug fixes. You will need to update to prl02.zip to be able to run version .39 (Or install Tk804.026, Image::Size and Tk::Toolbar in your local perl package.) Guiguts version .39 WILL NOT WORK with prl01.

Heavily modified the image handling code for HTML generation. It is still not automatic, but much more of the work is done for you. Added a button to the HTML fixup window; "Auto Illus Search" that will scan through the file and semi-automate HTML image code insertion.  Any text inside [Illustration: ...] markup will be placed in an alt=" " tag, and you will be able to see/adjust the size, alt, title, and alignment properties. Once you select an image file, a thumbnail image will be placed at the bottom of the Image Selection box. Click on the thumbnail to change the file. *Note: file names will be displayed as absolute in the Image Selection window. As long as the image file is located in
a sub folder or the same folder as the HTML file, it will be converted to a relative path name when it is inserted. The thumbnail size will vary somewhat with different image sizes. Tk does not have any easy mechanism for scaling images smoothly  to any ratios besides n-1, where n is a positive integer, which makes it difficult to scale them to an exact size. For simple thumbnail previews, where it is not critical what the actual size is, as long as you can see it,  it works ok though.
 
Tweaked the Unicode -> Beta code & Beta code -> Unicode functions under the Greek transliteration a bit to give more consistent results. Added the Greek letters with a tonos accent to the translation code. Tonos is essentially the modern equivalent of acute accent in ancient Greek. There is some overlap between characters with tonos and characters with acute, but there are a few unique combinations too. Added them for completeness.

Rewrote the character builder nearly from scratch. Expanded the capabilities quite extensively. Made it able to generate all Greek characters. Just type in the character you are looking for into the character builder box and the corresponding Greek character will appear in the box next to it. Press enter to accept the character. If you press enter with an empty character builder box, it will place a line return in the Greek text window. If you press backspace in an empty character builder box, it will backspace one in the Greek text window. You can add a space by adding a space in the builder box and pressing enter. To get a terminating lowercase sigma, type in "s " (s+space) in the character builder box. (Or you can just put a standard sigma and then do a back and forth transliteration with the ASCII or Beta code buttons.) You can get a ô (transliteration for omega) by typing o^ in the builder box. Likewise, ê (eta) can be obtained by typing e^. You can also use H & h as aliases for upper and lower case eta and W & w as aliases for upper and lower case omega in the character builder.

Rewrote the menuing code for Unicode character windows. Made it much more compact and maintainable. Added the hexadecimal character range that each block covers for those times when you know a characters index number but not which block it is in. Made the sort order of the Unicode menu selectable by block range name (default) or by block range index.

Tweaked named character conversion during HTML autogenerate to convert strings of  em dashes properly.

Fixed a bunch of other minor buglets.


Version .382(482k) A bunch of minor tweaks.

Tried to fix spell check in word frequency to recognize words with Latin-1 accented characters again. Broke Unicode compatibility in the process. Seems they are mutually exclusive right now. Oh well, right now, Latin-1 spell check is more important to me than Unicode spell check.

Fixed oddity in page separator routine where it would open a different page than you would expect when you pressed See Image. Now opens the page that you are currently working on.

Work on page marker moving and reindexing routines. Made them a lot less fragile. It was very easy to break them before. They are much more error resistant now.

Worked on internal link sorting code to work better with numerical link names. Really improved the ease of index hyperlinking. Fixed the hide footnote and hide page number link view options.

Note: now that the page marker reindexing tools are working rather well, you can hyperlink a fairly large index in a very short time. You need to make sure your page markers are aligned with the actual pages in the text. (Time consuming, but worth it.) Double check the numbers in the index against the original. (Very time consuming, but necessary, I've found. :-( ) Make sure you select "Insert Anchors at Pg #s" when you auto generate, then do a regex search and replace '(?<!\d)(\d{1,3})' => '<a href="#Page_$1">$1</a>' in the index, ((
\d{1,4}) if your text uses 4 digit page numbers). Press enter to search for the next page number and Ctrl+Enter to replace and search. Once the preliminary setup is done, you can hyperlink an index in minutes.

Added /f f/ to the rewrap marker cleanup routine.

Fixed problem in HTML auto generate where it was not adding all of the poem styles if there were multiple levels of indent beyond 4. Fixed syntax error in generated indent styles.

Fixed problem where autogenerated TOC entries would sometimes end up with markup inside them.

Tried to trap a few more instances where orphaned markup could be generated. Still not perfect, but I'm chipping away at it.


Version .381(480k) Fixed self inflicted gunshot wound to foot. When I made changes to word sorting routines to allow for Unicode characters, I broke the code so it wasn't allowing words with mixed characters and digits, thus rendering the Mixed AlphaNum word frequency absolutely worthless. Now repaired. Thanks Aria!

Version .38(480k) Version .38 .bin files are NOT backward compatible with previous versions. I have slightly modified the formats of  a few stored variables. The only one that would probably be a problem is the page marker hash. Earlier versions used pg001 as their key format, it has been changed to Pg001, uppercase first char. .38 will automatically convert earlier versions forward. If you want/need to go back, you may need to manually reset the page marker hash to use the earlier format. (I didn't do this lightly, it really made other things much easier having done this.)

Reworked Title Case function to work a little better with quoted words.

Started working on method to view and adjust page number markers after the page separators have been removed. Right click on Page # in status bar to toggle visible page numbers. Click on a page marker to adjust the placement. You can also adjust the offset of the various page markers and add and remove page markers. The functions TRY to disallow undesirable results. They will not allow you to add a page marker if the page markers on either side of it are only 1 apart. You can INCREASE the offset (nearly) unlimited amounts, but it will only decrease offsets until any gap is closed, no matter what you enter for the decrease offset amount. (IE Say you have page markers that range Pg004, Pg005, Pg009, Pg010. You can decrease the offset of marker Pg009 by -4. If you try to decrease it by -10, it will only do -4.) When adding page markers, Place the cursor where you would like the mark to go and press Add. It will look at the previous page marker, increment it by one and place it at the cursor as long as it doesn't already exist.
Right now there is no mechanism to change the png image names to match the changed page marker names, and I doubt that there will be one in the near future, at least, not one that is tied directly to the page marker adjust functions.

Changed named anchor generation code to strip apostrophes from the link name. Apparently, some code checkers were complaining about it, though the W3C validation service seemed to have no problem with it. (On further testing, W3C seems to ignore it sometimes and complain others. Ah well, removed anyway.)

Added two functions under the Selection menu, Convert To Named/Numerical Entities, Convert From Named/Numerical Entities. These will convert selected text in the main window to and from HTML encoded text. Note: these functions will NOT add or remove any HTML markup, they only convert named and numeric entities to and from HTML style encoding.

Worked on Word Frequency Accent check to try to catch more variations of  spellings in the suspicious category.  Semi successful.

Tweaked rewrap code a bit to try to compensate for possible interference between rewrap markers and page markers.

Fixed problem with /F F/ code not terminating during rewrap.

Added some more buttons to the Greek Transliteration window; ASCII->Greek and Greek->ASCII. These will take the selected text in the Greek transliteration window text box (or all of it, if none is selected,) and try to transliterate it using the rules from the site. (Except using ê for eta and ô for omega to distinguish them from epsilon and omicron.) (As an aside, this is a variation of the Perseus system for transliterating Greek.) These functions are not perfect, and shouldn't be used blindly, but they will do perhaps 95-98% of the work in transliteration. The U/Y transliterations are suspect especially, since they are somewhat subject to interpretation, this will lead to having text that is not 100% reversible. If you plug some English text and transliterate it back and forth, you most likely WILL NOT end up exactly with what you started with. These functions are somewhat inefficient, and probably shouldn't be used on chunks of text larger than about 10-20 K at a time. For small passages, they work ok though. The punctuation transliteration is also suspect and may be changed after some testing and feedback. **Note: the Greek auto transliteration functions will only be available if you have Tk 804.025 installed in your perl libraries.

Added two more buttons to the Greek transliteration window. Unicode->Beta & Beta->Unicode. These implement a subset of beta encoding to allow more detailed markup of accented Greek if you should desire to. For unaccented characters, the transliteration is the same as the Perseus method (What we use on the site and guiguts has used up to now.)  Beta encoding provides a method to preserve the accents. There are basically eight accents that you need to deal with for Greek, they are detailed below: (You will need a Unicode aware font to view the examples in the chart.)

Popular name
Greek name
symbol
example
encoded
rough breathing mark diasia
(
a(
soft breathing mark
psili
)
a)
acute
oxia
/
a/
grave
varia
\
a\
iota subscript
prosgegrammi
|
a|
tilde (or inverted
breve, depending on the font)

perispomeni
~
a~
diaeresis (rare)
dialyctika
+
ϋ y+
breve (rare)
vrachy
=
a=
macron (very rare)
macron
_
a_


To encode a character in beta code, transliterate the base character as normal. Then, starting from the highest point, working from left to right, place the symbols for the various accent marks after the base character. Stack as many accent symbols as needed to make the character. IE:    would be  Ô(/|. There is a utility box at the bottom of the Greek transliteration window to help assemble accented Greek characters. Select the base character and accents you want from the list and press enter to place the character in the transliteration window. **Note: for purists, this is not EXACTLY beta code, as beta code uses uppercase letters for all Greek letters, which, by default, means lowercase. :-? To encode an uppercase letter you are supposed to precede the base character with an asterisk. I elected not to do this, as we already have too many overloads for asterisk. The accent encoding is pretty close to standard beta encoding though.

The character builder is not yet capable of dealing with combining characters to make Unicode characters, it only indexes combination characters which are already defined in the Greek and Coptic and Extended Greek ranges of Unicode. So it is possible, (quite easy in fact,) to come up with a diacritical combination that it can't handle. If you want to look at the characters that it can handle, open the Greek and Extended Greek windows under the Unicode menu.

The character builder is somewhat of a pain to use, (hence the name, Character builder, by using it you build character(s). :-) ) I haven't been able to come up with a better way yet though.

Fixed a few problems with sidenote conversion during HTML auto generate that could hang the process.

Fixed aggravating bug in rewrap code in pre release that was leaving control characters in the text.


Version .372(467k) Fixed problem where under certain circumstances, the Goto Page pop up would be unclosable.
Fixed problem where /F F/ markup was not being recognized.
Fixed problem with unclosed paragraphs before /* */ blocks during HTML auto generate.

Version .371(467k)
Under certain circumstances, if you opened and worked on more than one file in a guiguts session, without restarting, the bin files could get cross contaminated, resulting in spurious page markers and bookmark markers.

Version .37(467k)
Started trying to wrap my head around OOP in perl (I don't need any snide comments from you python people either. :-\ ) Derived my own class of text widget and overrode the Load and Insert methods to deal with Unicode characters. No longer any need to patch the default perl packages to be able to load files containing Unicode.
Wrote some new insert and delete methods which bypass the undo buffer. I am using them in the HTML auto generate functions, since they work 2-3 times as fast as the methods which manipulate the undo buffer. They also cut down on memory usage quite a bit. The caveat? (You knew there was going to be one...) SAVE YOUR FILE BEFORE YOU RUN HTML AUTO CONVERT! You can no longer undo back out of it.
Fixed problem with auto generate poetry line numbers inside /p p/ markup.
Rewrote a bunch of the header.txt header file after discussion with some users. You are still free to modify it locally, I just changed to more logical defaults.
Added markup and auto generate code for sidenotes. Thought I had done this before, but I guess it never got into the distribution package.
Added tool tip pop-ups to the Unicode windows. The tool tip displays the decimal and hexadecimal ordinals of the Unicode character the pointer is over.
Added feature to the Unicode tool tips. Now also pops up the name of the character as well as the decimal and hexadecimal index. Will pop up tool tip with information, even if that particular character is not implemented in the font you are using. *Note: this feature is somewhat resource intensive, and will cause Unicode windows to load slower the first time one is called in a guiguts session. If you want to disable it, search for the file 'Unicode' in the guiguts directory and rename it to something else. You can rename it back if you want to re-enable it again.
Made guiguts automatically convert files to native line endings on file load. *Note: just converts the file in memory, not the file on the disk.  When you save the file, it will be saved with native line endings.
Fixed missing anchor closing markup in page anchors.
Added a summary attribute to the auto table generator function in HTML fixup.
Propagated icon to all sub windows that get opened in guiguts. It was only being shown in main window.
Added a unique identifier to footnote anchors allowing you to reuse the same footnote numbers/letters multiple times in one file. Useful if you set up your footnotes index to start over every chapter.
Cleaned up a whole bunch more poor code. Broke a bunch of stuff in the process.  :-(  Went back and fixed everything I broke. (hopefully...)
Changed bindings for Control-A to be Select All to be more like nearly all Windows programs.. Control-/ will still select all too.
Did some more work on HTML auto generate. Poetry inside footnotes should now be converted correctly.
Made word frequency routine aware of non-breaking spaces. Listed as *nbsp*. It knew they were there before, it just didn't know what to do with them.
Modified word frequencys Mixed Case sort to be Unicode aware.
Added a way to check to see if you have duplicate anchors in your HTML file. Duplicate anchors can cause problems for browsers and  will make validating  checkers complain. In HTML fixup, without selecting anything in the text, press Internal Links. It will check through the text for duplicate link names and throw a warning if it finds any. If you don't get a warning, your file is Ok.
Modified Latin-1 popup window to be selectable to enter either the literal character or HTML named entities. Added pretty much the rest of the printable Latin-1 characters to the chart while I was at it. Non-breaking space and non-breaking hyphen are the only ones not there.
Added the pop up tool tips to the Latin-1 Chart too.
Added /p p/ and /# #/ searches under the Search menu.
Added /p p/ search to the orphans search pop-up.
**Note: Gutcheck will complain about /p p/ poetry markup. "Paragraph starts with lower case." The easy work around is to use /P P/  (upper case P) instead. The markups are identical as far as guiguts is concerned.
I have removed, (for the time being, at least) the Korean, Japanese ideograph characters (CJK Unified Ideographs) from the Unicode popup choices. There are over 21000 characters in the chart, and trying to load it,  with the method I was using, was sucking up all available memory and crashing the program. If we need to start using Korean / Japanese ideographs, I'll need to figure out a different way to display them.
Added another special markup /f f/ or /F F/ - front material. Only should be used a the front of a text around the tile, author, publishing data etc. In text rewrap it is treated just like  /$ $/; IE no rewrap, no indent. In HTML autogenerate, it will allow the standard title and author markup, but will center everything else within the block. Not strictly necessary, but a time saver.
Modified Replace All function in Search and Replace to work with selections. If you have a selection of text, Replace All will only operate within the selection. If you don't have a selection, it will Replace All on the whole file.
Fixed problem with checkfil.chk temp file not being deleted when spell check was done..
Played around with the status bar looks a little bit. I like the changes. Made the line number, page number & mode in the status bar respond to left mouse clicks. Pops up the Goto Line, Goto Page windows & toggles mode, respectively. Changed the Images and Proofers buttons to match.
Made some changes to the word frequency filtering routine to make it better able to deal with non English texts.
Fixed the Alphabetical sort order in word frequency to ignore case when sorting. It had gotten changed due to programming modifications I had to make to accommodate Unicode characters, and was yielding unexpected results.
Added a "Suspects Only" option to the Word Frequency window. Will filter result to only show suspect word pairs in searches that return them. You will need to re run the particular search if you change the state of Suspects Only, Re Sort does not understand the filter.
Have added another preference setting "Auto Set Page Marks On File Open". This has been the default setting for several versions,(and is STILL the default setting), you just have the option turning it off, if desired. For very long texts, it will speed file load by a significant amount. If your file already has page markers set, you don't need to reset them every time the file is loaded. I would recommend leaving it on by default, unless you are working with VERY large files. You can still run the set page markers routine from under the File menu.

Version .363(342k) Fixed a major foofoo in HTML orphans check Seemingly minor change had major ramifications.

Version .362(339k) Fixed a bug in the external commands routine that would cause some commands not to work inexplicably.
Modified HTML generating code for poetry markup to avoid some fairly obscure errors.
Added a Versions command under Help menu, mostly for easy troubleshooting. Reports on most relevant version numbers.
Added an icon. :-) I included a gg.ico in the distribution package that you can use for a Windows icon should you so choose. Also available as gg.gif
Rewrote major portions of search code. Exact same functionality. Much more compact and readable.
Fixed some other really bad code. Hopefully, close scrutiny of the source will now only induce nausea instead of making you violently ill.

Version .361(338k)
Fixed bug in fixup routine where it was removing  a character after a thought break.
Fixed obscure problem where rewrapping would sometimes yield odd results at page boundaries if page separators were removed manually (select and delete) rather than using the page separator tool or if there were lots of blank pages in a text..
Fixed problem where file names with multiple full stops could result in truncated (wrong) file names for the project info and project dictionary files.
Fixed bug in .bin file save code that was causing problems on unix based OSs. Was making faulty assumptions about path separator character.
Fixed a bug with regex searches with a newline character in them that would only return one match on any particular line.. On the face of it, that doesn't seem like it would be much of a problem. You would think that, if a search term has a newline in it, it should not occur more than once on any single line. However, if you searched for a term that MIGHT have a newline, it would ALSO only return the first occurrence on any one line. A relatively obscure, but quite annoying bug.
Modified \n assertion special regex search code to allow forward and reverse searching like the regular searches. Regex searches with newline assertions will now respond to directional searches.(Previously they were forward only.)
Changed the HTML autogenerate option for adding page numbers as anchors, to ONLY add them as anchors, rather than adding the page numbers AND anchors.
Changed the Unicode pop up window code a bit to have less scrolling, not sure if I like the change or not.

Version .36(337k) Downloaded, compiled and installed beta 14 of PerlTk 804.025 and started forward porting. Unicode, here I come.
Went through the script making changes to bring it up to spec for tk804. Most of the changes necessary are due to tightening of rules for variable type definitions and parameter naming conventions. Everything I have had to change so far is backward compatible with tk800. Tk804 is much stricter about type mismatches.
Had to rewrite recent files and external functions menuing code to work under tk804. After many hours of frustration, gave up on elegance and went for brute force. On reflection, I probably should have done this in the first place, since it made it possible to merge a lot of special purpose menu handling code into one general purpose menu handling routine. Cleaned up a lot of code in the menu building function to make it more uniform and readable (and smaller, incidentally).
Rewrote the Greek transliteration tool nearly from scratch. What I had worked, but it was bulky and hard to maintain. Tore out nearly 2000 lines of code and replaced it with under 200, with the same functionality.
Added option to output utf-8 encoded characters to the Greek transliteration tool. (Only available if you have tk 804 installed.)
Went through and did some cleanup of inefficient code (space wise). Broke apart several functions into sub functions that can share a lot of code.
Battled the word sorting functions to make them work with a mixture of Latin-1 encoded letters and utf-8 encoded letters. Think I have reached an uneasy truce, though it needs to be tested across platforms.
Added Drag 'n Drop capability for opening files. Open up a guiguts instance and open up an explorer window. Drag a file from the explorer window to the guiguts window. Voila!
Fixed bug where a single line inside /* */ markup would not get indented by the rewrap function.
Fixed bug where Fix-up was adding an extraneous space between strings of hyphens and double quotes.
Fixed a few problems with the auto generated footnote markup that wasn't passing the w3c validater.
Added a function to convert utf encoded characters to numeric HTML entities during HTML auto generate.
Fixed single space indent to not add spaces to blank lines that are part of the selection, thus avoiding adding "space at end of line" errors.
Fixed problem with aspell.conf file update where dictionary changes were being appended to the end instead of overwriting previous settings.
Fixed word frequency em dash check to ignore HTML comments.
Fixed problem where named entities in auto generated TOC were not being converted during auto generate HTML.
Fixed /# #/ marked up text to be enclosed in appropriate HTML markup during auto generate HTML.
Modified spell check in word frequency to allow you to add words to the project dictionary. Highlight a word in the word frequency window and press Control+Right Mouse Button to add it to the project dictionary and redisplay the list.
Added a new experimental markup for poetry; /p .. p/ (or  /P .. P/) during text operations, (rewrapping) it is treated just like /*.. */ markup except the default indent is hard coded to 4 spaces rather than being adjustable. During HTML auto generation,  text enclosed in /p .. p/ markup will use special poetry markup similar to the markup Jon Ingram proposed for the Mirror periodicals. In order for the /p ..p/ markup to be converted properly to HTML there MUST be a minimum of four spaces of indent. (default if you wrap the text with that markup.)
Added lots of Unicode character popup charts. Somewhat buggy in that they rely on the display font having the characters implemented. If characters aren't implemented, they display as an empty box. Every character that is not implemented takes a short while to time out when it is trying to display, so if there are a LOT of unimplemented characters in the block of Unicode you are trying to display, it may take a long time for the pop up window to show up. In the meanwhile, the program appears to be frozen up.  ***Note: Unicode functions will only be available if you have Tk804 installed on your system.
Tracked down problem where a section for rewrapping that had three or more blank lines following it would through up a bunch of warnings on the console window.
Fixed problem where rewrapping a file that had three or more blank lines at the end would fall into an uninterruptable loop.
Rewrote part of  Separator Fixup code to work around problems in Tk804 that were causing it to run at glacial speeds.
Fixed problem with fix that would cause part of page separator to remain if you opted to insert page numbers as HTML comments.
Fixed bug in spell check code that was causing spell checker to appear to fail when it encountered an unrecognized word with underscores.
Added "Goto Page..." function to search menu, right under "Goto Line...". Much as you might suspect, allows you to jump directly to a page. Will only work if your file has page markers set.
Fixed problem in auto generate HTML code that would add extraneous bold markup to already bolded text in the generated TOC.
Added a function to quickly enter a thought break "       *       *       *       *       *" under fixup menu. Will automatically add it at the end of the line the cursor is in.
Overrode the default binding for Control-f to make it run the search function instead of  move the cursor right one space. Control-f  and Control-F are equivalent.
Added binding for Control-S to also save the file (in addition to Control-s.)
Made many of the pop up windows stay on top. (At least, on top of guiguts.) Works under windows, may not work cross platform. A drawback is that the popup windows no longer have their own process on the task bar. However, raising any of the popups will raise all of them. Also added a option under preferences to enable or disable this feature.
Worked on /$ $/ block handling in HTML auto generate to try to eliminate odd errors.
Fixed problem with page separator code locking up if there were several page seps at the end of a file with nothing between them.
Fixed problem with scannos directory path being DOS formatted for all OSs.
Finally fixed misspelled Interrupt Rewrap button that has been misspelled for an embarrassingly long time.
Added another popup window to the footnote fixup function that will display and highlight footnote anchor pairs that are suspected to have a problem.
Changed the default for footnote fixup to be out-of-line.
Rewrote font choosing code. Now have drop down list of all fonts that are available on your system available as options in drop down list.
Made Unicode pop up windows have selectable and resizable fonts. Unicode window and main window do not need to use same font.
Fixed oddity in word frequency window where word with an emdash would have incorrect word counts.
Changed right click context menu pop up to mirror the main menu. It has always been there, but it was popping up the default menu which was confusing at best.
Added option to output HTML numeric entities directly from the Unicode pop up windows.
Fixed scrolling behavior in several popup windows.

Version .35(259k) Fixed boundary condition error in page separator fixup code.
Worked heavily on HTML functions. Added div and span buttons, worked on file parsing code tried to automate title and author markup. Made it generate a Table of Contents link list. Added some more styles to the header file to deal with page numbers and poetry line numbers. Change line number code to display line numbers as well as make anchors.
Added a pop up status window while auto converting, to let user know that it hasn't frozen, it is just working.
Fixed bug in recent files list where files that were renamed and saved were not being added to recent files.
Fixed boundary condition bug in recent files list too.
Expanded recent files to 10 instead of five.
Modified special \n regex searches to start from scratch each time you cycled through the file. The way it was, it only actually searched for ALL of the instances of the assertion the FIRST time you searched on it, it would save all of the hits to an array if indexes and then cycle through the array. It still does that, but now it resets the search each time through so that any edits you may have made to the file will be picked up the next time through. It wasn't compensating for edits and was pointing to the same indexes, even if they had been changed.
Fixed bug in guess page markers code that was preventing it from running.

Version .341(260k) Realized I forgot to change the version number in the title bar from pre.34. Ooops. Oh well. Now you've got .341
Added default indent setting for /* */ markup. Set the default indent under Prefs->Set Rewrap Margins. The local overrides will still work ( /*[6] .. */  ) but now you can set a default other than 0. If you want to not indent, set default to zero.
Added \t tab interpolation to the replacement text for regex search & replace. Any \t will be replace by a tab in the replacement text.
Fixed a bunch of problems that were causing trouble for perl/Tk 804 series as reported by khandy. I'll need to take care of them eventually since I plan to make the move myself in the relatively near future. (The 804 series supports Unicode much better.)
Fixed the problem with extracting \n assertions for replacement substitutions while searching with \n regex assertions. IE,. '(\W)to([\s\n])he(\W)' => '$1to$2be$3'  will now work correctly. Quote from .34 release notes: "You can not capture newlines for replacement terms. That kinda sucks, but I don't see an easy  way around it (right now at least)." Turns out I needed to add one (one!!!) character to the script to get it to work......  No easy way around it indeed. Ah well..... It works now anyway.


Version .34(259k)
Added check for possible emdash errors to Word Frequency hyphen check. Will now list all of the words that contain a hyphen, all of the words that are identical except that they DON'T have a hyphen and (new) all the words that are identical except that they contain an emdash (two hyphens).
Added a "Check  Emdashes" function to the Word Frequency window. This is somewhat misleading. An emdash phrase is, by definition, not a word, so having it under word frequency is not absolutely accurate. I was forced to allow a lot of non-word characters through to cut down on unacceptably high false positive error reporting.  The addition of this function probably adds about 10-15% to the initial processing time of the word frequency routine, but it is useful enough to justify it, I believe.
Added an option under the Prefs menu to disable the automatic highlighting of pairs of quotes, single quotes, brackets and parentheses, if desired. Some people found it distracting.
Fixed bug in File open code where if you already had a file open that had edits, and opened another file, it was discarding the edits without notice and opening the new file.
Reformatted the Greek Transliteration tool from a vertical to a horizontal layout. Removed whitespace from character images to reduce display size of tool. Redid the "Rough Breathing" accents to be easier to discern. They looked too much like acute accents. Changed the HTML code generation to produce the correct accented character codes. Added a text box to the bottom of the Greek tool. Assemble the Greek phrase in the transliteration tool the transfer it over to the main editing window. Either cut and paste, or place the cursor in the main window then hit "Transfer".
Made a bunch of minor improvements the spell check handling code. Due to the way the spell check interacts with guiguts you could get odd results if you have made a bunch of unsaved edits before running spell check. To prevent this, I made the spell check code save the file if you have any unsaved edits when you run spell check. Just something to be aware of.
Added ability to check for mismatched European style angle quotes « » (guillemots) to the Find Orphaned Markup and Brackets. It is not really intuitive, it being there, but it easily uses the same search code, so it was a win as far as programming goes.
Added "Middle Dot", char code 0187, "·" to the Latin-1 character tool.
Added an "Edit" button to the stealth scannos window. Add, delete or edit the regexes and/or regex hints for the currently open scanno file. Everything is keyed off of the regex search term. Right now if you want to edit a regex, you need to add it, then delete the original. If you only edit the replacement term or hint, just press add when you are done editing. Save will write the changes permanently to the disk. Cancel will drop you back to the scannos window where you can use and test any edits you have made. CHANGES WILL NOT BE SAVED TO DISK UNTIL YOU HIT SAVE FROM THE EDIT WINDOW. If you want to revert, hit Cancel then rerun Stealth Scannos. That will reload the regex file from the disk.
Added a "term count" to the stealth scanno window to help you keep track of how many regexes are available and where you are in the list.
Changed sort order of regex scannos to be in alphabetical order. It doesn't really make sense, but at least it is guaranteed to be in that order. Hashes are not guaranteed to be in any particular order, leading to oddities in how the terms are presented  from search to search.
Radically improved search for words with accented characters in them by changing the border condition assertions used in the searching code. Instead of using \b to detect word borders (which gave incorrect results for accented characters,) I switched to (?!<\p{Alpha}) to detect a leading border and (?!\p{Alpha}) to detect a trailing border. These assertions work exactly like the \b assertion for unaccented characters but unlike \b, will also work for accented characters.
Added some regex verification code to the regex search functions. It will warn you if you are trying to search with an invalid regex assertion rather than just silently failing.
☺☺☺Very cool news! After much twiddling, hair pulling and muttering, I have gotten the \n newline regex assertion working!☺☺☺
Finally an easy way to do those ',(?=\n\n)'  =>  '.'  search and replaces. It is inelegant,  slow,  memory intensive and slightly erratic, but it is 100% better than it used to be. :-) There are several idiosyncrasies; you cannot search reverse if you have a newline assertion in your regex. (You can set reverse search, it will just continue to search forward.) The highlighting may be slightly off, especially if you start doing variable length \n searches, but it is pretty close most of the time. You can not capture newlines for replacement terms. That kinda sucks, but I don't see an easy  way around it (right now at least). The initial search is somewhat slow, especially if there are lots of matches found in your text. (I wouldn't recommend searching for '\n' unless you have nothing better to do for a while. It will match EVERY line.) It slams memory pretty hard too; if you don't have much physical memory, it may end up swapping in and out to disk a lot. Never-the-less, this is still pretty awesome!
The newline assertion activation really makes a lot of neat tricks possible that weren't either easy or possible before. When the regex engine searches for a term with a newline assertion in it, it does a search that will match newlines with a "." (match any character) This allows you to do neat little tricks to search across multiple lines even for regexes that don't nominally use \n assertions. Just having a \n assertion activates the special search function.  Say you wanted find matching pairs of  guillemots « ». They are not guaranteed to be on one line, and in fact are more likely than not, not going to be. However, you can search using the \n assertion to treat the file as one long string.  Search for '«[^»]+»\n?'. The \n in the regex is not necessary for the search, however, it is necessary to activate the special search. This will work for any single character searches. Search for pairs of low lines; '_[^_]+_\n?', search for pairs of double quotes; '"[^"]+"\n?'. And so on. A common scanno is "to he" instead of "to be". A regular search will turn up any occurrences in one line. But what if "to" is at the end of a line and "he" is at the start of the next? Search for 'to[\s\n]he'. The drawback with this is you can extract variables for replace if you use a newline, so it may be better to break that into two searches, this is more for examples rather than suggestions.
Removed the three "Blank Line" searches from the menus as they are no longer needed with the \n assertion activated in regex searches.
Added option under the Prefs menu to shut off  the audible bell on warnings. It was starting to annoy me while working on the regex search code. Setting will be saved and remembered.
Made search buttons flash on warning. Visible warning useful if  audible warning is turned off. I had to change the highlight color from default gray, otherwise it flashes from gray to the same shade of gray..... which isn't very useful.
Added option under the Prefs menu to change the highlight color of buttons. When I added the highlight color, I chose a default that I liked. Other people may disagree. Now they can change it.
Broke apart the $f internal variable for external module calls into components as I speculated in the forums. Now have $d - directory path, $file - file name and $e file extension. to get the equivalent of what used to be $f all by itself, you now need to use $d$f$e.
Reconfigured the regex hint window to be a little more user friendly. No longer need to close it every time between hints.
Added an "Auto Advance" option to the scannos window. When checking stealth scannos, selecting auto advance will automatically cycle through the search terms until it returns a successful search on the text. If  the search term is not found in the text, it will load the next scanno and search again.


Version .33(251k) Added a few new hot keys. Alt-right arrow will indent selected text one space. Alt Left arrow will move selected text out one space.
Added a whole bunch of menu shortcuts. Nearly every menu item can now be run from the keyboard, (rather than needing a mouse.) Some functions will still need a mouse to be used effectively, but you can at least run the function from the keyboard. Press the Alt button to see the hot letters stand out on the menu items.
Fixed obnoxious bug in search code where you could not search for a zero. Was using a brain dead method to check if there was a search term. It wasn't smart enough to know that zero ("0") was not equivalent to null ("").
Exposed a few internal variable for use in calling external modules, if desired. Right now I only have three. I can easily add more if they are desired. The exposed variables are "$f"; the current open file name with full path, "$i"; the (i)mage directory with full path and "$p"; the file number corresponding to the (p)age where the cursor is in the currently open file. For example you can pass the name of the png file of the current page to an program using the command:  "C:\some\path\program.exe $i$p.png"  -  Or pass the current file to your default handler "start $f" (useful to view HTML files in progress) - Note: if you try to use any of these variables when they are not set, you will get errors. IE, trying to use $f before you have opened a file will not work. Caveat: For windows systems, you must use DOS friendly path names for programs. The passed parameters will be automatically DOSified, parameters you type in will not.
Added a button to the scannos search function called "Hint". This will pop up a window with a brief description of the regex search term that is currently loaded in the search window. The descriptions are located in the same file that the scannos are in and are loaded at the same time.There does not need to be a hint for each scanno entry, there can be, but it isn't strictly necessary. I have edited the regex.rc included with the script file to include hints for all of the regex expressions. If you want to add or modify any hints, please use the format demonstrated in regex.rc. Any scanno file that can be used with the stealth scannos search function can have hints.
Modified search function to allow you to search on a selection. In order to make the function as unobtrusive as possible, I made it so it doesn't reset the search when you reach the end of the searched selection. Instead, it will beep when you reach the end of a selection but continue searching past the end of the selection if you search again
Added another custom interpolation to the regex replacement engine."\A" - Convert to an HTML named anchor. Used much like the \U, (upper case) \L (lowercase) and \T (title case) interpolations. Do a regex search for some text and do an interpolation on it. In this case, convert it to a named anchor. For example, if you are working with a cook book that has all of the recipe names in all caps, left justified  and want to make named anchors to hyper link the index or TOC. Do a search and replace: '^(\S\P{Lower}+)$' => '\A$1\E' Or "Find all lines that don't have a space as it's first character (not indented) and have no lower case characters, (but may have punctuation accented characters or spaces) and convert to a named anchor. (The \E means "End of Interpolation") A named anchor looks like this: <a name="ANCHOR_NAME"></a>. A named anchor can not contain double quotes, spaces or accented characters. Those will all automatically be removed/replaced. For example: say you did the aforementioned cookbook search and replace. You might start with the text -

BUTTERSCOTCH PUDDING
Ingredients
How to cook
How to serve
CHOCOLATE PUDDING
Ingredients
How to cook
How to serve
VANILLA PUDDING
Ingredients
How to cook
How to serve
YORKSHIRE PUDDING FLAMBÉ
Ingredients
How to cook
How to serve

After running the regex search and replace, '^(\S\P{Lower}+)$' => '\A$1\E', you would have:

<a name="BUTTERSCOTCH_PUDDING"></a>BUTTERSCOTCH PUDDING
Ingredients
How to cook
How to serve
<a name="CHOCOLATE_PUDDING"></a>CHOCOLATE PUDDING
Ingredients
How to cook
How to serve
<a name="VANILLA_PUDDING"></a>VANILLA PUDDING
Ingredients
How to cook
How to serve
<a name="YORKSHIRE_PUDDING_FLAMBE"></a>YORKSHIRE PUDDING FLAMBÉ
Ingredients
How to cook
How to serve

Notice the É was deaccented and spaces converted to underscores. Very useful if the items needing named anchors can be found using regexes.
Modified File open routine to automatically run Set Page Markers on file open. It won't hurt if there are none, and runs so quickly that the overhead is negligible.
Fixed External Calls menu building function to update correctly when setting up calls. No longer need to exit and restart to see modifications.


Version .321(247k) Minor but aggravating bug in file save code would sometimes not let you save edits.


Version .32 (247k) Fixed a few more spelling errors in the UI. Sigh.
Made guiguts recognize the /$  $/ markup for compatibility purposes. At this point, /* */ and /$  $/ are treated almost exactly the same. (I'm pretty sure I got all of the places where it will matter. If not, I'm sure someone will let me know.) Added or modified functions where appropriate to make the /$ $/ markup behave as expected. At this point, the only difference is with /* */ you can set a relative indent for wrapping and with /$ $/ you can't.
Implemented a "Last five files opened" history under the File menu for one click opening of previously opened files. It has a little intelligence in that it considers multiple opens of the same file as one instance. IE, you won't end up with the recent files history filled with pointers to the same file. I had to rewrite some portions of the menuing code to allow finer control of display parameters. The original method was very concise and compact, but did not allow granular control over the menu display. The changes should not be visible to the user, (except for the recent files list addition.) I've tested it a fair amount and it seems to look and work pretty much like it did before.
Fixed some minor bugs in the file saving code which would sometimes make it impossible to not save changes when opening another file without closing the program. Wasn't really noticeable until the recent files list became available.
Fixed an error in the "Block rewrap with parameters" code. When a block rewrap with parameters was used with a block rewrap without parameters (use defaults) after it, the values for the Block rewrap margins were not returning to the defaults for the non parametrized Block rewrap markup. If this means nothing to you, don't worry about it. It would only bother you in fairly obscure circumstances. Thanks to Curtis W. for finding the bug and for submitting a patch!
Fixed problem with page separator fixup routine where it would mistakenly delete thought breaks if they were adjacent to a page separator.
Added another Menu to the interface. External operation calls. It is a user configurable menu of calls to external programs. You can set up guiguts to call external programs from within the program. There are 10 slots that you can use to set up with external calls. Call any program using the same parameters that would be used in the Windows Start->Run box or at a command prompt. For Windows, if you have a registered extension, you can start the associated program automatically by using 'start [filename]' For instance to open a web page using the default browser, enter 'start http:\\www.pgdp.net' (without the quotes). If you are calling a program that has a space in the path name, you must enclose the program name in double quotes.  IE, "C:\Program Files\Accessories\wordpad.exe". I have included a few examples. Click on setup at the bottom of the External menu to see/edit them. You can also edit the setting.rc file directly if you prefer. Make a backup copy first though, if you chose to go that route. Right now you can only make explicit calls. Eventually I hope to have hooks to internal variables available too. (Current open file name, current page number, current working directory, etc.) At this point, when you make changes to the menu parameters, they will not be updated on the interface until you close and restart the program. (They actually WILL be active, but the interface will not change to reflect it.) I am having trouble figuring out how to dynamically update the menus. The code that worked for the recent files list fails miserably here. I'll keep poking at it.
Fixed a bunch of broken links in manual. Did some more proofing and editing and added some new material.


Version .31 (244k) Finally got selection of different dictionaries from within guiguts working. Not as elegant as I would have liked, but it works. Could not seem to get it working through the programming interface. Finally gave up and am just writing changes to the aspell.conf  file. Will create one if it doesn't exist. (as a consequence, spell check will need to be closed and restarted for dictionary change to take effect.) Will modify "master" line (master dictionary) and lang (language) lines if it does exist. Should not affect any other settings in your aspell.conf file.
Fixed several minor bugs in the Greek transliteration function. (Actually one bug and several aesthetic problems.) Thanks the heads up, Curtis!
Revamped and fleshed out manual a great deal. Added a lot more detailed info, added a bunch of explanation for things I have gotten questions about. Rearranged layout a bit. Added TOC with links. I've spell checked it and read through it several times, but I'm sure there are still errors. If you spot something (misspelling, wrong word, whatever) please let me know. I figure, with this bunch reading it, there's a fair chance that errors will be spotted. :-)
Added "Align text on string" function. Useful for aligning columns of text that you want to align on a certain text string. IE., align columns of numbers on decimal points, or contents  lists on a period. The default alignment character is a period/decimal point (full stop) You can change it to any character or string of characters you like. It will align on the first occurrence of the marker string found in the line. If the alignment marker string is not found in a line, the line will not be changed.
Tweaked a few of the scanno.rc regex expressions very slightly. Added the v\b -> y assertion to the file.
Added some buttons to the gutcheck view to easily make bulk changes to the gutcheck error view options; Hide all, See all and Toggle view.
Made a minor change to the scanno window file loading code. If you are loading a scanno file that contains the string reg somewhere in the name, it will automatically set the search window to use regex search settings. If it doesn't have reg in the name, regex search will be automatically unselected.
Changed all of the word frequency routines to be sortable both alphabetically and by frequency. The main function was sortable both ways, but all of the sub functions were only available with an alphabetical sort. I had to make some subtle changes to some of the functions to accomplish this, hopefully nothing that will be problematic. The default sort order for all functions is now by frequency (it is the word frequency routine, after all.) To change the sort order, check Sort Alpha and press Re sort. Change back by unchecking Sort Alpha and pressing Re Sort again.

Version .302 (229k) Arrgh. Stupidity fix to gutcheck filename parsing code.

Version .301(229k)
Minor fix to HTML image insertion code. Was forgetting image directory every time an image was inserted.

Version .30 (k) Made changes in the scroll wheel handling code to try to get it to work correctly under Windows XP. It has always worked under Win2K, WinMe, Win98 and Win95, but for some reason WinXP handles scroll events slightly differently. Not having an XP system myself, I never noticed that it didn't work. Hopefully this will address the problem.
Fixed scroll bar in spell check replacement words list to resize correctly to the length of the list.
Fixed HTML image code generator to use forward slashes instead of back slashes.
Added update check function under help menu. Will connect to the server where guiguts is hosted and compare your version with the latest version on the server. Will pop up a message saying either your version is the most current or that there is a newer version available. (Or that it couldn't connect to the server.) Obviously, you are going to need some sort of Internet connection for this to work.
Added a "stealth scanno" function to the word frequency window. It will ask for a scanno list to use, (It needs to be a word list, the regex lists won't work, at least not the way you would want.) then sort out all of the words that appear in the list.Since the scannos can theoretically go either way, both terms will appear in the word list. (As long as the actual word appears in the text.) This isn't just confined to lists of scannos either. You could use any list of words to come up with matches as long as it is formated correctly.
Added another word list to use in the new scanno word frequency function called misspelled.rc. It contains about 3500 of the most common misscanned words. About 95 % of this will already be covered by guiprep during preprocessing, but, hard as it is to believe, there are some people who don't use guiprep. :-)  This word list would be lends itself very well to the word frequency scanno function but would be extremely aggravating to use in the search scanno function.
Changed calling of search window from word frequency to pop it up on top if it is already open. If a search window already existed, it would not call it again, but it would also leave it covered by other windows. Now it will pop up if you call it.
Fixed file open code to open the bin file (with the page markers in it) if the filename is passed as an argument to the program.(Assuming it exists.)
Fixed code to save settings file in the correct spot if a file name is passed as an argument to the program.
Made some edits to the english-common scannos file distributed with the program. Added a few, removed a few.


Version .29 (217K)
Beefed up the ASCII box art drawing function quite a bit to make it more adaptable and user friendly. Allows customizable frame characters, justification, and selectable rewrap.
Added whitespace characters to the character count function. the characters are represented by their names rather than by the actual character (for what should be semi-obvious reasons). The searching code has  been modified to allow searching on whitespace characters. It doesn't seem like the search for newlines works, but look at the bottom line indicators, they change every time you search for one. Since there is one at the end of every line (by definition), it is of limited use. The tab searching is of value though.
Worked on the word frequency page Up/Down code, now moves correct number of lines and moves the active selection fairly predictably.  Had to override some of the default behavior to get this working. I'm pretty sure I didn't break anything else in the process.
Worked a bit on auto generation of chapter named anchors during HTML auto convert.
Fixed boneheaded problem with fixup routine with over enthusiasm in "fixing" lst -> 1st "errors".

Version .28 (216K)
Fixed a couple of warnings that were popping up occasionally when doing search and replace.
Found a bug in the regex search and replace function. When doing variable length extraction of named property assertions, the replacement extraction only works once. I have no idea why this is happening. Everything seems to be working correctly, it just will not allow you to use a variable length named property assertion more than once.  IE  (\p{IsUpper}+) => \L$1\E will work, but only once, and I can't figure out why. For now, try to avoid using variable length named property assertions for variable extraction during regex searches.
Worked on Auto List and Auto Table code in the HTML Fixup window quite a bit, to make it a little more user friendly and intuitive. Lots of little things added/changed. Too many to list (or remember.)
Added some a few more markup markup buttons to the HTML window <small> and <big>
Added the optional modifier code to the /* .. */ markup like I speculated about in the forums. /* .. */ markup with no modifiers works like it always did, no rewrap, no indent. Markup with a indent modifier,  /*[4] .. */   will adjust the indent in the block so that the left-most line will be set to have the indent specified and all other lines will be adjusted to keep their same relative indent. The modifier is an absolute indent. Negative numbers will be ignored.
Changed the auto generated HTML to use text style span indents instead of padding with nonbreaking spaces. A little more elegant and easier to read, especially for deep indenting.
Changed Gutcheck routine to automatically save the file if it has been edited rather than just flash up a message about it.
Changed the sorting code in the word frequency - Alpha/Num check to not list numbers with commas and periods since they are fairly common and make the results less useful. (Too many false positives) Changed the Check MiXeD CaSe routine to sort out all words that have lower case letters and an upper case letter not in the first position. Makes it much easier to pick out errors LlKE THlS.
Gave up on my attempt to make my rewrapping routine follow text layout conventions regarding orphaned words at the ends of paragraphs.  In general, it is considered poor practice to leave a line less than 10 characters at the end of a paragraph. Typically, what is done is a word is moved down from the preceding line to flesh it out. I had this implemented, and it worked pretty well, but occasionally, more often if the rewrap margins were set fairly low, (60-65), it would result in a line that would be reported as short by gutcheck. Well, yes, it was short by gutchecks standards, but it was carefully trying to follow standard text layout convention. I have, however, grown tired of explaining that that is indeed a feature, not a bug, and so, since nobody seems to want it, disabled that part of the rewrapping code. Now if you end up with one character left over at the end of a line, the rewrapper will cheerfully  put it on a line by itself. (Of course, gutcheck will report THAT as an error too....)
Fixed minor bug where rewrap function would strip spaces from in front of thought break if selection ended just before it.
Added a somewhat bizarre function to automatically draw ASCII art boxes around a selected block of text.  I was using it while post processing a Punchinello issue, to lay out the advertisements more like they are in the magazine. It works fairly well but I'm not sure that it should be used, really. It makes it necessary that the text be viewed using a fixed width font, and makes it difficult to rewrap. Still, it is fun to play with, and makes it easy to do such things, if you are of a mind to. The selection NEEDS to start and end on blank lines for it to work correctly.



Version .27(213K)
Made orphan brackets and markup search function recognize simple 1 level nesting.  IE (Text like (this) will pass.) (But text (like) (this) will still need to be checked.) I could  allow for unlimited nesting, but then missing or wrong brackets would slip through fairly easily.
Made bracket search highlight both brackets it is questioning instead of just the first.
Made the script remember the pngs directory from session to session as long as the file has been saved. Since it is project specific, it is saved in the bin file associated with a particular project file. As a result, it will only be saved when the bin file is saved, which is only saved when the project file is saved.
Fixed error when setting images directory through the prefs menu.Was supposed to open the first image file in the directory (a quick check that you had the correct directory) but would pop up a error that file (path).png could not be found. Changed code so it should work as expected
Modified code to save the window geometry (for the widows for which it tracks the geometry) right after it gets a resize (or move) command. There is actually a 300 ms delay so it doesn't try to continuously save the settings as you are trying to resize/move the window.
Puzzled out how to allow navigation of listboxes using arrow and page up/down keys. (I feel pretty stupid about this one. All I had to do was switch the focus. Clicking on it with the mouse doesn't change the focus oddly enough. It needs to be explicitly set. Or you can set the focus by tabbing between the widgets until the listbox has focus, but that little tidbit is not documented anywhere easily accessible.) The drawback is that the current selection in now underlined, which I don't really care for, but the benefits outweigh the negatives.
Experimented a bit with not sorting a scannos file if it contains the character sequence "reg" in the title. Much as suspected, the results come back in nearly random order. (Which, for the regex files, is not much worse than alphabetical order.)
Fixed problem with rewrap function where it would sometimes lose some of the page markers if text was heavily rewrapped. (For instance, change the rewrap margin from 72 to 50 and rewrap.)
Fixed problem with rewrap function where it would eat a blank line if there was an even number of blank lines in a row. It wouldn't change anything if there was 1, 3, 5, etc. but if there were 2, 4,... lines in a row, it would delete one. :-?
Fixed oddity where rewrap function would add an extra blank line at the end of the rewrapped text if the selection did not end exactly on a blank line.
Fixed problem where if selection did not begin on a line containing text, rewrap function would delete one blank line before the paragraph.
Cleaned up a bunch of warnings in the rewrap routine. (mostly boundary conditions on empty variables, either uninitialized or empty after processing.)
Added "Auto Table" and "Auto List" functions to the HTML fixup window. Auto table will put <table>  </table> around the selection, put <tr><td> at the beginning of each line in the selection, put </td></tr> at the end of each line and replace any instance of two or more spaces together in a line with </td><td>. Auto List puts <ul> </ul> around the selection, <li> at the beginning of each line and </li> at the end of each line in the selection.
Added a few more Greek glyphs to the transliteration chart. They are low usage but not terribly uncommon. For the most part they are not available as HTML entities.


Version .26 (202K)
Added a function to reformat poetry line numbers along the right side of the text. It will look for numbers in the rightmost columns separated from the text by at least two spaces, then add spaces until the right edge of the number is at the rewrap margin. Adjust the right rewrap margin to change where they are placed. It will put at least two spaces before the number, even if it makes the number exceed the rewrap limit, so it can find it again if you choose the run the routine again.
Reformated the menu layout as discussed in the forums.
Changed the fixup routine to first pop up an option window so you can select what fixes to run. You can also select whether to run the routine inside /* */ marked blocks or not.
Started writing a routine to find orphaned markup, but decided to cheat and just grafted it onto the bracket orphans search routine. :-)
Added a routine to find and remove blank lines before page separators without actually removing the separator. This is a low usage function, only certain projects will benefit from it, but the ones that need it, will now have it.
There is a bug in .25a that prevents gutcheck from running. I have no idea what the bug is or why it was getting an error. Version .25 worked and this version works, and I haven't touched the code in that part of the program lately. Rather then release a third version of .25, I'm pushing up the release of .26 a bit.


Version .25a (200K)
Fixed some fairly serious boundary condition bugs in the block rewrapping parameter code that I hadn't taken into account initially. Thanks to DaveKline for the the bug reports and examples of failure mode text.


Version .25 (200K)
Fiddled with the Internal links guessing code a bit more. When making internal links to named anchors, the window will pop up a list of all of the named anchors in the text. If you are hyperlinking an index, it can get pretty unwieldy. Added some code to try to guess which link you will want and put the likely candidates near the top of the list. If you name your anchors with this in mind, it can work pretty well. It looks at the first word in the selected text for the internal link and searches for named anchors that contain that word. Works pretty well for indexes.
Worked some more on the Greek pop-up window. Added the capital vowels with rough breathing marks. Added selection for what kind of mark up the function produces; transliteration, letter names or HTML entities.
Add some new markup for the rewrapping function. If the rewrapping function encounters /*..*/ markup , it skips over everything in the block enclosed by the markup. Added /#..#/, similar to gutwrench and rewrap-indent, anything enclosed in /#..#/ will be block indented the standard  block indent margins. If you put  margin numbers on the opening line, it will use those numbers for the margins instead. They must be formatted thusly:  ( /#[x.y,z] )  The first number is the general left margin override. ( /#[x] ) It will indent all of the lines x spaces. If a there is a period and a second number, (  /#[x.y] ), the first line will be indented y spaces and the rest x. If there is a comma followed by a number, ( /#[,z] ), it will override the default right margin setting. You can override the margins in nearly any combination. If you override the first line (y) you will need to have a x value, otherwise the y will be used for all of the lines, and if you have both a left margin and right margin setting, the left margin needs to come before the right. - /#[,z.yx]  won't work, at least not like you'd expect.
For example:

/#
Text text text
text text text
#/

will be indented and rewrapped using the standard block rewrap margins.

/#[6,53]
Text text text
text text text
#/

will block rewrap with a left margin of 6 and right margin of 53 instead.

/#[2]
Text text text
text text text
#/

will use a left margin of 2 and a standard block wrap right margin.

/#[4.6,70]
Text text text
text text text
#/

Will have first line margin at 6, the rest of the lines at 4, and wrap after column 70.

And so on.

This markup is available in addition to the block rewrap function, not in replacement of it. It is really just a way of doing overriding of default rewrap margins inline while running a standard rewrap.
Made the file open dialog see .htm and .html files also by default. Because I want it that way and what I thinks carries a lot of weight with the author. :-)
Fixed the File->Save As and File->Include functions to default to the directory the open file is in rather than the directory the guiguts executable is in.
Added a function to the Fixup menu near the Rewrap function to automatically clean up the rewrap markup /* */ & /# #/.  Anything on the line with the markup WILL BE DELETED. The entire line including the newline will be removed, so leave a space before the open and after the close markup. (the standard anyway.)
Added a Footnote format tidier for non HTML versions. It will reformat the footnotes to be a little more aesthetically pleasing and rewrap them, however, it will destroy the footnote markup so that other automated tools will no longer be able to work with them. If you plan to make an HTML version, SAVE THE FILE WITH A DIFFERENT NAME BEFORE YOU RUN THIS, because this function will make the automated footnote hyperlinking tool ineffective.


Version .24 (195K)
Added a Latin 1 pop up chart under the Help menu. It contains the bulk of the characters that aren't directly available on a std US layout keyboard. The accent marks aren't included, neither are the nonbreaking space & hyphen and a few other obscure characters.
Added a pop up Greek transliteration chart under the Help menu. I uses a very similar scheme for transliteration as the site, based on the encoding used by the Helen  project. It uses Latin-1 encoding rather than ASCII. The only real difference is eta ( η ) is encoded as ê rather than ae and omega ( ω ) is encoded as ô rather than o. That makes it easier to distinguish eta from alpha epsilon and omega from omicron. The upsilon defaults to y rather than u since it will only be a u if it is part of a diphthong. ευς is Zeus, not Zeys). I also put buttons for the "rough breathing" marks. The rough breathing marks were in essence, the "h" in Greek. They only occur over vowels (and rho) and signal that there should be a h sound before a word. So ύδρα  is hydra, not ydra or udra. I considered implementing beta encoding but dismissed that after investigating a bit. Beta encoding is very good for having something that exactly records the original Greek, but is almost unreadable in the transliterated form.
Trapped a warning in the rewrap routine and tried to find a reported problem of falling into an infinite loop. Not terribly successfully, I'm afraid.
Added button to word frequency window to re run the word frequency routine. It will save the file first if it has been edited to get a more accurate frequency count.
Added a "Find orphaned and nested brackets" function under the search menu. It will automatically find all orphaned and nested brackets and allow you to step through them to check their validity. Will work for parenthesis ( ), square brackets [ ],  braces { } and angle brackets < >.  It can take a while to do the initial search, especially on the < > search in HTML marked up texts, so be patient.
Fixed regex extracted variables to be usable more than once in a replacement term. They were artificially limited to once per replacement term.
Made default markup for image tags include the align tag for flowing text around the image. Defaults to align="left", change to right, center or whatever suits.
Added option to Internal links pop up selection window to sort alphabetically. Possibly handy while hyperlinking indexes. (I wanted it, so you got it.)
When doing internal links, if the selected link text matches one of the anchor names, that anchor will be displayed at the top of the list. (Handy when hyperlinking indexes to cookbooks where the index is basically an alphabetically sorted list of all recipes in the book. (Guess what I've been post processing.)
Added a few more search terms to the regex.rc file.


Version .23 (143K)
Fixed very nasty bug with spell checking function where spell check would skip a word every time you added one to a dictionary, either the project or Aspell one. Thanks to martinag for finding this and bringing it to my attention!
Remove key bindings for HTML header markup Alt-1 through Alt-.6 Was interfering with adding high ascii characters using the alt keys.
Added another subfunction to the word frequency window, Check Accents. Similar to the Check Hyphens function, will sift out all of the words in the text that contain accented letters and display them along with their frequency count. If a word is found that is the same as one of the accented words except it has no accents, it will be displayed too, marked with ****. The unaccented word may show up more than once in the list if more than one variation of accented characters show up.
Added a second harmonic display function to the word frequency list but, after tinkering with it for a while, removed it again. It ran very slowly for longer words, (about 5 minutes for a 10 letter word on a P4 2 Ghz processor, during which time, it was completely locked up,) and returned almost uselessly long lists for short words. (Search on any 2 letter word and you got back EVERY one and two letter words in the text plus a significant portion of the three letter words.) Ah well, I didn't spend that long implementing it so I was able to discard it without a twinge.
Did some more tweaking of the Auto HTML generation code, particularly with respect to Footnotes. Trapped some more potential formatting problems.

Version .22 (142K)
Fixed flaky found term highlighting. Another consequence of the cut 'n paste fix.
Fixed the undo EVERYTHING bug in the search and replace function. Something I was trying out didn't work too well. (Actually it DID work too well. Removed now.)
Got the Save As function to save the bin files with the correct name and in the correct directory. Should work correctly all the time now, worked sporadically before.
Worked on a problem with the guess page numbers function not being able to open page images with less than 100 or larger than 1000. Works much better now and gives more indication of  problems if it encounters one.
Worked on the search window option selection code. The options would get confused and eventually disabled if you had both the Search window and the Word Frequency window open at the same time and were using both. (Which is pretty much exactly what you want to do most of the time.) After puzzling over the problem for some time, I figured out that the GUI is pretty picky about how it will let you access variables associated with GUI elements (check boxes). If I manipulate the variables directly, the GUI disassociates the variable from the element. It insists that I interact through the GUI hooks. Fine, but if the GUI window hasn't been substantiated, the GUI hooks aren't available and I NEED to manipulate the database directly. Sigh. What it all boils down to is I added a whole bunch of if-than-else statements, and the Search options are much more stable now.
Not really a bug or a fix, just a notice: If you have an extremely long search or replace term (more than 40 characters), it will automatically scroll the text over 39 characters and seem like the search term has been chopped off. It hasn't. You can move back and forth with the arrow keys. Yet another consequence of the cut 'n paste fix (which I'm beginning to suspect was more trouble than it was worth.)
Found and fixed some problems with case interpolations in regex replacement text. Under certain conditions, the case interpolation was being applied to the term after the one specified in the replacement term.
Fixed the three Blank line searching functions to automatically start searching again from the beginning if they reach the end of the file. (They were supposed to before but they weren't.)
Worked on the spell checking program interface a bit. Got the replacement term selection intelligence working. It will now learn from experience and put words that are often chosen as the replacement for a  particular misspelling earlier in the replacement terms list the next time that misspelling is encountered. I have also enabled automatic entry of the top guess into the replacement entry box. This will probably be wrong as often as it is right, but even if it only right 10 % of the time, it is 10 % better than having nothing in the replacement box at all.
Added another binding to replacement words list box, Double click moves the word up to the replacement term box. Triple click will automatically replace the word in the text and advance to the next misspelled word.
Spent some time trying to get the aspell interface to allow you to change dictionaries from within guiguts. Finally got aspell to admit it has other dictionaries, (assuming it does), but it is steadfastly refusing to change to them. I'll have to poke at it some more....
Added still another function to the word frequency window. Harmonics. This is a list of all of the words that are in the current text that are within one edit of the currently selected word. For instance, if you select "the" and click Harmonics, you might end up with the list "he, she, She, the, The, them, then, they, thy, tie". These are all the words in the text that can be gotten from the selection with only one edit. (Well, you get the original word back, so one or less... :roll: ) The edit can be a replaced letter, a removed letter or an added letter, but  there can be only one edit. You must have selected a word in the word frequency window or it won't return a list. Different texts will get different lists.  It doesn't return every POSSIBLE variation in spelling, only those that are present in the text. There is a hot key shortcut - Ctrl-left click. The harmonics window has the same search bindings as the word frequency window (left click to pop up the search window, right click to search with the current search settings. You can also recursively do a harmonic search on a word in the harmonics window. You must use the hot key to do so.) The Harmonics function is fairly intensive to run, it does a large number of comparisons for each word, (259 * (# of letters in word) + 124) so running the harmonics routine on EVERY word in the text would take an unacceptably long time. In practice running, on a word at a time is pretty snappy and probably more useful anyway.


Version .21a (139K) Fixed problem in Search window where search text would disappear if you use the hot keys to search.

Version .21 (139K)
Fixed, or at least, worked on a problem with the spellchecking function where, under certain circumstances, it would skip words that weren't  correctly spelled, only to find them later in the text. Not sure if it is completely fixed, but after the changes I made, I was no longer able to reproduce the behavior.
Fixed spellcheck function to allow checking of a selection of text instead of the whole file. This was nominally implemented already but had bugs and would just spell check the whole file no matter what was selected.
Made the search function under the word frequency spell checking function do a regex search for -misspelling|misspelling- when the word is found zero times in the dictionary. That happens because of the different way aspell and guiguts treats hyphenated words. Aspell treats hyphenated words as separate words, guiguts as a single word.
Made some changes to the search window entry boxes to try to compensate for them not deleting selected text when you cut and paste. The changes I made introduced their own set of problems that I think I have worked around. Going to have to whack on the search window for a while to make sure it behaves as expected.
Made the word frequency and gutcheck windows use the same display font as selected for the main editing window. You may need to resize your windows, or scroll a bit to see the results now, but I think the change was worthwhile.
Added a character count subfunction to the word frequency window. Will give counts of all non whitespace characters in the text.
Changed Save As function so it will save the markup .bin file with the new name too. Be warned. If all you do is change the extension, the bin files will collide and may cause problems. (Part of the reason I wasn't too keen about implementing this.)
Modified the Save and Save As functions to only actually save anything if a file has already been opened.

Version.20 (138K)
Fixed a bunch of minor errors in the bookmarking functions. Bookmark highlighting works correctly now. Does not jump to bookmark on file load. After working with the bookmarks for a while, I found the bookmark highlighting annoying. Put a option under preferences to turn it off.
If you have a bookmark set in the middle of a paragraph and then rewrap the paragraph, the bookmark will be moved to the end of the paragraph. I could fix this but it would just add more overhead to the rewrap function which has quite a bit already, and is not really all that critical anyway, in my opinion.
Found a bug in the HTML named anchor function, was not generating a name for the named anchor if you had selected text. Traced it back to a change I had made to compensate for something else. Oops. Fixed.
Changed how the footnotes are formated in the auto generated HTML. they look a little more aesthetically pleasing now, I think.
Fixed space in link names to footnotes.
Added sub function to autogenerate HTML function to convert all characters from x80-xFF to named HTML entities, as well as &,<,>, and ". The windows 1252 codepage characters are converted as well. On large files or slow computers, may take a while to run, be patient.
Worked on autogenerate some more to prevent errant markup at boundaries.
Fixed a minor error with automated HTML superscript markup.
Modified behavior of some of the HTML markup insert functions. All of the header markup <h1>-<h6> will remove any paragraph markup from the selection when applied.
Finally got disgusted with battling the built in rewrapping function, ripped it out and wrote my own. Tried to make it behave as much like the original as possible without the idiosyncratic indenting. There may be some subtle differences that I haven't compensated for, but it is pretty close. In general, the new function seems to work pretty well. It will try to prevent "widowed" (extremely short) last lines by stealing a word from the previous to pad it out if it is less than 10 characters. I am not doing any line end smoothing,  I suppose I could mess around with it at some point to see what I can come up with.
Added an menu option under fixup to insert underscores around the selected text.
Came up with a function to try to guess page numbers based on average page length for people working on files that no longer have the page markers in them. It will ask for some page and line numbers to try to calculate an average page length. Avoid selecting pages for the calculation that are not in the body of the text (contents, index.) They tend to have wildly different page lengths and will throw off the calculation. This is only going to be somewhat accurate for texts that have mostly the same number of lines on each page. The more the text varies, the further the calculated pages will differ form actual. DO  NOT USE THIS FUNCTION UNLESS YOU HAVE NO OTHER OPTION. Or at least, don't save the changes.
Fixed a few spelling errors in the UI. If you didn't notice them, too bad,  I'm not telling you where they were.
Figured out and fixed problem where manual wasn't opening under winguts. (I think)


Version .19 (131K) Added capability to add fonts to the display font list. Will retain them once entered. If you want to delete them, open the setting.rc file and remove the ones you no longer want. (You can add them in the setting.rc file too if you want.) Ariel, Courier New & Times New Roman are the defaults and will be re added even if deleted from the file. (Under most circumstances, at least. You can make them not appear but, what's the point?)
Made a link to the HTML manual (here :-) ) under the help menu. Should work under all versions of windows. Probably won't work under Linux, but I have no system to test it on. Turned out to be much simpler than I expected, although I was getting pretty far afield before I figured it out.
Added a new menu item and a whole raft of hot keys for bookmarks. Mark and jump back and forth between up to 5 spots in your text. Control-Shift-(1-5) sets a bookmark and Control-(1-5) jumps to that bookmark if it has already been set. Bookmarks can also be set /accessed through the menu. They will be saved from session to session. They can be reused, just set it in another spot to reuse it.
Wherever applicable, inserted shortcut hot keys notation next to menu item in all menus.
Have enabled highlighting for zero width search results. Kind of cheesy, I'm just highlighting the character after the zero width assertion. In practice, it works pretty well. Search regex for ^$ (blank line) to see an example.
Worked on trying to come up with a work around for the lack of a newline assertion for the regex search. After about 10 hours of work, came up with something that would work about one third of the time. Another third, it wouldn't find what you were looking for, and the remaining times, it would just lock up the computer completely (once I managed to spontaneously reboot my computer too)  Gave up in disgust and removed all the code again.
Added three more single function search items to the search menu: search for two consecutive blank lines, search for three consecutive blank lines & search for four consecutive blank lines. (To find single blank lines, just use the regex search function with the search assertion ^$.) This will probably cover about 50 % of what the \n assertions would be useful for.
Added a new replacement text assertion: \T. Similar to \L & \U, this will adjust case in the replacement text. Whereas \L will set to upper case and \U to lower case, \T will set to title case. (First Letter Of Each Word Capitalized). This is not a standard regex assertion, but it allows you to do things that would not otherwise be easily done. An example: search for (CHAPTER) and replace with \T$1\E will yield "Chapter".
Uncovered a bug in the case assertions while I was implementing the \T assertion. Would not let you use a case assertion in the first position of the replacement string. It would just delete the text and not replace it with anything. Fixed for all case assertions.


Version .18 (131K) Fixed a bug in footnote moving function where if there were no footnotes for the last landing zone, the script would silently lock up and not move anything.
Add \n variable interpolation to regex replacement term. If there is a \n in the replacement text, it will insert a new line at that point.
Fixed a couple of instances where the search window could get set to do both regex and whole word search at the same time, leading to unpredictable searches.
Added a function under the fixup menu to set the invisible page markers before page separators have been deleted. The page join function will still set markers if they haven't already been set.
Have figured out a way to do non blocking calling of external programs in the compiled executable version. Had figured out a workaround many months ago for the script version, (actually for guiprep,) but was not able to get it to work with compiled version. (I'll have to backport this into winprep too.)
Since I can now execute external programs, I have put in hooks to an external image viewer. If you have set page markers, (or removed page separators, which amounts to the same thing,) the page number will appear in the bottom status bar, along with a button that will open an image viewer to the image file corresponding to the current page. It defaults to looking in a "pngs" directory one level below the directory the project file is in, however you can change the directory it looks in for the png files. The image file directory is not sticky from session to session. It will ask each time you restart the program. It will, however retain the directory for a session, once it has been set. You can change the paths to both the image viewer and the pngs directory under the prefs->set file paths menu. When you set the images path through the menu, it will attempt to open the first image file in that directory.


Version .17 (127K) Changed some of the menu items around as suggested in the forums. Moved the search and highlighting item to under the search menu. Made all of the menus tear off. I don't think it is particularly useful for some of them, but hey, now you have the option if you want to. :-) It only took about 25 seconds worth of coding so I was amenable.
Worked on the footnote parsing function to make it try to recover from minor formatting errors a little more gracefully. Will search for and correct the most common misspellings of Footnote I've come across: Fotonote, Footnoto and footnote (lowercase F) It will also assume that if it can't find a colon within twenty characters of the end of the word Footnote, that the colon has been omitted and will place one at the end of Footnote. This will allow for up to 19 digit footnote numbers, should be enough for most books. :-) The missing colons were not so much causing problems with the footnote moving routine as the automated generation of footnote links during HTML autogenerate where it would just silently and mysteriously fail. It relies on the colon to help parse the Footnote number (letter).
Fixed some rather bad bugs with landing zone handling when using more than one landing zone in a text. Changed a bunch of things and added error checking to make the whole process a lot less fragile. It was very easy to make it not work if you didn't do things in a very specific order. Made it a lot more forgiving of "out-of-sequence" operation.
Reworked the layout of the footnote moving tool window a bit. It was pretty sloppy and not very easy to figure out what some of buttons did.
Added a couple more buttons to the footnote window to automatically insert landing zones at the end of each chapter or at the end of the text. The auto insert functions will remove any existing landing zones before adding new ones. (So if you want to remove all of the LZs, click on Auto End LZ which will remove all but the one at the end of the text, them remove that one manually.) The chapter end auto insert LZ function is rather simplistic. It looks for 4 blank lines in a row and assumes that it is a chapter break. ( the standard layout for chapter breaks, so not a big stretch.) It will skip the first 200 lines of the text to avoid putting footnote LZs in the title page or contents page. If you have an especially long contents or preface, you may end up with some unnecessary LZs. Don't despair, it will automatically remove any landing zones that haven't been used after it is done moving the footnotes.
When the footnotes are moved, the script will attempt to move the anchors against the text they are referring to. The anchors are often spaced or have a line break between them and the text they refer to. This just automates the fixup so you don't have to go back do as much manual tweaking.
Putzed around with the HTML generation code some more. Tweaked the footnote layout a bit. Using blockquote tags to set them in a bit, probably would be better done with CSSs instead but it can be changed when necessary. Worked quite a bit on the auto generation function. Got it reacting fairly predictably. Fixed a bunch of border effect errors. Shouldn't auto generate orphan markup anymore.
Added more markup to the detect orphans function. Will now check nearly all standard HTML markup instead of just i and b. Of course, it's a lot slower now... :-(
Made the auto generate function automatically handle subscript and super script markup. _{xx}will be changed to <sub>xx</sub> and ^{xx} to <sup>xx</sup>This is the only destructive change that the auto generate function makes. Everything else can be backed out of by selecting the whole document and hitting "Remove markup from selection". (Will leave italic and bold markup.) It will take a while to chug through the file....
Added a button to the HTML popup named "Poetry". It will automatically add non breaking spaces to preserve indenting and insert line breaks after each line of the selected text with one button press.
Got annoyed with some of the idiosyncrasies of the rewrap function, so I whacked on that for a while. Think I've got it to a point where it is not going to go charging off in odd directions too often anymore.


Version .16 (127K) Added a function to the word frequency routine to sift out and display all Mixed Case words. These will primarily be initial caps words, but it will also find words with caps in the middle of the word. (It will not display words that are ALL caps.)
Added another function the the case adjustment functions; automatically convert selected text to title case. (This was actually already active in version .15, but I forgot to mention it.)
Added some markup shortcut keys for use when generating HTML versions. Hot keys Alt-1 through Alt-6 will insert markup <h1>..</h1> through <h6>..</h6> respectively around the selected text.
Added a bunch of unlikely character combination checks to the regex.rc list.
Added quite a bit of functionality to the HTML function under the fixup menu. Made a button bar which has most of the popular HTML markup on it; at least, all that is easily translatable to TEIlite. (Figured, "why make life harder for myself later?" :-) ) Will automatically insert the selected markup around the selected text. Some markup buttons act differently depending on what text is selected. There is an Autogenerate HTML that will do basic conversion to an HTML version. I still haven't thought of a good way to parse the title page and contents and automatically mark them up so the generated file will still need some tuning. I am quite pleased at the automated generation of links to out-of-line footnotes though. :-) I already had all of the code in place to parse the footnotes, so it wasn't all that difficult to implement. I am including a file called "header.txt" that has the basic HTML header information it uses to make the header to the HTML file. It is very basic, pretty much the absolute minimum to be valid HTML. You can edit it however you like, if you want some custom features. I am not using cascading style sheets at this point, though I am leaning in that direction for the future. I'd like to get some other opinions and suggestions before I go there.
This is not a very high end HTML editor. For simple texts it is probably sufficient, and it will generate something that can be further tuned in a more powerful editor if necessary. It will automate a bunch of stuff that would be very tedious in a standard HTML editor though.

Version .15 (121K)
Added a function to the word frequency routine to find all capitalized words. Unlike PRTK and Gutwrench, I include single character words because it will only grow my list by a maximum of 26 words due to the way they are presented. This is a function that I don't find particularly useful, but other people seem to, so I added it. (Besides, it only took about 5 minutes to do it. :-) ) BTW, if you sort case insensitively and then search for ALL CAPS, you won't find any. This shouldn't be too surprising if you think about it.
Found an error in how warnings were set up, (actually, it was pointed out to me,) was preventing warning from being raised. Combed through code fixing loads of minor errors (warnings) that were not fatal but could lead to obscure bugs occurring while running. Fixed a bunch of stunningly bad code that I was getting away with 'cause it sorta worked and no one had called me on it.
Added some options under the Prefs menu item where you can set paths to the various support programs. (gutcheck, aspell.) These were accessible in other places but this collects them into one place where they might be expected to be. Also moved rewrap margin setup to under Prefs.
Added few variable interpolations for regex replace. Have enabled the \L, \U & \E  assertions for replacement text. Text surrounded by \L and \E will be lowercased in the replacement text. Text surrounded by \U and \E will be upper cased. If  the \E assertion is omitted, all of the text after the \L or \U assertion will be lower or upper cased respectively. Usually will be used to change the case of extracted variables. An example:  Search for <(\/?)(\p{IsUpper}+)> and replace with <$1\L$2\E> will change any upper case HTML markup to lower case. Any instance of <I>, </I>, <B> or </B> will be converted to lower case. (Useful for XHTML. HTML is somewhat blasé about the case of its markup, XHTML is much more finicky)
Worked with donovan to get guiguts to run correctly under Linux. He also was invaluable in helping track down some of the more obscure warnings. Thanks donovan! Linux compatibility is about 80-90 % there. Still need to work out some odd things here and there.

Version .14(119K)
Fixed rather serious problem where aspell personal word list would be corrupted when guiguts exited. Only seemed to be an issue with Win 98; Windows NT and 2000 (and presumably XP) didn't seem to have the problem, or at least, not as bad.
Tweaked page separator routine a little.Auto join was failing if last character of  line prior to page separator was upper case. It was pointed out that that was unnecessary. Changed. Modified logic to allow the line after a page separator to start with "I " (capital I space) without faulting and needing user intervention. Normally, it fails to autojoin if the line after a page separator starts with a capital letter, since it is not uncommon that page ending punctuation is missed. "I" is so common and occurs in texts so frequently, that it is probably one of the biggest false negatives. Added it as a special case.
Fixed routine history to only record page separator routine when first invoked, instead of each time it removed a separator. Arrgh...
Added a few more regex expressions to the regex .rc file.
Tweaked a few other minor user interface bugs.

Version .13 (119K)
Fixed a problem with the Footnote Fixup routine. It was not dealing with double open or closing brackets very well. They are rare, but they do happen.
Made up a new scannos list containing some useful regex search and replace terms, called regex.rc. It works like the other scannos list except the you'll need to have Regex checked in the search window when you use them. There aren't many in there, just a few I thought of off the top of my head. If anybody comes up with any other useful regex search expressions they think should be in there, let me know and I'll add them to the distribution.
Made some change to the Footnote re indexing routine. Makes more of an attempt to preserve the original style anchor markers (letters, Roman numerals) instead of changing them all to numbers.
Tweaked various word frequency display lists (Frequency sort. Alphabetic sort, Hyphen sort, Alpha/Numeric sort, Spellcheck sort) to be more uniform in how they handle searches. Most of the changes are behind the scenes and not visible the the ordinary user, but I know they are there.
Added a Sidenote fixup function. Will find all Sidenotes marked with "[Sidenote" and move them to just before the paragraph they are in. It will leave 1 blank line before and after each Sidenote. This will allow the text to be re wrapped without folding the sidenotes back into the paragraph. (And besides, I like it better that way.) It will also do some basic error checking and alert you if it finds something it thinks is a sidenote but has bad markup. This hasn't been a high demand function, but I've got a text to post process with about 18 bazillion (loosely defined as 283) sidenotes in it and I didn't want to deal with them manually. So there.
Fixed minor problem where stealth scannos directory wouldn't be remembered if stealth scannos function was run more than once per session.

Version .12 (117K)
Added some more functionality to the Footnote fixup function. Separated first pass and re index buttons. Added buttons to allow you to switch view from the footnote to the anchor with one button press. (Useful for footnotes after they have been moved out-of-line.) Added option to do unlimited search for anchors. Was artificially limited to searching only the previous page (more or less) to prevent getting 50 footnotes all pointing the anchor point [1]. After the footnotes have been reindexed and moved, however, was preventing script from finding anchors. It should only be used when searching on footnotes that don't have any duplicate anchor markers to a fairly high confidence level
Found and fixed bug where script was sometimes skipping adjacent footnotes. (less than 2 characters between the end of one and the start of another.)
Added spell checking functionality to the word frequency function. Will filter the list to only show words that the spell checker doesn't recognize. You may end up with words in the list with a frequency of zero due to the different definitions of what a word is by the word frequency routine and the spell checking program, typically,  part of a hyphenated word.
Finally got regex variable extraction for replacement working. Still not perfect, but not bad. Will support up to 8 replacement variable extractions. Use standard regex syntax: surround the match variable with parentheses and use numbered back references in the replacement. IE: for the string " [12] " you could match \[(\d+)\] and replace with [Footnote $1: ] to end up with the string " [Footnote 12: ] ". I am not doing a true regex replace so you don't need to escape meta characters in the replacement text. The search is a true regex search so meta characters must be escaped. Short list of meta characters - "{}[]()^$.|*+?\-". Any of these characters need a backslash before it if you want to search for the literal character.
Regexes are extremely powerful and useful to do variable search and replace operations. There have been whole books written on regexes so I'm not going to try to cover them in great detail. Some of the more complex ones look more like line noise than a search function. A decent basic tutorial is in the perl documentation. See http://www.perldoc.com/perl5.8.0/pod/perlrequick.html.
There are a bunch of regex search and replace expressions that will be useful over and over. (Only the expression within the double quote marks.)
search - "(\S)\s\s(\S)"   replace - "$1 $2" -- Find exactly two spaces between any non space characters and remove one space. Will ignore indenting and Multi space strings.
search - "\.(\s\p{IsLower})"  replace - ",$1" -- Find period followed by a space and a lowercase letter, replace with comma. Will get lots of false positives.
search - ",(\s\p{IsUpper})"  replace - ".$1" -- Find comma followed by a space then an upper case letter, replace with a period. Will get lots of false positives.
search - "(?<=[^\-])-{4,}"   replace "----"  -- Find a string of hyphens at least four in a row, preceded by something that is not a hyphen and replace with a string of four hyphens.
--and many, many more. That last one is not specifically useful, it is more just an example of what kind of neat tricks you can do.
There is now a Function History pop up window available under the Help menu. It tracks the major functions as they are performed and keeps a record of them. It will be saved session to session as long as the file is saved. In other words, it will only retain records of functions performed that have had the file saved afterwards.

Version .11- (116K)
Added another button to gutcheck window. Allows you to easily rerun gutcheck without switching back to the main window.
Changed the the function of Del button in the page separator routine. Now completely removes line instead of just clearing it.
Footnote moving tool has been activated. It's complex, ugly, limited, buggy and much less automated than I originally planned, but it's a start. Right now, it works best with simple footnotes. Nested footnotes (the bane of my existence) are not supported very well. (You can do it, you just need to be very careful.) I plan to do more with this, but I've already spent over a week just trying to get this one function working and I 'm sick of pounding my head on it. Though the concept is simple, there is an amazing amount of fancy dancing that needs to go on behind the scenes to keep everything straight.

Version .10 - (110K) Modified gutcheck view options to only allow one instance to be created. Made closing gutcheck window destroy the options window as well.
Fixed bad  subroutine call in "Change All" routine under spell check function. Caused it to quietly fail.
Frame work for footnote moving and checking tool is implemented. It doesn't really do anything useful yet, but it is fun to play with. :-)
Modified executable version to get around problem with aspell blocking and failing. It will now open a console window. (DOS box) You shouldn't have to do anything with it, although you may need to close it separately after you close winguts. It is necessary to have it open for the program to communicate with aspell.


Version .09 - (109K) Got interface to Aspell/Ispell functioning. Either can be used, though aspell seems to be slightly better supported under windows and seems to be the more capable spelling package. Now able to just spell check with replacement suggestions a selection or the whole file if no selection is made. Have enabled function to allow you to add words to the Aspell dictionary. Still have to puzzle out how to allow you to select dictionaries in aspell. Added "Change All" button in Spell check box to allow you to change all occurrences of a misspelled word with one button press.
Added a "project dictionary" function. Allows you to skip project specific words. Use it for words that are common in a project, but you don't want to add to your standard dictionary. (Dialect, proper names, etc.) Project dictionary is saved in the directory the text file is in with the same name, but a ".dic" extension.
Tied Word Frequency information into spell checking function. If you have run Word Frequency before you do spell check, it will display how many times that particular word occurs in the text to help decide whether it is a misspelling or not. **Note: the case of the word must match in both the spell check and word frequency list. You probably won't get useful results if you do a case insensitive sort in word frequency.
Added alphanumeric check to word frequency window. Allows you to easily check all of the words with mixed digits and non-digits. It will include numbers that have commas and/or hyphens. I debated filtering them out, but during testing, I found several problems with dates in my test project that way, so I elected to leave them in. Makes it very easy to find "H0ME" and "a11" errors as well as "l87I" and "l9OO", which are sometimes missed by a standard spell checker.
Made some of the gutcheck error searching routines a little more robust. Should set the cursor exactly at the error more often. Still gets thrown off by HTML markup, especially markup adjacent to the error.
Trapped another potential problem with rewrapping mangling the hidden page markers. If you interrupted rewrapping, it was losing most of the page markers and strewing cedillas (don't ask) throughout the text. :-(   Fixed now.
Worked on regex search highlighting, still not perfect--but much better.
Fixed minor bug in Replace All function of search box. Was not replacing first occurrence when Replace All was selected.
Added selectable view options to gutcheck window. Only see the errors you want. If you right click on an error in the list it will be deleted from the list and will not be recoverable unless you run gutcheck again.


Version .08 - (105K) Fixed bug where bin file was not being saved on program exit.
Added ability to change Font, size and weight in main editing window. Very limited set of fonts available right now. I could probably add more if is desired. The font information is for viewing only. There are no formatting changes made to the text files.
Have spell check partially functional. If you have aspell or ispell installed on your system, it will search and find all of the "misspelled" words in the file and let you cycle through the file and check them. The replacement choices option is being resistant to implementation. I decided to release this as an interim since I figured half a loaf is better than none. You also can't add words to the dictionary yet. It will get there.
Fixed problem with rewrapping mangling the page index markers. It is not perfect but will generally remember the correct page to within one word of the original page breaks, especially for words that were broken across pages.

Version .07 - (102K) Fixed bug in Search function dialog box, was not updating "number of times term found" label until a search was performed.
Fixed fairly serious bug in rewrap / block rewrap function which would cause it to randomly fail if the selection did not start and stop exactly at a blank line. (Notice: rewrap functions still work best if the selection starts and stops exactly on a blank line.  If you insist on rewrapping a selection that terminates in the middle of a line of text, the results may not be exactly what you expect or want).
Fixed a subtle and obscure bug with page separator function. If there were two page separators in a row (from an illustration or blank page) with a word hyphenated across both separators, it would yield odd (read wrong) results.
Tweaked separator search so that first few separators will scroll to the center of the screen. It was not centering the separators until you were further into the text than twice the height of your text window. There is a very minor bug when converting the very first page separator. It will always add one more blank line than you request. I probably won't fix this since it is so minor, it is just something I've noticed.
Added a "Delete" button to the page separator dialog. Deletes page separator without making any other edits. Added a pop up help screen too to explain the different page separator editing functions briefly and listing the hot keys.
Fixed undo buffer to reset on file save. It was not, and I left it that way for several versions but if you start doing major editing (rewrap), the undo buffer starts chewing up large amounts of memory and slowing down your computer. It will still do that if you don't save periodically, but that is probably A Bad Idea™.
Added some hot keys to the Search function. Enter will search. Shift-Enter will replace. Ctrl-Enter will replace and search. Ctrl-Shift-Enter will replace all.
Added -v verbose switch to gutcheck options. Was enabled by default, now you have a choice.
Got the full automatic header replacement working rather well, after much dithering and fruitless experimentation. While full automatic header replacement is being done, the Undo button reverts to doing single step undo. It was very problematic trying to predict what exactly you would want undone in that situation, so if you want to undo something while in full automatic, you'll just need to single step back through the undo buffer. I am debating adding some more automatic fixes but want to get some feedback first.
Cobbled up a method to track page numbers after page separators are removed. An ugly, nasty kludge... but it seems to work fairly well. Page numbers will now show up in the bottom status bar if available. Need to come up with some way to save the now invisible page markers from session to session.
Debated various things and finally settled on writing a hash of the page markers and indicies to a separate file rather than trying to keep them in the same file. The cons, need to have two separate files, one with the text info and one with the markup data.  The pros; able to use the existing tools without modification to skip the binary data. Separate files win. This actually may be a good thing as I can start to keep indicies to HTML markup in the separate file and only substantiate it when you want to generate an HTML version. Anyway... the page marker information will be written to a file in the same directory as the text file with the same name but a ".bin" extension. It is actually just a text file, and you can open it an look in there if you like, but be VERY cautious about editing it. It is pretty sensitive to correct formatting. As long as the ".bin" file is found in the same directory as the text file, it will load when the text file loads.
This may be something we want to start saving with the archive files since it can be used to reconstruct the text layout long after it has been post processed.

Version .06 - (99K) Found and fixed a few bugs in the fixup routine. Under certain circumstances, it would leave whitespace at the end of a line. Fixed.
It was having problems with strings of hyphens longer than three in a row. Fixed.
In the rewrap routine under certain conditions, it would either trample the text after the selection or not rewrap the entire selection. (Usually only extremely long paragraphs or extremely long lines that needed lots of rewrapping) Fixed.
Finally tracked down a fix (or at least a work around) for the scrollbar not resizing in proportion to the text list size in the various popup windows. Not real elegant, but it works.
When using the search function, it will no longer display the number of times a word is found in the document if "Whole Word Only" is unchecked. This let to some confusing disparities between how many times a term was actually found versus how many times it was reported.
Added another failure detect mode to the HTML orphan markup detection.
Totally overhauled and changed the Page Separator Removal function. Now is interactive much like PRTK. It was too problematic trying to make it fully automatic. Now, a small window with several option buttons will pop up. It will automatically search for and highlight the next page separator and wait for you to make a decision on how to handle it. There is also an option to save a marker in the text with the page number as an HTML comment. Unfortunately, this will drive Gutcheck batty as it will generate 3-4 warnings for EACH.... :-(  I'll have to figure out some tap dance I can do behind the scenes to make this more usable. As of now it is available but not recommended.....
Added a regex search option to the search popup. It is subtly broken in three or four ways but it is as good as it is going to get.
1) you can't match a newline (\n) character. 
2) it won't perform matches across line boundaries. (Actually 1 & 2 are the same problem stated two ways)
3) accented characters will not match \w or \b assertions.
(Sorry, but that's the way it is. These are all flaws inherent in the Tk text widgets and there's nothing I can do about it)
4) found term highlighting is broken for regex searches, especially if you start using a bunch of variable or zero width assertions. It will highlight something, but it may be more or less than the actual matching text. I'm tempted to just turn off highlighting for regex matching, we'll see
Broke out "Remove end-of-line white space" from fixup function to let it be run separately if desired. (It is still in fixup too, this just lets it run separately)
Updated Manual.

Version .05 - (97K)
Fix one thing, break something else, :-\  When I did away with my previous file name parsing, it broke the check to see if the file was edited before running Gutcheck. Fixed.
Added another button to scannos search dialog to swap the search and replace terms so you can easily do reciprocal searching. Made searching better able to deal with accented characters.  Still some bizarness though. For some reason I am not able to make perl detect a border of a word if it begins or ends with an accented character. Makes it impossible to do a "Whole Word Only" search for those. I've done a kind of half assed work around by detecting if a word starts or ends with an accented character, then doing a pattern search instead of a whole word search in those cases. I am talking with the maintainers of the perl Tk text widgets to see if I can get this working correctly.
Made a few minor bug fixes to script; update line/column indicators when a gutcheck error is selected, added <ctrl>-s to the hot key bindings. <ctrl>-s will save the file. Fixed a few errors in documentation.  I've added the extension ".ggp" (guiguts project) to the default search list in the file open dialog. If you are running the windows executable version, you can register the ".ggp" extension to be opened by the winguts.exe program. That way you will be able to open winguts by double clicking on a file with a ".ggp" extension. Made search term entry take focus when Search & Replace box opens. (Thanks to martinag for the suggestions!) Trimmed some unnecessary files from the executable build, drastically reduced file size.

Version .04 - (96K)
Added some more function to the search / scannos functions. If  word frequency has been run before the search function is called, it will display how many times a particular search term appears in the text. Note that this is only accurate for whole words. Searching on a part word or punctuation will be labeled as "not found"; not because it doesn't exist in the text, but because it doesn't exist in the word list.
Added ability to search text either forward or reverse, The search will automatically wrap around to the beginning (or end) of the file when it reaches the end (or beginning).
Added a pop up hot key listing under the help menu.
Worked on filename / path parsing for converting to DOSish format for gutcheck. Gutcheck is still just a DOS program so it can't parse file names with more than 8.3 characters or pathnames with spaces / more than 8 characters. Poked around in some of the more obscure perl documentation. Think I've got it now. Fixed problem throughout program.
User requests, gutcheck box stays on top when focus change to main window.
Added file filter on file open dialog to default to .txt files

Version .03 - (94K) Fixed problem where file name would not be updated in header until you clicked in the text box. Twiddled around with fixup routine, sped it up by an order of magnitude without sacrificing function. :-) Now it's only about 20% slower on average than it would be without updating the display at all. Made some changes to word counting routine to recognize numbers with commas or periods in them as a single entity. Change file open routine to remember the last directory session to session. Added a couple more hot keys. Added a hyphen check function to the word frequency window. Shows a list of all of the hyphenated words in the file along with any words that are identical except without a hyphen. Added a basic stealth scannos checking routine. Grafted it onto the search and replace function. Basically, it allows you to load a file of stealth scanno pairs and automatically load them into the search and replace box one by one. I've included Big Bill's English Common Stealth Scannos list from the CVS site, formatted to work with the script. It is in the scannos directory, under the script folder. You can add others if you like. Fixed problems with windows that would only open once. Lots of miscellaneous tweaking and tuning. Updated manual.

Version .02 - (90K)
What version .01 should have been before it ever saw the light of day. Better calling of gutcheck routine. Better parsing of gutcheck error file. Better coupling of gutcheck output window with text window. Vastly improved search function with text highlighting. Made Gutcheck options sticky session to session. Lots more fix up functions and fixed a lot of bugs in the existing ones. Added Word frequency / index function to count words and frequency with direct ties to search function. Completely overhauled menu to work around bugs in perl text module. Now include the latest version of gutcheck with the script. Wrote a manual.

Version .01 - (6K)
Initial release. No manual. Partial functionality. Flaky operation. Basically a fetid pile of crap.... but it runs... sort of.